Three (plus one) Different Versions of map in Haskell
If you are here, then map
requires no introduction. You have seen it.
You have used it. Many times. You have even implemented it before. But did you
know that there are more than one ways to implement the function map
?
Of all the archetypal higher-order functions that I know, map
is the function
that I hold dearest in my heart. That’s because it marked a very pleasant memory
when I finished The Little Schemer (for the first time, back in 2016) and I found
out that I grokked the concept enough to be able to write it on my own.
As a quick explainer, for the sake of completeness, map
is a function that takes
two arguments, a function fn
and a list lat
, and produces a new list with the
elements being the result of the application of the function fn
to the elements
of lat
(in mathematical writing, this would be ∀x ∈ lat, fn(x)
- that is, for
all x
that belong to lat
we take the result of fn
applied to x
, fn(x)
).
I always knew of the (classic?) way to define that in Scheme
(or more generally,
in Lisp
), which I had seen also being used in OCaml
/SML
, which looks like this:
(define (map fn lat)
(cond
((null? lat) '())
(else
(cons (fn (car lat)) (map fn (cdr lat))))))
That is, a recursive function definition, that does the following:
- If the input list
lat
is empty, it returns the empty list()
(we use that as the base forcons
structing a list of values), otherwise - It returns the list
cons
tructed from the application of the function to the first element of the list(fn (car lat))
and the result of the recursive call on the remaining list(map fn (cdr lat))
.
In OCaml, the same function would probably be defined using pattern matching, but the pattern is the same:
let rec map fn lat =
match lat with
| [] -> []
| h::t -> (fn h)::(map fn t)
Recently though, I started reading Graham Hutton’s excellent Programming in Haskell, as a preparation for some more advanced material I wanted to read that required Haskell knowledge.
In that book, I was delighted to find not one, not two, but three different ways of
defining the function map
in Haskell.
Let’s have a look at the first two ways of defining map
in Haskell, both defined in
chapter 7 of the book.
The classical recursive definition of map
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
The first definition is the one that is closest to what we described above for both Scheme and OCaml. It’s the (structural) recursive definition using pattern matching against two patterns:
- The empty list in the first pattern
[]
, which as before returns the empty list, and - A pattern of a list with at least one element, which we immediately de-structure
into a
head
andtail
component in(x:xs)
. When matched, it will build acons
(:
) of the result of the function applicationf x
and the result of the recursive call of the remaining list (map f xs
).
This definition is basically the exact equivalent of the OCaml definition above, with the only notable difference being the explicit function signature given:
map :: (a -> b) -> [a] -> [b]
This tells us that the map
function takes two arguments:
- A polymorphic function that maps elements of type
a
to typeb
(a -> b)
, - A list of elements of type
a
([a]
)
and as a result produces a list of elements of type b
([b]
).
(The a
and b
above are called type variables, and are implicitly universally
quantified, i.e. they read as “for all elements of type a
”).
So far, so classic.
Nothing out of the ordinary here.
Let the show start: The list comprehension definition of map
Later on in the book, we come across this definition:
map :: (a -> b) -> [a] -> [b]
map f xs = [f x | x <- xs]
Wow!
This one packs a punch in terms of conciseness, but I find it very elegant and very expressive at the same time.
This definition is based in the syntax for list comprehensions in Haskell.
List comprehensions are a concise and convenient way to define new lists by manipulating elements of other lists. As an example, consider the following:
> [x^2 | x <- [1, 2, 3, 4, 5]]
[1, 4, 9, 16, 25]
This is read as make a list of “x
squared, with x
drawn from (|
) the list
[1, 2, 3, 4, 5]
.
Taking this into account, and back to our map
function definition, our comprehension
there:
map f xs = [f x | x <- xs]
reads as “make a list of the results of f x
, where x
is drawn from the list xs
”.
Beautiful.
Roll the carpet for Mr. Fold: The foldr
definition of map
One of the coolest things the book opened my eyes to was the fact that we can
use a fold
in a generative fashion.
I was well aware of fold
from both Racket and OCaml before, but I always thought
of that in terms of a generalisation of reduce
- it never occured to me that
I can use the accumulating function to yield a value in a generative recursion
fashion, like a new list 🤯
(Embarassingly, it now looks kind of obvious, and related to ideas I’ve been exposed
to in the past - this one is related to the collector functions
idea that the Little
Schemer uses in chapter 8).
Back to our definition.
This one is not given by the book - in fact, it’s left for the reader to define as one of the exercises. This is what I came up with:
map :: (a -> b) -> [a] -> [b]
map f = foldr (\ x xs -> f x : xs) []
This definition is the one that left me most stunned and amazed with the generalising power of a fold.
What this definition tells us is that we define map
as a right fold (foldr
)
of a lambda that takes two arguments, x
and xs
, and returns
the cons of f x
and xs
. The last value we pass to the foldr
is the empty list []
, which is the
value that is used as the base case.
In order to understand what the foldr
does, I find the following visual from
Wikipedia to be
very helpful:
To understand the above visualisation, you need to understand that a list in Haskell
(and most functional programming languages for that matter), is a cons
of various
values and the empty list. I.e. you can think of [1, 2, 3, 4, 5]
as
1 : (2 : (3 : (4 : (5 : []))))
.
With that in mind, what the foldr
function does is that replaces the cons
(:
)
operator with the function argument supplied to it, and reduces it all (folds the
list) into a single value.
(It also replaces the base case empty list value []
with the value of the last
argument supplied to it. In our case, we are passing the empty list ([]
) again,
which we are going to use as a base to build a new list of values on top of.)
And here’s the trick - because the (anonymous) function we passed to it is building
a new list (by applying the cons
operator again), what we end up is a new list instead
of just a single value!
In our case, this means that if we had the list:
1 : (2 : (3 : (4 : (5 : []))))
after the application of the anonymous function we would have a new list, equivalent to
f 1 : (f 2 : (f 3 : (f 4 : (f 5 : []))))
🤯
Bonus round: a monadic definition of map
Okay, this is a bit of a cheat because this is basically our first definition, except that the function and the return type are monadic:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM f [] = return []
mapM f (x:xs) = do y <- f x
ys <- mapM f xs
return (y : ys)
This is using the do
notation to define a sequence of actions,
but the actions themselves have a near 1-1 correspondence to our original map implementation:
- First we assign the name
y
to the result off x
, then - We assign the name
ys
to the result of the recursive call on the of the list (mapM f xs
) - And the return the
cons
of the two values(y : ys)
How is this definition useful?
You may have observed that the return type is m [b]
- a monadic list of type b
.
Consider the following: we want a function that converts a string into a list only if all of the string characters correspond to a digit, or fail gracefully otherwise.
One way to do that is to write a function to convert a single character into a
Maybe Int
:
conv :: Char -> Maybe Int
conv c | isDigit c = Just (digitToInt c)
| otherwise = Nothing
With this definition, we can now use our function mapM
like this:
> mapM conv "1234"
Just [1, 2, 3, 4]
> mapM conv "123a"
Nothing
Conclusion
I enjoyed writing this post quite a bit.
It’s a very humbling experience to be visiting books and seeing that ideas that you considered to be elementary and rather surface-level are a lot more nuanced once you start digging deeper into them.
It’s something I need to keep top of mind as I go forward as well.
This post also highlights one of the nice things of going for breadth of knowledge: exposure
to a wider set of ideas. Had I only stayed in the Racket/OCaml realm, I probably wouldn’t have
been exposed to this range of map
implementations, for the reason that these languages don’t
offer some of the language features that enable them, like list comprehensions.
All in all, happy I went down this path, and looking forward to what’s next.