**The exercise can be stated like this: given a list of functions (with appropriate types), construct the composition of the functions from that list using a foldr.**

The strategy to follow for this problem (and for others that request to write a function in a foldr form) is the following:

1. Write the function in a more mathematical notation (in order to be easier to manipulate it).

2. Derive a recursive version of the function.

3. Use the foldr universality property for the recursive definition and derive the foldr version of it.

So let’s see how we concretely apply these steps to our problem.

**Step 1. **A more mathematical specification of our problem can be done like this:

compFuncs [f1,f2,…,fn] = f1 o f2 . … . o fn (where o is the symbol for function composition)

In this moment we’re not interested in which is the definition for compFuncs [], because that will be given in the step 2.

**Step 2. **Let’s derive a recursive definition for our function.

**compFuncs [f1,f2,…,fn] **

**=**(by definition)

f1 o f2 . … . o fn

=(functions composition is associative)

f1 o (f2 o … o fn)

=(by definition, f2 o … o fn = compFuncs [f2,…,fn])

**f1 o compFuncs [f2,…,fn]**

So what we have by now is: **compFuncs [f1,f2,…,fn] = f1 o compFuncs [f2,…,fn] (1)**

To convert it in a Haskell form, let’s do some notations:

- o is denoted in Haskell by dot: .
- the list [f1,f2, … , fn] can be split into the head f1 and the tail [f2,…,fn]; we write the tail as fs = [f2,…,fn]

With these notations, we rewrite the relation (1) into:

compFuncs (f1:fs) = f1 o compFuncs fs, or equivalently (because I don’t like f1, I’ll write it as f):

**compFuncs (f:fs) = f . compFuncs fs (2)**

The only problem that remains is which form has **compFuncs []. **For that, let’s see how the function works:

**f1 o f2 o … o fn**

=(by definition)

compFuncs f1:[f2, … ,fn]

=(using the relation 1)

f1 o compFuncs [f2, … , fn]

=(using the relation 1)

f1 o (f2 o compFuncs [f3, … , fn])

=

…

=

**f1 o f2 o … o fn o (compFuncs [])**

So we have that **f1 o f2 o … o fn** = **f1 o f2 o … o fn o (compFuncs [])****. **We know that the identity function (id x = x) has the property that f o id = f for all functions f, so the obvious choice here is **compFuncs [] = id (3)**

Using the equalities (2) and (3), we obtain the recursive definition of **compFuncs:**

**compFuncs [] = id**

**compFuncs (f:fs) = f . compFuncs fs **

**Step 3. **Convert the recursive definition into a foldr.

In this moment we’ll use the foldr universality property (you can read it in detail in [1])

**Theorem (foldr universality property):**

** g = foldr h v**

** <=>**

** g [] = v **

** g (x : xs) = h x (g xs)**

So let’s apply this to our recursive definition.

**compFuncs = foldr h v**

=>

**The first equation of the theorem:**

compFuncs [] = v, but from our recursive deiniftion we know that compFuncs [] = id**, so v = id (A)**

**The second equation of the theorem:**

compFuncs (f:fs) = h f (compFuncs fs)

So our task here is to find the function h.

We know from the recursive definition that compFuncs (f:fs) = f o compFuncs fs, so we have:

compFuncs (f:fs) =** f o compFuncs fs = h f (compFuncs fs)**

Now let’s focus on the bolded part: **f o compFuncs fs = h f (compFuncs fs)**

We notice that **compFuncs fs** appears in the left and right parts of the equation, and because h must work in all cases, we can bring in **a fresh variable r = compFuncs fs**; so, h becomes: **f o r = h f r.**

So with **h f r = f o r **, the second equation of the theorem is satisfied.

Let’s write the composition with dot notation, like in Haskell: **h f r= f . r (B)**

Putting together the equalities (A)and (B), **we transformed compFuncs into a foldr:**

compFuncs :: [a -> a] -> (a -> a) compFuncs = foldr h id where h f r = f . r;

[1] Hutton, Graham (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming. Cambridge University Press.

]]>

This note will let you see some important points:

– you can use mathematical theorems to actually improve your code in terms of execution efficiency;

– using theorems, you can derive solutions which are hard to construct just in an intuitive manner;

– a highly efficient solution has, in many cases, a powerful theory standing behind it

That being said, we can state the problem:

**Problem. Find the ascending lists of the cartesian product of non-empty lists of integers.**

**Solution.**

**Aside.**

For the solution we clearly need a way to compute the cartesian product of lists. For this, you can read a more detailed post here: http://myhaskelljournal.com/cartesian-product-of-lists/

As an example: cartprod [ [2], [1,3] ] = [ [2,1], [2,3] ]

**End of Aside**

Back to the solution. It’s clear that, if we take only the ascending lists from the lists produced by the cartesian product, we have all the ascending lists of this cartesian product (it’s a useful truism, so let’s use it:) ).

In symbols, what we said reduces to:

**fcp = filter n . cp**, where cp is the cartesian product of lists and n is a function that, given a list of integers, it’s true if and only if this list it’s ascending.

**Remark. **This is a correct definition of our desired function, but in order to evaluate it, we need to take into account each element produced by the cartesian product, in order to see if it’s an ascending list or not. But there are a lot of these elements (actually this algorithm is an exponential one). So what to do next?

We can think of alternative ways to solve the problem. What if we can generate from the start only the ascending lists of the cartesian product? This would be, indeed, a much much faster algorithm, because instead of searching an exponential space and filtering, we’ll generate **directly **the ascending lists desired.

But here stops our intuition. It’s hard to derive a directly generating algorithm from just pure intuition. Other mechanisms must be employed to arrive at a correct and efficient solution.

First of all, we’ll use a definition of cartesian product in therms of folding (**exercise 1**):

cp :: [[Int]] -> [[Int]] cp = foldr f [[]] where f xs yss = [x:ys | x<-xs, ys<-yss];

In this moment, we have that fcp = filter n . cp, so fcp is a composition between a foldr (the cp function) and another function.

And here comes into use **the foldr fusion theorem (law)** which states that, under certain precise conditions, a composition between a function and a foldr is also a foldr. This “fusioned” foldr will generate directly the ascending lists from the cartesian product, as we’ll see in what comes next.

Let’s state the foldr fusion law:

**Theorem (foldr fusion law)**

** If we have the following properties:**

**f is a strict function.****f(g x y) = h x (f y), for all x and y in the appropriate ranges**

** Put b = f(a).**

**Then: f . foldr g a = foldr h b.**

This theorem will enable us to reduce fcp to a foldr which will generate directly the ascending lists needed.

Let’s write again what we have by now:

**fcp = filter n . cp = filter n . foldr f [ [ ] ], where f xs yss = [x:ys | x <- xs, ys <- yss];**

We’ll apply all we know about the stated foldr fusion law in order to transform fcp into a foldr.

First of all, the property 1 of the theorem is satisfied, because filter n is a strict function.

We’ll put b = filter n [ [ ] ] = [ [ ] ], so we have that **b = [ [ ] ].**

It remains to construct the function h from the theorem.

The property 2 from the theorem becomes the following in our context:

**filter n (f xs yss) = h xs (filter n yss). (E1)**

In this step, we can try to make something like an inductive construction based on xs or yss (the only variables we’re dealing with). As a confess, doing an induction based on xs was a pain in the neck, it is hard to do and involves too much reasoning steps.

We’ll do a kind of inductive reasoning based on yss.

Because the step 2 in the theorem must be valid **for all **yss, it must be valid (obviously) for concrete examples of yss.

**Case 0 (Base Case)**. The simplest concrete case is yss = []. Using it in the equation E1, we’ll have:

**filter n (f xs [ ]) = h xs (filter n [ ]) <=> filter n [ ] = h xs [ ]**

Computing the left part, we have that **h xs [ ] =** **[ ] (E2)**

**Case 1 (Inductive case). **Let’s decompose into head and tail: **ys:yss**

h xs (filter n ys:yss) = filter n (f xs ys:yss)

We’ll look only at the case when **ys is not ascending** (the other one is similar):

filter n (f xs ys:yss) = h xs (filter n ys:yss) = {ys is not ascending} h xs (filter n yss) (E3)

But f xs ys:yss = [x:zs | x <- xs, zs <- ys:yss]

=> filter n (f xs ys:yss) = filter n [x:zs | x <- xs, zs <- ys:yss]. (E4)

Knowing that **ys is not ascending, **we’ll not take ys into account, so:

filter n [x:zs | x <- xs, zs <- ys:yss] = filter n [x:zs | x <- xs, zs <- yss], or changing for convenience the notation:

filter n [x:zs | x <- xs, zs <- ys:yss] = filter n [x:ys | x <- xs, ys <- yss] (E5)

From E3, E4 and E5, we conclude that:

h xs (filter n yss) = filter n [x:ys | x <- xs, ys <- yss] =

{it’s obvious I can restrict yss to the ascending lists – filter n yss}

filter n [x:zs | x <- xs, zs <- filter n yss]

So we have that: ** h xs (filter n yss) = filter n [x:zs | x <- xs, zs <- filter n yss]**

**Having filter n yss in both parts of the previous equation** (we were heading for it), we use the standard technique to** raise filter n yss to a fresh variable zss**, and we have:

**h xs zss = filter n [x:zs | x<-xs, zs <- zss] **(E6)

Putting together E2 and E6, we constructed the function h needed:

**h xs [ ] = [ ]**

** h xs zss = filter n [x:zs | x<-xs, zs <- zss]**

Using the foldr fusion theorem, we arrived at **a definition of fcp = filter n . cp in terms of foldr** (of course, **the function n** isn’t defined for the moment, we don’t need it for the time! ):

fcp :: [[Int]] -> [[Int]] fcp = foldr h [[]] where h xs zss = filter n [x:zs | x<-xs, zs<-zss];

This is all fine and good, and (giving a recursive definition for the function **n** that decides if a list is ascending or not) we’re done. But remains a little problem: we still have this **filter n** which occurs in our foldr definition and can be a source of inefficiency.

To remove the filtering and to actualy generate directly the ascending lists, let’s see what actually foldr means:

In general setting, foldr f e [x1,x2,…,xn] = f x1(f x2 (..(f xn e))..)

**In our case: **foldr h [ [ ] ] [l1,l2,…,ln] = h l1 (h l2 (h l3(…(h ln [ [ ] ]))..) (**foldr expansion**)

h xs zss = filter n [x:zs | x<-xs, zs<-zss]

h ln [ [ ] ] = [ [x] | x <- ln]

h l(n-1) (h ln [ [ ] ]) = filter n [y:zs | y <- l(n-1), zs <- [ [x] | x<-ln]], **so it’s sufficient that y <= x = head zs in order to have filtered only ascending lists!**

So h l(n-1) zss = [y:zs | y <- l(n-1), zs <- zss, y<= head zs]

And the reasoning goes exactly like before for all the terms of the foldr expansion.

So it’s clear by now that h can be redefined by:

**h xs [ [ ] ] = [ [x] | x<-xs]**

**h xs zss = [y:zs | y <- xs, zs <- zss, y <= head zs]**

And in this way we constructed the final definition of fcp, which (by the foldr expansion) is clear that generates only the ascending lists of the cartesian product:

fcp :: [[Int]] -> [[Int]] fcp = foldr h [[]] where h xs [[]] = [ [x] | x<-xs] h xs zss = [y:zs | y <- xs, zs <- zss, y <= head zs]

**End of solution.**

**Remark1. **Doing a comparison test in WinGHCi, the inefficient solution of fcp (I’ve named it fcpi) is taking a long time:

*Main> last (fcpi [[1..3] | i<-[1..14]])

[3,3,3,3,3,3,3,3,3,3,3,3,3,3]

(6.27 secs, 4,051,273,576 bytes)

The efficient solution derived (named fcp) is doing a lot better (time and space usage!)

*Main> last (fcp [[1..3] | i<-[1..14]])

[3,3,3,3,3,3,3,3,3,3,3,3,3,3]

(0.05 secs, 379,392 bytes)

Evaluating compiled versions of fcpi and fcp doesn’t make any difference, of course, the difference remains the same.

**So indeed the fusion law helped us to optimize in space and time!**

**Remark2. **Behind an optimization usually stands a powerful mathematical theorem. This was noticed by others too – Edsger Dijkstra said one time that the most efficient algorithms are derived by taking into account powerful mathematical theorems.

**Remark3. **Intuition is helped a lot when we have a little more knowledge of what’s going on mathematically. Derivation by mathematical laws is a tool that we must use in the more difficult problems that we encounter – they are precise in nature and help us to see why our programs must work in all cases!

So, in a more leibnizian note,

Calculemus!

**Exercises:**

** 1.** Derive (using the universal property of foldr) the definition of cp in terms of foldr.

** 2.** Verify that h from the final definition of fcp satisfies the property 2 of the foldr fusion theorem.

** 3.** Compute the complexity of the first definition of fcp and the last one.

————————————————–

[1] Bird, Richard. (2015) Thinking Functionally with Haskell. Cambridge University Press. (Exercise D, page 173)

**Problem. Given a list of float numbers, write a function that computes the mean of these numbers. **

**Solution: **We can easily specify a function that computes the mean of a list of floats:

**mean:: [Float] -> Float**

** mean [] = 0**

** mean xs = sum xs / fromIntegral (length xs)**

The problem with this definition is that we have to loop twice through the elements of xs in order to compute the mean (because xs is needed in computing the sum and the length). Also, because xs is needed for sum and length, it will be retained in memory. This is a major space leak if we think of lists with millions of elements.

To avoid looping twice through the elements of xs, we need to compute sum and length in the same time. We’ll use the simple but powerful technique of tupling and define a function called sumlen:

**sumlen :: [Float] -> (Float,Int)**

** sumlen xs = (sum xs, length xs)**

**The real challenge begins here: how can we define sumlen as a function that loops only once through xs?**

Somehow we need to define a recursive definition of sumlen. Let’s define our plan to do that (explaining in detail every step from [2]):

- Find a recursive definition for sumlen that loops through xs only once.
- Find a definition of sumlen using foldr.
- Make explicit the transition from foldr to foldl and define sumlen as a foldl.
- We can use foldl’ instead of foldl and, in this way, hope for an algorithm that works in constant space.

**1. Finding a recursive definition for sumlen that loops through xs only once.**

First of all, I define sumcomp to be the sum by components of two pairs of numbers like this:

(a,b) `sumcomp` (c,d) = (a+c, b+d)

And now let’s do a little computation in order to transform sumlen into a recursive function:

I write [x1,x2,…,xn] = x1 : [x2,…,xn] and I define xs = [x2,…,xn]. With this in mind, we proceed like this:

sumlen(x1:xs) = {definition of sumlen} (sum (x1:xs), length (x1:xs)) = {definitions of sum and length} (x1 + sum xs, 1 + length(xs)) = {sum of pairs by components: sumcomp} (x1,1) `sumcomp` (sum xs, length xs) = {definition of sumlen} (x1,1) `sumcomp` (sumlen xs) = {I denote (s,n) = (sumlen xs)} (x1,1) `sumcomp` (s,n), where (s,n) = sumlen xs = {definition of `sumcomp`} (x1 + s, 1+n), where (s,n) = sumlen xs

So (rewritting a little to be more conventional: x=x1): **sumlen(x:xs) = (x+s,1+n), where (s,n) = sumlen xs. **From the previous inefficient specification of sumlen, it’s clear that **sumlen [] = (0,0)**.

We achieved our goal of computing the sum and length of a list of numbers with only one looping through the elements:

**sumlen :: [Float] -> (Float,Int)**

** sumlen []= (0,0)**

** sumlen(x:xs) = (x+s,1+n)**

** where (s,n) = sumlen xs;**

**Remark: We found that specifying an inefficient algorithm first is not a bad thing. We can use then symbol manipulations to compute more efficient definitions starting with the inefficient definition. This is a very powerful technique in functional programming and you can see more elaborate examples in Richard Bird’s book “Pearls of Functional Algorithm Design” ([3]).**

**2. Finding a definition of sumlen using foldr.**

For this, we need more mathematical machinery, because it simplifies things a lot. I’ll use the universality theorem described by Graham Hutton in [4]:

**Theorem:**

** g = foldr f v**

** <=>**

** g [] = v**

** g (x : xs) = f x (g xs)**

This is a very powerful theorem stating that we can write a function g as a foldr if and only if the function g respects two equations (one for the empty list and one for the general list (x:xs)).

In my case, the function g = sumlen. So I want a definition like sumlen = foldr f v. Our task is to find the actual f and v for our instance.

The first equation of the theorem says that sumlen [] = v. But we found that sumlen [] = (0,0), so v = (0,0).

The second equation of the theorem says that: sumlen (x:xs) = f x (sumlen xs). But sumlen (x:xs) = (x+s, 1+n), where (s,n) = sumlen xs. So we have that:

f x (sumlen xs) = (s+x, n+1), where (s,n) = sumlen xs. If I raise (sumlen xs) to a general variable y, we’ve found our function f:

f x y = (s+x, n+1), where y = (s,n); or, replacing y with (s,n), we have **f x (s,n) = (s+x,n+1)**.

In this way, we’re sure the right part of the universality theorem holds and, applying the theorem, we conclude that:

**sumlen :: [Float] -> (Float,Int)**

**sumlen = foldr f (0,0)**

** where f x (s,n) = (s+x,n+1);**

**Remark: As you can see, the universality theorem of foldr is a very powerful technique of transforming a function into a foldr. It can be used in many more contexts and you can find more about it in the article of Graham Hutton ([4]).**

**3.** **Making explicit the transition from foldr to foldl and define sumlen as a foldl.**

In order to transform the foldr definition of sumlen into a foldl, we’ll need again more mathematical machinery that simplifies the process. We’ll use a theorem proved in [2], stating that under certain conditions, a foldr can be transformed into a foldl:

**Theorem:**

**For all finite lists xs, we have:**

**foldr f e xs =foldl g e xs**

**provided that the following equations hold:**

**f x (g y z) = g (f x y) z (eq. 1)**

** f x e = g e x (eq. 2)**

For our example, we want to find a function g such that: foldr f (0,0) xs = foldl g (0,0) xs, where f x (s,n) = (s+x,n+1).

It’s clear from the given data that e = (0,0).

I’ll use the second equation (eq. 2), because is the simpler one and can give me some informations about g in the base case (0,0):

f x (0,0) =g (0,0) x

f x (0,0)

={definition of f}

(x,1)

So we have that **g (0,0) x = (x,1), for all x of appropriate type**. **(R1)**

But we also must preserve the equation f x (g y z) = g (f x y) z, for all x,y,z of appropriate types.

Trying to use R1, I’ll instantiate y = (0,0) in the equation eq.2 and obtain:

f x (g (0,0) z) = g (f x (0,0)) z = {definition of f} g (x,1) z.

So we have: f x (g (0,0) z) = g (x,1) z (i)

But f x (g (0,0) z) = {using R1} f x (z,1). (ii)

From (i) and (ii) we conclude that f x (z,1) = g (x,1) z. (iii)

But f x (z,1) = {definition of f} (x+z,2) (iv)

From (iii) and (iv) we have that g (x,1) z = (x+z,2). Straightforward generalizations gives us:

**g (x,n) z = (x+z,n+1).**

**Exercise: Verify the found function g satisfies eq.1 and eq.2 of the theorem.**

Because g satisfies eq.1 and eq.2, the theorem says that:

**foldr f (0,0) xs = foldl g (0,0) xs, with g (x,n) z = (x+z,n+1).**

**So we arrived at sumlen = foldr f (0,0) = foldl g (0,0), with g (x,n) z = (x+z,n+1), which was our task.**

4. Because we’re working with strict functions, we can use foldl’ instead of foldl and we have:

**sumlen :: [Float] -> (Float,Int)**

**sumlen = foldl’ g (0,0)**

** where g (x,n) z = (x+z,n+1);**

To make it work in constant space, follow the explanations from [1] (Chapter 25) and rewrite sumlen like this:

sumlen :: [Float] -> (Float,Int) sumlen = foldl' g (0,0) where g (x,n) z = x `seq` n `seq` (x+z,n+1);

This function can be much more optimized, but you can read about that in [1], chapter 25.

As an important conclusion, knowing mathematics can help a lot in optimizations and that is part of the beauty of functional programming!

You can ask anything and, if you want, you can subscribe to my blog.

You can also take a look on my page on Patreon (click this text).

—————————————————————————————————————————————————————

[1]. Bryan O’Sullivan, John Goerzen, Donald Bruce Stewart (2008) Real World Haskell. O’Reilly Media.

[2]. Bird, Richard. (2015) Thinking Functionally with Haskell. Cambridge University Press.

[3]. Bird, Richard. (2010) Pearls of Functional Algorithm Design. Cambridge University Press.

[4] Hutton, Graham (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming. Cambridge University Press.

Sometimes we have to answer questions like: I have something that looks like a functional equation, but I miss a function – how do I find it? To be more concrete, **given the functions f,g,h, how do I find the function r in the equation: f.g = h.r? **The purpose of this post is to show you how to find a function in a specific example and, after you found it, to proove that the functional equation holds.

**Problem: Given the following equation: reverse . concat = concat . reverse . h, find the function h.**

**Part 1**. First of all, we’ll start with the definitions of the functions concat and reverse.

concat is a function that takes a list of lists and concatenates the innermost lists into a single one.

Example: concat [ [1,2] , [3,4] ] = [1,2,3,4].

**Definition of concat:**

**concat :: [[a]] -> [a]**

** concat [] = []**

** concat (xs:xss) = xs ++ concat xss**

reverse is a function that takes a list and returns a list with the original elements read from end to beginning.

Example: reverse [1,2,3,4] = [4,3,2,1]

**Definition of reverse:**

**reverse :: [a] -> [a]**

** reverse [] = []**

** reverse (x:xs) = reverse xs ++ [x]**

**Exercise: Proove that reverse (reverse xs) = xs, for all finite lists xs (in other words, we say that reverse is an involution).**

Back to our problem, we want to find the function h that satisfies the functional equation **reverse . concat = concat . reverse . h.** Because reverse is an involution, we can simplify our equation by a standard technique – compose it on the left with reverse:

reverse . (reverse . concat) = reverse . (concat . reverse . h)

Because composing is associative, we have that:

(reverse . reverse) . concat = reverse . (concat . reverse . h)

Because reverse . reverse = Id (reverse is an involution), we derive that:

**concat = reverse . (concat . reverse . h) (E1), which is much more amenable to manipulations than the initial equation, because the left-hand side has been reduced to a single function – concat.**

Let’s simplify the problem a little and say we’re dealing with a finite list with 2 elements: L = [ [a1,a2] , [a3,a4] ]. Applying the equation E1 on L, we have:

**concat L = reverse(concat(reverse( h L ))) (E2)**

E2 is a very important equation, because it gives us a lot of information about the type of h.

**Remark 1.** h needs to work on L, which is a list of lists, so the **domain of h is [[a]], for a fixed type ‘a’**.

**Remark 2**. reverse(h L) in equation E2 says the function reverse needs to work on the values of function h. But reverse is a function that works on lists, so the value (h L) must be a list! In that way, we found that (h L) :: [b], for a fixed type b. So, **the codomain of the function h is [b], for a fixed type b.**

**Conclusion 1: **From remarks 1 and 2, we conclude that **h :: [[a]] -> [b]**, for fixed types a and b.

As we said, h(L) :: [b]. We can write h(L) = [b1, b2, …, bm]. Equation E2 becomes:

concat L = reverse(concat(reverse [b1,…,bm]))

= {definition of reverse}

reverse(concat [bm,…,b1] ) (1)

But concat L = concat [ [a1,a2] , [a3, a4]] = [a1,a2,a3,a4]. Using the derived equation (1), we deduce that:

**[a1,a2,a3,a4] = reverse(concat [bm,…,b1]) (E3)**

**Remark 3. **In the right-hand side of E3, we have the expression concat [bm,…,b1], and knowing that concat works on lists of lists, we deduce that bm,…,b1 are also lists.

Remark 3 helps us to write bm,…,b1 as lists in the following way:

bm = [cm_1,cm_2,…., cm_im], … , b1 = [c1_1, c1_2, … , c1_i1].

In this way, concat [bm,…,b1] = [cm_1,cm_2, … , cm_im, … , c1_1, c1_2, … , c1_i1] (2)

Coming back to the equation E2, we’ll have (together with (2)):

[a1,a2,a3,a4] = reverse [cm_1,cm_2, … , cm_im, … , c1_1, c1_2, … , c1_i1]

= {definition of reverse}

[c1_i1, … , c1_1, …, cm_im, …, cm_1]

So, we have now that:

**[a1,a2,a3,a4] = [c1_i1, … c1_2, c1_1, …, cm_im, …cm_2, cm_1] (E4)**

Becuase in the left-hand side of the equation E4 we have only 4 elements, it’s clear that in the right-hand side we also have only 4 elements. We’ll pick two elements from b1 and two elements from b2, and see what happens next:

[a1,a2,a3,a4] = [c1_2,c1_1,c2_2,c2_1]

It’s clear now how to pick: c1_2 = a1; c1_1 = a2; c2_2 = a3; c2_1 = a4.

So we’re left only with **b1 = [c1_1,c1_2] = [a2,a1] and b2 = [c2_1,c2_2] = [a4,a3].**

We first introduced h(L) = [b1,…,bm], which becomes now:

**h(L) = **h [ [a1,a2], [a3,a4] ] = [b1,b2] = [ [a2,a1], [a4,a3] ] = [reverse [a1,a2], reverse [a3,a4]] = map reverse [[a1,a2], [a3,a4]]** = map reverse L**

This derivation says that **h = map reverse**, and we’ve found the function needed.

The equation in the problem becomes: **reverse . concat = concat . reverse . map reverse**

But we’ve been overspecific by now, working on a list of lists of 2 elements each. Is the equation found holding on all finite lists? The answer is yes, and we’ll proove:

**Proposition (P): reverse . concat = concat . reverse . map reverse, for all finite lists.**^{1}

**End of Part 1.**

**Part 2.**

Prove that:

**Proposition (P): reverse . concat = concat . reverse . map reverse, for all finite lists.**^{1}

We can proove this equation using induction, but the scope of this post is to introduce some advanced techniques that you’ll need in your future projects. For that, we’ll start with a little theory.

First of all, all the functions used in the proposition P can be rewritten using foldr. We’ll let the proofs as an exercise.

^{1 }**concat = foldr (++) [] **

Also, we need to know a very important theorem, which is demonstrated in [1].

**The fusion theorem.**

Given a function f and a function h with the following conditions:

- f is a strict function.
- f(g x y) = h x (f y), for all x,y of appropriate types.

Let b = f a.

Then f . foldr g a = foldr h b.

**End of fusion theorem.**

The fusion theorem says that, in certain conditions, a function composed with a foldr is also a foldr. This theorem will be used to simplify a lot of equations in our exercise.

Now we’re ready to give the proof of our proposition, which is reminded below:

**reverse . concat = concat . reverse . map reverse, for all finite lists**

We’ll prove that in the following way:

- Express the left-hand side as a foldr.
- Express the right-hand side as a foldr.
- Compare the definitions to see that are equal.

**1. Express the left-hand side as a foldr.**

**reverse . concat**

= { because, as we said, concat = foldr (++) [] }

**reverse . foldr (++) []**

We have a composition of reverse with a foldr function, so the only thing we can think of is to try applying the fusion theorem:

reverse . foldr (++) [] = foldr h b.

- First of all, the first condition of the theorem is satisfied – reverse is a strict function.
- reverse ((++) xs ys) = h xs (reverse ys)

**reverse ((++) xs ys)**

= { infix notation for (++) }

reverse (xs ++ ys)

= { this can be easily proved }

**reverse ys ++ reverse xs**

We have in this way that:

reverse ys ++ reverse xs = h xs (reverse ys)

We define h to work generally like this:

**h xs zs = zs ++ reverse xs,** and we’re done, we’ve found the desired function h for the fusion theorem. The technique is simple: we just generalized reverse ys (present in left side and right side of the equation) to a fresh variable named zs.

The theorem states also that b = reverse [] = []. So **b = []**.

We’ve found our equation:

**reverse . concat = reverse . foldr (++) [] = foldr h [], where h xs zs = zs ++ reverse xs. (R1)**

**End – express the left-hand side as a foldr.**

**2. Express the righ-hand side as a foldr (concat . reverse . map reverse)**

We’ll use the following result, which is left also as an exercise to the reader: **map f = foldr ((:) . f) [] (where ‘:’ is the lists operator).**

Using the map rewriting as a foldr, we’ll have:

concat . reverse . map reverse

= { function composition is associative }

concat . (reverse . map reverse)

= { map f function as a foldr }

concat . (reverse . foldr ((:) . reverse) [] )

We apply now the fusion theorem for the composition:

reverse . foldr ((:) . reverse) [] = foldr h b.

It’s clear that reverse is a strict function.

Also, b = reverse [] = []. So **b = [].**

The second condition of the theorem becomes (instantiated in our case):

**reverse ((:) . reverse xs yss) = h xs (reverse yss) (1)**

**reverse ((:). reverse xs yss)**

= { function composition }

reverse ((:)(reverse xs) (yss) )

= { infix notation of (:) }

reverse(reverse xs : yss)

= { reverse definition, equation 2 }

**reverse yss ++ [reverse xs]**

This derivation, together with (1), gives us:

**reverse yss ++ [reverse xs] = h xs (reverse yss) **

Generalizing reverse yss (found in both sides of the equation) to the fresh variable zss, we have found our function h:

h xs zss = zss ++ [reverse xs].

Applying the fusion theorem, we have that:

**reverse . foldr ((:) . reverse) [] = foldr h [], where the function h is:**

**h xs zss = zss ++ [reverse xs]**

I’ll rewrite the function h above as h1, in order to not confuse notations in the following argument.

We have so the original equation, which became:

**concat . reverse . map reverse = concat . foldr h1 [], where h1 xs zss = zss ++ [reverse xs].**

We’ll try to apply again the fusion theorem and to write:

concat . foldr h1 [] = foldr h b.

We know that concat is a strict function, so the first condition of the theorem applies.

Let b = concat [] = []. **So b = [].**

**concat (h1 xs zss) = h xs (concat zss) (2)**

**concat (h1 xs zss) **

= { definition of h1}

concat(zss ++ [reverse xs])

= {simple to proove: concat (xss ++ zss) = concat xss ++ concat zss}

concat zss ++ concat [reverse xs]

= { it’s trivial to see that concat [xs] = xs}

**concat zss ++ reverse xs**

From this derivation and equation (2), we conclude that:

** h xs (concat zss) = concat zss ++ reverse xs**

Generalizing concat zss into the fresh variable ys, we have that:

h xs ys = ys ++ reverse xs

In this way, we have that concat . foldr h1 [] = foldr h [], where h xs ys = ys ++ reverse xs.

We have completed, in this way, the right-hand side of the original equation:

**concat . reverse . map reverse = foldr h [], where h xs ys = ys ++ reverse xs. (R2)**

But we also have the result obtained for the left-hand side:

**reverse . concat = foldr h [], where h xs zs = zs ++ reverse xs (R3)**

3. Comparing R2 and R3, it’s clear that are equal. So the proposition holds:

**reverse . concat = concat . reverse . map reverse**

**End of Part 2.**

I know the exercise is a bit long and technical, but in this way we’ve shown some powerful techniques for finding functions with some properties given and for proving equations using the fusion theorem of foldr. This are mathematical tools that you can use for optimizing equations, for prooving and even for constructing new functions as folds (see more about it in [2]).

If you see some error in my calculations, I’m more than happy to listen your opinion.

You can also take a look on my page on Patreon (click this text).

**References**

[1]. Bird,R. (2015) Thinking Functionally with Haskell. Cambridge University Press.

[2]. Hutton,G. (1999) A tutorial on the universality and expressiveness of fold. J. Functional Programming.

As an aside, I believe the role of intuition and invention is important in our work, but I believe also the study of the mechanisms behind our thought processes is important too. I say this because, if we want to derive complex algorithms based on more than simple hunches and common sense considerations, we need reliable reasoning methods that can help us in the development of ambitious projects.

Also, I think Edsger Dijkstra was right: we need (as much as we can) to be very explicit about what we’re doing. Without this, the future of computing will be that of software systems resembling biological systems and black magic (in this sense, read the text talk of Leslie Lamport – “The Future of Computing – Logic or Biology”).

************

All that being said, let’s proceed with our first example. In all that follows, I’ll work with Haskell lists [a], where “a” is a general type.

I’ll use also pattern matching and say that a list can be viewed as split into its head and its tail (x:xs). For example, the list [1,2,3] = 1:[2,3].

**Example 1. (Sum function) Given a finite list of numbers, write a function that computes their sum.**

**Derivation: **That looks simple enough and it’s appropriate for our more formal treatment. The first step is to use formal notations like in classic mathematics (using also Haskell notation). We have the following mathematical definition (**MD**)

sumfunc :: Num a => [a] -> a

sumfunc[x1,x2,…,xn] = x1 + x2 + … + xn. (**MD**)

The notation is a classical mathematical formalization that follows from the statement of the example 1.

The advantage of the technique is the notation enabled us to express example 1 in a more concise manner. It’s clear that sumfunc associates, for a given finite list [x1,x2,…,xn], its sum.

The downside of the story is that sumfunc can’t be simply implemented in Haskell, because we don’t have a real rule of computation for its formula.

But the formula provides us with a lot more. We can manipulate the symbols x1,…,xn following the simple rule of associativity:

**Derivation**

sumfunc[x1,x2,…,xn]

= {definition of sumfunc}

x1 + x2 + … + xn

= {associativity of “+”}

x1 + (x2 + x3 + … + xn)

= {definition of sumfunc}

x1 + sumfunc[x2,x3,…,xn]

**End of Derviation**

——————————————-

**So, by simple derivation from the mathematical formalization, we obtained the formula E0:**

**sumfunc[x1,x2,…,xn] = x1 + sumfunc[x2,x3,…,xn].**

If we regard this formula like in Haskell, we have:

[x1,x2,…,xn] = x1:[x2,…,xn], and we note xs = [x2,…,xn].

In this way: sumfunc(x1:xs) = x1 + sumfunc xs,

Or equivalently, in the more conventional notation:

**sumfunc(x:xs) = x + sumfunc xs (E1)**

All it’s fine by now, but let’s observe that the recursive definition E1 reduces, at every step, the list with one element. The question is, how long does it take before it stops the computation of sumfunc for a given list?

Let’s compute from the equation E0:

sumfunc[x1,x2,…,xn] = x1 + sumfunc[x2,x3,…,xn] = … = x1 + (x2 + (….(xn + sumfunc[]))) = {associativity of “+”} x1 + x2+ … + xn + sumfunc [] = {mathematical definition **MD**} x1 + x2 + … + xn.

We obtained, in this way, that: x1 + x2 + … + xn + sumfunc [] = x1 + x2 + … + xn. It follows immediately that **sumfunc [] = 0**. (**E2**)

Because every computation of sumfunc for a list ends in sumfunc [], we’re done.

We derived, in this way, the classical formula for the sum function on numerical lists:

**sumfunc :: Num a => [a] -> a**

**sumfunc [] = 0**

**sumfunc (x:xs) = x + sumfunc xs**

**End of example 1.**

**Example 2. (Product function) Given a finite list of numbers, write a function that computes their product.**

**Derivation. **Left as an exercise for the reader. It’s very similar with the derivation for the sum function.

**End of example 2.**

**Example 3. (Map function) Given a function f :: a->b and a finite list, construct a function that produces a finite list whose elements are the mapping of function f on every element of the list given.**

**Derivation. **

Let’s formalize the statement of the problem. What we want to construct is a function (we’ll call it **mapfunc**) which works something like this:

**mapfunc( f, [x1,x2,…,xn] ) = [f x1, f x2, … ,f xn], for all f :: a-> b and x1,…,xn :: a (E0)**

This is the well-known function map, and it’s used in a lot of applications. As an example, if f(x) = x+2, where x is an element of a numeric type, then we have mapfunc(f,[2,6,7]) = [4,8,9].

All is well and beautiful, but we have to convert the equation E0 into an equivalent recursive one, which can be used for implementation in a programming language like Haskell. Following the pattern of derivation from example 1, we’ll have:

**Derivation. **

mapfunc(f,[x1,x2,…,xn])

= {definition of list operator “:”}

mapfunc(f,x1:[x2,…,xn])

= {notation: xs = [x2,…,xn]}

** mapfunc(f,x1:xs) (R1)**

= {equation E0}

[f x1,f x2,…,f xn]

= {definition of list operator “:”}

f x1 : [f x2, f x3, … , f xn]

= {from E0 we have: [f x2, … , f xn] = mapfunc(f,[x2,…,xn])}

f x1 : mapfunc(f,[x2,…,xn])

= {notation: xs = [x2,…,xn]}

** f x1 : mapfunc (f,xs) (R2)**

**End of derivation.**

We derived, in this way, R1 = R2, which gives the equation:

mapfunc(f,x1:xs) = f x1 : mapfunc (f,xs)

or, rewritten in the conventional notation:

**mapfunc(f,x:xs) = f x : mapfunc(f,xs). (E1)**

The next question is: being a recursive function, how long does mapfunc compute an element?

Let’s use again the mathematical formalization E0 and compute on a general list [x1,x2,…,xn]:

[f x1,…,f xn] = mapfunc(f,x1:[x2,…,xn]) = f x1 : mapfunc(f,[x2,…,xn]) = … = f x1 : (f x2 : … (f xn : mapfunc(f,[]))…).

So, we have [f x1, … , f xn] = f x1 : (f x2: … (f xn : mapfunc(f,[]))…).

But we have also, from the definition of list operator “:”, that [f x1, …, f xn] = f x1 : (f x2 : … (f xn : [])…).

In this way, we deduce that** mapfunc(f,[]) = []. (E2)**

**From the equations E1 and E2, we derived the well-known recursive function map (you can easily translate it in curried form):**

**mapfunc(f,[]) = []**

**mapfunc(f,x:xs) = f x : mapfunc(f,xs)**

** **

**End of example 3.**

I hope these simple examples make clear that formalizing a problem can help us a lot in deriving algorithms. I’ll write some posts about constructing more complex functions, this one is intended to exemplify on simpler examples how to derive recursive functions based on classical mathematics.

As a suggestion for reading, I recommend the exceptional article of C.A.R.Hoare: “An Axiomatic Basis for Computer Programming”. It’s written in imperative style, but the essence of the methods is applicable to functional programming too.

Also, you can read The Science of Programming (David Gries), very well written and with a lot of general applicable principles of formal methods for computing science.

If you like it, can also take a look on my page on Patreon (click this text).

** **

The example which I’ll discuss is derived from an implementation of the transpose function.

Let’s start with an informal definition: **the transpose of a matrix **M is the matrix MT which is derived very simply – every row of the matrix M becomes the column of the matrix MT.

**Example:**

Let M be the matrix:

1 2 3

4 5 6

Then the transpose MT is the matrix:

1 4

2 5

3 6

In order to formalize this, using Haskell, we’ll conceive **every matrix as a list of rows. **

In this way, **M = [ [1,2,3] , [4,5,6] ]** and **MT = [ [1,4], [2,5], [3,6] ]. **In other words, a matrix is a list of lists of elements from a generic type ‘a’ (for every matrix M, M::[[a]]).

That being said, we define the function transpose in the following way:

**> transpose :: [[a]] -> [[a]] **

**> transpose [xs] = [[x] | x <- xs] **

**> transpose (xs:xss) = zipWith (:) xs (transpose xss)**

Let’s denote **E1:** **transpose [xs] = [[x] | x <- xs] **and **E2**: **transpose (xs:xss) = zipWith (:) xs (transpose xss).**

The interesting part is **the base case in this definition is the transpose of a matrix with a single row (transpose [xs]). **Which (intuitively) is the natural base case, because we’ll transpose row by row and we have to make sure we know how to transpose a single row.

But what if we want to **replace the base case** (**transpose [xs] = [[x] | x <- xs]**) with **transpose []? **It’s not obvious at all how we can go about and give a definition for transpose [].

One of the ways we can think about it is that we need a definition for transpose [] that will enable us to recover the definition E1 for the transpose of a single row with the help of equation E2 . If we can recover E1 in this way, we’re done and transpose will work correctly.

With this idea in mind, we’ll rewrite the equation E1:

transpose [xs] = transpose (xs:[])

**= {definition of E2} zipWith (:) xs (transpose [])**

**= {definition of E1} [[x] | x <- xs]**

In this way, we arrive at the functional equation (in the **unknown transpose []**):

**E3: zipWith (:) xs (transpose []) = [[x] | x <- xs] **

Let’s reason a little element by element and say that:

**xs = [x1,x2,…,xn], where x1,..,xn are elements of the generic type a.**

** transpose [] = [xs1,xs2,…,xsn], where xs1,…,xsn are elements of [a].**

In this way, the equation E3 becomes:

**zipWith (:) [x1,…,xn] ([xs1,…,xsn]) = [[x1],[x2],…,[xn]]**

** <=>**

** [x1:xs1,x2:xs2,…,xn:xsn] = [[x1],[x2],…,[xn]]**

Because two lists are equal iff are equal element by element, we have:

**x1:xs1 = [x1]; x2:xs2 = [x2]; … xn:xsn = [xn].**

From the definition of the operator (:) it’s clear that **xs1 = []; xs2 = []; … ; xsn = [] (for example: x1:[] = [x1], … xn:[] = [xn]).**

In this way, we arrived at the conclusion that:

transpose [] = [xs1,…,xsn] = [ [], [], …, [] ]. If we take into consideration that the list xs can be any list (so can have any number of elements), transpose [] must be an infinite list of [], because zipWith will use from it only a finite number of empty lists (more precise, will use as many empty lists as are elements in xs).

So,** transpose [] = repeat [] = [ [], [], [], [], …].**

In this way, (because we explicitly constructed transpose [] in order to recover transpose [xs]) we’re also assured that transpose [xs] will work as intended and the function transpose will be correct. Of course, you can do a more formal proof for this assertion.

We arrived at the new definition of transpose:

**> transpose :: [[a]] -> [[a]]**

**> transpose [] = repeat []**

**> transpose (xs:xss) = zipWith (:) xs (transpose xss)**

We see now that the equational reasoning provided a lot of heuristic guidance in finding the definition of transpose []. I think that we can apply this kind of reasoning in another contexts too, where our intuition can’t reach a result instantly. A more systematic approach to problem solving, using math techniques consistently, can enlarge our problem solving skills in the long run.

If you have any question or comment to add, don’t hesitate!

You can also take a look on my page on Patreon (click this text).

And don’t forget these very good books about functional programming:

]]>Let’s introduce a little theory first. **A monoid** is a pair (M,o) where M is a set and “o” (**called the law of composition**) is a function o: MxM -> M with the following axioms:

**Associativity:**For each x,y,z elements of M, (xoy)oz = xo(yoz).**Identity element:**there exists in M an element e such as: e o x = x o e =x, for each x from M.

As it can be seen, associativity is a very important property and says that it doesn’t matter the order of composition for elements, because the result will remain the same. **This is a very important property which leads to mathematical laws used in optimisation!**

You know for sure a lot of monoids in practice, but most of the time you didn’t call them that way. Let’s see some examples:

**Example 1. **The set of natural numbers (N = {0,1,2,3,4,…}) is a monoid under the addition learnt in school – the identity element is 0.

**Example 2. **The well-known boolean set **B = {False, True}** forms a monoid under the laws:

– **AND, with the identity True (True AND x = x AND True = x, where x can be True or False);**

– **OR, with the identity False (False OR x = x OR False = x, where x can be True or False).**

**Example 3. **This probably is one of the most important for computing, everyone uses it when working with lists. Lists form a monoid under concatenation, and the identity is the empty list (denoted by “[]”).

Now is the moment to introduce the important notion of power.

**Powers laws of monoids:** In a monoid (M,o), we denote x^0 the power 0 of the element x. By convention, x^0 = e (the identity element of the monoid). We define also x^n = x o x o x … o x (the composition applied n times) and, using associativity, it’s easily seen that** (x^n)^m = x^(n*m)**, for each n,m natural numbers. Also, it’s easisly seen that **x^(n+m) = x^n * x^m**.

These are two of the most important properties of monoids. We’ll employ them in finding an efficient solution for computing the exponent n of a given number x (denoted by x^n).

Let’s denote our function exp1 :: (Integer,Integer) -> Integer. The scope is to compute exp1(x,n) = x^n.

**The first idea **would be to use the simple law: x^n = x*x^(n-1), which becomes exp1(x,n) = x*exp1(x,n-1).

That leads us to the first program, which is very ineficient (because we have to recurse through almost all the powers of n):

> exp1 :: (Integer,Integer) -> Integer

> exp1(x,n)

> | n == 0 = 1

> | n == 1 = x

> | otherwise = x*exp1(x,(n-1))

**The second idea is based on the powers laws of monoids:**

– if the exponent is even (n = 2m), we’ll have **exp1(x,2m)** = x^(2m) = (x^2)^m **= exp1(x^2,m) = exp1(x*x,m)**

– if the exponent is odd (n = 2m+1), we’ll have **exp1(x,2m+1)** = x^(2m+1) = (x^2m) * x = x * x^(2m) = x*((x^2)^m) **= x*exp1(x^2,m) = x*exp1(x*x,m)**

Thus, we arrive at the program:

> exp1 :: (Integer,Integer) -> Integer

> exp1(x,n)

> | n == 0 = 1

> | n == 1 = x

> | even n = exp1(x*x,m)

> | odd n = x*exp1(x*x,m)

> where m = div n 2;

**The power of this solution is that, when the recursion step takes place, n is halved, which leads to much less steps taken for the computation of the exponential function. The halving of n was made possible by applying systematically the laws of powers in monoids, which are based on the associativity property of the composition law and the existence of the identity element.**

For more informations about monoids and algebraic structures, and also about functional programming:

- You can read this very beautiful article of Bartosz Milewski: Free Monoids.
- You can read Graham Hutton’s: Programming in Haskell.
- You can read Richard Bird’s: Thinking Functionally with Haskell.

If you like, You can take a look on my page on Patreon (click this text).

]]>Think, for example, of this simple problem. Given two lists of numbers, how do you combine their elements in every possible way?

More concrete, you’re given the lists: **[1,2,3] and [4,5]**. The task is to enumerate all the possible ways in which you can choose a single element from the first list to be paired with a single element from the second list.The pattern of choices is as in the following image:

In this way, we’ve constructed all the pairs needed: [1,4], [2,4], [3,4], [1,5], [2,5], [3,5]. This is called the** cartesian product of lists**. **Generally, given two sets A and B, the cartesian product AxB = {(a,b) | a is in A, b is in B}.**

**Why is so important the cartesian product of lists?**

Let’s say you work with a given matrix M and you want to find all the possible matrices (starting from M) with a given property. Take, for example, the matrix below (we’ll refer to it as the matrix **M**):

**1 0**

**0 4**

Let’s restrict M and say that 0 can be only 1,2,3 or 4. How do you proceed to generate all the matrices (starting from M) by instantiating 0 with the values from the list [1..4]?

A first solution would be to represent the matrix as an array of rows (which are also arrays). That will suggest to loop through the rows unsing indices, which is not very elegant.

A second solution would be to represent the matrix as of list of rows (which are also lists). In this case, we have the opportunity to use the device of cartesian product of lists to generate all the possible choices. This is more elegant and much more easy to manipulate mathematically.

More concrete, the matrix M is encoded like this: M = [ [1,0], [0,4] ] (the list of rows). In order to be able to use the cartesian product of lists, we need to represent every possible choice like this:

- a digit d <> 0 will become a singleton list [d].
- a digit d = 0 will become the list of choices [1,2,3,4] = [1..4]

in this way we’ll have the matrix S = [ [ [1], [1..4] ], [ [1..4], [4] ] ]. We can distinguish now the internal elements:

**L1 = [ [1], [1..4] ] and L2 = [ [1..4], [4] ].**

** If we apply cartesian product on L1 and L2, we obtain all the possible rows, and from all the possible rows, if we apply the cartesian product, we obtain all the possible matrices derived from M with the restriction that 0 could be any element of [1..4].**

For example, cartesian product applied to L1 will result in L11 = [ [1,1], [1,2] ,[1,3], [1,4] ]. Doing the same with L2, we obtain L21 = [ [1,4], [2,4], [3,4], [4,4]].

Now we only have to apply the cartesian product on L11 and L21 and we obtain all the matrices derived from M with the restriction that 0 can be only from the list [1..4]

- choice 1: M1 = [ [1,1], [1,4] ]
- choice 2: M2 = [ [1,2], [1,4] ]
- choice 3: M3 = [ [1,3], [1,4] ]
- etc.

**How to construct the cartesian product of lists?**

Let’s start, again, with an example. We’ll denote cartesian product on lists by cartprod.

cartprod [ [2], [1,3]] = [ [2,1], [2,3] ]

But how to construct cartprod [ [1,2], [2], [1,3] ]? From cartprod [ [1,2], [2], [1,3] ] = [ [1,2,1], [1,2,3], [2,2,1], [2,2,3] ] we observe that basically what we’ve done is the following:

- take 1 from the [1,2] and prefix to the elements of cartprod [ [2], [1,3]]
- take 2 from the [1,2] and prefix to the elements of cartprod [ [2], [1,3]]

That being said, we can formalize it with a list comprehension:

cartprod :: [[a]] -> [[a]];

cartprod [] = [[]]

cartprod (xs:xss) = [x:ys | x<- xs, ys <-yss]

where yss = cartprod xss

This definition says nothing more than split the list in head and tail (xs = [1,2] = head and [ [2], [1,3]] = tail = xss), compute the cartesian product of the tail in yss, and prefix each element x from the head to each element of yss.

**As a final remark, cartesian product of lists is a very powerful method to implement exhaustive search, so it’s worth knowing and understanding it!**

This blog post is a more elaborate explanation of a part of Richard Bird’s Sudoku solver. If you want to read more about this, you can find it in Richard Bird’s book:

P.S. You can take a look on my page on Patreon (click this text).

]]>The question is: where is the living mathematics? Eric Temple Bell answers this question in his masterpiece: The Development of Mathematics. Mathematics isn’t about formulas and equations (altough it incorporates this form of expression). Mathematics is about methods of thinking concerning certain problems, so effective that all sciences use it for expressing their theories. Mathematics is about ideas and precise thinking coming through centuries of clumsy techniques.

Bell tells us also that living mathematics is not the exact science perceived from the outside world. Its methods are not given from above, there’s no theology or metaphysics involved in it. Beneath all the clear cut thinking there is a lot of struggling from great men to understand natural and abstract phenomena.

From the empirical tests of babylonians used in measuring particular areas of certain fields, to Pythagora with his famous theorem, there is a long and hard way. No one knows how, from this empirical mindset, which was a kind of submathematical activity, came about the idea of proof. The human mind evolved and needed proof for facts, proof that some measurements from practice hold for large sets of numbers, proofs that in a right triangle exists a definite relationship between the hypotenuse and the other two sides (the square of the hypotenuse is equal to the sum of the squares of the other two sides).

And, most of all, the greek Euclid gave us the most efficient tool of reasoning: the postulational method. The method is simple – you start from postulates, general accepted statements, and deduce the consequences. It’s the most powerful method known to humankind to reason about mathematical objects and nature.

Compared with the so called informal method and the intuition, the postulational method is so direct and straightforward that no one can argue with its efficiency of proving statements.

Reinforced by David Hilbert in the 20th century, the postulational method became more powerful than ever. And, along with it and helping it, came to life the abstract method. The abstract method followed the detailed work of 19th century mathematics encompassing it in simple, beautiful and elegant results that generalized what was known before. The abstract method is the method of finding the essence through the details (addition in integers shares the same properties of associativity, neutral element, commutativity with the addition of matrices over commutative rings, for example). Before the abstract method, the connections between mathematical objects were not clearly seen and computations lacked elegance. Emmy Noether, the woman who pushed the abstract thinking into the front lines of living mathematics, said that her computation filled dissertation was “crap”. After the dissertation she devoted herself to abstract thinking simplifying much of what was known before her and discovering new and profound results.

After all, in mathematics we discover the history of the failure and success of human reason, with the most outstanding methods of thinking still being used today. It’s a history of making order from otherwise disorderly objects. It’s a method of understanding the very nature of physical reality and of pure thought, while struggling with the unknown and the confusion. It’s about the human mind and its reasoning techniques. Formulas are just the language, but the ideas behind them are the life of mathematics.

The book of Eric Temple Bell seems to leave us with a lot of questions about ourselves, one being how come the human race is capable of abstract thinking and how did the abstract thinking appear? After all, we can understand a lot of things about this vast and intricate universe with our limited skull using abstract thinking. For Bell (and for me), it’s a wonder that we can think so efficiently. Sometimes.

You can find this book here: The Development of Mathematics.

P.S. Let me know what you think of the book when you’re done.

P.P.S. You can take a look on my page on Patreon (click this text).

]]>

A first example of a law is the **map fusion**: **given any two functions f,g that are composable, map(f . g) = map f . map g **(by the “.” sign we mean functional composition). You can find more about the map function in my earlier post: A little bit of recursion – the map function.

The proof of this law is left as an exercise. You can prove it element by element, applying map(f.g) on the fixed general list [a1,…,an], where a1,..,an are elements from a fixed type a.

**Remark. We observe that the fusion law for the map function says that you can loop through a list once and apply f.g on every element on the list — or you can loop on the whole list and apply g, then loop again on the output list and apply f. It’s obvious that this law applied only once per list gives a lot of gain in efficiency (think about a list with several millions of elements, or even more).**

That being said, we are asked to prove the following equation: **cross(map f,map g) . unzip = unzip . map (cross(f,g)).**

We are given the definitions:

**cross function definition:**cross(f,g) = fork(f . fst,g . snd)**unzip function definition:**unzip = fork(map fst, map snd)**fork function definition:**fork :: (a->b , a->c) -> a -> (b,c); fork (f,g) x = (f x, g x)**fst function definition:**fst :: (a,b) -> a; fst(x,y) = x**snd function definition:**snd :: (a,b) -> b; snd(x,y) = y

We are given the following equational laws:

**Law 1 (cross and fork composition by “components”):**cross(f,g) . fork(h,k) = fork(f . h, g . k)**Law 2 (fork fusion law):**fork(f,g) . h = fork(f . h, g . h)**Law 3:**fst . cross(f,g) = f.fst**Law 4:**snd . cross(f,g) = g.snd

We restate what we have to prove:

**Lemma. cross(map f,map g) . unzip = unzip . map (cross(f,g))**

**Proof. **

**cross(map f, map g) . unzip**

= {**using unzip function definition**} cross(map f, map g).fork(map fst, map snd)

= {**using Law 1 of cross and fork composition**} fork(map f . map fst, map g . map snd)

= {**using the map fusion law**} fork(map(f . fst), map(g . snd))

= {**using Law 3 and Law 4**} fork( map(fst . cross(f,g)) , map(snd . cross(f,g)) )

= {**using the map fusion law**} fork( map fst . map (cross (f,g)) , map snd . map (cross (f,g)) )

= {**using Law 2 – fork fusion law**} fork(map fst, map snd) . map (cross (f,g))

={**unzip definition**}

**unzip . map (cross (f,g))**

**End of Proof.**

As you can see, the equational reasoning is a very powerful method to prove equational laws. And sometimes, as in the map fusion law, it can lead to optimizations that in other languages are almost impossible to see or prove.

P.S. You can take a look on my page on Patreon (click this text).

]]>