The following are some personal notes on how a study of classical mathematical notations can help us to derive algorithms. The functions obtained in this way are simple enough to understand and clarify the technique used.

As an aside, I believe the role of intuition and invention is important in our work, but I believe also the study of the mechanisms behind our thought processes is important too. I say this because, if we want to derive complex algorithms based on more than simple hunches and common sense considerations, we need reliable reasoning methods that can help us in the development of ambitious projects.

Also, I think Edsger Dijkstra was right: we need (as much as we can) to be very explicit about what we’re doing. Without this, the future of computing will be that of software systems resembling biological systems and black magic (in this sense, read the text talk of Leslie Lamport – “The Future of Computing – Logic or Biology”).


All that being said, let’s proceed with our first example. In all that follows, I’ll work with Haskell lists [a], where “a” is a general type.

I’ll use also pattern matching and say that a list can be viewed as split into its head and its tail (x:xs). For example, the list [1,2,3] = 1:[2,3].

Example 1. (Sum function) Given a finite list of numbers, write a function that computes their sum.

Derivation: That looks simple enough and it’s appropriate for our more formal treatment. The first step is to use formal notations like in classic mathematics (using also Haskell notation). We have the following mathematical definition (MD)

sumfunc :: Num a => [a] -> a
sumfunc[x1,x2,…,xn] = x1 + x2 + … + xn. (MD)

The notation is a classical mathematical formalization that follows from the statement of the example 1.

The advantage of the technique is the notation enabled us to express example 1 in a more concise manner. It’s clear that sumfunc associates, for a given finite list [x1,x2,…,xn], its sum.

The downside of the story is that sumfunc can’t be simply implemented in Haskell, because we don’t have a real rule of computation for its formula.

But the formula provides us with a lot more. We can manipulate the symbols x1,…,xn following the simple rule of associativity:


= {definition of sumfunc}
x1 + x2 + … + xn
= {associativity of “+”}
x1  + (x2 + x3 + … + xn)
= {definition of sumfunc}
x1 + sumfunc[x2,x3,…,xn]

End of Derviation


So, by simple derivation from the mathematical formalization, we obtained the formula E0:

sumfunc[x1,x2,…,xn] = x1 + sumfunc[x2,x3,…,xn].

If we regard this formula like in Haskell, we have:

[x1,x2,…,xn] = x1:[x2,…,xn], and we note xs = [x2,…,xn].

In this way: sumfunc(x1:xs)  = x1 + sumfunc xs,

Or equivalently, in the more conventional notation:

sumfunc(x:xs) = x + sumfunc xs (E1)

All it’s fine by now, but let’s observe that the recursive definition E1 reduces, at every step, the list with one element. The question is, how long does it take before it stops the computation of sumfunc for a given list?

Let’s compute from the equation E0:

sumfunc[x1,x2,…,xn] = x1 + sumfunc[x2,x3,…,xn] = … = x1 + (x2 + (….(xn + sumfunc[]))) = {associativity of “+”} x1 + x2+ … + xn + sumfunc [] = {mathematical definition MD} x1 + x2 + … + xn.

We obtained, in this way, that: x1 + x2 + … + xn + sumfunc [] = x1 + x2 + … + xn. It follows immediately that sumfunc [] = 0. (E2)

Because every computation of sumfunc for a list ends in sumfunc [], we’re done.

We derived, in this way, the classical formula for the sum function on numerical lists:

sumfunc :: Num a => [a] -> a
sumfunc [] = 0
sumfunc (x:xs) = x + sumfunc xs

End of example 1.

Example 2. (Product function) Given a finite list of numbers, write a function that computes their product.

Derivation. Left as an exercise for the reader. It’s very similar with the derivation for the sum function.

End of example 2.

Example 3. (Map function) Given a function f :: a->b and a finite list, construct a function that produces a finite list whose elements are the mapping of function f on every element of the list given.


Let’s formalize the statement of the problem. What we want to construct is a function (we’ll call it mapfunc) which works something like this:

mapfunc( f, [x1,x2,…,xn] ) = [f x1, f x2, … ,f xn], for all f :: a-> b and x1,…,xn :: a (E0)

This is the well-known function map, and it’s used in a lot of applications. As an example, if f(x) = x+2, where x is an element of a numeric type, then we have mapfunc(f,[2,6,7]) = [4,8,9].

All is well and beautiful, but we have to convert the equation E0 into an equivalent recursive one, which can be used for implementation in a programming language like Haskell. Following the pattern of derivation from example 1, we’ll have:


= {definition of list operator “:”}
= {notation: xs = [x2,…,xn]}
    mapfunc(f,x1:xs) (R1)
= {equation E0}
[f x1,f x2,…,f xn]
= {definition of list operator “:”}
f x1 : [f x2, f x3, … , f xn]
= {from E0 we have: [f x2, … , f xn] = mapfunc(f,[x2,…,xn])}
f x1 : mapfunc(f,[x2,…,xn])
= {notation: xs = [x2,…,xn]}
   f x1 : mapfunc (f,xs) (R2)

End of derivation.

We derived, in this way, R1 = R2, which gives the equation:

mapfunc(f,x1:xs) = f x1 : mapfunc (f,xs)

or, rewritten in the conventional notation:

mapfunc(f,x:xs) = f x : mapfunc(f,xs). (E1)

The next question is: being a recursive function, how long does mapfunc compute an element?

Let’s use again the mathematical formalization E0 and compute on a general list [x1,x2,…,xn]:

[f x1,…,f xn] = mapfunc(f,x1:[x2,…,xn]) = f x1 : mapfunc(f,[x2,…,xn]) = … = f x1 : (f x2 : … (f xn : mapfunc(f,[]))…).

So, we have [f x1, … , f xn] = f x1 : (f x2: … (f xn : mapfunc(f,[]))…).

But we have also, from the definition of list operator “:”, that [f x1, …, f xn] = f x1 : (f x2 : … (f xn : [])…).

In this way, we deduce that mapfunc(f,[]) = []. (E2)

From the equations E1 and E2, we derived the well-known recursive function map (you can easily translate it in curried form):

mapfunc(f,[]) = []
mapfunc(f,x:xs) = f x : mapfunc(f,xs)

End of example 3.

I hope these simple examples make clear that formalizing a problem can help us a lot in deriving algorithms. I’ll write some posts about constructing more complex functions, this one is intended to exemplify on simpler examples how to derive recursive functions based on classical mathematics.

As a suggestion for reading, I recommend the exceptional article of C.A.R.Hoare: “An Axiomatic Basis for Computer Programming”. It’s written in imperative style, but the essence of the methods is applicable to functional programming too.

Also, you can read The Science of Programming (David Gries), very well written and with a lot of general applicable principles of formal methods for computing science.


Constantin Liculescu


Share Button