In this post I’ll show you how you can practically apply formal methods to obtain a correct algorithm to solve a problem. The chosen problem is a challenge posted on CodeEval:

Self Describing Numbers

Simply put, a self describing number is a number that has the property the digit found in a position is the same with the number of times that the digit is found in the number. An example will clarify the definition:

Example

1210 is viewed as follows: 1 is in the 0 -position, 2 is in the 1-position, 1 is in the 2-position, and 0 is in the 3-position

1210 is a self describing number because:
– in the position 0 we have the value 1, and 0 appears in 1210 exactly one time;
– in the position 1 we have the value 2, and 1 appears in 1210 exactly 2 times;
– in the position 2 we have the value 1, and 2 appears in 1210 exactly 1 time;
– in the position 3 we have the value 0, and 3 doesn’t appear in 1210 , which is the same with saying that 3 appears in 1210 exactly 0 times;

In conclusion, 1210 is a self describing number.

End of Example

To make the most out of formal methods (as I’ll exemplify in the solving of the current problem), you must learn a simple but powerful principle: you don’t have to formalize all the problem in order to solve it, but you must formalize the hardest part, in order to be sure you don’t introduce unexpected bugs. This is the clear way which will develop in you, as a programmer, the will and skill of doing programming with the goal with not introducing bugs in your software at all.

Back to our problem. The single thing that must be formalized in this problem is the definition of a self describing number, which will lead the way of developing an algorithm to solve it.

Observations

1. Let’s view a number like 1210 like an array of integers, in this way:
d[0,3] is an array of integers that represents 1210 in the sense that d = 1, d = 2, d = 1, d = 0. In this way, we can view the index positions as the indexes of the array d, and we can formalize our definition.

2. I introduce the following notation to designate the number of times d[i] appears in the array d[0,n-1]:

for a given i fixed, card{d[i]: d[i] is in d[0,n-1]}, where {d[i]: d[i] is in d[0,n-1]} simply means the “bag” (a set which allows duplicates) that contains all d[i] from d[0,n-1], and card means the number of the elements of that “bag”. We can formalize further the definition of card, using another notation from David Gries book “The science of programming”, but my notation is sufficiently clear for the purpose of solving this problem.
(Remark, you’ll find the concept of bag or multiset in the notions of Databases ans SQL Queries returning duplicate records. The records returned by an SQL query is nothing more than a bag. End of Remark)

End of Observations

Now, let’s proceed to the algorithm that decides a number is a self describing number or not.

A. The formalization of the definition of a self describing number

A number with n digits (which we view represented in the form of an array of integers d[0,n-1], where d = the first digit of the number, d is the second digit of the number, etc) is self describing number if:

For all 0<=i<=n-1 (card{i: i is in d[0,n-1]} = d[i])

B. The construction of the algorithm

The construction of the algorithm from the formalized definition comes very naturally now.

1. Let’s observe that the expression for all 0<=i<=n-1 in the formalized definition indicates that we must loop through all the indexes of the array d, so I’ll choose a for loop to do that.

2. The beauty of our formalized definition is in the fact that we can put it in the negated form:
A number (described like in definition by the array of integers d[0,n-1]) is not a self describing number if and only if exists 0<=i<=n-1 such that card{i: i is in d[0,n-1]} <> d[i].
This is the principle based on which we can decide (when we’re looping through the digits of the number) if a number is self describing or not.

We’ll start from the premise that the number is self describing and we’ll give the algorithm in pseudo-code and in C# implementation:

a) pseudo-code

bool is_self_describing_number = true;

for all 0<=i<=n-1
{
Step 1. Compute x = card{i: i is in d[0,n-1]}
Step 2. If x<>d[i], then is_self_describing_number = false
}

b) c# code

And my solution in C# for the challenge Self Describing Numbers is the following:

Sugestion: the problem is not solved in optimal time, but the scope is the correctness of the algorithm.
The problem can be solved with a single loop through the elements of the array d[0,n-1]. Please feel free to post an optimal solution!

If you want to learn more about constructing correct software, you can check these exceptional books, written in a clear and understandable style:

1. The father of the shortest path algorithm,
Edsger W. Dijkstra – A Discipline of Programming 