A Brief Forage into Functional Thinking

Seth Juarez12/13/2013

I have long considered software engineering to be a craft that must be honed over years of careful study and precise implementation. What we learn in the labs of school and contrived examples are indeed important but only when carefully used in our day to day work. As such my challenge to you, dear reader, is to try to find at least one application where thinking functionally might assist you in your daily endeavors to produce excellent software.

Years ago I took a graduate class in functional programming. The professor allowed each student to implement a series of exercises in a functional language of their choice. At first my imperative background clashed a great deal with this new (at least to me) approach to writing software. Finally the thought occurred to me: "think of functions as data types." Now while this revelation is indeed a simple one, it completely altered my approach to solving problems in software. Here we will look at the evolution of functions as data types in C# as well as a series of practical applications that produce more readable and concise code.

Delegates

There have been many approaches to describing delegates. I think the best and simplest is as follows:

A delegate declaration simply defines a new data type that is inherently a deferred unit of work.

Consider the following line of code:

int x = 3;


There are three basic parts to this statement: the data type, the variable name, and the assignment. While the following code is a bit more complex, the same basic elements also apply.

delegate int doWork(int x, int y);

static int add(int x, int y)
{
return x + y;
}

static void Main(string[] args)
{
}


The penultimate line is the one that is most similar to them simple integer declaration. We have:

1. The Data Type is doWork,
2. The variable name is work, and
3. The assigned value is called add

Notice that while the first and third elements are a bit different, the basic concepts are still very much the same.

Functions at their core have a basic signature that includes the return type, the parameters, and a name. Delegates are just a definition of a new function type. The challenge I had early on was thinking of functions as nouns rather than verbs. Using intellisense one can readily see that the variable work from above has a series of properties and methods (one of which includes executing the function). Just like objects, delegates can even be passed around as parameters to other functions. Above we see that we have declared a new delegate that returns and integer, takes two integers as parameters, and is called doWork.

The function (or method) add from above matches the delegate doWork in both the return type as well as in the required parameters. As such, creating a new variable called work and assigning it the value add is perfectly legal. Change any two aspects in the function add and the compiler is sure to complain. Now that we have this object called work we can treat it just like any other object.

Lambda Expressions

Lambda expressions are simply syntactic sugar over the same ideas from above. Below we see the exact functionality from above using a lambda expression:

static void Main(string[] args)
{
Func<int, int, int> work = (x, y) => x + y;
}


Instead of calling the delegate doWork, we use the built in generic delegate called Func. The basic premise behind this built in delegate is that parameters are the first set of types used in the generic declaration and the last is the return type. A similar delegate called Action is used when the function has no return type. The assignment portion has the actual lambda expression. The first portion provides names to the parameters used in the function. The arrow delineates the beginning of the actual function and the final section specifies the type of work performed. Notice that in this case there is no need to use the keyword return as this is implicit in the delegate itself. In the case of multi-line lambda expressions one need only use curly braces to begin and end the function declaration and specify the return value at the end.

An Example

I am currently working on an open source machine learning library. In it there is a considerable amount of linear algebra used to implement many of these algorithms. Oftentimes this requires the ability to slice a matrix based upon certain properties of the row or column vectors. Standard implementation would simply dictate one create a number of for loops to evaluate the appropriate criteria and then reassemble the correct matrix. Consider a more functional approach:

public Matrix this[Func<vector , bool> f, VectorType t]
{
get
{
int count = 0;
if (t == VectorType.Row)
{
// Part 1
for (int i = 0; i < Rows; i++)
if (f(this[i, t]))
count++;

// Part 2
Matrix m = new Matrix(count, Cols);
int j = -1;
for (int i = 0; i < Rows; i++)
if (f(this[i, t]))
m[++j, t] = this[i, t];

return m;
}
else ...
}
}


The usefulness of this approach is not readily apparent until it is actually used:

Matrix m = new[,]
{{ 1, 2, 3},
{ 4, 5, 6},
{ 7, 8, 9},
{ 4, 9, 1}};

Matrix col1 = new[,]
{{ 1, 3},
{ 4, 6},
{ 7, 9},
{ 4, 1}};

Assert.AreEqual(col1, m[v => v[0] != 2, VectorType.Column]);


Let’s first talk about the usage and then move on to the implementation. The example shows a matrix m and a matrix called col1. The last line simply instructs the matrix to construct a new matrix whose columns (second parameter) do not have their first member equal to 2 (first parameter). The beauty of doing it this way is that one can construct any number of new matrices by defining a deferred unit of execution that can be subsequently passed in to the matrix. The imperative alternative is to create a set of nested for loops every time a slice operation is required.

The implementation is quite simple and can be divided into two parts:

1. Figure out the new size of the matrix, and
2. Build the new matrix

Those two parts are clearly labeled in the implementation above. The delegate we are passing in takes a vector and returns a boolean. In essence its basic job is to tell us which elements to keep. The second parameter tells us whether we are working in the column or row space of the matrix. This minor change has allowed users of the machine learning library to more concisely work with matrices and vectors.

Conclusion

While the things we discussed are not earth shattering, keeping this main concept in mind can definitely change the way you have implemented any number of algorithms. Generally these kinds of approaches yield a more concise and compact representation of what we would have otherwise have implemented. I would love to see how! Drop me a line if you have found a piece of existing code that benefits from this slight shift in thinking.