Function application in la-la land

Here’s a familiar scenario for a programmer: You have some useful function that you would like to apply to some values. It could be concat that concatinates two strings, or add that adds two integers, or cons which prepends an element to a list, or truncate which cuts off a string at the specified length, or indeed any old function f you come up with which takes a bunch of arguments and produces some result.

Simple, right? But there’s a twist! The values you’d like to apply your function to are all trapped in la-la land! And once you have values in la-la land, it’s not obvious how you’d go about getting them out of there. It really depends on the kind of la-la land your values are in. It’s sort of like being trapped in the afterlife. You might be able to return to the land of the living, but it’s not trivial. Certainly not something you’d want your pure, innocent function to have to deal with!

You might wonder how the values ended up in la-la land in the first place. In many cases, they were born there. They are la-la land natives – it’s the only existence they’ve ever known. It sounds weird, but it’s surprisingly common. Indeed, many programs contain not one but several distinct la-la lands, each with their own peculiar laws and customs! Some familiar la-la lands in the .NET world include Task, Nullable and List.

Since la-la lands are so pervasive in our programs, clearly we need to be able to apply functions to the values that dwell there. Previously we’ve seen that if your la-la land is a Functor, there is a Map function that lets us do that. But there is a problem: Map cannot work with any of the functions I mentioned above. The reason is that they all take more than one argument. Map can transform a single value of type T1 inside la-la land to a single value of type T2 inside la-la land. What Map does is teleport the T1 value out of la-la land, apply your function to obtain a T2 value, and teleport that back into la-la land. You can of course map multiple times, but you’ll still involving just one la-la land value at a time. So that’s not going to work.

What alternatives do we have? Well, one idea that springs to mind is partial application. If we had a curried function, we could apply it to the la-la land values one by one, producing intermediate functions until we have the final result. For instance, say we have a curried version of add which looks like this:

Func<int, Func<int, int>> add = a => b => a + b;

Now we have a single-argument function that returns a single-argument function that returns a value. So we can use it like this:

Func<int, Func<int, int&gt> add = a => b => a + b;
Func<int, int> incr = add(1);
int four = incr(3);

Unfortunately, this still won’t work with Map. What would happen if we passed the curried add to Map? We would get an incr function stuck inside of la-la land! And then we’d be stuck too. But what if we replaced Map with something that could work with functions living in la-la land? Something like this:

Lala<TR> Apply<T, TR>(Lala<Func<T, TR>> f, Lala<T> v);

What Apply needs to do is teleport both the function and the value out of la-la land, apply the function to the value, and teleport the result back into la-la land.

How would this work with our curried add function? Well, first we’d need to teleport the function itself into la-la land. For this, we need a function, which we’ll call Pure. It looks like this:

Lala<T> Pure<T>(T val);

In other words, Pure is a one-way portal to la-la land.

Let’s see how this would work for our curried add function:

static Lala<int> AddLalaIntegers(Lala<int> a, Lala<int> b) 
{
    Func<int, Func<int, int>> add = a => b => a + b;
    Lala<Func<int, Func<int, int>>> lalaAdd = Lala.Pure(add);
    Lala<Func<int, int>> lalaPartial = Lala.Apply(lalaAdd, a);
    Lala<int> lalaResult = Lala.Apply(lalaPartial, b);
    return lalaResult;    
}

Success! Who would have thought?

Well, someone, obviously. It turns out that la-la lands that support Pure and Apply are known as Applicative.

But there are still questions worth asking, such as: How do we implement these functions? Like Map, Pure and Apply must obey the laws of the particular la-la land they work with. We’re going to look at two examples in C#.

First, consider the la-la land known as Task<T>.

public static class TaskApplicative 
{
    public static Task<T> Pure(T val) 
    {
        return Task.FromResult(val);
    }

    public static async Task<TR> Apply<T, TR>(
        this Task<Func<T, TR> funTask, 
        Task<T> valTask)
    {
        var fun = await funTask;
        var val = await valTask;
        return fun(val);
    }
}

Awaiting the tasks bring them out of Task-land, and the return value is automatically transported back by the async machinery.

Second, imagine a type called Mayhaps<T>. Mayhaps is like Nullable, but it works on any type T. Why is this important? Because delegates are reference types, which means they can’t be put inside a Nullable. In other words, functions are not allowed into the la-la land that is Nullable. So Mayhaps it is.

Mayhaps has two possible values, Indeed and Sorry. Indeed holds a value, Sorry does not. That’s really all you need to know about Mayhaps. (For implementation details, look here.)

Here are Pure and Apply for Mayhaps:

public static class MayhapsApplicative
{
    public static Mayhaps<TR> Pure<TR>(TR v)
    {
        return Mayhaps<TR>.Indeed(v);
    }

    public static Mayhaps<TR> Apply<T, TR>(
        this Mayhaps<Func<T, TR>> mayhapsFunction,
        Mayhaps<T> mayhapsValue)
    {
        if (mayhapsFunction.HasValue && mayhapsValue.HasValue)
        {
            var fun = mayhapsFunction.Value;
            var val = mayhapsValue.Value;
            return Mayhaps<TR>.Indeed(fun(val));
        }
        else
        {
            return Mayhaps<TR>.Sorry;
        }
    }
}

The semantics of Mayhaps is to propagate Sorry – you can only get a new Indeed if you have both a function and a value.

And of course the nice thing now is that we can separate our logic from the idiosyncracies of each la-la land! Which is pretty great.

But I’ll admit that we’re currently in a situation where calling a function is a little bit involved and awkward. It’s involved because there’s quite a bit of boilerplate, and it’s awkward because working with curried functions and partial application isn’t necessarily the bread and butter of C# programming. So let’s write some helper functions to alleviate some of that pain.

We can start by writing functions to curry Funcs, which should reduce the awkward. There are quite a few of them; here’s an example that curries a Func with four input parameters:

public static Func<T1, Func<T2, Func<T3, Func<T4, TR>>>> Curry<T1, T2, T3, T4, TR>(
    this Func<T1, T2, T3, T4, TR> f)
{
    return a => b => c => d => f(a, b, c, d);
}

We can use it like this:

Func<int, int, int, int, int> sirplusalot = 
    (a, b, c, d) => a + b + c + d; 
Func<int, Func<int, Func<int, Func<int, int>>>> = 
    sirplusalot.Curry();

A little less awkward. What about involved? We’ll define some helper functions to reduce the boilerplate. The idea is to use a function Lift to handle pretty much everything for us. Here is one that can be used with sirplusalot:

public static Lala<TR> Lift<T1, T2, T3, T4, TR>(
    this Func<T1, T2, T3, T4, TR> f,
    Lala<T1> v1,
    Lala<T2> v2,
    Lala<T3> v3,
    Lala<T4> v4)
{
    return Pure(f.Curry()).Apply(v1).Apply(v2).Apply(v3).Apply(v4);
}

Note that all Lift functions will have the same structure, regardless of which la-la land they operate in. Only the implementations of Pure and Apply will vary.

And now we can implement functions that look like this:

private async static Task<int> Plus(
    Task<int> ta, 
    Task<int> tb, 
    Task<int> tc, 
    Task<int> td) 
{ 
    Func<int, int, int, int, int> sirplusalot = 
        (a, b, c, d) => a + b + c + d;
    return await sirplusalot.Lift(ta, tb, tc, td);
}

private static Mayhaps<int> Plus(
    Mayhaps<int> ma, 
    Mayhaps<int> mb, 
    Mayhaps<int> mc, 
    Mayhaps<int> md)
{
    Func<int, int, int, int, int> sirplusalot = 
        (a, b, c, d) => a + b + c + d;
    return sirplusalot.Lift(ma, mb, mc, md);
}

Which is quite nice? Yes?


How to reduce bunches of things

So there you are, a pragmatic C# programmer out to provide business value for your end users and all that stuff. That’s great.

One of the (admittedly many) things you might want to do is reduce a bunch of things of some type into a single thing of that type. For instance, you might want to add a bunch of numbers together, or concatinate a bunch of strings and so on. How would you do that? (Assuming there’s no built-in Aggregate method available, that is.) Well, you’d write a Reduce function, right? And since we haven’t specified in advance what kinds of things we should reduce, we better make it generic. So it could work on an IEnumerable<T> of things.

Now how should the actual reduction take place? An obvious idea is to do it stepwise. It’s both a good problem solving strategy in general, and kind of necessary when dealing with an IEnumerable. For that to work, though, you need some way of taking two values and combining them to produce a single value. So Reduce needs to be a higher-order function. The caller should pass in a combine function, as well as some initial value to combine with the first element. And then the completed function might look something like this:

public static T Reduce(this IEnumerable<T> things, 
  Func<T, T, T> combine, 
  T initialValue) 
{
  T result = initialValue;
  foreach (var t in things) 
  {
    result = combine(result, t);
  }
  return result;
}

And now if you have a bunch of integers, say, you can add them all up like this:

var integers = new [] { 1, 2, 3, 4 };
var sum = integers.Reduce((a, b) => a + b, 0);

If, on the other hand, you have a bunch of lists, you’d do something like this instead:

var lists = new [] {
  new List { 1 },
  new List { 2, 3 }
};
var sum = lists.Reduce((a, b) => 
  {
    var list = new List();
    list.AddRange(a);
    list.AddRange(b);
    return list;
  },
  new List());

And this would give you the list of elements 1, 2, 3. Great.

Now there are other things you might wonder about with respect to the combine function. For whatever reason, you might want to consider alternative implementations of Reduce. For instance, you’d might like to create batches of n things, reduce each batch, and then reduce those results for the final result. It would be nice to have that freedom of implementation. For that to be an option, though, you need your combine function to be associative.

Assume you have three values t1, t2, t3. Your combine function is associative if the following holds:

combine(t1, combine(t2, t3)) == combine(combine(t1, t2), t3)

Unfortunately there is nothing in the C# type system that lets us specify and verify that a function is associative, so we need to rely on documentation and discipline for that.

Alternatively, we can turn to mathematics. It turns out that mathematicians have a name for precisely the kind of thing we’re talking about. A semigroup is a structure that consists of a set of values and an associative binary operation for combining such values. Granted, it’s a strange-sounding name, but it identifies a very precise concept that gives us something to reason about. So it’s a useful abstraction that actually gives us some guarantees that we can rely on when programming.

To represent a semigroup in our program, we can introduce an interface:

public interface ISemigroup<T>
{
  T Combine(T a, T b);
}

And we can modify our Reduce function to work with semigroups, which by definition guarantees that the Combine function is associative.

public static T Reduce<T>(this IEnumerable<T> things, 
  ISemigroup<T> semigroup, 
  T initialValue)
{
  T result = initialValue;
  foreach (var thing in things)
  {
    result = semigroup.Combine(result, thing);
  }
  return result;
}

And we can introduce a bunch of concrete implementations of this interface, like:

class IntegerUnderAdditionSemigroup : ISemigroup<int>
{
  public int Combine(int a, int b)
  {
    return a + b;
  }
}

class IntegerUnderMultiplicationSemigroup : ISemigroup<int>
{
  public int Combine(int a, int b)
  {
    return a * b;
  }
}

class StringSemigroup : ISemigroup<string>
{
  public string Combine(string a, string b) 
  {
    return a + b;
  }
}

class ListSemigroup<T> : ISemigroup<List<T>> 
{
  public List Combine(List a, List b) 
  {
    var result = new List();
    result.AddRange(a);
    result.AddRange(b);
    return result;
  }
}

class FuncSemigroup<T> : ISemigroup<Func<T, T>>
{
  public Func<T, T> Combine(Func<T, T> f, Func<T, T> g) 
  {
    return it => g(f(it));
  }
}

So that’s quite nice. We can rely on meaningful and precise abstractions to give us some guarantees in our programs.

There is still a small problem when working with semigroups for reduction though. What should the initial value be? We really just want to reduce a bunch of values of some type, we don’t want to be bothered with some additional value.

One approach, I guess, would be to just pick the first value and then perform reduce on the rest.

public static T Reduce(this IEnumerable<T> things, 
  ISemigroup<T> semigroup)
{
  return things.Skip(1).Reduce(semigroup, things.First();
}

This would work for non-empty bunches of things. But that means we’d have to check for that in some way before calling Reduce. That’s quite annoying.

What would be useful is some sort of harmless value that we could combine with any other value and just end up with the other value. So we could just use that magical value as the initial value for our Reduce.

Luckily, it turns out that there are such magical values for all the semigroups we’ve looked at. In fact, we’ve seen two such values already. For integers under addition, it’s zero. For lists, it’s the empty list. But there are others. For integers under multiplication, it’s one. For strings, it’s the empty string. And for functions it’s the identity function, which just returns whatever value you hand it. Now if you can provide such a value, which is called the unit value, for your semigroup, you get what the mathematicians call a monoid. It’s another intensely unfamiliar-sounding name, but again the meaning is very precise.

We can represent monoids in our programs by introducing another interface:

public interface IMonoid<T> : ISemigroup<T> 
{
  T Unit { get; }
}

So there is nothing more to a monoid than exactly this: it’s a semigroup with a unit value. And the contract that the unit value operates under is this:

Compose(Unit, t) == Compose(t, Unit) == t

This just says that the unit value is magical in the sense we outlined. We can combine it with any value t any way we want, and we end up with t.

Now we can write a new Reduce function that works on monoids:

public static T Reduce(this IEnumerable<T> things, 
  IMonoid<T> monoid)
{
  return things.Reduce(monoid, monoid.Unit);
}

This is quite nice, because we don’t have to worry any more about whether or not the bunch of things is empty. We can proceed to implement concrete monoids that we might want to use.

class IntegerUnderAdditionMonoid 
  : IntegerUnderAdditionSemigroup, IMonoid<int>
{
  public int Unit
  {
    get { return 0; }
  }
}

class IntegerUnderMultiplicationMonoid 
  : IntegerUnderMultiplicationSemigroup, IMonoid<int>
{
  public int Unit
  {
    get { return 1; }
  }
}

class StringMonoid : StringSemigroup, IMonoid<string>
{
  public string Unit
  {
    get { return ""; }
  }
}

class ListMonoid<T> 
  : ListSemigroup<T>, IMonoid<List<T>>
{
  public List<T> Unit
  {
    get { return new List<T>(); }
  }
}

class FuncMonoid<T> : FuncSemigroup<T>, IMonoid<Func<T, T>>
{
  public Func<T, T> Unit 
  {
    get { return it => it; }
  }
}

And we might write a small test program to see if they work as intended.

public static void Main(string[] args)
{
  var integers = new[] { 1, 2, 4, 8 };
  var sum = integers.Reduce(new IntegerUnderAdditionMonoid());
  var product = integers.Reduce(new IntegerUnderMultiplicationMonoid());
  var strings = new[] { "monoids", " ", "are", " ", "nifty" };
  var str = strings.Reduce(new StringMonoid());
  var lists = new[] {
    new List { "monoids", " " },
    new List { "are" },
    new List { " ", "nice" }
  };
  var list = lists.Reduce(new ListMonoid());
  var str2 = list.Reduce(new StringMonoid());
  var integerFunctions = new Func<T, T>[] { it => it + 1, it => it % 3 };
  var intFun = integerFunctions.Reduce(new FuncMonoid());
  var stringFunctions = new Func<T, T>[] { s => s.ToUpper(), s => s.Substring(0, 5) };
  var strFun = stringFunctions.Reduce(new FuncMonoid());

  Console.WriteLine(sum);
  Console.WriteLine(product);
  Console.WriteLine(str);
  Console.WriteLine(list.Count);
  Console.WriteLine(str2);
  Console.WriteLine(intFun(1));
  Console.WriteLine(intFun(2));
  Console.WriteLine(strFun("hello world"));
  Console.WriteLine(strFun(str));
}

Can you work out what the program will print? If not, you might want to try to run it.

Hopefully this post gives some indication of the flexibility and power that can come with very simple abstractions. It might even give you a creeping sensation that these Haskell heads are onto something when they claim that mathematics that studies structure and composition can be useful for programmers. At the face of things, the processes of adding up integers, concatenating strings, appending lists and composing functions seem quite different, but structurally they nevertheless share some fundamental traits that can be leveraged to great effect.


LINQ to Nullable

Things got a bit out of hand today.

It all started when I added a point to the agenda for our backend team meeting saying I’d explain real quick what a functor is – or at least what my understanding of what a functor is is. And so I did.

Now the explanation itself didn’t go half bad, I don’t think. While I’m sure I would have offended mathematicians and possibly some haskellites, they weren’t there. Instead, the room was filled with C# programmers.

I think I said something like the following. Assume you have a parameterized type S<T>, where S defines some structure on top of type T. The obvious example for a C# programmer would be an IEnumerable<T>, but of course there are others, including Task<T> and Nullable<T> and indeed Whatever<T>. Now if you have such an S and a mapping function that given some S<T> and a function from T to U produces an S<U> then you almost have a functor already! In addition to that, you just need to make sure that your mapping is well-behaved in a sense. First, mapping the identity function over a structure shouldn’t change it. So if you map it => it over some structure S, that should just give you the same structure you started with. And second, assume you have a function f from T to U and a function g from U to V. If you map f over S to yield S<U> and then map g over that to yield S<V>, that should give you the same result as mapping the composed function it => g(f(it)) over S<T>.

To illustrate, I explained that Nullable<T> is a functor – or at least it should be. And it would be, if we defined the appropriate mapping function for Nullable<T>. So I wrote the following on the whiteboard:

public static class NullableExtensions {
  public static TTarget? Select<TSource, TTarget>(
      this TSource? t, 
      Func<TSource, TTarget> selector)
    where TSource : struct
    where TTarget : struct
  {
    return t.HasValue ? (TTarget?) selector(t.Value) : null;
  }
}

So this is our mapping function, even though I named it Select, which is the name used in the C# and LINQ world. A benefit of this function is that you no longer have to manually handle the mundane issues of worrying about whether or not some Nullable<T> is null. So instead of writing code like this, which resembles something from our code base:

Duration? duration = null;
if (thing.Frames.HasValue)
{
  var ms = thing.Frames.Value * 40;
  duration = Duration.FromMilliseconds(ms);
}

You can write this instead:

Duration? duration = thing.Frames.Select(fs => Duration.FromMilliseconds(fs * 40));

I think it is quite nice – at least if you can get comfortable calling an extension method on something that might be null. But from this point on, things started to go awry. But it wasn’t my fault! They started it!

See, some of the people in the meeting said they kind of liked the approach, but argued that Map would be a better name because it would avoid confusion with Select, which is associated with LINQ and IEnumerable<T>. In some sense, this was the opposite argument I used for choosing Select over Map in the first place! I thought it would make sense to call it Select precisely because that’s the name for the exact same thing for another kind of structure.

So as I left the meeting, I started wondering. I suspected that there really was nothing particular that tied LINQ and the query syntax to IEnumerable<T>, which would mean you could use it for other things. Other functors. And so I typed the following into LinqPad:

DateTime? maybeToday = DateTime.Today;
var maybeTomorrow = from dt in maybeToday select dt.AddDays(1);
maybeTomorrow.Dump();

And it worked, which I thought was pretty cool. I consulted the C# specification and found that as long as you implement methods of appropriate names and signatures, you can use the LINQ query syntax. And so I decided to let functors be functors and just see what I could do with Nullables using LINQ. So I wrote this:

public static TSource? Where<TSource>(
    this TSource? t, 
    Func<TSource, bool> predicate)
  where TSource : struct
  {
    return t.HasValue && predicate(t.Value) ? t : null;
  }

Which allowed me to write

DateTime? MaybeSaturday(DateTime? maybeDateTime)
{
  return
    from dt in maybeDateTime
    where dt.DayOfWeek == DayOfWeek.Friday
    select dt.AddDays(1);
}

Which will return null unless it’s passed a Nullable that wraps a DateTime representing a Friday. Useful.

It should have stopped there, but the C# specification is full of examples of expressions written in query syntax and what they’re translated into. For instance, I found that implementing this:

public static TTarget? SelectMany<TSource, TTarget>(
    this TSource? t, 
    Func<TSource, TTarget?> selector)
  where TSource : struct
  where TTarget : struct
{
  return t.HasValue ? selector(t.Value) : null;
}

public static TResult? SelectMany<TSource, TIntermediate, TResult>(
    this TSource? t, 
    Func<TSource, TIntermediate?> selector, 
    Func<TSource, TIntermediate, TResult> resultSelector)
  where TSource : struct
  where TIntermediate : struct
  where TResult : struct
{
  return t.SelectMany(selector)
          .Select(it => resultSelector(t.Value, it));
}

I could suddenly write this, which is actually quite nice:

TimeSpan? Diff(DateTime? maybeThis, DateTime? maybeThat)
{
  return
    from dt1 in maybeThis
    from dt2 in maybeThat
    select (dt2 - dt1);
}

It will give you a wrapped TimeSpan if you pass it two wrapped DateTimes, null otherwise. How many checks did you write? None.

And as I said, it sort of got a bit out of hand. Which is why I now have implementations of Contains, Count, Any, First, FirstOrDefault, even Aggregate, and I don’t seem to be stopping. You can see the current state of affairs here.

What I find amusing is that you can usually find a reasonable interpretation and implementation for each of these functions. Count, for instance, will only ever return 0 or 1, but that sort of makes sense. First means unwrapping the value inside the Nullable<T> without checking that there is an actual value there. Any answers true if the Nullable<T> holds a value. And so on and so forth.

Finally, as an exercise for the reader: what extension methods would you write to enable this?

static async Task Greet()
{
  var greeting =
    from v1 in Task.FromResult("hello")
    from v2 in Task.FromResult("world")
    select (v1 + " " + v2);

  Console.WriteLine(await greeting);
}

Picture combinators and recursive fish

On February 9th 2017, I was sitting in an auditorium in Krakow, listening to Mary Sheeran and John Hughes give the opening keynote at the Lambda Days conference. It was an inspired and inspiring keynote, that discussed some of the most influential ideas in some of the most interesting papers written on functional programming. You should absolutely check it out.

One of the papers that was mentioned was Functional Geometry by Peter Henderson, written in 1982. In it, Henderson shows a deconstruction of an Escher woodcut called Square Limit, and how he can elegantly reconstruct a replica of the woodcut by using functions as data. He defines a small set of picture combinators – simple functions that operate on picture functions – to form complex pictures out of simple ones.

Escher’s original woodcut looks like this:

escher-square-limit-309.png

Which is pretty much a recursive dream. No wonder Henderson found it so fascinating – any functional programmer would.

As I was listening the keynote, I recalled that I had heard about the paper before, in the legendary SICP lectures by Abelson and Sussman (in lecture 3A, in case you’re interested). I figured it was about time I read the paper first hand. And so I did. Or rather, I read the revised version from 2002, because that’s what I found online.

And of course one thing led to another, and pretty soon I had implemented my own version in F#. Which is sort of why we’re here. Feel free to tag along as I walk through how I implemented it.

A key point in the paper is to distinguish between the capability of rendering some shape within a bounding box onto a screen on the one hand, and the transformation and composition of pictures into more complex pictures on the other. This is, as it were, the essence of decoupling through abstraction.

Our basic building block will be a picture. We will not think of a picture as a collection of colored pixels, but rather as something that is capable of scaling and fitting itself with respect to a bounding box. In other words, we have this:

type Picture : Box -> Shape list

A picture is a function that takes a box and creates a list of shapes for rendering.

What about the box itself? We define it using three vectors a, b and c.

type Vector = { x : float; y : float }
type Box = { a : Vector; b : Vector; c : Vector}

The vector a denotes the offset from the origin to the bottom left corner of the box. The vectors b and c span out the bounding box itself. Each vector is defined by its magnitude in the x and y dimensions.

For example, assume we have a picture F that will produce the letter F when given a bounding box. A rendering might look like this:

basic-f

But if we give F a different box, the rendering will look different too:

skewed-f-image

So, how do we create and render such a magical, self-fitting picture?

We can decompose the problem into three parts: defining the basic shape, transforming the shape with respect to the bounding box, and rendering the final shape.

We start by defining a basic shape relative to the unit square. The unit square has sides of length 1, and we position it such that the bottom left corner is at (0, 0) and top right corner is at (1, 1). Here’s a definition that puts a polygon outlining the F picture inside the unit square:

let fShape = 
  let pts = [ 
    { x = 0.30; y = 0.20 } 
    { x = 0.40; y = 0.20 }
    { x = 0.40; y = 0.45 }
    { x = 0.60; y = 0.45 }
    { x = 0.60; y = 0.55 }
    { x = 0.40; y = 0.55 }
    { x = 0.40; y = 0.70 }
    { x = 0.70; y = 0.70 }
    { x = 0.70; y = 0.80 }
    { x = 0.30; y = 0.80 } ]
  Polygon { points = pts }

To make this basic shape fit the bounding box, we need a mapping function. That’s easy enough to obtain:

let mapper { a = a; b = b; c = c } { x = x; y = y } =
   a + b * x + c * y

The mapper function takes a bounding box and a vector, and produces a new vector adjusted to fit the box. We’ll use partial application to create a suitable map function for a particular box.

As you can see, we’re doing a little bit of vector arithmetic to produce the new vector. We’re adding three vectors: a, the vector obtained by scaling b by x, and the vector obtained by scaling c by y. As we proceed, we’ll need some additional operations as well. We implement them by overloading some operators for the Vector type:

static member (+) ({ x = x1; y = y1 }, { x = x2; y = y2 }) =
    { x = x1 + x2; y = y1 + y2 }

static member (~-) ({ x = x; y = y }) =
    { x = -x; y = -y }

static member (-) (v1, v2) = v1 + (-v2)

static member (*) (f, { x = x; y = y }) =
    { x = f * x; y = f * y }

static member (*) (v, f) = f * v

static member (/) (v, f) = v * (1 / f)

This gives us addition, negation, subtraction, scalar multiplication and scalar division for vectors.

Finally we need to render the shape in some way. It is largely an implementation detail, but we’ll take a look at one possible simplified rendering. The code below can be used to produce an SVG image of polygon shapes using the NGraphics library.

type PolygonShape = { points : Vector list }

type Shape = Polygon of PolygonShape

let mapShape m = function 
  | Polygon { points = pts } ->
    Polygon { points = pts |> List.map m }

let createPicture shapes = 
   fun box ->
     shapes |> List.map (mapShape (mapper box))

let renderSvg width height filename shapes = 
  let size = Size(width, height)
  let canvas = GraphicCanvas(size)
  let p x y = Point(x, height - y) 
  let drawShape = function 
  | Polygon { points = pts } ->
    match pts |> List.map (fun { x = x; y = y } -> p x y) with 
    | startPoint :: t ->
      let move = MoveTo(startPoint) :> PathOp
      let lines = t |> List.map (fun pt -> LineTo(pt) :> PathOp) 
      let close = ClosePath() :> PathOp
      let ops = (move :: lines) @ [ close ] 
      canvas.DrawPath(ops, Pens.Black)
    | _ -> ()
  shapes |> List.iter drawShape
  use writer = new StreamWriter(filename)
  canvas.Graphic.WriteSvg(writer)

When we create the picture, we use the mapShape function to apply our mapping function to all the points in the polygon that makes up the F. The renderSvg is used to do the actual rendering of the shapes produced by the picture function.

Once we have the picture abstraction in place, we can proceed to define combinators that transform or compose pictures. The neat thing is that we can define these combinators without having to worry about the rendering of shapes. In other words, we never have to pry open our abstraction, we will trust it to do the right thing. All our work will be relative, with respect to the bounding boxes.

We start with some basic one-to-one transformations, that is, functions with this type:

type Transformation = Picture -> Picture

The first transformation is turn, which rotates a picture 90 degrees counter-clockwise around its center (that is, around the center of its bounding box).

The effect of turn looks like this:

turn-f

Note that turning four times produces the original picture. We can formulate this as a property:

(turn >> turn >> turn >> turn) p = p

(Of course, for pictures with symmetries, turning twice or even once might be enough to yield a picture equal to the original. But the property above should hold for all pictures.)

The vector arithmetic to turn the bounding box 90 degrees counter-clockwise is as follows:

(a’, b’, c’) = (a + b, c, -b)

And to reiterate: the neat thing is that this is all we need to consider. We define the transformation using nothing but this simple arithmetic. We trust the picture itself to cope with everything else.

In code, we write this:

let turnBox { a = a; b = b; c = c } =
    { a = a + b; b = c; c = -b }

let turn p = turnBox >> p

The overloaded operators we defined above makes it very easy to translate the vector arithmetic into code. It also makes the code very easy to read, and hopefully convince yourself that it does the right thing.

The next transformation is flip, which flips a picture about the center vertical axis of the bounding box.

Which might sound a bit involved, but it’s just this:

flip-f

Flipping twice always produces the same picture, so the following property should hold:

(flip >> flip) p = p

The vector arithmetic is as follows:

(a’, b’, c’) = (a + b, -b, c)

Which translates neatly to:

let flipBox { a = a; b = b; c = c } =
   { a = a + b; b = -b; c = c }

let flip p = flipBox >> p

The third transformation is a bit peculiar, and quite particular to the task of mimicking Escher’s Square Limit, which is what we’re building up to. Henderson called the transformation rot45, but I’ll refer to it as toss, since I think it resembles light-heartedly tossing the picture up in the air:

toss-f

What’s going on here? Its a 45 degree counter-clockwise rotation around top left corner, which also shrinks the bounding box by a factor of √2.

It’s not so easy to define simple properties that should hold for toss. For instance, tossing twice is not the same as turning once. So we won’t even try.

The vector arithmetic is still pretty simple:

(a’, b’, c’) = (a + (b + c) / 2, (b + c) / 2, (c − b) / 2)

And it still translates very directly into code:

let tossBox { a = a; b = b; c = c } =
  let a' = a + (b + c) / 2
  let b' = (b + c) / 2
  let c' = (c − b) / 2
  { a = a'; b = b'; c = c' }

let toss p = tossBox >> p

That’s all the transformations we’ll use. We can of course combine transformations, e.g:

(turn >> turn >> flip >> toss)

Which produces this:

turn-turn-flip-toss

We proceed to compose simple pictures into more complex ones. We define two basic functions for composing pictures, above and beside. The two are quite similar. Both functions take two pictures as arguments; above places the first picture above the second, whereas beside places the first picture to the left of the second.

above-beside.png

Here we see the F placed above the turned F, and the F placed beside the turned F. Notice that each composed picture forms a square, whereas each original picture is placed within a half of that square. What happens is that the bounding box given to the composite picture is split in two, with each original picture receiving one of the split boxes as their bounding box. The example shows an even split, but in general we can assign a fraction of the bounding box to the first argument picture, and the remainder to the second.

For implementation details, we’ll just look at above:

let splitHorizontally f box =
  let top = box |> moveVertically (1. - f) |> scaleVertically f  
  let bottom = box |> scaleVertically (1. - f)
  (top, bottom)

let aboveRatio m n p1 p2 =
  fun box ->
    let f = float m / float (m + n)
    let b1, b2 = splitHorizontally f box
    p1 b1 @ p2 b2

let above = aboveRatio 1 1

There are three things we need to do: work out the fraction of the bounding box assigned to the first picture, split the bounding box in two according to that fraction, and pass the appropriate bounding box to each picture. We “split” the bounding box by creating two new bounding boxes, scaled and moved as appropriate. The mechanics of scaling and moving is implemented as follows:

let scaleVertically s { a = a; b = b; c = c } = 
  { a = a
    b = b 
    c = c * s }

let moveVertically offset { a = a; b = b; c = c } = 
  { a = a + c * offset
    b = b
    c = c }

Now we can create more interesting images, such as this one:

composite-f

Which is made like this:

above (beside (turn (turn (flip p))) (turn (turn p)))
      (beside (flip p) p)

With this, our basic toolset is complete. Now it is time to lose the support wheels and turn our attention to the original task: creating a replica of Henderson’s replica of Escher’s Square Limit!

We start with a basic picture that is somewhat more interesting than the F we have been using so far.

According to the paper, Henderson created his fish from 30 bezier curves. Here is my attempt at recreating it:

Henderson's fish

You’ll notice that the fish violates the boundaries of the unit square. That is, some points on the shape has coordinates that are below zero or above one. This is fine, the picture isn’t really bound by its box, it’s just scaled and positioned relative to it.

We can of course turn, flip and toss the fish as we like.

Henderson's fish (turned, flipped and tossed)

But there’s more to the fish than might be immediately obvious. After all, it’s not just any fish, it’s an Escher fish. An interesting property of the fish is shown if we overlay it with itself turned twice.

We define a combinator over that takes two pictures and places both pictures with respect to the same bounding box. And voila:

overlay-fish

As we can see, the fish is designed so that it fits together neatly with itself. And it doesn’t stop there.

The t tile

This shows the tile t, which is one of the building blocks we’ll use to construct Square Limit. The function ttile creates a t tile when given a picture:

let ttile f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over fishN fishE)

Here we see why we needed the toss transformation defined earlier, and begin to appreciate the ingenious design of the fish.

The second building block we’ll need is called tile u. It looks like this:

The u tile

And we construct it like this:

let utile (f : Picture) = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over fishN fishW)
       (over fishE fishS)

To compose the Square Limit itself, we observe that we can construct it from nine tiles organized in a 3×3 grid. We define a helper function nonet that takes nine pictures as arguments and lays them out top to bottom, left to right. Calling nonet with pictures of the letters H, E, N, D, E, R, S, O, N produces this result:

H-E-N-D-E-R-S-O-N

The code for nonet looks like this:

let nonet p q r s t u v w x =
  aboveRatio 1 2 (besideRatio 1 2 p (beside q r))
                 (above (besideRatio 1 2 s (beside t u))
                        (besideRatio 1 2 v (beside w x)))

Now we just need to figure out the appropriate pictures to pass to nonet to produce the Square Limit replica.

The center tile is the easiest: it is simply the tile u that we have already constructed. In addition, we’ll need a side tile and a corner tile. Each of those will be used four times, with the turn transformation applied 0 to 3 times.

Both side and corner have a self-similar, recursive nature. We can think of both tiles as consisting of nested 2×2 grids. Similarly to nonet, we define a function quartet to construct such grids out of four pictures:

let quartet p q r s = above (beside p q) (beside r s)

What should we use to fill our quartets? Well, first off, we need a base case to terminate the recursion. To help us do so, we’ll use a degenerate picture blank that produces nothing when given a bounding box.

We’ll discuss side first since it is the simplest of the two, and also because corner uses side. The base case should look like this:

side 1 fish

For the recursive case, we’ll want self-similar copies of the side-tile in the top row instead of blank pictures. So the case one step removed from the base case should look like this:

side 2 fish

The following code helps us construct sides of arbitrary depth:

let rec side n p = 
  let s = if n = 1 then blank else side (n - 1) p
  let t = ttile p
  quartet s s (t |> turn) t

This gives us the side tile that should be used as the “north” tile in the nonet function. We obtain “west”, “south” and “east” as well by turning it around once, twice or thrice.

Creating a corner is quite similar to creating a side. The base case should be a quartet consisting of three blank pictures, and a u tile for the final, bottom right picture. It should look like this:

corner 1 fish

The recursive case should use self-similar copies of both the corner tile (for the top left or “north-west” picture) and the side tile (for the top right and bottom left pictures), while keeping the u tile for the bottom right tile.

corner 2 fish

Here’s how we can write it in code:

let rec corner n p = 
  let c, s = if n = 1 then blank, blank 
             else corner (n - 1) p, side (n - 1) p
  let u = utile p
  quartet c s (s |> turn) u

This gives us the top left corner for our nonet function, and of course we can produce the remaining corners by turning it a number of times.

Putting everything together, we have:

let squareLimit n picture =
  let cornerNW = corner n picture
  let cornerSW = turn cornerNW
  let cornerSE = turn cornerSW
  let cornerNE = turn cornerSE
  let sideN = side n picture
  let sideW = turn sideN
  let sideS = turn sideW
  let sideE = turn sideS
  let center = utile picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

Calling squareLimit 3 fish produces the following image:

sqlimit-3.png

Which is a pretty good replica of Henderson’s replica of Escher’s Square Limit, to a depth of 3. Sweet!

Misson accomplished? Are we done?

Sort of, I suppose. I mean, we could be.

However, if you take a look directly at Escher’s woodcut (or, more likely, the photos of it that you can find online), you’ll notice a couple of things. 1) Henderson’s basic fish looks a bit different from Escher’s basic fish. 2) Escher’s basic fish comes in three hues: white, grey and black, whereas Henderson just has a white one. So it would be nice to address those issues.

Here’s what I came up with.

escher-fish-all.png

To support different hues of the same fish requires a bit of thinking – we can’t just follow Henderson’s instructions any more. But we can use exactly the same approach! In addition to transforming the shape of the picture, we need to be able to transform the coloring of the picture. For this, we introduce a new abstraction, that we will call a Lens.

type Hue = Blackish | Greyish | Whiteish

type Lens = Box * Hue

We redefine a picture to accept a lens instead of just a box. That way, the picture can take the hue (that is, the coloring) into account when figuring out what to draw. Now we can define a new combinator rehue that changes the hue given to a picture:

let rehue p =
  let change = function
  | Blackish -> Greyish
  | Greyish -> Whiteish
  | Whiteish -> Blackish
  fun (box, hue) -> p (box, change hue)

Changing hue three times takes us back to the original hue:

(rehue >> rehue >> rehue) p = p

We need to revise the tiles we used to construct the Square Limit to incorporate the rehue combinator. It turns out we need to create two variants of the t tile.

ttile-shades

But of course it’s just the same old t tile with appropriate calls to rehue for each fish:

let ttile hueN hueE f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over (fishN |> hueN)
                (fishE |> hueE))

let ttile1 = ttile rehue (rehue >> rehue)

let ttile2 = ttile (rehue >> rehue) rehue

For the u tile, we need three variants:

utiles.png

Again, we just call rehue to varying degrees for each fish.

let utile hueN hueW hueS hueE f = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over (fishN |> hueN) (fishW |> hueW))
       (over (fishE |> hueE) (fishS |> hueS))

let utile1 = 
  utile (rehue >> rehue) id (rehue >> rehue) id

let utile2 = 
  utile id (rehue >> rehue) rehue (rehue >> rehue)

let utile3 = 
  utile (rehue >> rehue) id rehue id 

We use the two variants of the t tile in two side functions, one for the “north” and “south” side, another for the “west” and “east” side.

let side tt hueSW hueSE n p = 
  let rec aux n p =
    let t = tt p
    let r = if n = 1 then blank else aux (n - 1) p
    quartet r r (t |> turn |> hueSW) (t |> hueSE)
  aux n p

let side1 =
  side ttile1 id rehue 

let side2 =
  side ttile2 (rehue >> rehue) rehue

We define two corner functions as well, one for the “north-west” and “south-east” corner, another for the “north-east” and the “south-west” corner.

let corner ut sideNE sideSW n p = 
  let rec aux n p = 
    let c, ne, sw = 
      if n = 1 then blank, blank, blank 
               else aux (n - 1) p, sideNE (n - 1) p, sideSW (n - 1) p
    let u = ut p
    quartet c ne (sw |> turn) u
  aux n p 

let corner1 = 
  corner utile3 side1 side2

let corner2 = 
  corner utile2 side2 side1

Now we can write an updated squareLimit that uses our new tile functions.

let squareLimit n picture =
  let cornerNW = corner1 n picture
  let cornerSW = corner2 n picture |> turn
  let cornerSE = cornerNW |> turn |> turn
  let cornerNE = cornerSW |> turn |> turn
  let sideN = side1 n picture
  let sideW = side2 n picture |> turn
  let sideS = sideN |> turn |> turn
  let sideE = sideW |> turn |> turn
  let center = utile1 picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

Was it worth it? I think so. Calling squareLimit 5 fish produces the following image:

square-limit-shades-level-5.png

Which is a pretty good replica of Escher’s Square Limit, to a depth of 5.

The complete code is here.


Donkey code

This is an attempt to write down a very simple example I’ve been using to explain the profound impact the language we use has on thought, discussion and ultimately code.

Imagine you have a computer system, and that you’re one of the programmers working on that system (not too hard, is it?). The system is called, oh I don’t know, eQuest. It has to do with horses. So it typically works with entities of this kind:

horse-only

eQuest is a tremendous success for whatever reason, perhaps there’s very little competition. But it is a success, and so it’s evolving, and one day your product owner comes up with the idea to expand to handle entities of this kind as well:

donkey-only

It’s a new kind of horse! It’s mostly like the other horses and so lots of functionality can easily be reused. However, it has some special characteristics, and must be treated a little differently in some respects. Physically it is quite short, but very strong. Behaviour-wise, it is known to be stubborn, intelligent and not easily startled. It’s an interesting kind of horse.  It also likes carrots a lot (but then don’t all horses?). Needless to say, there will be some adjustments to some of the modules.

Design meetings ensue, to flesh out the new functionality and figure out the correct adjustments to be made to eQuest. Discussions go pretty well. Everyone has heard of these “horses that are small and stubborn” as they’re called. (Some rumors indicate that genetically they’re not actually horses at all – apparently there are differences at the DNA level, but the real world is always riddled with such technicalities. From a pragmatic viewpoint, they’re certainly horses. Albeit short and stubborn, of course. And strong, too.) So it’s not that hard to discuss features that apply to the new kind of horse.

There is now a tendency for confusion when discussing other kinds of changes to the product, though. The unqualified term “horse” is obviously used quite a bit in all kinds of discussions, but sometimes the special short and stubborn kind is meant to be included and sometimes it is not. So to clarify, you and your co-workers find yourself saying things like “any horse” to mean the former and “regular horse”, “ordinary horse”, “old horse”, “horse-horse” or even “horse that’s not small and stubborn” to mean the latter.

To implement support for the new horse in eQuest, you need some way of distinguish between it and an ordinary horse-horse. So you add an IsShort property to your Horse data representation. That’s easy, it’s just a derived property from the Height property. No changes to the database schema or anything. In addition, you add an IsStubborn property and checkbox to the registration form for horses in eQuest. That’s a new flag in the database, but that’s OK. With that in place, you have everything you need to implement the new functionality and make the necessary adjustments otherwise.

Although much of the code applies to horses and short, stubborn horses alike, you find that the transport module, the feeding module, the training module and the breeding module all need a few adjustments, since the new horses aren’t quite like the regular horses in all respects. You need to inject little bits of logic here, split some cases in two there. It takes a few different forms, and you and your co-workers do things a bit differently. Sometimes you employ if-branches with logic that looks like this:

if (horse.IsShort && horse.IsStubborn) {
  // Logic for the new horse case.
}
else {
  // Regular horse code here.
}

Other times you go fancy with LINQ:

var newHorses = horses.Where(h => h.IsShort && h.IsStubborn);
var oldHorses = horses.Except(newHorses);
foreach (var h in newHorses) {
  // New horse logic.
}
foreach (var h in oldHorses) {
  // Old horse logic.
}

And that appears to work pretty well, and you go live with support for short and stubborn horses.

Next day, you have a couple of new bug reports, one in the training module and two concerning the feeding module. It turns out that some of the regular horses are short and stubborn too, so your users would register short regular horses, tick the stubborn checkbox, and erroneously get new horse logic instead of the appropriate horse-horse logic. That’s awkward, but understandable. So you call a few meetings, discuss the issue with your fellow programmers, scratch your head, talk to a UX designer and your product owner. And you figure out that not only are the new horses short and stubborn, they make a distinct sound too. They don’t neigh the way regular horses do, they hee-haw instead.

So you fix the bug. A new property on horse, Sound, with values Neigh and HeeHaw, and updates to logic as appropriate. No biggie.

In design meetings, most people still use the term “horse that’s short and stubborn” to mean the new kind of horse, even though you’re encouraging people to include the sound they make as well, or even just say “hee-hawing horse”. But apart from this nit-picking from your side, things proceed well. It appears that most bugs have been ironed out, and your product owner is happy. So happy, in fact, that there is a new meeting. eQuest is expanding further, to handle entities of this kind as well:

mule-only.png

What is it? Well, it’s the offspring from a horse and a horse that’s short and stubborn and says hee-haw. It shares some properties with the former kind of horse and some with the latter, so obviously there will be much reuse! And a few customizations and adjustments unique for this new new kind of horse. At this point you’re getting worried. You sense there will be trouble if you can’t speak clearly about this horse, so you cry out “let’s call it a half hee-haw!” But it doesn’t catch on. Talking about things is getting a bit cumbersome.

“But at least I can still implement it,” you think for yourself. “And I can mostly guess what kind of horses the UX people are thinking about when they say ‘horse’ anyway, I’ll just map it in my head to the right thing”. You add a Sire and a Dam property to Horse. And you proceed to update existing logic and write new logic.

You now have code that looks like this:

if (horse.IsShort && 
    horse.IsStubborn && 
    horse.Sound == Sound.HeeHaw) || 
   (horse.Sire.IsShort && 
    horse.Sire.IsStubborn && 
    horse.Sire.Sound == Sound.HeeHaw) ||
   (horse.Dam.IsShort && 
    horse.Dam.IsStubborn && 
    horse.Dam.Sound == Sound.HeeHaw)) {
  // Logic for both the new horse and the new-new horse!
}
else {
  // Really regular horse code here.
}

Which turns out to be wrong, since the new new horse doesn’t really neigh or hee-haw, it does something in-between. There is no word for it, so you invent one: the neigh-haw. You extend the Sound enumeration to incorporate it and fix your code.

Getting all the edge cases right takes a while. Your product owner is starting to wonder why development is slowing down, when so much of the functionality can be reused. You mumble something about technical debt. But you manage to get the bug count down to acceptable levels, much thanks to diligent testing.

At this point, there is another meeting. You are shown two photographs of horses of the newest kind. Or so you think. The product owner smiles. “They’re almost identical, but not quite!” he says. “You see, this one has a horse as a mother and a short stubborn horse as a father.” You see where this is going. “But this one, this one has a short stubborn horse as a mother and a horse as a father.” “Does it matter?” you ask. “This last one is always sterile,” he says. “So you need to handle that in the breeding module.” Oh.

“And then there’s this.”

zeedonk-pretty-cut-245.png

The point of this example is that it takes very little for software development to get crippled by complexity without precise language. You need words for the things you want to talk about, both in design discussions and in code. Without them, it becomes very difficult to have meaningful communication, and the inability to articulate a thought precisely is made manifest in the code itself. Your task quickly turns from implementing new functionality to making sure that you don’t break existing functionality.

The example is special in that the missing words are sort of jumping out at you. It’s so obvious what they should be. This is not always the case. In many domains, it can be much, much harder to figure out what the words should be. It’s likely to require a lot of time and effort, and include frustrating and heated discussions with people who think differently than you. You might find that you and your team have to invent new words to represent the concepts that are evolving in your collective mind. But it can also be that you’ve all become accustomed to the set of words you’re currently using, and gone blind to the donkeys in your system.


Something for nothing

I thought I’d jot down some fairly obvious things about values in programs.

Say you have some value in your program. For instance, it could be a String, or a Thing. Then conceptually, each String belongs to the set of possible Strings, and each Thing belongs to the set of possible Things. Right?

Like so:

thing-string

Of course, you might even have something like a function from String to Thing or the other way around. That’s no different, they’re just values that belong to some set of possible values. Hard to draw, though.

In programs, this notion of sets of possible values is baked into the concept of types. So instead of saying that some value belongs to the set of possible Things, we say that it has type Thing or is of type Thing.

I wish that was all there was to it, but alas, it gets more complicated. Not only do we want to represent the presence of values in our programs, sometimes we want to represent the absence of values as well. The absence of a value isn’t necessarily an error. It could be, but it could be fine, too. There are many valid reasons why we might end up with absences flowing through our programs. So we need to represent them.

This is where null enters the picture.

In languages like C# and Java – any object-oriented language that carries DNA from Tony Hoare’s ALGOL W – the drawing of the sets above doesn’t map directly over to types. For each so-called reference type (object values that are accessed by means of references), there’s a twist. In addition to the set of possible values, each reference type also allows for the value null:

thing-string-null.png

It looks pretty similar, but the consequences for program semantics are significant.

The purpose of null is, of course, to represent the absence of a value of a given type. But now your type identifies a pretty weird set of possible values. In the case of Thing, for instance, you have all the legitimate actual Things, but also this weird thing that represents the absence of a Thing. So it’s decidedly not a Thing as such, yet it is of type Thing. It’s bizarre.

But it’s not just confusing to think about. It causes practical problems since null just doesn’t fit in. It’s a phony – a hologram that successfully fools the compiler, which is unable to distinguish between null and proper values of a given type. It’s nothing like the other values. Hence you need to think about it and worry about it all the time. Since it’s not really there, you can’t treat it like a proper value. You most decidedly can not invoke a method on it, which is sort of what you do with objects. The interpretation of null is radically different from the interpretation of all the other values of the same type. (Interestingly, it’s different in precisely the same way for all types: how null sticks out from legit Things mirrors how it sticks out from legit Strings and everything else)

But it gets worse. Once you’ve invited null into your home, there’s no way of getting rid of it! In other words, when you make null part of, say, the Thing type, you can no longer express the idea one of the actual, legit Things, not that spectral special “Thing”. There is no way you can say explicitly in your program that you know that a value is present. It’s all anecdotes and circumstance. You can obviously take a look at some value at any given time in your program and decide whether it’s a legit Thing or just an illusion, but it’s completely ephemeral. You’ve given your type system a sort of brain damage that prevents it from forming memories about absence and presence of values: you might check for null, but your program immediately forgets about it!

So much for reference types. What about so-called primitive values, like integers and booleans? Neither can be null in C# or Java. So the question is: lacking null, how do we represent the absence of a value?

Well, one hack is to think “Gee, I’m not really going to use all the possible values, so I can take one of them and pretend it’s sort of like null.” So instead of interpreting the value literally, you override the interpretation for some magic values. (Using -1 as a special value for integers is a classic, in the case where your legit values are non-negative.) The consequence is that you now have two kinds of values inside your type, operating at different semantic levels and being used for different purposes. Your set of values isn’t just a set of values anymore – it’s a conglomerate of conceptually different things.

This leaves us in a situation that’s similar to the reference type situation, except it’s ad-hoc, convention-based at best. In both cases, we have two things we’re trying to express. One thing is the presence or absence of a value. And the other thing is the set of possible values that value belongs to. These are distinct, orthogonal concerns. And usually, the word of wisdom in programming is to let distinct things be distinct, and avoid mixing things up.

So how can we do that? If we reject the temptation to put special values with special interpretation rules into the same bag as the legit values, what can we do?

If you’re a C# programmer, you’re thinking that you can use Nullable to represent the absence of a primitive value instead. This is true, and you should. Nullable is a wrapper around a primitive type, creating a new type that can have either null or a legit instance of the primitive type as value. On top of that, the the compiler works pretty to hard to blur the line between the underlying type and its Nullable counterpart, such as special syntax and implicit type conversion from T to Nullable<T>.

In a language with sum types, we can do something similar to Nullable, but in a completely generic way. What is a sum type? It is a composite type created by combining various classes of things into a single thing. Here’s an example:

type Utterance = Fee | Fie | Foe | Fum

So it’s sort of like an enumeration of variants. But it’s a rich man’s enumeration, since you can associate a variant with a value of another type. Like so:

type ThingOrNothing = Indeed of Thing | Nothing

This creates a type that distinguishes neatly between legit Things and the absence of a Thing. Either you’ll indeed have a Thing, or you’ll have nothing. And since absence stands to presence in the same way for all types, we can generalize it:

type Mayhaps<'T> = Indeed of 'T | Nothing

The nice thing is that we can now distinguish between the case where we might have a value or not and the case where we know we do have a value (provided our language doesn’t allow null, of course). In other words, we can have a function of type Mayhaps<Thing> -> Thing. This is a big deal! We can distinguish between the parts of our program that have to worry about absent values and the parts that don’t. We’ve fixed the brain damage, our program can form memories of the checks we’ve made. It’s a beautiful feat of surgery, enabled by sum types and the absence of null.

So sum types neatly solves this problem with representing absence and presence of values, on top of, and orthogonally to, the issue of defining a set of possible values for a given shape of data. What’s more, this is just one application of a general feature of the language. There is no need to complicate the language itself with special handling of absence. Instead, you’re likely to find something like Mayhaps in the standard library. In F# it’s called Option, in Elm it’s called Maybe. In languages like F# and Elm – any functional language that carries DNA from Robin Milner’s ML – you’ll find that you have both sum types and pattern matching. Pattern matching makes it very easy to work with the sum type variants.

The code you write to handle absence and presence follows certain patterns, which means you can create abstractions. For instance, say you have some function thingify of type String -> Thing, which takes a String and produces a Thing. Now suppose you’re given an Mayhaps<String> instead of a String. If you don’t have a String, you don’t get a Thing, but if indeed you do have a String, you’d like to apply thingify to get a Thing. Right?

Here’s how you might write it out, assuming F#-style syntax and pattern matching:

match dunnoCouldBe with 
| Indeed str -> Indeed (thingify str) 
| Nothing -> Nothing

This pattern is going to pop up again and again when you’re working with Mayhaps values, where the only thing that varies is the function you’d like to apply. So you can write a general function to handle this:

let quux (f : 'T -> 'U) (v : Mayhaps<'T>) : Mayhaps<'U> = 
  match v with 
  | Indeed t -> Indeed (f t) 
  | Nothing -> Nothing

Whenever you need to optionally transform an option value using some function, you just pass both of them to quux to handle it. But of course, chances are you’ll find that quux has already been written for you. In F#, it’s called Option.map. Because it map from one kind of Option value to another.

At this point, we’ve got the mechanics and practicalities of working with values that could be present or absent worked out. Now what should you do when you change your mind about where and how you handle the absence of a value? This is a design decision, even a business rule. These things change.

The short answer is that when these things change, you get a rippling change in type signatures in your program – from the place where you made a change, to the place where you handle the effect of the change. This is a good thing. This is the compiler pointing out where you need to do some work to make the change work as planned. That’s another benefit of treating the absence of values explicitly instead of mixing it up with the values themselves.

That’s all well and good, but what can you do if you’re a C# or Java programmer? What if your programming language has null and no sum types? Well, you could implement something similar to Mayhaps using the tools available to you.

Here’s a naive implementation written down very quickly, without a whole lot of thought put into it:

Now you can write code like this:

var foo = Mayhaps<string>.Nothing;
var bar = Mayhaps<string>.Indeed("lol");
var couldBeStrings = new[] { foo, bar };
var couldBeLengths = couldBeStrings.Select(it => it.Map(s => s.Length));

A better solution would be to use a library such as Succinc<T> to do the job for you.

Regardless of how you do it, however, it’s always going to be a bit clunky. What’s more is it won’t really solve our problem.

As you’ll recall, the problem with null is that you can’t escape from it. In a sense, what is missing isn’t Mayhaps. It’s the opposite. With null, everything is Mayhaps! We still don’t have a way to say that we know that the value is there. So perhaps a better solution is to implement the opposite? We could try. Here’s a very simple type that banishes null:

Now the question is – apart from being very clunky – does it work? And the depressing answer is: not really. It addresses the correct problem, but it fails for an obvious reason – how do you ensure that the Indeed value itself isn’t null? Put it inside another Indeed?

Implementing Indeed as a struct (that is, a value type) doesn’t work too great either. While a struct Indeed cannot be null, you can very easily obtain an uninitialized Indeed, for instance by getting the default value of the struct, which is always available. In that case, you would end up with an Indeed which wraps a null, which is unacceptable.

So I’m afraid it really is true. You can’t get rid of null once you’ve invited it in. It’s pretty annoying. I wish they hadn’t invited in null in the first place.


Pragmatism is poison

Yesterday I gave a lightning talk called “Pragmatism is Poison” at the Booster Conference in Bergen. This blog post is essentially that talk in written form.

The basic idea of the talk, and therefore of this blog post, is to launch a public attack on perhaps the most fundamental virtue of the software craftsman: pragmatism. The reason is that I think pragmatism has turned toxic, to the point where it causes more harm than good.

While I do believe that pragmatism was once a useful maxim to combat analysis paralysis and over-engineering, I also believe that the usefulness has expired. Pragmatism no longer represents a healthy attitude, in my mind it’s not even a word anymore. It has degenerated into a thought-terminating cliché, which is used to stifle discussion, keep inquiries at bay and to justify not thinking things through.

These are, of course, pretty harsh words. I’ll try to make my case though.

Let’s start by having a look at what it means to be pragmatic. The definition Google gave me is as follows:

Dealing with things sensibly and realistically in a way that is based on practical rather than theoretical considerations.

[A small caveat: this isn’t necessarily the only or “correct” definition of what it means to be pragmatic. But I suppose it’s a sensible and realistic approximation?]

Anyways: this is good by definition! How can it possibly be bad?

Well, for a whole bunch of reasons really. We can divide the problems in two parts: 1) that pragmatism lends itself to abuse because it’s ambiguous and subjective, and 2) all the forms that abuse can take.

So the underlying problem is that the definition relies heavily on subjective judgment. (Consider the task of devising a test that would determine whether or not someone was being pragmatic – the very idea is absurd!) One thing is that what qualifies as “realistic” and “sensible” is clearly subjective. For any group of people, you will find varying degrees of agreement and disagreement depending on who they are and the experiences they’ve had. But the distinction between “practical” and “theoretical” considerations is problematic as well.

Here are some quick considerations to -uh- consider:

#1 laws of nature change
#2 bomb hits data center
#3 cloud providers go down
#4 servers are hacked
#5 unlikely timing of events

I suppose we can agree that #1 is of theoretical interest only. What about the others? If #2 happens, maybe we have more serious problems that your service going down, but it depends on how critical that service is. The same goes for #3. With respect to #4, a lot of people seem surprised when their servers are hacked – as if they hadn’t thought it possible! And #5 depends on whether or not you’d like luck to be a part of your architecture. I’ve been on projects where people would tell me that scenarios I asked about were so unlikely that they were practically impossible. Which to me just means they’ll be hard to debug and reproduce when they happen in production.

The point, though, is that the distinction between “practical” and “theoretical” is pretty much arbitrary. Where do we draw the line? Who’s to say? But it’s important, because mislabeling important considerations – things that affect the quality of software! – as “theoretical” leads to bad software.

So that’s the underlying problem of ambiguity. On to the various forms of abuse!

The first is that pragmatism is often used to present a false dilemma in software development. We love our dichotomies! I could use cheap rhetoric to imply that it traces back to the 0s and 1s of our computers, but let’s not – I have no evidence for such a claim! Luckily I don’t need it either. Suffice it to say that the world is complex, and it’s always very tempting to see things in black and white instead. And so we pit “pragmatism” on the one hand against “dogmatism” on the other, and it’s really important to stay on the right side of that divide! Sometimes we use different words instead, like “practical” vs “theoretical” or “real world” vs “ivory tower”. It all means the same thing: “good” vs “bad”. Which is a big lie, because we’re not making ethical judgments, we’re trying to assess the pros and cons of different solutions to particular problems in concrete contexts. This isn’t The Lord of the Rings.

The consequences of this polarizing are pretty severe. The false dilemma is often used as a self-defense bluff in discussions between team members. The so-called impostor syndrome is rampant in our industry, and so we reach for tools that help us deal with insecurity. One such tool is pragmatism, which can be abused as a magical spell to turn insecurity on its head.

Here’s how it works. Because of the false dilemma, a claim to be pragmatic is implicitly an accusation that says that whoever disagrees is dogmatic: they’re not being sensible, they’re not being realistic, they’re just obsessing over theoretical considerations. So while a statement like “I’m being pragmatic” sounds innocuous enough, it’s really not. It leads to stupid, unrefined, pointless discussions where no knowledge is gained. Instead we’re fighting over who’s good and who’s bad. Polarizing does not make discussions more interesting, it makes them degenerate into banality.

A related strategy is to use pragmatism as a diversion or a smoke bomb, offering the confronted part with an easy exit and effectively ending the discussion. The reason is that it takes a lot of guts and perseverance to call someone’s bluff when they’re claiming to be pragmatic. You might approach your co-worker with a concern like “hey, I was looking at the code, and it seems like we’re blocking in our streaming API, which sort of defeats the purpose of a streaming API in the first place”. It sounds like a valid concern until your co-worker says the magic words “I was just being pragmatic” and vanishes in a puff of smoke, like so:

puffanimate

What we should do instead is accept the complexity we’re faced with and resist the urge to trivialize it. There’s always a need for thinking and discussion, and spurious claims of pragmatism don’t help.

Another problem with pragmatism is that it can be – and is – used as an excuse for sloppy thinking, or no thinking at all. Pragmatism encourages partial “solutions” that work not by reflecting a conceptual solution to a problem, but rather by mimicking correct behavior for various inputs. That way, we can short-circuit the need for design and collaboration. Instead we start with a trivial “happy path” solution and add flags and epicycles to flesh it out into richer behavior as needed. This approach yields software that more or less works for the inputs we’ve tried, and maybe for other inputs as well. (Behavior in the latter case is not as well understood, for obvious reasons.) If we come across inputs that cause problems, we apply patches in the form of additional flags and epicycles.

Because the approach sounds rather dubious when written out like that, we use the magic word “pragmatic” to make it sound better. It’s pragmatic problem-solving. We call the solutions themselves “good enough” solutions – wasting any more time thinking would be gold-plating!  Sometimes we use quotes like “perfect is the enemy of good” as further evidence that we’re doing a good thing – as if the problem we’re facing is too much perfect software in the world!

Here’s an obviously made-up example of this approach:

function square(x) = {
    if (x == 1) then return 1;
    if (x == 2) then return 4;
    if (x == 3) then return 9;
    if (x == 4) then return 15;
    if (x == 5) then return 25;
    // Should never happen.
    return -1;
 }

This is a function that computes the square of the integers 1-5, not by reflecting any understanding of what it means to square a number, but rather by emulating correct behavior. It has a small bug for the input number 4, but that doesn’t matter much, we rarely get that value, and the result isn’t too far off. It’s a perfectly pragmatic solution if you can assume that x will always be in the range 1-5.

A more general solution would be this:

function square(x) = {
    return x * x;
}

Which solves the general case of calculating the square of integers – sort of! Unfortunately, integers themselves are deeply pragmatic. (What happens when x * x is greater than the maximum value for integers?)

But these are all silly examples – theoretical considerations!

So let’s consider something more “real world”. Since software exists and executes in time, software typically needs to take time into account – by registering time stamps for various events and so forth.

How do you handle dates and times in your applications? Are you aware of the related complexities that exist? Do you handle those complexities explicitly? Do you think about how they might affect your application in various ways? Or do you simply close your eyes and hope for the best? Assume that all the systems you integrate with use time stamps from the same time zone? Assume that leap years and leap seconds won’t affect you (that all years have 365 days and all minutes have 60 seconds)? Assume that daylight savings time won’t cause any problems (even though it means that time isn’t linear – depending on your time zone(s), some points in time may not exist, whereas others may exist more than once)? Assume that everyone else around you are making the same assumptions? That’s mighty pragmatic!

Finally, pragmatism is sometimes used to create outright logical contradictions. Pragmatism is about compromise, but some compromises cannot be made without compromising the concept itself! For instance, some architectural properties have principles that cannot be violated while still maintaining the properties – they are simply no longer present due to the violation. (A vegetarian cannot eat meat in the weekends and still be a vegetarian, if you will.) Not even pragmatism can fix this, but that doesn’t stop people from trying!

To illustrate, here’s (a reproduction of) a funny meme I found on the Internet.

batman-rest

I think it’s funny because a lot of people seem to get annoyed when someone points out that their self-proclaimed RESTful APIs aren’t really RESTful because they violate some property of REST or other – typically the hypermedia constraint. People get annoyed because they don’t want to think about that, they’d rather be pragmatic and get on with stuff.

But for some reason they still want to keep that word, REST. Maybe they think it sounds good, or maybe they promised their manager a REST API, I don’t know. It doesn’t matter. They want to keep that word. And so they say “well, maybe it’s not your Ivory Tower REST” (implicitly bad!), “maybe it’s Pragmatic REST instead” (implicitly good!). And then they go on to do something like JSON over HTTP, which is really simple and great, and they can easily deserialize the JSON in their JavaScript, and they’ve practically shipped it already. And when someone comes along and talks about hypermedia being a requirement for REST, they just slap them! Pretty funny!

Here’s another meme. I made this one myself. Unfortunately it’s not funny.

batman-secure

Why isn’t it funny? It looks a lot like the previous one.

The problem is that when someone violates some established principle of security – maybe they decide it’s convenient to store encrypted passwords on their server or to roll their own cryptography – we think it’s a bad idea. And we don’t think it’s a good excuse to say “well, maybe it’s not Ivory Tower Security, maybe it’s Pragmatic Security instead”. We simply don’t agree that it’s very secure at all. So in some sense it’s not funny because the roles have been reversed. Turns out it’s much more funny being Batman than being Robin in this meme. Go figure.

Now we have the strange situation that it’s apparently OK for some words in software to have no meaning (REST), whereas we insist that others do have meaning (secure). For the meaningless words, prefixing “pragmatic” will absolve you from your sins, for the meaningful words, it will not. This is a problem, I think. Who’s to decide which words should have meaning?

Here’s a third meme. I made this one too.

batman-tea

It’s a bit strange, I’ll admit that. But bear with me. It’s the last one, I promise.

What would you say if someone offered you hot water and a biscuit and said “have some pragmatic tea”? Would you be content? Would you pay for a cup of pragmatic tea? Or would you take the route of the dogmatist and argue that it’s not really tea if no tea was involved in the preparation? Well, SLAP you! Crazy tea zealot! Hang out with the other ivory tower hipsters, and have your fancy tea! Who drinks tea anyway?!

At this point you can see I’ve gone absurd – but we’re still just doing variations of the same joke. I didn’t bring the absurdity, it was there all along. The point I’m trying to make is that words do have meaning, whether it’s REST or security or tea. We should respect that meaning instead of using pragmatism as a poor excuse for undermining it. Some properties have principles that cannot be broken while at the same time keeping the properties intact. You can’t have REST without Representational State Transfer, because that’s literally what the acronym means. Secure applications shouldn’t be storing passwords, even if they’re encrypted. Tea should contain tea. (Please don’t get me started on Rooibos.)

I should add that it’s perfectly fine to not care about REST, or to dispute the value of REST – or tea, or security, for that matter. Those are conversations worth having. It’s a free world, and everyone is entitled to choose the properties they care about, based on the context they’re in. Lots of people very vocally don’t care about REST, perhaps even more people than people who know what REST actually is! I have no problem with that. What’s less fine is pretending it has no meaning.

This concludes my attack on pragmatism, at least for now! To reiterate: pragmatism is easily abused because it’s hard to tell if someone is genuinely pragmatic or just claiming to be so. The abuse takes various forms: false dilemma, self-defense bluff, smoke bomb, justification for sloppy thinking and undermining of the meaning of words.

A call for action? Please stop using pragmatism as an excuse for doing sloppy work! And if you find you do need to use the word, please have it be the beginning of a discussion rather than the end of one.