Picture combinators and recursive fish

On February 9th 2017, I was sitting in an auditorium in Krakow, listening to Mary Sheeran and John Hughes give the opening keynote at the Lambda Days conference. It was an inspired and inspiring keynote, that discussed some of the most influential ideas in some of the most interesting papers written on functional programming. You should absolutely check it out.

One of the papers that was mentioned was Functional Geometry by Peter Henderson, written in 1982. In it, Henderson shows a deconstruction of an Escher woodcut called Square Limit, and how he can elegantly reconstruct a replica of the woodcut by using functions as data. He defines a small set of picture combinators – simple functions that operate on picture functions – to form complex pictures out of simple ones.

Escher’s original woodcut looks like this:

escher-square-limit-309.png

Which is pretty much a recursive dream. No wonder Henderson found it so fascinating – any functional programmer would.

As I was listening the keynote, I recalled that I had heard about the paper before, in the legendary SICP lectures by Abelson and Sussman (in lecture 3A, in case you’re interested). I figured it was about time I read the paper first hand. And so I did. Or rather, I read the revised version from 2002, because that’s what I found online.

And of course one thing led to another, and pretty soon I had implemented my own version in F#. Which is sort of why we’re here. Feel free to tag along as I walk through how I implemented it.

A key point in the paper is to distinguish between the capability of rendering some shape within a bounding box onto a screen on the one hand, and the transformation and composition of pictures into more complex pictures on the other. This is, as it were, the essence of decoupling through abstraction.

Our basic building block will be a picture. We will not think of a picture as a collection of colored pixels, but rather as something that is capable of scaling and fitting itself with respect to a bounding box. In other words, we have this:

type Picture : Box -> Shape list

A picture is a function that takes a box and creates a list of shapes for rendering.

What about the box itself? We define it using three vectors a, b and c.

type Vector = { x : float; y : float }
type Box = { a : Vector; b : Vector; c : Vector}

The vector a denotes the offset from the origin to the bottom left corner of the box. The vectors b and c span out the bounding box itself. Each vector is defined by its magnitude in the x and y dimensions.

For example, assume we have a picture F that will produce the letter F when given a bounding box. A rendering might look like this:

basic-f

But if we give F a different box, the rendering will look different too:

skewed-f-image

So, how do we create and render such a magical, self-fitting picture?

We can decompose the problem into three parts: defining the basic shape, transforming the shape with respect to the bounding box, and rendering the final shape.

We start by defining a basic shape relative to the unit square. The unit square has sides of length 1, and we position it such that the bottom left corner is at (0, 0) and top right corner is at (1, 1). Here’s a definition that puts a polygon outlining the F picture inside the unit square:

let fShape = 
  let pts = [ 
    { x = 0.30; y = 0.20 } 
    { x = 0.40; y = 0.20 }
    { x = 0.40; y = 0.45 }
    { x = 0.60; y = 0.45 }
    { x = 0.60; y = 0.55 }
    { x = 0.40; y = 0.55 }
    { x = 0.40; y = 0.70 }
    { x = 0.70; y = 0.70 }
    { x = 0.70; y = 0.80 }
    { x = 0.30; y = 0.80 } ]
  Polygon { points = pts }

To make this basic shape fit the bounding box, we need a mapping function. That’s easy enough to obtain:

let mapper { a = a; b = b; c = c } { x = x; y = y } =
   a + b * x + c * y

The mapper function takes a bounding box and a vector, and produces a new vector adjusted to fit the box. We’ll use partial application to create a suitable map function for a particular box.

As you can see, we’re doing a little bit of vector arithmetic to produce the new vector. We’re adding three vectors: a, the vector obtained by scaling b by x, and the vector obtained by scaling c by y. As we proceed, we’ll need some additional operations as well. We implement them by overloading some operators for the Vector type:

static member (+) ({ x = x1; y = y1 }, { x = x2; y = y2 }) =
    { x = x1 + x2; y = y1 + y2 }

static member (~-) ({ x = x; y = y }) =
    { x = -x; y = -y }

static member (-) (v1, v2) = v1 + (-v2)

static member (*) (f, { x = x; y = y }) =
    { x = f * x; y = f * y }

static member (*) (v, f) = f * v

static member (/) (v, f) = v * (1 / f)

This gives us addition, negation, subtraction, scalar multiplication and scalar division for vectors.

Finally we need to render the shape in some way. It is largely an implementation detail, but we’ll take a look at one possible simplified rendering. The code below can be used to produce an SVG image of polygon shapes using the NGraphics library.

type PolygonShape = { points : Vector list }

type Shape = Polygon of PolygonShape

let mapShape m = function 
  | Polygon { points = pts } ->
    Polygon { points = pts |> List.map m }

let createPicture shapes = 
   fun box ->
     shapes |> List.map (mapShape (mapper box))

let renderSvg width height filename shapes = 
  let size = Size(width, height)
  let canvas = GraphicCanvas(size)
  let p x y = Point(x, height - y) 
  let drawShape = function 
  | Polygon { points = pts } ->
    match pts |> List.map (fun { x = x; y = y } -> p x y) with 
    | startPoint :: t ->
      let move = MoveTo(startPoint) :> PathOp
      let lines = t |> List.map (fun pt -> LineTo(pt) :> PathOp) 
      let close = ClosePath() :> PathOp
      let ops = (move :: lines) @ [ close ] 
      canvas.DrawPath(ops, Pens.Black)
    | _ -> ()
  shapes |> List.iter drawShape
  use writer = new StreamWriter(filename)
  canvas.Graphic.WriteSvg(writer)

When we create the picture, we use the mapShape function to apply our mapping function to all the points in the polygon that makes up the F. The renderSvg is used to do the actual rendering of the shapes produced by the picture function.

Once we have the picture abstraction in place, we can proceed to define combinators that transform or compose pictures. The neat thing is that we can define these combinators without having to worry about the rendering of shapes. In other words, we never have to pry open our abstraction, we will trust it to do the right thing. All our work will be relative, with respect to the bounding boxes.

We start with some basic one-to-one transformations, that is, functions with this type:

type Transformation = Picture -> Picture

The first transformation is turn, which rotates a picture 90 degrees counter-clockwise around its center (that is, around the center of its bounding box).

The effect of turn looks like this:

turn-f

Note that turning four times produces the original picture. We can formulate this as a property:

(turn >> turn >> turn >> turn) p = p

(Of course, for pictures with symmetries, turning twice or even once might be enough to yield a picture equal to the original. But the property above should hold for all pictures.)

The vector arithmetic to turn the bounding box 90 degrees counter-clockwise is as follows:

(a’, b’, c’) = (a + b, c, -b)

And to reiterate: the neat thing is that this is all we need to consider. We define the transformation using nothing but this simple arithmetic. We trust the picture itself to cope with everything else.

In code, we write this:

let turnBox { a = a; b = b; c = c } =
    { a = a + b; b = c; c = -b }

let turn p = turnBox >> p

The overloaded operators we defined above makes it very easy to translate the vector arithmetic into code. It also makes the code very easy to read, and hopefully convince yourself that it does the right thing.

The next transformation is flip, which flips a picture about the center vertical axis of the bounding box.

Which might sound a bit involved, but it’s just this:

flip-f

Flipping twice always produces the same picture, so the following property should hold:

(flip >> flip) p = p

The vector arithmetic is as follows:

(a’, b’, c’) = (a + b, -b, c)

Which translates neatly to:

let flipBox { a = a; b = b; c = c } =
   { a = a + b; b = -b; c = c }

let flip p = flipBox >> p

The third transformation is a bit peculiar, and quite particular to the task of mimicking Escher’s Square Limit, which is what we’re building up to. Henderson called the transformation rot45, but I’ll refer to it as toss, since I think it resembles light-heartedly tossing the picture up in the air:

toss-f

What’s going on here? Its a 45 degree counter-clockwise rotation around top left corner, which also shrinks the bounding box by a factor of √2.

It’s not so easy to define simple properties that should hold for toss. For instance, tossing twice is not the same as turning once. So we won’t even try.

The vector arithmetic is still pretty simple:

(a’, b’, c’) = (a + (b + c) / 2, (b + c) / 2, (c − b) / 2)

And it still translates very directly into code:

let tossBox { a = a; b = b; c = c } =
  let a' = a + (b + c) / 2
  let b' = (b + c) / 2
  let c' = (c − b) / 2
  { a = a'; b = b'; c = c' }

let toss p = tossBox >> p

That’s all the transformations we’ll use. We can of course combine transformations, e.g:

(turn >> turn >> flip >> toss)

Which produces this:

turn-turn-flip-toss

We proceed to compose simple pictures into more complex ones. We define two basic functions for composing pictures, above and beside. The two are quite similar. Both functions take two pictures as arguments; above places the first picture above the second, whereas beside places the first picture to the left of the second.

above-beside.png

Here we see the F placed above the turned F, and the F placed beside the turned F. Notice that each composed picture forms a square, whereas each original picture is placed within a half of that square. What happens is that the bounding box given to the composite picture is split in two, with each original picture receiving one of the split boxes as their bounding box. The example shows an even split, but in general we can assign a fraction of the bounding box to the first argument picture, and the remainder to the second.

For implementation details, we’ll just look at above:

let splitHorizontally f box =
  let top = box |> moveVertically (1. - f) |> scaleVertically f  
  let bottom = box |> scaleVertically (1. - f)
  (top, bottom)

let aboveRatio m n p1 p2 =
  fun box ->
    let f = float m / float (m + n)
    let b1, b2 = splitHorizontally f box
    p1 b1 @ p2 b2

let above = aboveRatio 1 1

There are three things we need to do: work out the fraction of the bounding box assigned to the first picture, split the bounding box in two according to that fraction, and pass the appropriate bounding box to each picture. We “split” the bounding box by creating two new bounding boxes, scaled and moved as appropriate. The mechanics of scaling and moving is implemented as follows:

let scaleVertically s { a = a; b = b; c = c } = 
  { a = a
    b = b 
    c = c * s }

let moveVertically offset { a = a; b = b; c = c } = 
  { a = a + c * offset
    b = b
    c = c }

Now we can create more interesting images, such as this one:

composite-f

Which is made like this:

above (beside (turn (turn (flip p))) (turn (turn p)))
      (beside (flip p) p)

With this, our basic toolset is complete. Now it is time to lose the support wheels and turn our attention to the original task: creating a replica of Henderson’s replica of Escher’s Square Limit!

We start with a basic picture that is somewhat more interesting than the F we have been using so far.

According to the paper, Henderson created his fish from 30 bezier curves. Here is my attempt at recreating it:

Henderson's fish

You’ll notice that the fish violates the boundaries of the unit square. That is, some points on the shape has coordinates that are below zero or above one. This is fine, the picture isn’t really bound by its box, it’s just scaled and positioned relative to it.

We can of course turn, flip and toss the fish as we like.

Henderson's fish (turned, flipped and tossed)

But there’s more to the fish than might be immediately obvious. After all, it’s not just any fish, it’s an Escher fish. An interesting property of the fish is shown if we overlay it with itself turned twice.

We define a combinator over that takes two pictures and places both pictures with respect to the same bounding box. And voila:

overlay-fish

As we can see, the fish is designed so that it fits together neatly with itself. And it doesn’t stop there.

The t tile

This shows the tile t, which is one of the building blocks we’ll use to construct Square Limit. The function ttile creates a t tile when given a picture:

let ttile f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over fishN fishE)

Here we see why we needed the toss transformation defined earlier, and begin to appreciate the ingenious design of the fish.

The second building block we’ll need is called tile u. It looks like this:

The u tile

And we construct it like this:

let utile (f : Picture) = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over fishN fishW)
       (over fishE fishS)

To compose the Square Limit itself, we observe that we can construct it from nine tiles organized in a 3×3 grid. We define a helper function nonet that takes nine pictures as arguments and lays them out top to bottom, left to right. Calling nonet with pictures of the letters H, E, N, D, E, R, S, O, N produces this result:

H-E-N-D-E-R-S-O-N

The code for nonet looks like this:

let nonet p q r s t u v w x =
  aboveRatio 1 2 (besideRatio 1 2 p (beside q r))
                 (above (besideRatio 1 2 s (beside t u))
                        (besideRatio 1 2 v (beside w x)))

Now we just need to figure out the appropriate pictures to pass to nonet to produce the Square Limit replica.

The center tile is the easiest: it is simply the tile u that we have already constructed. In addition, we’ll need a side tile and a corner tile. Each of those will be used four times, with the turn transformation applied 0 to 3 times.

Both side and corner have a self-similar, recursive nature. We can think of both tiles as consisting of nested 2×2 grids. Similarly to nonet, we define a function quartet to construct such grids out of four pictures:

let quartet p q r s = above (beside p q) (beside r s)

What should we use to fill our quartets? Well, first off, we need a base case to terminate the recursion. To help us do so, we’ll use a degenerate picture blank that produces nothing when given a bounding box.

We’ll discuss side first since it is the simplest of the two, and also because corner uses side. The base case should look like this:

side 1 fish

For the recursive case, we’ll want self-similar copies of the side-tile in the top row instead of blank pictures. So the case one step removed from the base case should look like this:

side 2 fish

The following code helps us construct sides of arbitrary depth:

let rec side n p = 
  let s = if n = 1 then blank else side (n - 1) p
  let t = ttile p
  quartet s s (t |> turn) t

This gives us the side tile that should be used as the “north” tile in the nonet function. We obtain “west”, “south” and “east” as well by turning it around once, twice or thrice.

Creating a corner is quite similar to creating a side. The base case should be a quartet consisting of three blank pictures, and a u tile for the final, bottom right picture. It should look like this:

corner 1 fish

The recursive case should use self-similar copies of both the corner tile (for the top left or “north-west” picture) and the side tile (for the top right and bottom left pictures), while keeping the u tile for the bottom right tile.

corner 2 fish

Here’s how we can write it in code:

let rec corner n p = 
  let c, s = if n = 1 then blank, blank 
             else corner (n - 1) p, side (n - 1) p
  let u = utile p
  quartet c s (s |> turn) u

This gives us the top left corner for our nonet function, and of course we can produce the remaining corners by turning it a number of times.

Putting everything together, we have:

let squareLimit n picture =
  let cornerNW = corner n picture
  let cornerSW = turn cornerNW
  let cornerSE = turn cornerSW
  let cornerNE = turn cornerSE
  let sideN = side n picture
  let sideW = turn sideN
  let sideS = turn sideW
  let sideE = turn sideS
  let center = utile picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

Calling squareLimit 3 fish produces the following image:

sqlimit-3.png

Which is a pretty good replica of Henderson’s replica of Escher’s Square Limit, to a depth of 3. Sweet!

Misson accomplished? Are we done?

Sort of, I suppose. I mean, we could be.

However, if you take a look directly at Escher’s woodcut (or, more likely, the photos of it that you can find online), you’ll notice a couple of things. 1) Henderson’s basic fish looks a bit different from Escher’s basic fish. 2) Escher’s basic fish comes in three hues: white, grey and black, whereas Henderson just has a white one. So it would be nice to address those issues.

Here’s what I came up with.

escher-fish-all.png

To support different hues of the same fish requires a bit of thinking – we can’t just follow Henderson’s instructions any more. But we can use exactly the same approach! In addition to transforming the shape of the picture, we need to be able to transform the coloring of the picture. For this, we introduce a new abstraction, that we will call a Lens.

type Hue = Blackish | Greyish | Whiteish

type Lens = Box * Hue

We redefine a picture to accept a lens instead of just a box. That way, the picture can take the hue (that is, the coloring) into account when figuring out what to draw. Now we can define a new combinator rehue that changes the hue given to a picture:

let rehue p =
  let change = function
  | Blackish -> Greyish
  | Greyish -> Whiteish
  | Whiteish -> Blackish
  fun (box, hue) -> p (box, change hue)

Changing hue three times takes us back to the original hue:

(rehue >> rehue >> rehue) p = p

We need to revise the tiles we used to construct the Square Limit to incorporate the rehue combinator. It turns out we need to create two variants of the t tile.

ttile-shades

But of course it’s just the same old t tile with appropriate calls to rehue for each fish:

let ttile hueN hueE f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over (fishN |> hueN)
                (fishE |> hueE))

let ttile1 = ttile rehue (rehue >> rehue)

let ttile2 = ttile (rehue >> rehue) rehue

For the u tile, we need three variants:

utiles.png

Again, we just call rehue to varying degrees for each fish.

let utile hueN hueW hueS hueE f = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over (fishN |> hueN) (fishW |> hueW))
       (over (fishE |> hueE) (fishS |> hueS))

let utile1 = 
  utile (rehue >> rehue) id (rehue >> rehue) id

let utile2 = 
  utile id (rehue >> rehue) rehue (rehue >> rehue)

let utile3 = 
  utile (rehue >> rehue) id rehue id 

We use the two variants of the t tile in two side functions, one for the “north” and “south” side, another for the “west” and “east” side.

let side tt hueSW hueSE n p = 
  let rec aux n p =
    let t = tt p
    let r = if n = 1 then blank else aux (n - 1) p
    quartet r r (t |> turn |> hueSW) (t |> hueSE)
  aux n p

let side1 =
  side ttile1 id rehue 

let side2 =
  side ttile2 (rehue >> rehue) rehue

We define two corner functions as well, one for the “north-west” and “south-east” corner, another for the “north-east” and the “south-west” corner.

let corner ut sideNE sideSW n p = 
  let rec aux n p = 
    let c, ne, sw = 
      if n = 1 then blank, blank, blank 
               else aux (n - 1) p, sideNE (n - 1) p, sideSW (n - 1) p
    let u = ut p
    quartet c ne (sw |> turn) u
  aux n p 

let corner1 = 
  corner utile3 side1 side2

let corner2 = 
  corner utile2 side2 side1

Now we can write an updated squareLimit that uses our new tile functions.

let squareLimit n picture =
  let cornerNW = corner1 n picture
  let cornerSW = corner2 n picture |> turn
  let cornerSE = cornerNW |> turn |> turn
  let cornerNE = cornerSW |> turn |> turn
  let sideN = side1 n picture
  let sideW = side2 n picture |> turn
  let sideS = sideN |> turn |> turn
  let sideE = sideW |> turn |> turn
  let center = utile1 picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

Was it worth it? I think so. Calling squareLimit 5 fish produces the following image:

square-limit-shades-level-5.png

Which is a pretty good replica of Escher’s Square Limit, to a depth of 5.

The complete code is here.


Donkey code

This is an attempt to write down a very simple example I’ve been using to explain the profound impact the language we use has on thought, discussion and ultimately code.

Imagine you have a computer system, and that you’re one of the programmers working on that system (not too hard, is it?). The system is called, oh I don’t know, eQuest. It has to do with horses. So it typically works with entities of this kind:

horse-only

eQuest is a tremendous success for whatever reason, perhaps there’s very little competition. But it is a success, and so it’s evolving, and one day your product owner comes up with the idea to expand to handle entities of this kind as well:

donkey-only

It’s a new kind of horse! It’s mostly like the other horses and so lots of functionality can easily be reused. However, it has some special characteristics, and must be treated a little differently in some respects. Physically it is quite short, but very strong. Behaviour-wise, it is known to be stubborn, intelligent and not easily startled. It’s an interesting kind of horse.  It also likes carrots a lot (but then don’t all horses?). Needless to say, there will be some adjustments to some of the modules.

Design meetings ensue, to flesh out the new functionality and figure out the correct adjustments to be made to eQuest. Discussions go pretty well. Everyone has heard of these “horses that are small and stubborn” as they’re called. (Some rumors indicate that genetically they’re not actually horses at all – apparently there are differences at the DNA level, but the real world is always riddled with such technicalities. From a pragmatic viewpoint, they’re certainly horses. Albeit short and stubborn, of course. And strong, too.) So it’s not that hard to discuss features that apply to the new kind of horse.

There is now a tendency for confusion when discussing other kinds of changes to the product, though. The unqualified term “horse” is obviously used quite a bit in all kinds of discussions, but sometimes the special short and stubborn kind is meant to be included and sometimes it is not. So to clarify, you and your co-workers find yourself saying things like “any horse” to mean the former and “regular horse”, “ordinary horse”, “old horse”, “horse-horse” or even “horse that’s not small and stubborn” to mean the latter.

To implement support for the new horse in eQuest, you need some way of distinguish between it and an ordinary horse-horse. So you add an IsShort property to your Horse data representation. That’s easy, it’s just a derived property from the Height property. No changes to the database schema or anything. In addition, you add an IsStubborn property and checkbox to the registration form for horses in eQuest. That’s a new flag in the database, but that’s OK. With that in place, you have everything you need to implement the new functionality and make the necessary adjustments otherwise.

Although much of the code applies to horses and short, stubborn horses alike, you find that the transport module, the feeding module, the training module and the breeding module all need a few adjustments, since the new horses aren’t quite like the regular horses in all respects. You need to inject little bits of logic here, split some cases in two there. It takes a few different forms, and you and your co-workers do things a bit differently. Sometimes you employ if-branches with logic that looks like this:

if (horse.IsShort && horse.IsStubborn) {
  // Logic for the new horse case.
}
else {
  // Regular horse code here.
}

Other times you go fancy with LINQ:

var newHorses = horses.Where(h => h.IsShort && h.IsStubborn);
var oldHorses = horses.Except(newHorses);
foreach (var h in newHorses) {
  // New horse logic.
}
foreach (var h in oldHorses) {
  // Old horse logic.
}

And that appears to work pretty well, and you go live with support for short and stubborn horses.

Next day, you have a couple of new bug reports, one in the training module and two concerning the feeding module. It turns out that some of the regular horses are short and stubborn too, so your users would register short regular horses, tick the stubborn checkbox, and erroneously get new horse logic instead of the appropriate horse-horse logic. That’s awkward, but understandable. So you call a few meetings, discuss the issue with your fellow programmers, scratch your head, talk to a UX designer and your product owner. And you figure out that not only are the new horses short and stubborn, they make a distinct sound too. They don’t neigh the way regular horses do, they hee-haw instead.

So you fix the bug. A new property on horse, Sound, with values Neigh and HeeHaw, and updates to logic as appropriate. No biggie.

In design meetings, most people still use the term “horse that’s short and stubborn” to mean the new kind of horse, even though you’re encouraging people to include the sound they make as well, or even just say “hee-hawing horse”. But apart from this nit-picking from your side, things proceed well. It appears that most bugs have been ironed out, and your product owner is happy. So happy, in fact, that there is a new meeting. eQuest is expanding further, to handle entities of this kind as well:

mule-only.png

What is it? Well, it’s the offspring from a horse and a horse that’s short and stubborn and says hee-haw. It shares some properties with the former kind of horse and some with the latter, so obviously there will be much reuse! And a few customizations and adjustments unique for this new new kind of horse. At this point you’re getting worried. You sense there will be trouble if you can’t speak clearly about this horse, so you cry out “let’s call it a half hee-haw!” But it doesn’t catch on. Talking about things is getting a bit cumbersome.

“But at least I can still implement it,” you think for yourself. “And I can mostly guess what kind of horses the UX people are thinking about when they say ‘horse’ anyway, I’ll just map it in my head to the right thing”. You add a Sire and a Dam property to Horse. And you proceed to update existing logic and write new logic.

You now have code that looks like this:

if (horse.IsShort && 
    horse.IsStubborn && 
    horse.Sound == Sound.HeeHaw) || 
   (horse.Sire.IsShort && 
    horse.Sire.IsStubborn && 
    horse.Sire.Sound == Sound.HeeHaw) ||
   (horse.Dam.IsShort && 
    horse.Dam.IsStubborn && 
    horse.Dam.Sound == Sound.HeeHaw)) {
  // Logic for both the new horse and the new-new horse!
}
else {
  // Really regular horse code here.
}

Which turns out to be wrong, since the new new horse doesn’t really neigh or hee-haw, it does something in-between. There is no word for it, so you invent one: the neigh-haw. You extend the Sound enumeration to incorporate it and fix your code.

Getting all the edge cases right takes a while. Your product owner is starting to wonder why development is slowing down, when so much of the functionality can be reused. You mumble something about technical debt. But you manage to get the bug count down to acceptable levels, much thanks to diligent testing.

At this point, there is another meeting. You are shown two photographs of horses of the newest kind. Or so you think. The product owner smiles. “They’re almost identical, but not quite!” he says. “You see, this one has a horse as a mother and a short stubborn horse as a father.” You see where this is going. “But this one, this one has a short stubborn horse as a mother and a horse as a father.” “Does it matter?” you ask. “This last one is always sterile,” he says. “So you need to handle that in the breeding module.” Oh.

“And then there’s this.”

zeedonk-pretty-cut-245.png

The point of this example is that it takes very little for software development to get crippled by complexity without precise language. You need words for the things you want to talk about, both in design discussions and in code. Without them, it becomes very difficult to have meaningful communication, and the inability to articulate a thought precisely is made manifest in the code itself. Your task quickly turns from implementing new functionality to making sure that you don’t break existing functionality.

The example is special in that the missing words are sort of jumping out at you. It’s so obvious what they should be. This is not always the case. In many domains, it can be much, much harder to figure out what the words should be. It’s likely to require a lot of time and effort, and include frustrating and heated discussions with people who think differently than you. You might find that you and your team have to invent new words to represent the concepts that are evolving in your collective mind. But it can also be that you’ve all become accustomed to the set of words you’re currently using, and gone blind to the donkeys in your system.


Strings with assumptions

TL;DR Strings always come with strings attached.

I had a little rant about strings on Twitter the other day. It started like this:

This blog post is essentially the same rant, with a bit of extra cheese.

Here’s the thing: I find that most code bases I encounter are unapologetically littered with strings. Strings are used to hold values of all kinds of kinds, from customer names to phone numbers to XML and JSON structures and what have you. As such, strings are incredibly versatile and flexible; properties we tend to think of as positive when we talk about code. So why do I hate strings?

Well, the problem is that we don’t want our types to be flexible like that – as in “accepting of all values”. In fact, the whole point of types is to avoid this flexibility! Types are about restricting the number of possible values in your program, to make it easier to reason about. You want to allow exactly the legal values, and to forbid all the illegal values. String restricts nothing! String is essentially object! But people who have the decency to refrain from using object will still gladly use string all over the place. It’s weird. And dangerous. That’s why we should never give in to the temptation to escape from the type system by submerging our values in the untyped sea of string. When that value resurfaces sometime later on, we’ll effectively be attempting a downcast from object back to the actual type. Will it succeed? Let’s hope so!

So to be very explicit about it: if you have a string in your program, it could be anything – anything! You’re communicating to the computer that you’re willing to accept any and all of the following fine string specimen as data in your program:

Your program does not distinguish between them, they’re all equally good. When you declare a string in your program, you’re literally saying that I’m willing to accept – I’m expecting! – any and all of those as a value. (I should point out that you’re expecting very big strings too, but I didn’t feel like putting any of them in here, because they’re so unwieldy. Not to mention the door is open to that mirage doppelganger of a string, null, as well – but that’s a general problem, not limited to string.)

Of course, we never want this. This is never what the programmer intends. Instead, the programmer has assumptions in their head, that the string value should really be drawn from a very small subset of the entire domain of strings, a subset that fits the programmer’s purpose. Common assumptions include “not terribly big”, “as large as names get”, “reasonable”, “benign”, “as big as the input field in the form that should provide the value”, “similar to values we’ve seen before”, “some format parsable as a date”, “a number”, “fits the limit of the database column that’s used to persist the value”, “well-formed XML”, “matching some regular expression pattern or other” and so on and so forth. I’m sure you can come up with some additional ones as well.

The assumptions might not be explicitly articulated anywhere, but they’re still there. If the assumptions are implicit, what we have is basically a modelling issue that the programmer hasn’t bothered to tackle explicitly yet. It is modelling debt. So whenever you see string in a program, you’re really seeing “string with assumptions”, with the caveats that the assumptions may not be terribly well defined and there may or may not be attempts to have them enforced. In other words, we can’t trust that the assumptions hold. This is a problem.

So what should we do instead? We can’t realistically eradicate strings from our programs altogether. For instance, we do need to be able to speak string at the edges of our programs. Quite often, we need to use strings to exchange data with others, or to persist values in a database. This is fine. But we can minimize the time we allow strings to be “raw”, without enforced assumptions. As soon as we can, we should make our assumptions explicit – even though that means we might need to spend a little time articulating and modelling those assumptions. (That’s a bonus by the way, not a drawback.) We should never allow a string pass unchecked through any part of our system. An unchecked string is Schrodinger’s time bomb. You don’t know if it will explode or not until you try to use it. If it turns out your string is a bomb, the impact may vary from the inconvenient to the embarrassing to the catastrophic.

Unsurprisingly, the good people who care about security (which should be all of us!) find strings with assumptions particularly interesting. Why? Because security bugs can be found precisely where assumptions are broken. In particular, since the string type allows for any string, the scene is set for “Houdini strings” to try to escape the cage where they’re held as data, and break free into the realm of code.

To make our assumptions explicit, we need to use types that are not strings. But it’s perfectly fine for them to carry strings along. Here’s a class to represent a phone number in C#:

Nothing clever, perfectly mundane. You create your PhoneNumber and use it whenever you’d use “string with assumption: valid phone number”. As you can see, the class does nothing more than hold on to a string value, but it does make sure that the string belongs to that small subset of strings that happen to be valid phone numbers as well. It will reject all the other strings. When you need to speak string (at the edges of your program, you just never do it internally), you call ToString() and shed the protection of your type – but at least at that point you know you have a valid phone number.

So it’s not difficult. So why do we keep littering our programs with strings with assumptions?


Thinging names

The other night I made a tweet. It was this:

Programmers are always chasing proximate causes. This is why naming things is considered hard, not finding the right abstractions to name.

And I meant something by that, but what? I got some responses that indicated that some people interpreted it differently than I intended, so evidently it’s not crystal clear. I can see why, too. Like most tweets, it is lacking in at least two ways: it lacks context and it lacks precision. (Incidentally this is why I write “I made a tweet”, much like I’d write “I made a mistake”.) Of course, tweets are prone to these shortcomings, and it takes special talent and a gift for brevity to avoid them. Alas, as the poor reader may have noticed, it is a gift I don’t possess – that much is evident from this paragraph alone!

Therefore, I’m making this attempt at a long-winded deliberation of what I tried to express – that should better suit my talents. It turns out I was even stupid enough to try to say two or even three things at once, which is surely hubris and the death of pithy tweets. First, I was trying to make a rather bold general claim about programmers: that we tend to chase proximate causes rather than ultimate ones. Second, I said that programmers often talk about how hard a problem naming things is, but that instead we should be worried about choosing the appropriate abstractions to name in the first place. And third, I implied that the latter is a particular instance of the former.

So, let’s see if I can clarify and justify what I mean by all these things.

A bit of context first – where does this come from? I’ve been increasingly preoccupied with domain modelling lately, so the tweet ideally should be interpreted with that in mind. I’m absolutely convinced that the only way we can succeed with non-trivial software projects is by working domain-driven. The work we do must reflect insight that we arrive at by talking to users and domain experts and thinking really hard about the problem domain. Otherwise we go blind – and although we might be going at high velocity, we’ll quite simply miss our target and get lost. In the words of Eric Evans, we need to do knowledge crunching to develop a deep model and keep refactoring towards greater insight to ensure that the software 1) solves the current problem and 2) can co-evolve with the business. This is the primary concern. Everything else is secondary, including all the so-called “best practices” you might be employing. Kudos to you, but your craftmanship really is nothing unless it’s applied to the domain.

I think that many of the things we struggle with as programmers are ultimately caused by inadequate domain modelling. Unfortunately, we’re not very good at admitting that to ourselves. Instead, we double and triple our efforts at chasing proximate causes. We keep our code squeaky clean. We do TDD. We program to interfaces and inject our dependencies. This is all very well and good, but it has limited effect, because we’re treating the symptoms rather than curing the disease. In fact, I’ve made drive-by tweets about SOLID (with similar lack of context and precision) that hint at the same thing. Why? Because I think that SOLID is insufficient to ensure a sensible design. It’s not that SOLID is bad advice, it’s just that it deals with secondary rather than primary causes and hence has too little leverage to fix the issues that matter. Even if you assume that SOLID will expose all your modelling inadequacies as design smells and implementation pains, fixing the problem that way is inefficient at best.

So that was a bit of context. Now for precision. The term “naming things” is incredible vague in and of itself, so to make any sense of the tweet, I should qualify what I meant by that. The term stems from a famous quote by Phil Karlton, which goes like this:

There are only two hard problems in Computer Science: cache invalidation and naming things.

Unfortunately I don’t know much about Phil Karlton, except that he was at Netscape when Netscape mattered, and that he obviously had the gift of brevity that I lack.

What are “things” though? I don’t know what Phil Karlton had in mind, but for the purposes of my tweet, I was thinking about “code things”, things like classes and methods. Naming such things appropriately is important, since we rely on those names when we abstract away from the details of implementation. But it shouldn’t be hard to name them! If it is hard, it is because we’re doing something wrong – it is a symptom that the very thing we’re naming has problems and should probably not exist. This is why I think that addressing the “naming things” problem is dealing with proximate causes. Naming things is hard because of the ultimate problem of inadequate domain modelling.

Of course it is quite possible to think of different “things” when you speak of “naming things”, in particular “things” in the domain. In that case, tackling the problem of naming things really is dealing with ultimate causes! This is the most important activity in domain-driven design! And with that interpretation, my tweet is completely nonsensical, since naming things in the domain and finding the right domain abstractions become one and the same. (Ironically, this all goes to show that “naming things” interpreted as “describing things with words” certainly is problematic!)

So where does this leave us? To summarize, I think that domain names should precede code things. This is really just another way of stating that we need the model to drive the implementation. We should concentrate our effort on coming up with the right concepts to embody in code, rather than writing chunks code and coming up with names for them afterwards. Making things from names (“thinging names”) is easy. Making names from things (“naming things”), however, can be hard.