Donkey code

This is an attempt to write down a very simple example I’ve been using to explain the profound impact the language we use has on thought, discussion and ultimately code.

Imagine you have a computer system, and that you’re one of the programmers working on that system (not too hard, is it?). The system is called, oh I don’t know, eQuest. It has to do with horses. So it typically works with entities of this kind:

horse-only

eQuest is a tremendous success for whatever reason, perhaps there’s very little competition. But it is a success, and so it’s evolving, and one day your product owner comes up with the idea to expand to handle entities of this kind as well:

donkey-only

It’s a new kind of horse! It’s mostly like the other horses and so lots of functionality can easily be reused. However, it has some special characteristics, and must be treated a little differently in some respects. Physically it is quite short, but very strong. Behaviour-wise, it is known to be stubborn, intelligent and not easily startled. It’s an interesting kind of horse.  It also likes carrots a lot (but then don’t all horses?). Needless to say, there will be some adjustments to some of the modules.

Design meetings ensue, to flesh out the new functionality and figure out the correct adjustments to be made to eQuest. Discussions go pretty well. Everyone has heard of these “horses that are small and stubborn” as they’re called. (Some rumors indicate that genetically they’re not actually horses at all – apparently there are differences at the DNA level, but the real world is always riddled with such technicalities. From a pragmatic viewpoint, they’re certainly horses. Albeit short and stubborn, of course. And strong, too.) So it’s not that hard to discuss features that apply to the new kind of horse.

There is now a tendency for confusion when discussing other kinds of changes to the product, though. The unqualified term “horse” is obviously used quite a bit in all kinds of discussions, but sometimes the special short and stubborn kind is meant to be included and sometimes it is not. So to clarify, you and your co-workers find yourself saying things like “any horse” to mean the former and “regular horse”, “ordinary horse”, “old horse”, “horse-horse” or even “horse that’s not small and stubborn” to mean the latter.

To implement support for the new horse in eQuest, you need some way of distinguish between it and an ordinary horse-horse. So you add an IsShort property to your Horse data representation. That’s easy, it’s just a derived property from the Height property. No changes to the database schema or anything. In addition, you add an IsStubborn property and checkbox to the registration form for horses in eQuest. That’s a new flag in the database, but that’s OK. With that in place, you have everything you need to implement the new functionality and make the necessary adjustments otherwise.

Although much of the code applies to horses and short, stubborn horses alike, you find that the transport module, the feeding module, the training module and the breeding module all need a few adjustments, since the new horses aren’t quite like the regular horses in all respects. You need to inject little bits of logic here, split some cases in two there. It takes a few different forms, and you and your co-workers do things a bit differently. Sometimes you employ if-branches with logic that looks like this:

if (horse.IsShort && horse.IsStubborn) {
  // Logic for the new horse case.
}
else {
  // Regular horse code here.
}

Other times you go fancy with LINQ:

var newHorses = horses.Where(h => h.IsShort && h.IsStubborn);
var oldHorses = horses.Except(newHorses);
foreach (var h in newHorses) {
  // New horse logic.
}
foreach (var h in oldHorses) {
  // Old horse logic.
}

And that appears to work pretty well, and you go live with support for short and stubborn horses.

Next day, you have a couple of new bug reports, one in the training module and two concerning the feeding module. It turns out that some of the regular horses are short and stubborn too, so your users would register short regular horses, tick the stubborn checkbox, and erroneously get new horse logic instead of the appropriate horse-horse logic. That’s awkward, but understandable. So you call a few meetings, discuss the issue with your fellow programmers, scratch your head, talk to a UX designer and your product owner. And you figure out that not only are the new horses short and stubborn, they make a distinct sound too. They don’t neigh the way regular horses do, they hee-haw instead.

So you fix the bug. A new property on horse, Sound, with values Neigh and HeeHaw, and updates to logic as appropriate. No biggie.

In design meetings, most people still use the term “horse that’s short and stubborn” to mean the new kind of horse, even though you’re encouraging people to include the sound they make as well, or even just say “hee-hawing horse”. But apart from this nit-picking from your side, things proceed well. It appears that most bugs have been ironed out, and your product owner is happy. So happy, in fact, that there is a new meeting. eQuest is expanding further, to handle entities of this kind as well:

mule-only.png

What is it? Well, it’s the offspring from a horse and a horse that’s short and stubborn and says hee-haw. It shares some properties with the former kind of horse and some with the latter, so obviously there will be much reuse! And a few customizations and adjustments unique for this new new kind of horse. At this point you’re getting worried. You sense there will be trouble if you can’t speak clearly about this horse, so you cry out “let’s call it a half hee-haw!” But it doesn’t catch on. Talking about things is getting a bit cumbersome.

“But at least I can still implement it,” you think for yourself. “And I can mostly guess what kind of horses the UX people are thinking about when they say ‘horse’ anyway, I’ll just map it in my head to the right thing”. You add a Sire and a Dam property to Horse. And you proceed to update existing logic and write new logic.

You now have code that looks like this:

if (horse.IsShort && 
    horse.IsStubborn && 
    horse.Sound == Sound.HeeHaw) || 
   (horse.Sire.IsShort && 
    horse.Sire.IsStubborn && 
    horse.Sire.Sound == Sound.HeeHaw) ||
   (horse.Dam.IsShort && 
    horse.Dam.IsStubborn && 
    horse.Dam.Sound == Sound.HeeHaw)) {
  // Logic for both the new horse and the new-new horse!
}
else {
  // Really regular horse code here.
}

Which turns out to be wrong, since the new new horse doesn’t really neigh or hee-haw, it does something in-between. There is no word for it, so you invent one: the neigh-haw. You extend the Sound enumeration to incorporate it and fix your code.

Getting all the edge cases right takes a while. Your product owner is starting to wonder why development is slowing down, when so much of the functionality can be reused. You mumble something about technical debt. But you manage to get the bug count down to acceptable levels, much thanks to diligent testing.

At this point, there is another meeting. You are shown two photographs of horses of the newest kind. Or so you think. The product owner smiles. “They’re almost identical, but not quite!” he says. “You see, this one has a horse as a mother and a short stubborn horse as a father.” You see where this is going. “But this one, this one has a short stubborn horse as a mother and a horse as a father.” “Does it matter?” you ask. “This last one is always sterile,” he says. “So you need to handle that in the breeding module.” Oh.

“And then there’s this.”

zeedonk-pretty-cut-245.png

The point of this example is that it takes very little for software development to get crippled by complexity without precise language. You need words for the things you want to talk about, both in design discussions and in code. Without them, it becomes very difficult to have meaningful communication, and the inability to articulate a thought precisely is made manifest in the code itself. Your task quickly turns from implementing new functionality to making sure that you don’t break existing functionality.

The example is special in that the missing words are sort of jumping out at you. It’s so obvious what they should be. This is not always the case. In many domains, it can be much, much harder to figure out what the words should be. It’s likely to require a lot of time and effort, and include frustrating and heated discussions with people who think differently than you. You might find that you and your team have to invent new words to represent the concepts that are evolving in your collective mind. But it can also be that you’ve all become accustomed to the set of words you’re currently using, and gone blind to the donkeys in your system.


Something for nothing

I thought I’d jot down some fairly obvious things about values in programs.

Say you have some value in your program. For instance, it could be a String, or a Thing. Then conceptually, each String belongs to the set of possible Strings, and each Thing belongs to the set of possible Things. Right?

Like so:

thing-string

Of course, you might even have something like a function from String to Thing or the other way around. That’s no different, they’re just values that belong to some set of possible values. Hard to draw, though.

In programs, this notion of sets of possible values is baked into the concept of types. So instead of saying that some value belongs to the set of possible Things, we say that it has type Thing or is of type Thing.

I wish that was all there was to it, but alas, it gets more complicated. Not only do we want to represent the presence of values in our programs, sometimes we want to represent the absence of values as well. The absence of a value isn’t necessarily an error. It could be, but it could be fine, too. There are many valid reasons why we might end up with absences flowing through our programs. So we need to represent them.

This is where null enters the picture.

In languages like C# and Java – any object-oriented language that carries DNA from Tony Hoare’s ALGOL W – the drawing of the sets above doesn’t map directly over to types. For each so-called reference type (object values that are accessed by means of references), there’s a twist. In addition to the set of possible values, each reference type also allows for the value null:

thing-string-null.png

It looks pretty similar, but the consequences for program semantics are significant.

The purpose of null is, of course, to represent the absence of a value of a given type. But now your type identifies a pretty weird set of possible values. In the case of Thing, for instance, you have all the legitimate actual Things, but also this weird thing that represents the absence of a Thing. So it’s decidedly not a Thing as such, yet it is of type Thing. It’s bizarre.

But it’s not just confusing to think about. It causes practical problems since null just doesn’t fit in. It’s a phony – a hologram that successfully fools the compiler, which is unable to distinguish between null and proper values of a given type. It’s nothing like the other values. Hence you need to think about it and worry about it all the time. Since it’s not really there, you can’t treat it like a proper value. You most decidedly can not invoke a method on it, which is sort of what you do with objects. The interpretation of null is radically different from the interpretation of all the other values of the same type. (Interestingly, it’s different in precisely the same way for all types: how null sticks out from legit Things mirrors how it sticks out from legit Strings and everything else)

But it gets worse. Once you’ve invited null into your home, there’s no way of getting rid of it! In other words, when you make null part of, say, the Thing type, you can no longer express the idea one of the actual, legit Things, not that spectral special “Thing”. There is no way you can say explicitly in your program that you know that a value is present. It’s all anecdotes and circumstance. You can obviously take a look at some value at any given time in your program and decide whether it’s a legit Thing or just an illusion, but it’s completely ephemeral. You’ve given your type system a sort of brain damage that prevents it from forming memories about absence and presence of values: you might check for null, but your program immediately forgets about it!

So much for reference types. What about so-called primitive values, like integers and booleans? Neither can be null in C# or Java. So the question is: lacking null, how do we represent the absence of a value?

Well, one hack is to think “Gee, I’m not really going to use all the possible values, so I can take one of them and pretend it’s sort of like null.” So instead of interpreting the value literally, you override the interpretation for some magic values. (Using -1 as a special value for integers is a classic, in the case where your legit values are non-negative.) The consequence is that you now have two kinds of values inside your type, operating at different semantic levels and being used for different purposes. Your set of values isn’t just a set of values anymore – it’s a conglomerate of conceptually different things.

This leaves us in a situation that’s similar to the reference type situation, except it’s ad-hoc, convention-based at best. In both cases, we have two things we’re trying to express. One thing is the presence or absence of a value. And the other thing is the set of possible values that value belongs to. These are distinct, orthogonal concerns. And usually, the word of wisdom in programming is to let distinct things be distinct, and avoid mixing things up.

So how can we do that? If we reject the temptation to put special values with special interpretation rules into the same bag as the legit values, what can we do?

If you’re a C# programmer, you’re thinking that you can use Nullable to represent the absence of a primitive value instead. This is true, and you should. Nullable is a wrapper around a primitive type, creating a new type that can have either null or a legit instance of the primitive type as value. On top of that, the the compiler works pretty to hard to blur the line between the underlying type and its Nullable counterpart, such as special syntax and implicit type conversion from T to Nullable<T>.

In a language with sum types, we can do something similar to Nullable, but in a completely generic way. What is a sum type? It is a composite type created by combining various classes of things into a single thing. Here’s an example:

type Utterance = Fee | Fie | Foe | Fum

So it’s sort of like an enumeration of variants. But it’s a rich man’s enumeration, since you can associate a variant with a value of another type. Like so:

type ThingOrNothing = Indeed of Thing | Nothing

This creates a type that distinguishes neatly between legit Things and the absence of a Thing. Either you’ll indeed have a Thing, or you’ll have nothing. And since absence stands to presence in the same way for all types, we can generalize it:

type Mayhaps<'T> = Indeed of 'T | Nothing

The nice thing is that we can now distinguish between the case where we might have a value or not and the case where we know we do have a value (provided our language doesn’t allow null, of course). In other words, we can have a function of type Mayhaps<Thing> -> Thing. This is a big deal! We can distinguish between the parts of our program that have to worry about absent values and the parts that don’t. We’ve fixed the brain damage, our program can form memories of the checks we’ve made. It’s a beautiful feat of surgery, enabled by sum types and the absence of null.

So sum types neatly solves this problem with representing absence and presence of values, on top of, and orthogonally to, the issue of defining a set of possible values for a given shape of data. What’s more, this is just one application of a general feature of the language. There is no need to complicate the language itself with special handling of absence. Instead, you’re likely to find something like Mayhaps in the standard library. In F# it’s called Option, in Elm it’s called Maybe. In languages like F# and Elm – any functional language that carries DNA from Robin Milner’s ML – you’ll find that you have both sum types and pattern matching. Pattern matching makes it very easy to work with the sum type variants.

The code you write to handle absence and presence follows certain patterns, which means you can create abstractions. For instance, say you have some function thingify of type String -> Thing, which takes a String and produces a Thing. Now suppose you’re given an Mayhaps<String> instead of a String. If you don’t have a String, you don’t get a Thing, but if indeed you do have a String, you’d like to apply thingify to get a Thing. Right?

Here’s how you might write it out, assuming F#-style syntax and pattern matching:

match dunnoCouldBe with 
| Indeed str -> Indeed (thingify str) 
| Nothing -> Nothing

This pattern is going to pop up again and again when you’re working with Mayhaps values, where the only thing that varies is the function you’d like to apply. So you can write a general function to handle this:

let quux (f : 'T -> 'U) (v : Mayhaps<'T>) : Mayhaps<'U> = 
  match v with 
  | Indeed t -> Indeed (f t) 
  | Nothing -> Nothing

Whenever you need to optionally transform an option value using some function, you just pass both of them to quux to handle it. But of course, chances are you’ll find that quux has already been written for you. In F#, it’s called Option.map. Because it map from one kind of Option value to another.

At this point, we’ve got the mechanics and practicalities of working with values that could be present or absent worked out. Now what should you do when you change your mind about where and how you handle the absence of a value? This is a design decision, even a business rule. These things change.

The short answer is that when these things change, you get a rippling change in type signatures in your program – from the place where you made a change, to the place where you handle the effect of the change. This is a good thing. This is the compiler pointing out where you need to do some work to make the change work as planned. That’s another benefit of treating the absence of values explicitly instead of mixing it up with the values themselves.

That’s all well and good, but what can you do if you’re a C# or Java programmer? What if your programming language has null and no sum types? Well, you could implement something similar to Mayhaps using the tools available to you.

Here’s a naive implementation written down very quickly, without a whole lot of thought put into it:

Now you can write code like this:

var foo = Mayhaps<string>.Nothing;
var bar = Mayhaps<string>.Indeed("lol");
var couldBeStrings = new[] { foo, bar };
var couldBeLengths = couldBeStrings.Select(it => it.Map(s => s.Length));

A better solution would be to use a library such as Succinc<T> to do the job for you.

Regardless of how you do it, however, it’s always going to be a bit clunky. What’s more is it won’t really solve our problem.

As you’ll recall, the problem with null is that you can’t escape from it. In a sense, what is missing isn’t Mayhaps. It’s the opposite. With null, everything is Mayhaps! We still don’t have a way to say that we know that the value is there. So perhaps a better solution is to implement the opposite? We could try. Here’s a very simple type that banishes null:

Now the question is – apart from being very clunky – does it work? And the depressing answer is: not really. It addresses the correct problem, but it fails for an obvious reason – how do you ensure that the Indeed value itself isn’t null? Put it inside another Indeed?

Implementing Indeed as a struct (that is, a value type) doesn’t work too great either. While a struct Indeed cannot be null, you can very easily obtain an uninitialized Indeed, for instance by getting the default value of the struct, which is always available. In that case, you would end up with an Indeed which wraps a null, which is unacceptable.

So I’m afraid it really is true. You can’t get rid of null once you’ve invited it in. It’s pretty annoying. I wish they hadn’t invited in null in the first place.


Pragmatism is poison

Yesterday I gave a lightning talk called “Pragmatism is Poison” at the Booster Conference in Bergen. This blog post is essentially that talk in written form.

The basic idea of the talk, and therefore of this blog post, is to launch a public attack on perhaps the most fundamental virtue of the software craftsman: pragmatism. The reason is that I think pragmatism has turned toxic, to the point where it causes more harm than good.

While I do believe that pragmatism was once a useful maxim to combat analysis paralysis and over-engineering, I also believe that the usefulness has expired. Pragmatism no longer represents a healthy attitude, in my mind it’s not even a word anymore. It has degenerated into a thought-terminating cliché, which is used to stifle discussion, keep inquiries at bay and to justify not thinking things through.

These are, of course, pretty harsh words. I’ll try to make my case though.

Let’s start by having a look at what it means to be pragmatic. The definition Google gave me is as follows:

Dealing with things sensibly and realistically in a way that is based on practical rather than theoretical considerations.

[A small caveat: this isn’t necessarily the only or “correct” definition of what it means to be pragmatic. But I suppose it’s a sensible and realistic approximation?]

Anyways: this is good by definition! How can it possibly be bad?

Well, for a whole bunch of reasons really. We can divide the problems in two parts: 1) that pragmatism lends itself to abuse because it’s ambiguous and subjective, and 2) all the forms that abuse can take.

So the underlying problem is that the definition relies heavily on subjective judgment. (Consider the task of devising a test that would determine whether or not someone was being pragmatic – the very idea is absurd!) One thing is that what qualifies as “realistic” and “sensible” is clearly subjective. For any group of people, you will find varying degrees of agreement and disagreement depending on who they are and the experiences they’ve had. But the distinction between “practical” and “theoretical” considerations is problematic as well.

Here are some quick considerations to -uh- consider:

#1 laws of nature change
#2 bomb hits data center
#3 cloud providers go down
#4 servers are hacked
#5 unlikely timing of events

I suppose we can agree that #1 is of theoretical interest only. What about the others? If #2 happens, maybe we have more serious problems that your service going down, but it depends on how critical that service is. The same goes for #3. With respect to #4, a lot of people seem surprised when their servers are hacked – as if they hadn’t thought it possible! And #5 depends on whether or not you’d like luck to be a part of your architecture. I’ve been on projects where people would tell me that scenarios I asked about were so unlikely that they were practically impossible. Which to me just means they’ll be hard to debug and reproduce when they happen in production.

The point, though, is that the distinction between “practical” and “theoretical” is pretty much arbitrary. Where do we draw the line? Who’s to say? But it’s important, because mislabeling important considerations – things that affect the quality of software! – as “theoretical” leads to bad software.

So that’s the underlying problem of ambiguity. On to the various forms of abuse!

The first is that pragmatism is often used to present a false dilemma in software development. We love our dichotomies! I could use cheap rhetoric to imply that it traces back to the 0s and 1s of our computers, but let’s not – I have no evidence for such a claim! Luckily I don’t need it either. Suffice it to say that the world is complex, and it’s always very tempting to see things in black and white instead. And so we pit “pragmatism” on the one hand against “dogmatism” on the other, and it’s really important to stay on the right side of that divide! Sometimes we use different words instead, like “practical” vs “theoretical” or “real world” vs “ivory tower”. It all means the same thing: “good” vs “bad”. Which is a big lie, because we’re not making ethical judgments, we’re trying to assess the pros and cons of different solutions to particular problems in concrete contexts. This isn’t The Lord of the Rings.

The consequences of this polarizing are pretty severe. The false dilemma is often used as a self-defense bluff in discussions between team members. The so-called impostor syndrome is rampant in our industry, and so we reach for tools that help us deal with insecurity. One such tool is pragmatism, which can be abused as a magical spell to turn insecurity on its head.

Here’s how it works. Because of the false dilemma, a claim to be pragmatic is implicitly an accusation that says that whoever disagrees is dogmatic: they’re not being sensible, they’re not being realistic, they’re just obsessing over theoretical considerations. So while a statement like “I’m being pragmatic” sounds innocuous enough, it’s really not. It leads to stupid, unrefined, pointless discussions where no knowledge is gained. Instead we’re fighting over who’s good and who’s bad. Polarizing does not make discussions more interesting, it makes them degenerate into banality.

A related strategy is to use pragmatism as a diversion or a smoke bomb, offering the confronted part with an easy exit and effectively ending the discussion. The reason is that it takes a lot of guts and perseverance to call someone’s bluff when they’re claiming to be pragmatic. You might approach your co-worker with a concern like “hey, I was looking at the code, and it seems like we’re blocking in our streaming API, which sort of defeats the purpose of a streaming API in the first place”. It sounds like a valid concern until your co-worker says the magic words “I was just being pragmatic” and vanishes in a puff of smoke, like so:

puffanimate

What we should do instead is accept the complexity we’re faced with and resist the urge to trivialize it. There’s always a need for thinking and discussion, and spurious claims of pragmatism don’t help.

Another problem with pragmatism is that it can be – and is – used as an excuse for sloppy thinking, or no thinking at all. Pragmatism encourages partial “solutions” that work not by reflecting a conceptual solution to a problem, but rather by mimicking correct behavior for various inputs. That way, we can short-circuit the need for design and collaboration. Instead we start with a trivial “happy path” solution and add flags and epicycles to flesh it out into richer behavior as needed. This approach yields software that more or less works for the inputs we’ve tried, and maybe for other inputs as well. (Behavior in the latter case is not as well understood, for obvious reasons.) If we come across inputs that cause problems, we apply patches in the form of additional flags and epicycles.

Because the approach sounds rather dubious when written out like that, we use the magic word “pragmatic” to make it sound better. It’s pragmatic problem-solving. We call the solutions themselves “good enough” solutions – wasting any more time thinking would be gold-plating!  Sometimes we use quotes like “perfect is the enemy of good” as further evidence that we’re doing a good thing – as if the problem we’re facing is too much perfect software in the world!

Here’s an obviously made-up example of this approach:

function square(x) = {
    if (x == 1) then return 1;
    if (x == 2) then return 4;
    if (x == 3) then return 9;
    if (x == 4) then return 15;
    if (x == 5) then return 25;
    // Should never happen.
    return -1;
 }

This is a function that computes the square of the integers 1-5, not by reflecting any understanding of what it means to square a number, but rather by emulating correct behavior. It has a small bug for the input number 4, but that doesn’t matter much, we rarely get that value, and the result isn’t too far off. It’s a perfectly pragmatic solution if you can assume that x will always be in the range 1-5.

A more general solution would be this:

function square(x) = {
    return x * x;
}

Which solves the general case of calculating the square of integers – sort of! Unfortunately, integers themselves are deeply pragmatic. (What happens when x * x is greater than the maximum value for integers?)

But these are all silly examples – theoretical considerations!

So let’s consider something more “real world”. Since software exists and executes in time, software typically needs to take time into account – by registering time stamps for various events and so forth.

How do you handle dates and times in your applications? Are you aware of the related complexities that exist? Do you handle those complexities explicitly? Do you think about how they might affect your application in various ways? Or do you simply close your eyes and hope for the best? Assume that all the systems you integrate with use time stamps from the same time zone? Assume that leap years and leap seconds won’t affect you (that all years have 365 days and all minutes have 60 seconds)? Assume that daylight savings time won’t cause any problems (even though it means that time isn’t linear – depending on your time zone(s), some points in time may not exist, whereas others may exist more than once)? Assume that everyone else around you are making the same assumptions? That’s mighty pragmatic!

Finally, pragmatism is sometimes used to create outright logical contradictions. Pragmatism is about compromise, but some compromises cannot be made without compromising the concept itself! For instance, some architectural properties have principles that cannot be violated while still maintaining the properties – they are simply no longer present due to the violation. (A vegetarian cannot eat meat in the weekends and still be a vegetarian, if you will.) Not even pragmatism can fix this, but that doesn’t stop people from trying!

To illustrate, here’s (a reproduction of) a funny meme I found on the Internet.

batman-rest

I think it’s funny because a lot of people seem to get annoyed when someone points out that their self-proclaimed RESTful APIs aren’t really RESTful because they violate some property of REST or other – typically the hypermedia constraint. People get annoyed because they don’t want to think about that, they’d rather be pragmatic and get on with stuff.

But for some reason they still want to keep that word, REST. Maybe they think it sounds good, or maybe they promised their manager a REST API, I don’t know. It doesn’t matter. They want to keep that word. And so they say “well, maybe it’s not your Ivory Tower REST” (implicitly bad!), “maybe it’s Pragmatic REST instead” (implicitly good!). And then they go on to do something like JSON over HTTP, which is really simple and great, and they can easily deserialize the JSON in their JavaScript, and they’ve practically shipped it already. And when someone comes along and talks about hypermedia being a requirement for REST, they just slap them! Pretty funny!

Here’s another meme. I made this one myself. Unfortunately it’s not funny.

batman-secure

Why isn’t it funny? It looks a lot like the previous one.

The problem is that when someone violates some established principle of security – maybe they decide it’s convenient to store encrypted passwords on their server or to roll their own cryptography – we think it’s a bad idea. And we don’t think it’s a good excuse to say “well, maybe it’s not Ivory Tower Security, maybe it’s Pragmatic Security instead”. We simply don’t agree that it’s very secure at all. So in some sense it’s not funny because the roles have been reversed. Turns out it’s much more funny being Batman than being Robin in this meme. Go figure.

Now we have the strange situation that it’s apparently OK for some words in software to have no meaning (REST), whereas we insist that others do have meaning (secure). For the meaningless words, prefixing “pragmatic” will absolve you from your sins, for the meaningful words, it will not. This is a problem, I think. Who’s to decide which words should have meaning?

Here’s a third meme. I made this one too.

batman-tea

It’s a bit strange, I’ll admit that. But bear with me. It’s the last one, I promise.

What would you say if someone offered you hot water and a biscuit and said “have some pragmatic tea”? Would you be content? Would you pay for a cup of pragmatic tea? Or would you take the route of the dogmatist and argue that it’s not really tea if no tea was involved in the preparation? Well, SLAP you! Crazy tea zealot! Hang out with the other ivory tower hipsters, and have your fancy tea! Who drinks tea anyway?!

At this point you can see I’ve gone absurd – but we’re still just doing variations of the same joke. I didn’t bring the absurdity, it was there all along. The point I’m trying to make is that words do have meaning, whether it’s REST or security or tea. We should respect that meaning instead of using pragmatism as a poor excuse for undermining it. Some properties have principles that cannot be broken while at the same time keeping the properties intact. You can’t have REST without Representational State Transfer, because that’s literally what the acronym means. Secure applications shouldn’t be storing passwords, even if they’re encrypted. Tea should contain tea. (Please don’t get me started on Rooibos.)

I should add that it’s perfectly fine to not care about REST, or to dispute the value of REST – or tea, or security, for that matter. Those are conversations worth having. It’s a free world, and everyone is entitled to choose the properties they care about, based on the context they’re in. Lots of people very vocally don’t care about REST, perhaps even more people than people who know what REST actually is! I have no problem with that. What’s less fine is pretending it has no meaning.

This concludes my attack on pragmatism, at least for now! To reiterate: pragmatism is easily abused because it’s hard to tell if someone is genuinely pragmatic or just claiming to be so. The abuse takes various forms: false dilemma, self-defense bluff, smoke bomb, justification for sloppy thinking and undermining of the meaning of words.

A call for action? Please stop using pragmatism as an excuse for doing sloppy work! And if you find you do need to use the word, please have it be the beginning of a discussion rather than the end of one.


Strings with assumptions

TL;DR Strings always come with strings attached.

I had a little rant about strings on Twitter the other day. It started like this:

This blog post is essentially the same rant, with a bit of extra cheese.

Here’s the thing: I find that most code bases I encounter are unapologetically littered with strings. Strings are used to hold values of all kinds of kinds, from customer names to phone numbers to XML and JSON structures and what have you. As such, strings are incredibly versatile and flexible; properties we tend to think of as positive when we talk about code. So why do I hate strings?

Well, the problem is that we don’t want our types to be flexible like that – as in “accepting of all values”. In fact, the whole point of types is to avoid this flexibility! Types are about restricting the number of possible values in your program, to make it easier to reason about. You want to allow exactly the legal values, and to forbid all the illegal values. String restricts nothing! String is essentially object! But people who have the decency to refrain from using object will still gladly use string all over the place. It’s weird. And dangerous. That’s why we should never give in to the temptation to escape from the type system by submerging our values in the untyped sea of string. When that value resurfaces sometime later on, we’ll effectively be attempting a downcast from object back to the actual type. Will it succeed? Let’s hope so!

So to be very explicit about it: if you have a string in your program, it could be anything – anything! You’re communicating to the computer that you’re willing to accept any and all of the following fine string specimen as data in your program:

Your program does not distinguish between them, they’re all equally good. When you declare a string in your program, you’re literally saying that I’m willing to accept – I’m expecting! – any and all of those as a value. (I should point out that you’re expecting very big strings too, but I didn’t feel like putting any of them in here, because they’re so unwieldy. Not to mention the door is open to that mirage doppelganger of a string, null, as well – but that’s a general problem, not limited to string.)

Of course, we never want this. This is never what the programmer intends. Instead, the programmer has assumptions in their head, that the string value should really be drawn from a very small subset of the entire domain of strings, a subset that fits the programmer’s purpose. Common assumptions include “not terribly big”, “as large as names get”, “reasonable”, “benign”, “as big as the input field in the form that should provide the value”, “similar to values we’ve seen before”, “some format parsable as a date”, “a number”, “fits the limit of the database column that’s used to persist the value”, “well-formed XML”, “matching some regular expression pattern or other” and so on and so forth. I’m sure you can come up with some additional ones as well.

The assumptions might not be explicitly articulated anywhere, but they’re still there. If the assumptions are implicit, what we have is basically a modelling issue that the programmer hasn’t bothered to tackle explicitly yet. It is modelling debt. So whenever you see string in a program, you’re really seeing “string with assumptions”, with the caveats that the assumptions may not be terribly well defined and there may or may not be attempts to have them enforced. In other words, we can’t trust that the assumptions hold. This is a problem.

So what should we do instead? We can’t realistically eradicate strings from our programs altogether. For instance, we do need to be able to speak string at the edges of our programs. Quite often, we need to use strings to exchange data with others, or to persist values in a database. This is fine. But we can minimize the time we allow strings to be “raw”, without enforced assumptions. As soon as we can, we should make our assumptions explicit – even though that means we might need to spend a little time articulating and modelling those assumptions. (That’s a bonus by the way, not a drawback.) We should never allow a string pass unchecked through any part of our system. An unchecked string is Schrodinger’s time bomb. You don’t know if it will explode or not until you try to use it. If it turns out your string is a bomb, the impact may vary from the inconvenient to the embarrassing to the catastrophic.

Unsurprisingly, the good people who care about security (which should be all of us!) find strings with assumptions particularly interesting. Why? Because security bugs can be found precisely where assumptions are broken. In particular, since the string type allows for any string, the scene is set for “Houdini strings” to try to escape the cage where they’re held as data, and break free into the realm of code.

To make our assumptions explicit, we need to use types that are not strings. But it’s perfectly fine for them to carry strings along. Here’s a class to represent a phone number in C#:

Nothing clever, perfectly mundane. You create your PhoneNumber and use it whenever you’d use “string with assumption: valid phone number”. As you can see, the class does nothing more than hold on to a string value, but it does make sure that the string belongs to that small subset of strings that happen to be valid phone numbers as well. It will reject all the other strings. When you need to speak string (at the edges of your program, you just never do it internally), you call ToString() and shed the protection of your type – but at least at that point you know you have a valid phone number.

So it’s not difficult. So why do we keep littering our programs with strings with assumptions?


Technical debt isn’t technical

TL;DR

Technical debt is not primarily caused by clumsy programming, it is a third-order effect of poor communication. Technical debt is a symptom of an underlying lack of appropriate abstractions, which in turn stems from insufficient modelling of the problem domain. This means that necessary communication has not taken place: discussions and decisions to resolve ambiguity and make informed trade-offs have been swept under the rug. Technical debt is the reification of this lack of resolution in code.

The technical debt meme

For a while now, I’ve been wanting to write about technical debt. As we all know, technical debt is a very successful meme in software development – it needs no introduction as a concept. Like any good virus, it has self-replicated and spread throughout the software development world, even reaching into the minds of project leaders and stakeholders. This is good, since the notion of technical debt brings attention to the fact that the internal quality of software matters – that there are aspects of software that are invisible to anyone but the programmers, but still have very visible effects – in the form of prolonged quality problems, missed deadlines, development grinding to a halt and so forth. For this reason, we should tip our hats to Ward Cunningham for coming up with the term. It gives us terminology that allows us to communicate better with non-technical stakeholders in software projects.

Why technical debt is a misnomer

That’s not what I want to write about however. What I want to say is that technical debt is also a deeply problematic notion, because it speaks little of the causes of technical debt, or how to fix them.

The usual story is that technical debt stems from project deadlines. If the code is inadequate, sloppy or otherwise “bad”, it has probably been written in a hurry because the project leader said so. This indicates that time is the cause of our problems, and also conveniently places the responsibility of the mess on someone else than the developers.

This is certainly true in some cases; we have all written code like that, and for those exact reasons. I just don’t think it’s the whole story, or even a major part of it. It seems to me entirely inadequate to explain the majority of technical debt that I’ve seen on software projects. The so-called technical problems go much deeper than mere sloppiness of implementation, and reveal fundamental problems in the process of understanding of the business domain and how that understanding is captured and represented in software. In particular, it is very common to see weak abstractions that fail to represent the richness of the domain. The code tends to be overgrown with conditional and flags, which indicates a weak model that has handled evolution and change very poorly – by ad hoc spouting of extra branches and the booleans needed to navigate them as appropriate for different use cases.  Complexity grows like ever new epicycles on the inadequate model – easily recognizable as things in your code that cannot be given meaningful names because they have no meaningful counterpart in the problem domain. The end result is a horrific steampunk contraption of accidental complexity.

This makes the code extraordinarily difficult to reason about. Hence, it would seem that the so-called technical debt really stems from modelling debt; the code lacks the higher-level concepts of a rich domain model that would make it possible to express the use cases more directly.

The currency of technical debt is knowledge. — @sarahmei

In DDD terms, modelling debt indicates that insufficient knowledge crunching has taken place. Knowledge crunching involves learning about the problem domain and capturing that knowledge in a suitable domain model. This is a communication-driven process that involves identifying and resolving ambiguity in the problem domain, and expressing the domain as clearly as possible. Most of all, it is a chaotic and messy process that involves people and discussion. Insufficient knowledge crunching in turn points towards the ultimate cause of technical debt: poor communication.

Communication is the principal portion of the “technical debt.” Messy code is just the ever-increasing interest. — @nycplayer

Why technical debt is misrepresented

So if technical debt isn’t really technical – or at least not ultimately caused by technical issues – why do we keep referring to it as technical debt? Unfortunately, it seems to me that developers have a tendency to look for technical solutions to soft problems.

Technical tasks are alluring because, unlike modelling and communication, they have no psychological dimensions, and tend not to lead to conflict. I don’t want to add to the stereotype of the programmer as particularly socially inept; suffice it to say that most people will prefer to avoid conflict if possible. Technical work is a series of puzzles to be solved. Modelling work uncovers human issues, differences of opinion, different focus, different hopes for the application, even personal conflicts. Figuring out what the application should do exposes all of these issues, and it is painful! This is why everyone is quoting Conway’s Law these days.

And so, even as widespread as the meme technical debt is, it seems to be poorly understood, even by developers – perhaps even particularly by developers! Indeed, the Wikipedia page for technical debt – no doubt authored by developers – currently lists 11 causes of technical debt, but lack of understanding of the problem domain is not one of them! “Lack of knowledge” sounds promising, until you read the explanation: “when the developer simply doesn’t know how to write elegant code”(!)

Again we see the focus on the technical aspects, as if technical debt were caused by clumsy, unskilled programmers with nagging, incompetent project leaders – and hence as if it were fixable by some virtuous programmer – a master craftsman, no less! – using generic, context-free principles like SOLID, dependency injection and patterns. It is not! Code hygiene is certainly a virtue, but it is no substitute for modelling, just like frantically washing your hands is not sufficient for successful surgery. Getting lost in code hygiene discussions is like arguing about the optimal kinds of soap and water temperature while the patient is dying on the operating table.

And yet it is indirectly true: a developer who doesn’t know the importance of understanding the problem domain, of proper modelling, will certainly fail to write elegant code. Elegance of the implementation can only stem from an elegant model that reflects a deep understanding of the problem addressed.

Down payment

This has deep ramifications, in particular in how we address technical debt. Refactoring is another successful meme in software development, and we often use it to describe the process of making down payments on technical debt. But if technical debt isn’t just clumsy code, if instead it is clumsy code caused by unresolved ambiguity in the problem domain, then it is poorly addressed by rearranging code. We need to start in the other end, with a better understanding of the problem we are trying to solve, and with modelling concepts permeating the code instead of branches and booleans. This is what Eric Evans calls “refactoring towards deeper insight”. Unless we have a model to drive our efforts, there is no reason to believe that we will be able to do much better than before. Refactoring without an improved domain model is just hubris.

A rewrite will end up with the same problems as the original unless you close the understanding gap. — @sarahmei

To conclude

That’s what I wanted to say about technical debt. It’s not very technical at all. It’s about code that gets bad because humans fail to communicate when trying to solve problems in some business domain using software. It usually is.

 


Thinging names

The other night I made a tweet. It was this:

Programmers are always chasing proximate causes. This is why naming things is considered hard, not finding the right abstractions to name.

And I meant something by that, but what? I got some responses that indicated that some people interpreted it differently than I intended, so evidently it’s not crystal clear. I can see why, too. Like most tweets, it is lacking in at least two ways: it lacks context and it lacks precision. (Incidentally this is why I write “I made a tweet”, much like I’d write “I made a mistake”.) Of course, tweets are prone to these shortcomings, and it takes special talent and a gift for brevity to avoid them. Alas, as the poor reader may have noticed, it is a gift I don’t possess – that much is evident from this paragraph alone!

Therefore, I’m making this attempt at a long-winded deliberation of what I tried to express – that should better suit my talents. It turns out I was even stupid enough to try to say two or even three things at once, which is surely hubris and the death of pithy tweets. First, I was trying to make a rather bold general claim about programmers: that we tend to chase proximate causes rather than ultimate ones. Second, I said that programmers often talk about how hard a problem naming things is, but that instead we should be worried about choosing the appropriate abstractions to name in the first place. And third, I implied that the latter is a particular instance of the former.

So, let’s see if I can clarify and justify what I mean by all these things.

A bit of context first – where does this come from? I’ve been increasingly preoccupied with domain modelling lately, so the tweet ideally should be interpreted with that in mind. I’m absolutely convinced that the only way we can succeed with non-trivial software projects is by working domain-driven. The work we do must reflect insight that we arrive at by talking to users and domain experts and thinking really hard about the problem domain. Otherwise we go blind – and although we might be going at high velocity, we’ll quite simply miss our target and get lost. In the words of Eric Evans, we need to do knowledge crunching to develop a deep model and keep refactoring towards greater insight to ensure that the software 1) solves the current problem and 2) can co-evolve with the business. This is the primary concern. Everything else is secondary, including all the so-called “best practices” you might be employing. Kudos to you, but your craftmanship really is nothing unless it’s applied to the domain.

I think that many of the things we struggle with as programmers are ultimately caused by inadequate domain modelling. Unfortunately, we’re not very good at admitting that to ourselves. Instead, we double and triple our efforts at chasing proximate causes. We keep our code squeaky clean. We do TDD. We program to interfaces and inject our dependencies. This is all very well and good, but it has limited effect, because we’re treating the symptoms rather than curing the disease. In fact, I’ve made drive-by tweets about SOLID (with similar lack of context and precision) that hint at the same thing. Why? Because I think that SOLID is insufficient to ensure a sensible design. It’s not that SOLID is bad advice, it’s just that it deals with secondary rather than primary causes and hence has too little leverage to fix the issues that matter. Even if you assume that SOLID will expose all your modelling inadequacies as design smells and implementation pains, fixing the problem that way is inefficient at best.

So that was a bit of context. Now for precision. The term “naming things” is incredible vague in and of itself, so to make any sense of the tweet, I should qualify what I meant by that. The term stems from a famous quote by Phil Karlton, which goes like this:

There are only two hard problems in Computer Science: cache invalidation and naming things.

Unfortunately I don’t know much about Phil Karlton, except that he was at Netscape when Netscape mattered, and that he obviously had the gift of brevity that I lack.

What are “things” though? I don’t know what Phil Karlton had in mind, but for the purposes of my tweet, I was thinking about “code things”, things like classes and methods. Naming such things appropriately is important, since we rely on those names when we abstract away from the details of implementation. But it shouldn’t be hard to name them! If it is hard, it is because we’re doing something wrong – it is a symptom that the very thing we’re naming has problems and should probably not exist. This is why I think that addressing the “naming things” problem is dealing with proximate causes. Naming things is hard because of the ultimate problem of inadequate domain modelling.

Of course it is quite possible to think of different “things” when you speak of “naming things”, in particular “things” in the domain. In that case, tackling the problem of naming things really is dealing with ultimate causes! This is the most important activity in domain-driven design! And with that interpretation, my tweet is completely nonsensical, since naming things in the domain and finding the right domain abstractions become one and the same. (Ironically, this all goes to show that “naming things” interpreted as “describing things with words” certainly is problematic!)

So where does this leave us? To summarize, I think that domain names should precede code things. This is really just another way of stating that we need the model to drive the implementation. We should concentrate our effort on coming up with the right concepts to embody in code, rather than writing chunks code and coming up with names for them afterwards. Making things from names (“thinging names”) is easy. Making names from things (“naming things”), however, can be hard.


Diamond mirrors

My friend Bjørn Einar did a nice write-up about the Diamond code kata in F# the other day. He did so in the context of TDD-style evolutionary design vs up-front thinking away from the keyboard. Apparently he has this crazy idea that it might be worthwhile to do a bit of conceptual problem-solving and thinking about properties of the domain before you start typing. Very out of vogue, I know.

Anyways, he ended up with an interesting implementation centered on exploiting something called the taxicab norm. (I hadn’t heard of it either, which makes it all the more interesting.) I really like that approach: cast your problem as an instance of an existing, well-understood problem for which there exists a well-understood solution. It replaces ad-hoc code with a mathematical idea, and is rather a far step away from typical implementations that get heavy on string manipulations and where the solution to the problem in general is swamped with things related to outputting the diamond to the console.

I wondered if I could come up with an alternative approach, and hence I got to thinking a bit myself. Away from the keyboard, like a madman. The solution I came up is perhaps a bit more conventional, a bit less mathematical (I’m sorry to say), but still centered on a single idea. That idea is mirroring.

To illustrate the approach, consider a sample diamond built from five letters, A through E. It should look like the following:

....A....
...B.B...
..C...C..
.D.....D.
E.......E
.D.....D.
..C...C..
...B.B...
....A....

The mirroring is fairly obvious. One way to look at the diamond is to consider it as a pyramid mirrored along the E-row. But at the same time, it is also a pyramid mirrored along the A-column. So it goes both ways. This means that we could rather easily build our pyramid from just a quarter of the diamond, by mirroring it twice. We would start with just this:

A....
.B...
..C..
...D.
....E

We could then proceed by mirroring along the A-column to produce this:

....A....
...B.B...
..C...C..
.D.....D.
E.......E

And then we could complete the diamond by mirroring along the E-row, and it would look like the diamond we wanted.

So far so good. But we need the first quarter. How could we go about producing that?

Assume we start with a list ['A' .. 'E']. We would like to use that to produce this list:

But that’s rather easy. Each inner list is just the original list ['A' .. 'E'] with all letters except one replaced by ‘.’. That’s a job for map. Say I want to keep only the ‘B’:

And so on and so forth for each letter in the original list. We can use a list comprehension to generate all of them for us. For convenience, we’ll create a function genLists:

This gives us the first quarter. Now for the mirroring. That’s easy too:

(We’ll never actually call mirror with an empty list, but I think it’s better form to include it anyway.)

So now we can map the mirror function over the quarter diamond to produce a half diamond:

Excellent. Now we’re almost ready to do the second mirroring. The only problem is that the mirror function uses the head element as the pivot for mirroring, so we would end up with an X instead of a diamond!

That’s trivial to fix though. We’ll just reverse the list first, and then do the mirroring. I’m not even going to write up the result for that – it is obviously the completed diamond. Instead, here’s the complete diamond function, built from the parts we’ve seen so far:

Could I speed up things by reversing my lists before the first mapping instead of after? No, because the (outer) list has the same number of elements before and after the first mirroring. Plus it’s easier to explain this way. And really, perf optimization for a code kata? Come on!

Now for rendering:

And to run everything (for a full-sized diamond, because why not):

And that’s all there is to it. The entire code looks like this: