Into the Tar Pit

I recently re-read the “Out of the Tar Pit” paper by Ben Moseley and Peter Marks for a Papers We Love session at work. It is a pretty famous paper. You can find it at the Papers We Love repository at GitHub for the simple reason that lots of people love it. Reading the paper again triggered some thoughts, hence this blog post.

The title of the paper is taken from Alan Perlis (epigram #54):

Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.

A Turing tar-pit is a language or system that is Turing complete (and so can do as much as any language or system can do) yet is cumbersome and impractical to use. The Turing machine itself is a Turing tar-pit, because you probably wouldn’t use it at work to solve real problems. It might be amusing but not practical.

The implication of the title is that we are currently in a Turing tar-pit, and we need to take measures to get out of it. Specifically, the measures outlined in the paper.

The paper consists of two parts. The first part is an essay about the causes and effects of complexity in software. The second part is a proposed programming model to minimize so-called accidental complexity in software.

The argument of the paper goes like this: Complexity is the primary cause of problems in software development. Complexity is problematic because it hinders the understanding of software systems. This leads to all kinds of bad second-order effects including unreliability, security issues, late delivery and poor performance, and – in a vicious circle – compound complexity, making all the problems worse in a non-linear fashion as systems grow larger.

Following Fred Brooks in “No Silver Bullet“, the authors distinguish between essential and accidental complexity. The authors define “essential complexity” as the complexity inherent in the problem seen by the users. The “accidental complexity” is all the rest, including everything that has to do with the mundane, practical aspects of computing.

The authors identify state handling, control flow and code volume as drivers of complexity. Most of this complexity is labeled “accidental”, since it has to do with the physical reality of the machine, not with the user’s problem.

The proposed fix is to turn the user’s informal problem statement into a formal one: to derive an executable specification. Beyond that, we should only allow for the minimal addition of “accidental” complexity as needed for practical efficiency concerns.

The authors find our current programming models inadequate because they incur too much accidental complexity. Hence a new programming model is needed, one that incurs a minimum of accidental complexity. The second part of the paper presents such a model.

What struck me as I was reading the paper again was that it is wrong about the causes of complexity and naive about software development in general.

The paper is wrong for two reasons. First, because it treats software development as an implementation problem. It would be nice if that were true. It’s not. We will not get much better at software development if we keep thinking that it is. Second, because it ignores the dynamics of software development and makes invalid assumptions. Specifically, it is naive about the nature of the problems we address by making software.

I agree with the authors that complexity is a tremendous problem in software. The non-linear cumulation of complexity often threatens to make non-trivial software development efforts grind to a halt. Many software systems are not just riddled with technical debt which is the term we often use for runaway complexity – they have practically gone bankrupt! However, the problem of complexity cannot be solved by means of a better programming model alone. We must trace the causes of complexity beyond the realm of the machine and into the real world. While better programming models would be nice, we can’t expect wonders from them. The reason is that the root cause of complexity is to be found in the strenuous relationship between the software system and the world in which it operates. This is outside the realm of programming models.

According to the authors, the role of a software development team is “to produce (using some given language and infrastructure) and maintain a software system which serves the purposes of its users”. In other words, the role is to implement the software. The role of the user, on the other hand, is to act as oracle with respect to the problem that needs to be solved. The authors note in parenthesis that they are assuming “that the users do in fact know and understand the problem that they want solved”. Yet it is well-known that this assumption doesn’t hold! Ask anyone in software! Already we’re in trouble. How can we create an executable specification without a source for this knowledge and understanding?

The paper’s analysis of the causes of complexity begins like this:

In any non-trivial system there is some complexity inherent in the problem that needs to be solved.

So clearly the problem is important. But what is it? In fact, let’s pull the sentence “the problem that needs to be solved” apart a bit, by asking some questions.

Where did the problem come from? Who defined the problem? How is the problem articulated and communicated, by whom, to whom? Is there agreement on what the problem is? How many interpretations and formulations of the problem are there? Why this problem and not some other problem? Who are affected by this problem? Who has an interest in it? Who owns it? Why does the problem matter? Who determined that it was a problem worth solving? Why does it need to be solved? How badly does it need to be solved? Is time relevant? Has it always been a problem? How long has this problem or similar problems existed? Could it cease to be a problem? What happens if it isn’t solved? Is a partial solution viable? How does the problem relate to other problems, or to solutions to other problems? How often does the problem change? What does it mean for the problem to change? Will it still need solving? What forces in the real world could potentially lead to changes? How radical can we expect such changes to be?

We quickly see that the problem isn’t the solution, the problem is the problem itself! How can we even begin to have illusions about how to best develop a “solution” to our “problem” without answers to at least some of these questions? The curse of software development is that we can never fully answer all of these questions, yet they are crucial to our enterprise! If we are to look for root causes of complexity in software, we must start by addressing questions such as these.

When we treat the problem definition as somehow outside the scope of the software development effort, we set ourselves up for nasty surprises – and rampant complexity. As Gerald Weinberg put it in “Are Your Lights On?“: “The computer field is a mother lode of problem definition lessons.” Indeed, any ambiguity, misunderstandings, conflicts, conflict avoidance etc with respect to what the problem is will naturally come back to haunt us in the form of complexity when we try to implement a solution.

Consider an example of actual software development by an actual organization in an actual domain: the TV streaming service offered by NRK, the national public broadcaster in Norway. It’s where I work. What is the problem? It depends on who you ask. Who should you ask? I happen to be nearby. If you ask me, one of many developers working on the service, I might say something like “to provide a popular, high-quality, diverse TV streaming service for the Norwegian public”. It is immediately clear that providing such a service is not a purely technical problem: we need great content, great presentation, great usability, a great delivery platform, among many other things. Creating useful, non-trivial software systems is a multi-disciplinary effort.

It is also clear that such a high-level problem statement must be interpreted and detailed in a million ways in order to be actionable. All the questions above start pouring in. Who provides the necessary interpretation and deliberation? Who owns this problem? Is it our CEO? The product owner for the TV streaming service? The user experience experts? Me? The public? The answer is all and none of us!

But it gets worse, or more interesting, depending on your perspective. The world is dynamic. It changes all the time, whether we like it or not. Hence “the problem” changes as well. It is not something that we can exercise full control over. We don’t exist in a vacuum. We are heavily influenced by changes to the media consumption habits of the public, for instance. The actions of the international media giants influence our actions as well, as do the actions of the large social media platforms. Everything changes, sometimes in surprising ways from surprising angles.

With this backdrop, how do we address “the problem”? What is the best future direction for our service? What would make it more popular, higher quality, more diverse, better for the public? Opinions vary! Is it ML-driven customization and personalization? Is it more social features? Is it radical new immersive and interactive experiences that challenge what TV content is and how it is consumed? We don’t know. No-one knows.

It is naive to think that there is such a thing as “the user”. If there were such a thing as “the user”, it is naive to think that they could provide us with “the problem”. If they could provide us with “the problem”, it is naive to think that it would stay stable over time. If “the problem” did stay stable over time, it is naive to think that everyone would understand it the same way. And so on and so forth.

We cannot expect “the user” to provide us with a problem description, at least not one that we could use to implement an executable specification. The problem of defining the problem unfolds over time in a concrete yet shifting context, in a complex system of human actors. There is nothing inessential about this, it is ingrained in everything we do. We can’t escape from it. Labeling it accidental won’t make it go away.

Instead of ignoring it or dreaming about an “ideal world” where all of these aspects of software development can be ignored, we should accept it. Not only accept it, in fact, but see it as our job to handle. Software developers should provide expertise not just in programming or in running software in production, but also in the externalization of mental models to facilitate communication and enable collaborative modelling. Software development is largely a communication problem. We should take active part in defining, delineating, describing and exploring the problem domain itself, which goes beyond the software system. We should contribute to better concepts and a richer language to describe the domain. It will help us uncover new and better problem descriptions, which will lead to new and better software systems. This exploration is a never-ending process of discovery, negotiation and reevaluation. We should lead in this effort, not wait for someone else to do it for us.

When we pretend that there is such a thing as “the essential problem” that the user can hand over to “the development team” for implementation, we are being naive Platonists. We’re acting as if “the problem” is something stable and eternal, an a priori, celestial entity that we can uncover. But that is not the reality of most problem domains. It may be possible to identify such problems for purely abstract, mathematical structures – structures that need no grounding in the fleeting world that we inhabit. But most software systems don’t deal with such structures.

Instead, most programs or software systems deal with informal, ambiguous, self-contradictory, fluctuating, unstable problems in a shifting, dynamic world. “The problem that needs solving” is always in a state of negotiation and partial understanding. Assumptions and presumed invariants are rendered invalid by a reality that has no particular regard for our attempts to describe it. Indeed, there can be no innovation without the invalidation of existing models! The problem of software development is not “how to implement a solution to a given problem without shooting yourself in the foot”. It is to formalize something that in its nature is informal and unformalizable. As Stephen Jay Gould puts it in “What, if anything, is a zebra?“, “I do not believe that nature frustrates us by design, but I rejoice in her intransigence nonetheless.”

As software developers, we can’t turn a blind eye to this state of affairs. It is an intrinsic and hence essential problem in software development, and one that we must tackle head-on. In “The World and the Machine“, Michael A. Jackson refers to what he calls the “Von Neumann principle”:

There is no point in using exact methods where there is no clarity in the concepts and issues to which they are to be applied.

This means that we must gain a deep understanding and a rich language to describe the problem domain itself, not just the software system we want to operate in that problem domain.

The challenge is to fight an impossible battle successfully. We must constantly try to pin down a problem to a sufficient degree to be able to construct a useful machine that helps solve the problem, as we currently understand it. We must accept that this solution is temporary since the problem will change. And then we must try to keep a dance going on between a fundamentally unstable problem and a machine that longs for stability, without toppling over.

We can’t hope to be successful in this endeavor if we ignore the nature of this process. An account of complexity in software that doesn’t account for the continuous tension between a necessarily formal system and an irreducibly informal world is missing something essential about software development. That’s why “Out of the Tar Pit” is wrong.

I think we need to accept and embrace the tar-pit. At least then we’re grappling with the real causes of complexity. The real world is a hot and sticky place. This is where our software systems and we, as software developers, must operate. Nothing of interest is ever going to be easy. But perhaps we can take heart that everything is still possible!

Proper JSON and property bags

I recently wrote a blog post where I argued that “JSON serialization” as commonly practiced in the software industry is much too ambitious. This is the case at least in the .NET and Java ecosystems. I can’t really speak to the state of affairs in other ecosystems, although I note that the amount of human folly appears to be a universal constant much like gravity.

The problem is that so-called “JSON serializers” handle not just serialization and deserialization of JSON, they tend to support arbitrary mapping between the JSON model and some other data model as well. This additional mapping, I argued, causes much unnecessary complexity and pain. Whereas serialization and deserialization of JSON is a “closed” problem with bounded complexity, arbitrary mapping between data models is an “open” problem with unbounded complexity. Hence JSON serializers should focus on the former and let us handle the latter by hand.

I should add that there is nothing that forces developers to use the general data model mapping capabilities of JSON serializers of course. We’re free to use them in much more modest ways. And we should.

That’s all well and good. But what should we do in practice? There are many options open to us. In this blog post I’d like to explore a few. Perhaps we’ll learn something along the way.

Before we proceed though, we should separate cleanly between serialization and deserialization. When a software module uses JSON documents for persistence, it may very well do both. In many cases, however, a module will do one or the other. A producer of JSON documents only does serialization, a consumer only does deserialization. In general, the producer and consumer are separate software modules, perhaps written in different languages by different teams.

It looks like this:

Source and target models for JSON serialization and deserialization

There doesn’t have to be a bi-directional mapping between a single data model and JSON text. There could very well be two independent unidirectional mappings, one from a source model to JSON (serialization) and the other from JSON to a target model (deserialization). The source model and the target model don’t have to be the same. Why should they be? Creating a model that is a suitable source for serialization is a different problem from creating a model that is suitable target for deserialization. We are interested in suitable models for the task at hand. What I would like to explore, then, are some alternatives with respect to what the JSON serializer actually should consume during serialization (the source model) and produce during deserialization (the target model).

In my previous blog post I said that the best approach was to use “an explicit representation of the JSON data model” to act as an intermediate step. You might be concerned about the performance implications of the memory allocations involved in populating such a model. I am not, to be honest. Until I discover that those allocations are an unacceptable performance bottleneck in my application, I will optimize for legibility and changeability, not memory footprint.

But let’s examine closer what a suitable explicit representation could be. JSON objects are property bags. They have none of the ambitions of objects as envisioned in the bold notion of object-oriented programming put forward by Alan Kay. There is no encapsulation and definitely no message passing involved. They’re not alive. You can’t interact with them. They’re just data. That may be a flaw or a virtue depending on your perspective, but that’s the way it is. JSON objects are very simple things. A JSON object has keys that point to JSON values, which may be null, true, false, a number, a string, an array of JSON values, or another object. That’s it. So the question becomes: what is an appropriate representation for those property bags?

JSON serializers typically come with their own representation of the JSON data model. To the extent that it is public, this representation is an obvious possibility. But what about others?

I mentioned in my previous blog post that the amount of pain related to ambitious “JSON serializers” is proportional to the conceptual distance involved in the mapping, and also of the rate of change to that mapping. In other words, if the model you’re mapping to or from is significantly different from the JSON model, there will be pain. If the model changes often, there will be no end to the pain. It’s a bad combination. Conversely, if you have a model that is very close to the JSON model and that hardly ever changes, the amount of pain will be limited. We are still technically in the land of unbounded complexity, but if we are disciplined and staying completely still close to the border, it might not be so bad? A question might be: how close to the JSON model must we stay to stay out of trouble? Another might be: what would be good arguments to deviate from the JSON model?

When exploring these alternatives, I’ll strive to minimize the need for JSON serializer configuration. Ideally there should be no configuration at all, it should just work as expected out of the box. In my previous blog post I said that the black box must be kept closed lest the daemon break free and wreak havoc. Once we start configuring, we need to know the internal workings of the JSON serializer and the fight to control the daemon will never stop. Now your whole team needs to be experts in JSON serializer daemon control. Let’s make every effort to stay out of trouble and minimize the need for configuration. In other words, need for configuration counts very negatively in the evaluation of a candidate model.

Example: shopping-cart.json

To investigate our options, I’m going to need an example. I’m going to adapt the shopping cart example from Scott Wlaschin’s excellent book on domain modelling with F#.

A shopping cart can be in one of three states: empty, active or paid. How would we represent something like that as JSON documents?

First, an empty shopping cart.

Second, an active shopping cart with two items in it, a gizmo and a widget. You’ll notice that items may have an optional description that we include when it’s present.

And finally, a paid shopping cart with two items in it, the amount and currency paid, and a timestamp for the transaction.

You’ll notice that I’ve added a _state property to make it easier for a client to check which case they’re dealing with. This is known as a discriminator in OpenAPI, and can be used with the oneOf construct to create a composite schema for a JSON document.

So what are our options for explicit representations of these JSON documents in code?

We’ll take a look at the following:

  • Explicit JSON model (Newtonsoft)
  • Explicit DTO model
  • Anonymous DTO model
  • Dictionary

Explicit JSON model

Let’s start by using an explicit JSON model. An obvious possibility is to use the JSON model from whatever JSON serializer library we happen to be using. In this case, we’ll use the model offered by Newtonsoft.

We’ll look at serialization first. Here’s how we might create a paid cart as a JObject and use it to serialize to the appropriate JSON.

There’s no denying it: it is a bit verbose. At the same time, it’s very clear what we’re creating. We are making no assumptions that could be invalidated by future changes. We have full control over the JSON since we are constructing it by hand. We have no problems with optional properties.

What about deserialization?

The deserialization itself is trivial, a one-liner. More importantly: there is no configuration involved, which is great news. Deserialization is often a one-liner, but you have to set up and configure the JSON serializer “just so” to get the output you want. Not so in this case. There are no hidden mechanisms and hence no surprises.

We can read data from the deserialized JObject by using indexers, which read pretty nicely. Unfortunately the last step is a little bit cumbersome, since we need to cast the JToken to a JValue before we can actually get to the value itself. Also, we obviously have to make sure that we get the property names right.

A drawback of using Newtonsoft’s JSON model is, of course, that we get locked-in to Newtonsoft. If we decide we want to try a hot new JSON serializer for whatever reason, we have to rewrite a bunch of pretty boring code. An alternative would be to create our own simple data model for JSON. But that approach has its issues too. Not only would we have to implement that data model, but we would probably have to teach our JSON serializer how to use it as a serialization source or deserialization target as well. A lot of work for questionable gain.

Explicit DTO model

Many readers of my previous blog post said they mitigated the pain of JSON serialization by using dedicated data transfer objects or DTOs as intermediaries between their domain model and any associated JSON documents. The implied cure for the pain, of course, is that the DTOs are much nearer to the JSON representation than the domain model is. The DTOs don’t have to concern themselves with things such as data integrity and business rules. The domain model will handle all those things. The domain model in turn doesn’t need to know that such a thing as JSON even exists. This gives us a separation of concerns, which is great.

However, the picture is actually a little bit more complex.

JSON serialization and deserialization with DTOs.

To keep the drawing simple, I’m pretending that there is a single DTO model and a bi-directional mapping between the DTO and the JSON. That doesn’t have to be the case. There might well be just a unidirectional mapping.

Even with a DTO, we have ventured into the land of unbounded complexity, on the tacit promise that we won’t go very far. The pain associated with JSON serialization will be proportional to the distance we travel. So let’s agree to stay within an inch of actual JSON. In fact, let’s just treat our explicit DTO model as named, static property bags.

To minimize pain, we’ll embrace some pretty tough restrictions on our DTOs. We’ll only allow properties of the following types: booleans, numbers (integers and doubles), strings, arrays, lists and objects that are themselves also DTOs. That might seem like a draconian set of restrictions, but it really just follows from the guideline that JSON serialization and deserialization should work out of the box, without configuration.

You’ll probably notice that there are no types representing dates or times in that list. The reason is that there are no such types in JSON. Dates and times in JSON are just strings. Ambitious JSON serializers will take a shot at serializing and deserializing types like DateTime for you of course, but the exact behavior varies between serializers. You’d have to know what your JSON serializer of choice happens to do, and you’d have to configure your JSON serializer to override the default behavior if you didn’t like it. That, to me, is venturing too far from the JSON model. I’ve seen many examples of developers being burned by automatic conversion of dates and times by JSON serializers.

Even with those restrictions, we still have many choices to make. In fact, it’s going to be a little difficult to achieve the results we want without breaking the no-configuration goal.

First we’re going to have to make some decisions about property names. In C#, the convention is to use PascalCase for properties, whereas our JSON documents use camelCase. This is a bit of a “when in Rome” issue, with the complicating matter that there are two Romes. There are two possible resolutions to this problem.

One option is to combine an assumption with an admission. That is, we can 1) make the assumption that our JSON documents only will contain “benign” property names that don’t contain whitespace or control characters and 2) accept that our DTOs will have property names that violate the sensitivies of a C# style checker. That will yield the following set of DTOs:

Depending on your sensitivies, you may have run away screaming at this point. A benefit however, is that it works reasonably well out of the box. The property names in the DTOs and in the JSON are identical, which makes sense since the DTOs are a representation of the same property bags we find in the JSON. In this scenario, coupling of names is actually a good thing.

Another option is to add custom attributes to the properties of our DTOs. Custom attributes are a mechanism that some JSON serializers employ to let us create an explicit mapping between property names in our data model and property names in the JSON document. This clearly is a violation of the no-configuration rule, though. Do it at your own peril.

This yields perhaps more conventional-looking DTOs. They are, however, now littered with custom attributes specific to the JSON serializer I’m using. There’s really a lot of configuration going on: every property is being reconfigured to use a different name.

We also have the slightly strange situation where the property names for the DTOs don’t really matter. It is a decoupling of sorts, but it doesn’t really do much work for us, seeing as the whole purpose of the DTO is to represent the data being transferred.

But ok. Let’s look at how our DTOs hold up as source models for serialization and target models for deserialization, respectively.

Here’s how you would create an instance of DTO v1 and serialize it to JSON.

It’s pretty succinct and legible, and arguably looks quite similar to the JSON text it serializes to. However, there is a small caveat: our optional description is included with a null value in the JSON. That’s not really what we aimed for. To change that behaviour, we can configure our JSON serializer to omit properties with null values from the serialized output. But now we have two problems. The first is that we had to resort to configuration, the second is that we’ve placed a bet: that all properties with null values should always be omitted from the output. That’s the case today, but it could definitely change. To gain more fine-grained control, we’d have to dig out more granular and intrusive configuration options, like custom attributes or custom serializers. Or perhaps some combination? That’s even worse, now our configuration is spread over multiple locations – who knows what the aggregated behavior is and why?

What about DTO v2? The code looks very similar, except it follows C# property naming standards and at the same time deviates a little bit from the property names that we actually find in the JSON document. We’d have to look at the definition of the PaidCart to convince ourselves that it probably will serialize to the appropriate JSON text, since we find the JSON property names there – not at the place we’re creating our DTO.

A benefit is that since we already littered the DTO with custom attributes, I made sure to add a NullValueHandling.Ignore to the Description property, so that the property is not included in the JSON if the value is null. Of course I had to Google how to do it, since I can’t ever remember all the configuration options and how they fit together.

So that’s serialization. We can get it working, but it’s obvious that the loss of control compared to using the explicit JSON model is pushing us towards making assumptions and having to rely on configuration to tweak the JSON output. We’ve started pushing buttons and levers. The daemon is banging against walls of the black box.

What about deserialization? Here’s how it looks for a paid cart using DTO v2:

Well, what can I say. It’s quite easy if we know in advance if we’re dealing with an empty cart, an active cart or a paid cart! And it’s very easy and access the various property values.

But of course we generally don’t know what kind of shopping the JSON document describes. That information is in the JSON document!

What we would like to write in our code is something like this:

But the poor JSON serializer can’t do that, not without help! The problem is that the JSON serializer doesn’t know which subclass of ShoppingCart to instantiate. In fact, it doesn’t even know that the subclasses exist.

We have three choices at this point. First, we can create a third variation of our DTO, one that doesn’t have this problem. We could just the collapse our fancy class hierarchy and use something like this:

It’s not ideal, to put it mildly. I think we can probably agree that this is not a good DTO, as it completely muddles together what was clearly three distinct kinds of JSON documents. We’ve lost that now, in an effort to make the JSON deserialization process easier.

The second option is to pull out the big guns and write a custom deserializer. That way we can sneak a peek at the _state property in the JSON document, and based on that create the appropriate object instance. To manage that requires, needless to say, a fair bit of knowledge of the workings of our JSON serializer. Chances are your custom deserializer will be buggy. If the JSON document format and hence the DTOs are subject to change (as typically happens), changes are it will stay buggy over time.

The third option is to protest against the design of the JSON documents! That would mean that we’re letting our problems with deserialization dictate our communication with another software module and potentially a different team of developers. It is not the best of reasons for choosing a design, I think. After all, there are alternative target models for deserialization that don’t have these problems. Why can’t we use one of them? But we might still be able to pull it off, if we really want to. It depends on our relationship with the owners of the supplier of the JSON document. It is now a socio-technical issue that involves politics and power dynamics between organizations (or different parts of the same organization): do we have enough leverage with the supplier of the JSON document to make them change their design to facilitate deserialization at our end? To we want to exercise that leverage? What are the consequences?

It’s worth noting that these problems only apply to DTOs as target models for deserialization. As source models for serialization, we can use our previous two variations, with the caveats mentioned earlier.

To conclude then, explicit DTOs are relatively straightforward as source models for serialization, potentially less so as target models for deserialization. A general drawback of using explicit DTOs is that we must write, maintain and configure a bunch of classes. That should be offset by some real, tangible advantage. Is it?

Anonymous classes

We can avoid the chore of having to write and maintain such classes by using anonymous classes in C# as DTOs instead. It might not be as silly as it sounds, at least for simple use cases.

For the serialization case, it would look something like this:

This is actually very clean! The code looks really similar to the target JSON output. You may notice that the paidItems array is typed as object. This is to allow for the optional description of items. The two items are actually instances of distinct anonymous classes generated by the compiler. One is a DTO with two properties, the other a DTO with three properties. For the compiler, the two DTOs have no more in common than the fact that they are both objects.

As long as we’re fine with betting that the property names of the target JSON output will never contain whitespace or control characters, this isn’t actually a bad choice. No configuration is necessary to handle the optional field appropriately.

A short-coming compared to explicit DTOs is ease of composition and reuse across multiple DTOs. That’s not an issue in the simple shopping cart example, but it is likely that you will encounter it in a real-world scenario. Presumably you will have smaller DTOs that are building blocks for multiple larger DTOs. That might be more cumbersome to do using anonymous DTOs.

What about deserialization? Surely it doesn’t make sense to use an anonymous type as target model for deserialization? Newtonsoft thinks otherwise! Ambitious JSON serializers indeed!

This actually works, but it’s a terrible idea, I hope you’ll agree. Creating a throw-away instance of an anonymous type in order to be able to reflect over the type definition is not how you declare types. It’s convoluted and confusing.

So while it is technically possible to use anonymous DTOs as target models for deserialization, you really shouldn’t. As source models for serialization, however, anonymous DTOs are not too bad. In fact, they have some advantages over explicit DTOs in that you don’t have to write and maintain them yourself.


Finally, we come to the venerable old dictionary! With respect to representing a property bag, it really is an obvious choice, isn’t it? A property bag is literally what a dictionary is. In particular, it should be a dictionary that uses strings for keys and objects for values.

Here is a dictionary used as serialization source:

It is more verbose than the versions using explicit or anonymous DTOs above. I’m using the dictionary initializer syntax in C# to make it as compact as possible, but still.

It is very straightforward however. It makes no assumptions and places no bets against future changes to the JSON document format. Someone could decide to rename the paidItems property in the JSON to paid items and we wouldn’t break a sweat. The code change would be trivial. Moreover the effect of the code change would obviously be local – there would be no surprise changes to the serialization of other properties.

What about the deserialization target scenario, which caused so much trouble for our DTOs? We would like to be able to write something like this:

Alas, it doesn’t work! The reason is that while we can easily tell the JSON serializer that we want the outermost object to be a dictionary, it doesn’t know that we want that rule to apply recursively. In general, the JSON serializer doesn’t know what to do with JSON objects and JSON arrays, so it must revert to defaults.

We’re back to custom deserializers, in fact. Deserialization really is much more iffy than serialization. The only good news is that deserialization of JSON into a nested structure of string-to-object dictionaries, object lists and primitive values is again a closed problem. It is not subject to change. We could do it once, and not have to revisit it again. Since our target model won’t change, our custom deserializer won’t have to change either. So while it’s painful, the pain is at least bounded.

Here is a naive attempt at an implementation, thrown together in maybe half an hour:

It probably has bugs. It didn’t crash on my one test input (the paid cart JSON we’ve seen multiple times in this blog post), that’s all the verification I have done. Writing custom deserializers is a pain, and few developers have enough time available to become experts at it. I’m certainly no expert, I have to look it up and go through a slow discovery process every time. But there is a chance that it might one day become relatively bug-free, since the target isn’t moving. There are no external sources of trouble.

With the custom deserializer, deserializing to a dictionary looks like this:

There is a lot of casting going on. We might be able to gloss it over a bit by offering some extension methods on dictionary and list. I’m not sure it would help or make matters worse.

The reason is in JSON’s nature, I guess. It is a completely heterogenic property bag. It’s never going to be a frictionless thing in a statically typed language, at least not of the C# ilk.


What did we learn? Did we learn anything?

Well, we learned that deserialization in general is much more bothersome than serialization. Perhaps we already suspected as much, but it really became painfully clear, I think. In fact, the only target model that will let us do deserialization without either extensive configuration, making bets against the future or potentially engaging in organizational tug of war is the explicit JSON model. But luckily that’s actually a very clean model as well. The explicit JSON model is verbose when you use it to create instances by hand. But we’re not doing that. The JSON serializer does all of that, and it does it robustly because it’s the JSON serializer’s own model. Reading values out of the JSON model is actually quite succinct and nice. And when we’re deserializing, we’re only reading. I therefore recommend using that model as target model for deserialization.

For serialization, there is more competition and hence the conclusion is less clear-cut. The explicit JSON model is still a good choice, but it is pretty verbose. You might prefer to use a dictionary or some sort of DTO, either explicit or anonymous. However, both of the latter come with some caveats and pitfalls. I think actually the good old dictionary might be the best choice as source model for serialization.

What do you think?

On the complexity of JSON serialization

I vented a bit on Twitter the other day about my frustrations with JSON serialization in software development.

I thought I’d try to write it out in a bit more detail.

I’m going to embrace ambiguity and use the term “serialization” to mean both actual serialization (producing a string from some data structure) and deserialization (producing a data structure from some string). The reason is that our terminology sucks, and we have no word that encompasses both. No, (de-)serialization is not a word. I think you’ll be able to work out which sense I’m using at any given point. You’re not a machine after all.

Here’s the thing: on every single software project or product I’ve worked on, JSON serialization has been a endless source of pain and bugs. It’s a push stream of trouble. Why is that so? What is so inherently complicated in the problem of JSON serialization that we always, by necessity, struggle with it?

It’s weird, because the JSON object model is really really simple. Moreover, it’s a bounded, finite set of problems, isn’t it? How do you serialize or deserialize JSON? Well, gee, you need to map between text and the various entities in the JSON object model. Specifically, you need to be able to handle the values null, true and false, you need to handle numbers, strings and whitespace (all of which are unambiguously defined), and you need to handle arrays and objects of values. That’s it. Once you’ve done that, you’re done. There are no more problems!

I mean, it’s probably an interesting engineering challenge to make that process fast, but that’s not something that should ever end up causing us woes. Someone else could solve that problem for us once and for all. People have solved problems much, much more complicated than that once and for all.

But I’m looking at the wrong problem of course. The real problem is something else, because “JSON serialization” as commonly practiced in software development today is much more than mere serializing and deserializing JSON!

This is actual JSON serialization:

Actual JSON serialization.

This problem has some very nice properties! It is well-defined. It is closed. It has bounded complexity. There are no sources of new complexity unless the JSON specification itself is changed. It is an eminent candidate for a black box magical solution – some highly tuned, fast, low footprint enterprise-ready library or other. Great.

This, however, is “JSON serialization” as we practice it:

JSON serialization as practised

“JSON serialization” is not about mapping to or from a text string a single, canonical, well-defined object model. It is much more ambitious! It is about mapping to or from a text string containing JSON and some arbitrarily complex data model that we invented using a programming language of our choice. The reason, of course, is that we don’t want to work with a JSON representation in our code, we want to work with our own data structure. We may have a much richer type system, for instance, that we would like to exploit. We may have business rules that we want to enforce. But at the same it it’s so tedious to write the code to map between representations. Boilerplate, we call it, because we don’t like it. It would be very nice if the “JSON serializer” could somehow produce our own, custom representation directly! Look ma, no boilerplate! But now the original problem has changed drastically.

It now includes this:

Generic data model mapping

The sad truth is that it belongs to a class of problems that is both boring and non-trivial. They do exist. It is general data model mapping, where only one side has fixed properties. It could range from very simple (if the two models are identical) to incredibly complex or even unsolvable. It depends on your concrete models. And since models are subject to change, so is the complexity of your problems. Hence the endless stream of pain and bugs mentioned above.

How does an ambitious “JSON serializer” attempt to solve this problem? It can’t really know how to do the mapping correctly, so it must guess, based on conventions and heuristics. Like if two names are the same, you should probably map between them. Probably. Like 99% certain that it should map. Obviously it doesn’t really know about your data model, so it needs to use the magic of reflection. For deserialization, it needs to figure out the correct way to construct instances of your data represention. What if there are multiple ways? It needs to choose one. Sometimes it will choose or guess wrong, so there needs to be mechanisms for giving it hints to rectify that. What if there are internal details in your data representation that doesn’t have a natural representation in JSON? It needs to know about that. More mechanisms to put in place. And so on and so forth, ad infinitum.

This problem has some very bad properties! It is ill-defined. It is open. It has unbounded, arbitrary complexity. There are endless sources of new complexity, because you can always come up with new ways of representing your data in your own code, and new exceptions to whatever choices the “JSON serializer” needs to make. The hints you gave it may become outdated. It’s not even obvious that there exists an unambiguous mapping to or from your data model and JSON. It is therefore a terrible candidate for a black box magical solution!

It’s really mind-boggling that we can talk about “single responsibility principle” with a grave expression on our faces and then happily proceed to do our “JSON serialization”. Clearly we’re doing two things at once. Clearly our “JSON serializer” now has two responsibilities, not one. Clearly there is more than one reason to change. And yet here we are. Because it’s so easy, until it isn’t.

But there are more problems. Consider a simple refactoring: changing the name of a property of your data model, for instance. It’s trivial, right? Just go ahead and change it! Now automatically your change also affects your JSON document. Is that what you want? Always? To change your external contract when your internal representation changes? You want your JSON representation tightly coupled to your internal data model? Really? “But ha, ha! That’s not necessarily so!” you may say, because you might be a programmer and a knight of the technically correct. “You can work around that!” And indeed you can. You can use the levers on the black box solution to decouple what you’ve coupled! Very clever! You can perhaps annotate your property with an explicit name, or even take full control over the serialization process by writing your own custom serializer plug-in thing. But at that point it is time for a fundamental question: “why are you doing this again?”.

Whenever there is potentially unbounded complexity involved in a problem, you really want full control over the solution. You want maximum transparency. Solving the problem by trying to give the black box the right configurations and instructions is much, much more difficult than just doing it straightforwardly “by hand”, as it were. By hand, there are no “exceptions to the default”, you just make the mapping you want. Conversely, if and when you summon a daemon to solve a problem using the magic of reflection, you really want that problem to be a fixed one. Keep the daemon locked in a sealed box. If you ever have to open the box, you’ve lost. You’ll need to tend to the deamon endlessly and its mood varies. It is a Faustian bargain.

So what am I suggesting? I’m suggesting letting JSON serialization be about JSON only. Let JSON serializer libraries handle translating between text and a representation of the JSON object model. They can do that one job really well, quickly and robustly. Once you have that, you take over! You take direct control over the mapping from JSON to your own model.

It looks like this:

JSON serialization and mapping by hand

There is still potentially arbitrary complexity involved of course, in the mapping between JSON and your own model. But it is visible, transparent complexity that you can address with very simple means. So simple that we call it boilerplate.

There is a famous paper by Fred Brooks Jr kalled “No Silver Bullet“. In it, Brooks distinguishes between “essential” and “accidental” complexity. That’s an interesting distinction, worthy of a discussion of its own. But I think it’s fair to say that for “JSON serialization”, we’re deep in the land of the accidental. There is nothing inescapable about the complexity of serializing and deserializing JSON.

Death of a Craftsman

This is a transcript of a talk I did at KanDDDinsky 2019 in Berlin.

Hi! My name is Einar. I design and write software for a living, presumably like you. You could call me a software developer, a coder, a programmer, and also a software designer, I guess. Those are all things I do and labels that I’m comfortable with. I am not, however, a software craftsman. This blog post is about why.

I must say that this is a very difficult topic to write about. It’s difficult because I don’t want to insult or belittle other people and their beliefs, and I’m sure that some of you identify as – and even take pride in your identity as – a software craftsman. So to clarify, this is not “death to all craftsmen”, “craftsmen are bad”, “craftsmen are stupid” or anything like that. In fact – and this is super super important – if you think of yourself as a craftsman or a crafter, and this gives you energy and inspires you, if it gives you the drive to improve and do better, that’s great. Never mind me! I’m just one guy! This blog post is just about why I don’t think of myself as a software craftsman. That’s all. Granted I wouldn’t be writing this if I didn’t think that some of the things I’m going to bring up might resonate with some of you, or at least trigger some reflection. But we’ll see.

Why the dramatic title though? The title is a pun on the title of a stage play by Arthur Miller, called Death of a Salesman. The play deals with identity crisis and the relationship between dreams and reality. So it seemed a fitting title for a story about my own identity crisis.

Before we get to that though, I also want to acknowledge all the great work done by many people who do identify as software craftsmen or software crafters. There are lots of conferences and unconferences and meetups and what have you where people learn from and inspire each other. That’s great. I don’t want to diminish that in any way. What’s more, we agree on a lot of things about software development. We follow many of the same practices.

And yet I’m not a software craftsman.

I should explain why I use the label “software craftsman” here, rather than “software crafter”. I’m very aware that the label “craftsman” is problematic and unfortunate in an industry that has real diversity issues. I share that awareness with hopefully a growing part of the software development community. Moreover, in the subset of that community that forms the craft community, there are people who are working hard to replace the notion of craftsman with crafter and similarly craftsmanship with craft. I think that’s great and very commendable.

Those people, if they read this blog post, might feel that my critique is a bit unfair, or that at least some of the issues I’m going to bring up are no longer relevant. But I’m not so sure. Change takes time and craftsmanship is a concept with a lot of mindshare. For a litmus test, consider this: do you think there are more software developers out there today that identity as software crafters or as software craftsmen? There are multiple books – very, very popular books – on software craftsmanship, none on software craft that I’ve found. That is the status as of this writing.

So I’m going to use the old craftsman term here. It feels better to aim the critique at the death star, as it were, than at the nascent community trying to do better. Also, my identity crisis really was with the original craftsman concept, not with crafting, which came later. Though I will say that I have problems with the metaphor of craft itself that cause me to reject the identity as crafter as well, and I’m going to touch on some of those too.

So. I am not a software craftsman. The obvious question is “why not?”.

After all, I used to be. I used to think of myself as a software craftsman, until maybe, what, five years ago? It’s hard to tell, time goes fast.

In fact, let’s rewind time quite a bit, much more than five years. Back in 2002, I was fresh out of the university with my Master’s degree in Computer Science and got my first job as a programmer. And I was delighted to learn of all kinds of interesting stuff concerning software development that wasn’t taught at the university! Agile obviously was the hot new thing that everyone was talking about. I remember attending a conference around 2003 and Kent Beck was there, Robert C. Martin, Ward Cunningham, Rebecca Wirfs-Brock, even Eric Evans. I worked hard and read a lot of books and blogs as well. It was exciting times. I was looking to prove myself, I guess, trying to find out who I was as a programmer and software developer. And I wanted to be good.

Then I heard about craftsmanship and it just clicked. I had found my identity. I’m guessing I first read about it in the book The Pragmatic Programmer. I remember the title page with a woodworker’s tool on it, and the subtitle “from journeyman to master”. It resonated. That was me. I wanted to master programming. You might say I tried on the craftsman’s cloak, and my god, it fit me! It looked good on me! I was a “journeyman” on my way to mastery. And of course I also liked this idea that craftsmanship seemed to solve the age-old dilemma: is programming art or engineering? It was neither and both: it was a craft! I could be half engineer, half artist, and sort of make up the mix myself as I went along. Of course, what exactly that meant was a bit unclear, but it felt good.

Over time, I gained much experience with software development in practice, and the forces that influence it. I learned a lot about myself and about others. I learned about my capabilities and limitations, my strengths and weaknesses. I learned that my brain and my ability to understand complex code is finite, for instance. Not only that, it’s shrinking! I also learned that my self-discipline and my rationality are pretty fixed entities. I’m disciplined most of the time, rational most of the time, but not always. But that was OK, I could live with being human and a craftsman, even though I started to realize that I might not actually be able to level up indefinitely. But there was a bigger problem. I was starting to feel that there were vitally important things influencing the success and failure of the software I worked on that had nothing to do with craftsmanship. There were both things out of my control, and things that I could potentially influence, but that craftsmanship said nothing about.

Eventually this led to an identity crisis. The cloak of craftsmanship was starting to itch and no longer felt right. It no longer fit me. It felt quaint and awkward. So finally, I had to take it off. But of course, this left me with a dilemma. If I’m not a craftsman, who or what am I then?

I was back where I started. I once again faced the problem that I wanted to be a good developer. But what does that mean? What is a good developer?

And that’s an important question to answer, because it enables you to answer the even more important question: am I a good developer? Because this has to do with sense of worth. Am I valuable? Am I a good egg? We all want to be good eggs.

Roberto Benigni in Down By Law

And that was the original lure of software craftsmanship for me – it provided simple and clear answers to these questions.

But what exactly were those answers? What is software craftsmanship?

According to Robert C. Martin, an authority on software craftsmanship if ever there was one, software craftsmanship is a meme that contains values, disciplines, techniques, attitudes and answers. And I wholeheartedly agree with that. It’s an idea that has proven very successful in spreading from brain to brain among software developers.

Moreover, software craftsmanship is an identity. It is an identity because it offers a narrative about what matters in software development and hence a way to distinguish yourself.

It is a narrative that talks about pride and professionalism. You’ll find that craftsmanship and professionalism is used more or less as synonyms in the literature. In fact, you can get the impression that craftsmanship is just a slightly kitchy word for professionalism. And this is actually quite sneaky and one of the things I don’t like so much. Because when you start to use those terms interchangeably, then suddenly you can’t be professional unless you’re a craftsman. And that’s problematic.

So let’s dig into the narrative a bit deeper. The way I hear it, the craftsman’s tale is a slightly medievally themed story of heroes and adversaries.

Let’s start with the hero. Who is the hero of software development in the craftsmanship narrative? That’s easy. It’s the programmer of course! The programmer is the main character in this adventure. The picture I have in my head is of Geralt of Rivia – The Witcher himself. The hero of software development is the pragmatic good software craftsman, who relies on his skill and training and his own good judgment to do good in the world. He answers to no-one but himself. He is armed with the twin swords of TDD and SOLID, drilled through countless katas. He’s a trained coding machine!

Being a good developer in the craftsmanship narrative means seeking mastery of the craft, technical excellence, flawless execution. There’s an underlying belief that discipline and virtue can fix software. If I’m just disciplined enough, I will do well. I just need to follow the best practices diligently. There may not be a silver bullet but there is a silver sword, and I can slay the monster of software development!

So it all comes down to me. Me and my skills. If I’m high level enough, the software will be great. And in some ways, that’s a very optimistic tale, because it means it is within my power to determine the outcome. It’s empowering.

Of course if that’s the case, then it also goes the other way around. If the software is broken, I simply wasn’t good enough. My coding skills weren’t up to the task, or I wasn’t disciplined enough in my practices, or I wasn’t professional enough in my communication. Human, all too human!

And this is where the cloak really starts to itch for me. I think this is just wrong. And since it’s wrong, it’s also dangerous. Because now you have a narrative of software development that sells you a false promise and sets you up to feel bad about yourself. Gerry Weinberg warned us about this 50 years ago, in The Psychology of Computer Programming, when he talked about egoless programming.

But why is it wrong? Isn’t it just the harsh, cold truth that if I fail, I’m not good enough? Shouldn’t I just face reality, accept the challenge, buckle down and work harder? Be better? Get good?

Let’s consider another question. Who makes software?

That’s a trick question, right? We all know who makes software, right? This is like asking who the hero of software development is again! It’s us, the programmers, we sit at the keyboard, we type the code, we test it, we ship it to production, it’s us? Surely it’s us! No one else can make software!

At this point, I would like you to form an image of a bubble in your head. A beautiful soap bubble, as multi-chrome as Saruman’s cloak in the Lord of the Rings, a perfect sphere floating in the air.

The bubble you just conjured up in your head is the craftsman’s model of software development. Coding happens inside the bubble. This is where software is created. You have a border of professionalism to protect against the outside, and you have passion on the inside. It’s very inward-facing. The problem is that the border isn’t going to hold. It’s super-weak. The bubble is going to burst and there is nothing you can do, great hero! No amount of skills or professionalism is going to change that. Because the outside forces you’re dealing with are much stronger than you are.

Those forces are organizational forces, and the reason they matter so much is that programmers don’t make software. We may think we do, we might wish we did, but we don’t. Organizations make software. Conway’s Law is one expression of this. Despite your values and disciplines and practices, the health of the organization will be an upper bound on the health of your code. Communication patterns, weak concepts, ambiguity, conflicts, conflict avoidance, all that is going to affect your code. It seeps in and you can’t stop it. You have no practices that help against that influence. None of your weapons work. Moreover, your code or your team’s code doesn’t live in isolation. Most modern software talks to other software, made by other teams and organizations. Therefore, to make good software, you need to direct your attention outward and work on the conditions for making software. You need to address the organizational forces head-on, build culture and understanding, spread your ideas of what healthy software development looks like. You might still fail because there are many things beyond your control, but at least you’ve entered the playing field.

With this realization, a seemingly simple question is suddenly difficult. If organizations rather than programmers make software, what is the craft of making software? Is it coding? Deploying software? That doesn’t sound right anymore. Rather: isn’t it about using your skill, knowledge and influence to shape software that somehow benefits your organization? And you can do that without writing code, can’t you?

Now everything is in flux! It all falls apart! Who performs the craft? What is the craft anyway? Who can be a software crafter? Who is the hero? Isn’t it true that any work that affects the finished product must be considered part of the craft of making software? How is coding special?

There’s an interesting dynamic when we program, when we write code, between our intent and the reality of the code and ultimately the machine. The code offers resistance to being formed, so we try different strategies, making tradeoffs. And that’s part of why the craft metaphor is so compelling to us. But! That’s not the only malleable material involved in making software. Consider user experience work or visual design – there is the same dynamic of forming and resistance. The organization itself can be changed, again with the same dynamic, as well as the business it engages in and the products it makes. What this means is that if there is craft involved in software development, there is a multiplicity of crafts. It’s not just the craft performed by coders. So how come we are the craftsmen?

The craftman narrative fails to see this because it is caught up in a fetish about code. Like the code is the only thing that matters and everything else is noise and waste and irrationality. Software craftsmen are fond of saying things like “The code is the truth!”. (You have to say that in like a deep masculine voice, regardless of your gender or pitch of voice.) But there is no truth about anything interesting in the code! Apart from tautology, that is. The code can only tell “the truth” about what the code does – or will do, if you compile and run it. For instance, it can’t tell you what it should have done, how to make users happy, how to provide value to someone, how to earn money. It can’t even tell you if it’s in production!

We know that software is organic, it’s alive and changing, but the code at any given time doesn’t point in any particular direction. It has no idea of what’s going to happen next. The code doesn’t want anything. It’s not strategic. It doesn’t have plans. It doesn’t know how to improve itself. There’s no reflection except the bad kind. It doesn’t know how to make priorities. But these are super-important things to talk about.

The idea that “the truth is in the code” is a passive-aggressive power trick as well. Because only we, the coders, can read the code. So if the code is the truth, then we are keepers of the truth, aren’t we? Not those other people. They might hold other kinds of power and prestige but they don’t wield the truth! So the question becomes: do we really want to communicate? Or do we prefer to hold on to this monopoly of truth as leverage to exert our will over others?

Another problem is what I call the RPG model of skill acquisition. An RPG is a role-playing game. In your typical RPG you find yourself in a medieval setting, just like a software craftsman, and your character has a level. You start at level 1 and then you kill some rats to gain experience because you don’t have many hit points yet, and eventually you reach level 2 and maybe you kill some bigger rats and you get to level 3 and you can kill, I don’t know, undead rats or something.

And you have the same thing in the craft narrative. There’s a linear notion of leveling up, step-wise, as you gain experience. It’s a model taken from medieval guilds. There’s a book by Pete Breen that discusses it in detail, and you’ll also find it referenced lots of other places. Recall that the subtitle of first edition of The Pragmatic Programmer was “from journeyman to master”. I think they changed it for the new edition, now it says “your journey to mastery”, which sort of tones it down a notch.

It’s very simple. You start out as an apprentice. Then at some ill-defined point you progress to journeyman and I think this is basically where every craftsman I’ve ever met puts themselves, because they’re too modest to claim to be a master and too experienced to be a mere apprentice. But of course the implied goal is to become a master, whatever that entails in software development. In a sense it’s a pretty sentimental and naive model. But it’s not entirely harmless. A side-effect of this model is that it establishes a hierarchy of practitioners. It ranks us.

Now since many craftsmen are helpful and want to share their knowledge, some people form mentorships or apprenticeships. That’s great, it’s great that people want to help others “level up” as it were. But there’s something here that’s strange to me, in that the relationship has such a clear direction. Knowledge flows from the mentor to the mentee. Of course mentors will say that they also learn from that process, that teaching clarifies their understanding or something, but it’s still directional. There is an ordering in the relationship.

Contrast that with a camerata of peers, to use Jessica Kerr’s term. The relationships I’ve learned the most from have always been bi-directional, between people with mutual respect and complementary skills. That enables a group dynamic that lifts everyone up. You can take turns inspiring each other and taking things to the next level.

Finally there’s the monopoly on virtue. It’s this thing where being a craftsman is the same as, or a prerequisite of, being professional. You see, in the craftsman’s tale, there are three kinds of programmers.

There are the unwashed masses, programmers who don’t care, who have no passion for programming – lazy, undisciplined filth! Apparently there are many of these. (I personally haven’t met too many of them, I must have been lucky.) But these are bad, obviously. Luckily if you merely attend a talk on craftsmanship or buy and perhaps even read a book on craftsmanship, you’re not one of them! Because then you’re a software craftsman. These are good, by definition. You could be an apprentice, could be a journeyman, could be a master, doesn’t matter, you’re on the virtuous path!

And all would be pretty much well if that were all there was to it. A simple, unproblemtic story. But then there is the third category: the ivory tower zealots! These are terrible! They have passion but they use it all wrong! They have principles but they are wrong! They could be into category theory instead of SOLID! Unrealistic! Unpragmatic! Proofs! Maths! Maybe they write comments or even specifications!

For these terrible sins, they must be banished from the real world! It is inconceivable that the ivory tower zealots inhabit the same world as the software craftsmen. Why? Because they’re a direct challenge to authority. They represent a threat to the monopoly on virtue. Both can’t be right, both can’t wield the truth about what matters in software development, and so the zealots have to go. It turns out that craftsmanship is really surprisingly conservative! Software craftsmen are a rare breed of dogmatic pragmatists.

As an example, let’s take a topic that presumably all programmers care about, which is software correctness.

One practice that is always mentioned by craftsmen is test-driven development, TDD. Of course, it’s hailed as something much more than just a testing technique. It’s also a design technique, and the by-product of that technique is a comprehensive test suite. Since TDD is example-driven, the test suite is an instance of example-based testing. This helps ensure software correctness and offers some protection against regression in the face of changes. So that’s fine. Many people, self-proclaimed craftsmen or not, find TDD to be a valuable practice – both for the design aspect and for the effect on software correctness. What’s weird is that software craftsmen seem generally uninterested in other approaches.

Because there are other approaches to ensure software correctness. One example is property-based testing, where you verify properties that will hold for all inputs to your code. That’s a powerful design technique as well, as it guides you to think of your code in terms of properties and invariants, not just examples. There are advanced static type systems aiming to make software correct by construction, by disallowing illegal states in the software. There are formal methods, mathematically based techniques for the specification and verification of software. But craftsmen seem completely uninterested in all of these, brushing them off as “academic” or “impractical” – the sort of stuff that ivory tower zealots meddle with instead of solving real problems.

For instance, I haven’t heard of a single craftsman with any interest in a tool like TLA+. TLA+ a language for modelling concurrent and distributed systems, developed by Turing-award winner Leslie Lamport. Is it an ivory tower language? Is it for academics? Microsoft has used to find bugs in the Xbox and Azure. Amazon has used it to verify AWS. That’s pretty real world. Is it overkill for your application? Perhaps, I don’t know, it depends on your context. Wouldn’t you want to find out? Most applications these days are concurrent and distributed in some way. (If you do want to find out, you should check out one of Hillel Wayne’s talks.)

The conservatism of craftsmanship is baked into that three-step ladder of skill. The master already has all the knowledge written on a scroll and just needs to pass it on to eager students of the true way. It’s not terribly progressive. Progressive learning requires you to look for counter-examples to your current beliefs rather than confirmation. Radical improvement to the way we do software must challenge and overturn best practices, and is likely to come from outside our bubble.

So I wonder: Do craftsmen question their beliefs? Do craftsmen welcome differences of opinion? (Do craftsmen like this blog post?) I don’t know, but I think disagreement is good! We need disagreement! Disagreement has energy! We should seek out tension and dissonance, because that’s where we might learn something.

Speaking of dissonance: let’s backtrack a bit. If the programmer is the hero in the craftsman’s tale about software, who are the adversaries? Apart from the ivory tower people, that is.

The business. The suits. The accursed clueless managers. People who have never coded in their life!

People who are also part of your organization. People who have an interest in the software we help make. Sometimes they’re called stakeholders, because they have a stake in the software. Just like us, in a sense.

Software craftsmen speak a lot about respect. But it seems pretty one-sided from what I can tell. It’s mostly about programmers respecting themselves, or demanding respect from others. Which is fine and good and important. But it’s not so great when it’s not balanced with an equal respect for others. Have you seen and heard how craftsmen speak about non-programmers? It’s not pretty. And it’s not just embarrassing and undignified, but actively harmful to software development.

For one, it’s not a very good way of forming productive partnerships. Instead it deepens the divide between the so-called “business” and the craftsman. It’s a bit of medieval kitch imagery again. The business people, the suits, can visit the craftsman’s shoppe, where everything is neat and tidy, a stronghold of sanity in a crazy, irrational world. There “the business” can order hand-crafted artisan software of the highest quality, and maybe there’s some negotiation and the craftsman can offer alternatives at various costs, according to need. So it’s a client-supplier kind of relationship. Communication is negotiation.

But the problem is that there is no “business”. Or rather we, developers, are the business, just as much as anyone else. There are no separate “business people”! There are other people doing things besides programming in the same business. We’re all in this together. We have the same goals. If not there are going to be problems! Software development should be a cross functional team effort to achieve a business goal. Communication should be about collaboration.

So in this environment, how can we be valuable? How can we be good eggs?

I would like to see more programmers looking beyond the craftsmanship narrative, beyond technical skill, beyond code, to be valuable. In my work, I try to bring value by doing domain modelling, because I believe that poor modelling and weak, ambiguous language is the primary cause of accumulating complexity and technical debt. (Indeed, I like to say that technical debt isn’t technical.) I also work on storytelling and culture, by which I mean I try to shape the story of how we do software development and help grow a healthy and safe culture to work in. I believe mob programming, where the mob includes so-called non-technical people, can help us move away from the programmer-as-hero myth. In particular I have high hopes for what I call “Conway’s Mob” as a self-empowering juggernaut to overcome organizational silos and work across existing team and system boundaries. I would like to see us programmers engage in organizational refactoring, trying to influence how we are organized to provide better solutions to the right problems. We should do more strategic architecture work, working iteratively to improve the architecture to support evolving business needs. Of course, that requires that we know what those evolving business needs are, so we need to learn about that. There are so many valuable skills for a software developer besides writing code.

Which brings us back to where we started. What is a good developer? What is my identity? Am I a good egg?

What identities are there for those of us who are not software craftsmen? I for one want to see more heresy in software development, so I would welcome more software infidels! I want us to fear orthodoxy. Is there such a thing as a software jester? Someone who can turn things upside down and make jokes at the king’s expense yet get away with it? I wouldn’t even mind meeting an actual, genuine software engineer! Not the ill-advised 1990-style kind perhaps, but someone who actually made meaningful measurements to understand the dynamics of a running distributed software system, say, and ran experiments and made strategic decisions based on those experiments. That sounds valuable to me. Or even a proper software artist, whatever that might be.

I would like to end this long blog post with a book recommendation. There’s a great story by Italo Calvino called “The non-existent knight“. It was written in 1959, but it’s clearly about software development. The main character – the hero, as it were – is Sir Agilulf Emo Bertrandin of the Guildivern. (I’m not making it up, his name really is Agilulf. Of the Guildivern.) He is the perfect knight in white armor. He carries out all the duties and routines of a knight with the utmost discipline and precision and keeps everything shining and clean. He berates his fellow knights for all their sloppiness and shortcomings. The only problem with Sir Agilulf is that he doesn’t exist. There’s no body inside the armor. When he takes off his armor, he simply disappears.

We should not let ourselves disappear.

Function application in la-la land

Here’s a familiar scenario for a programmer: You have some useful function that you would like to apply to some values. It could be concat that concatinates two strings, or add that adds two integers, or cons which prepends an element to a list, or truncate which cuts off a string at the specified length, or indeed any old function f you come up with which takes a bunch of arguments and produces some result.

Simple, right? But there’s a twist! The values you’d like to apply your function to are all trapped in la-la land! And once you have values in la-la land, it’s not obvious how you’d go about getting them out of there. It really depends on the kind of la-la land your values are in. It’s sort of like being trapped in the afterlife. You might be able to return to the land of the living, but it’s not trivial. Certainly not something you’d want your pure, innocent function to have to deal with!

You might wonder how the values ended up in la-la land in the first place. In many cases, they were born there. They are la-la land natives – it’s the only existence they’ve ever known. It sounds weird, but it’s surprisingly common. Indeed, many programs contain not one but several distinct la-la lands, each with their own peculiar laws and customs! Some familiar la-la lands in the .NET world include Task, Nullable and List.

Since la-la lands are so pervasive in our programs, clearly we need to be able to apply functions to the values that dwell there. Previously we’ve seen that if your la-la land is a Functor, there is a Map function that lets us do that. But there is a problem: Map cannot work with any of the functions I mentioned above. The reason is that they all take more than one argument. Map can transform a single value of type T1 inside la-la land to a single value of type T2 inside la-la land. What Map does is teleport the T1 value out of la-la land, apply your function to obtain a T2 value, and teleport that back into la-la land. You can of course map multiple times, but you’ll still involving just one la-la land value at a time. So that’s not going to work.

What alternatives do we have? Well, one idea that springs to mind is partial application. If we had a curried function, we could apply it to the la-la land values one by one, producing intermediate functions until we have the final result. For instance, say we have a curried version of add which looks like this:

Func<int, Func<int, int>> add = a => b => a + b;

Now we have a single-argument function that returns a single-argument function that returns a value. So we can use it like this:

Func<int, Func<int, int&gt> add = a => b => a + b;
Func<int, int> incr = add(1);
int four = incr(3);

Unfortunately, this still won’t work with Map. What would happen if we passed the curried add to Map? We would get an incr function stuck inside of la-la land! And then we’d be stuck too. But what if we replaced Map with something that could work with functions living in la-la land? Something like this:

Lala<TR> Apply<T, TR>(Lala<Func<T, TR>> f, Lala<T> v);

What Apply needs to do is teleport both the function and the value out of la-la land, apply the function to the value, and teleport the result back into la-la land.

How would this work with our curried add function? Well, first we’d need to teleport the function itself into la-la land. For this, we need a function, which we’ll call Pure. It looks like this:

Lala<T> Pure<T>(T val);

In other words, Pure is a one-way portal to la-la land.

Let’s see how this would work for our curried add function:

static Lala<int> AddLalaIntegers(Lala<int> a, Lala<int> b) 
    Func<int, Func<int, int>> add = a => b => a + b;
    Lala<Func<int, Func<int, int>>> lalaAdd = Lala.Pure(add);
    Lala<Func<int, int>> lalaPartial = Lala.Apply(lalaAdd, a);
    Lala<int> lalaResult = Lala.Apply(lalaPartial, b);
    return lalaResult;    

Success! Who would have thought?

Well, someone, obviously. It turns out that la-la lands that support Pure and Apply are known as Applicative.

But there are still questions worth asking, such as: How do we implement these functions? Like Map, Pure and Apply must obey the laws of the particular la-la land they work with. We’re going to look at two examples in C#.

First, consider the la-la land known as Task<T>.

public static class TaskApplicative 
    public static Task<T> Pure(T val) 
        return Task.FromResult(val);

    public static async Task<TR> Apply<T, TR>(
        this Task<Func<T, TR> funTask, 
        Task<T> valTask)
        var fun = await funTask;
        var val = await valTask;
        return fun(val);

Awaiting the tasks bring them out of Task-land, and the return value is automatically transported back by the async machinery.

Second, imagine a type called Mayhaps<T>. Mayhaps is like Nullable, but it works on any type T. Why is this important? Because delegates are reference types, which means they can’t be put inside a Nullable. In other words, functions are not allowed into the la-la land that is Nullable. So Mayhaps it is.

Mayhaps has two possible values, Indeed and Sorry. Indeed holds a value, Sorry does not. That’s really all you need to know about Mayhaps. (For implementation details, look here.)

Here are Pure and Apply for Mayhaps:

public static class MayhapsApplicative
    public static Mayhaps<TR> Pure<TR>(TR v)
        return Mayhaps<TR>.Indeed(v);

    public static Mayhaps<TR> Apply<T, TR>(
        this Mayhaps<Func<T, TR>> mayhapsFunction,
        Mayhaps<T> mayhapsValue)
        if (mayhapsFunction.HasValue && mayhapsValue.HasValue)
            var fun = mayhapsFunction.Value;
            var val = mayhapsValue.Value;
            return Mayhaps<TR>.Indeed(fun(val));
            return Mayhaps<TR>.Sorry;

The semantics of Mayhaps is to propagate Sorry – you can only get a new Indeed if you have both a function and a value.

And of course the nice thing now is that we can separate our logic from the idiosyncracies of each la-la land! Which is pretty great.

But I’ll admit that we’re currently in a situation where calling a function is a little bit involved and awkward. It’s involved because there’s quite a bit of boilerplate, and it’s awkward because working with curried functions and partial application isn’t necessarily the bread and butter of C# programming. So let’s write some helper functions to alleviate some of that pain.

We can start by writing functions to curry Funcs, which should reduce the awkward. There are quite a few of them; here’s an example that curries a Func with four input parameters:

public static Func<T1, Func<T2, Func<T3, Func<T4, TR>>>> Curry<T1, T2, T3, T4, TR>(
    this Func<T1, T2, T3, T4, TR> f)
    return a => b => c => d => f(a, b, c, d);

We can use it like this:

Func<int, int, int, int, int> sirplusalot = 
    (a, b, c, d) => a + b + c + d; 
Func<int, Func<int, Func<int, Func<int, int>>>> = 

A little less awkward. What about involved? We’ll define some helper functions to reduce the boilerplate. The idea is to use a function Lift to handle pretty much everything for us. Here is one that can be used with sirplusalot:

public static Lala<TR> Lift<T1, T2, T3, T4, TR>(
    this Func<T1, T2, T3, T4, TR> f,
    Lala<T1> v1,
    Lala<T2> v2,
    Lala<T3> v3,
    Lala<T4> v4)
    return Pure(f.Curry()).Apply(v1).Apply(v2).Apply(v3).Apply(v4);

Note that all Lift functions will have the same structure, regardless of which la-la land they operate in. Only the implementations of Pure and Apply will vary.

And now we can implement functions that look like this:

private async static Task<int> Plus(
    Task<int> ta, 
    Task<int> tb, 
    Task<int> tc, 
    Task<int> td) 
    Func<int, int, int, int, int> sirplusalot = 
        (a, b, c, d) => a + b + c + d;
    return await sirplusalot.Lift(ta, tb, tc, td);

private static Mayhaps<int> Plus(
    Mayhaps<int> ma, 
    Mayhaps<int> mb, 
    Mayhaps<int> mc, 
    Mayhaps<int> md)
    Func<int, int, int, int, int> sirplusalot = 
        (a, b, c, d) => a + b + c + d;
    return sirplusalot.Lift(ma, mb, mc, md);

Which is quite nice? Yes?

How to reduce bunches of things

So there you are, a pragmatic C# programmer out to provide business value for your end users and all that stuff. That’s great.

One of the (admittedly many) things you might want to do is reduce a bunch of things of some type into a single thing of that type. For instance, you might want to add a bunch of numbers together, or concatinate a bunch of strings and so on. How would you do that? (Assuming there’s no built-in Aggregate method available, that is.) Well, you’d write a Reduce function, right? And since we haven’t specified in advance what kinds of things we should reduce, we better make it generic. So it could work on an IEnumerable<T> of things.

Now how should the actual reduction take place? An obvious idea is to do it stepwise. It’s both a good problem solving strategy in general, and kind of necessary when dealing with an IEnumerable. For that to work, though, you need some way of taking two values and combining them to produce a single value. So Reduce needs to be a higher-order function. The caller should pass in a combine function, as well as some initial value to combine with the first element. And then the completed function might look something like this:

public static T Reduce(this IEnumerable<T> things, 
  Func<T, T, T> combine, 
  T initialValue) 
  T result = initialValue;
  foreach (var t in things) 
    result = combine(result, t);
  return result;

And now if you have a bunch of integers, say, you can add them all up like this:

var integers = new [] { 1, 2, 3, 4 };
var sum = integers.Reduce((a, b) => a + b, 0);

If, on the other hand, you have a bunch of lists, you’d do something like this instead:

var lists = new [] {
  new List { 1 },
  new List { 2, 3 }
var sum = lists.Reduce((a, b) => 
    var list = new List();
    return list;
  new List());

And this would give you the list of elements 1, 2, 3. Great.

Now there are other things you might wonder about with respect to the combine function. For whatever reason, you might want to consider alternative implementations of Reduce. For instance, you’d might like to create batches of n things, reduce each batch, and then reduce those results for the final result. It would be nice to have that freedom of implementation. For that to be an option, though, you need your combine function to be associative.

Assume you have three values t1, t2, t3. Your combine function is associative if the following holds:

combine(t1, combine(t2, t3)) == combine(combine(t1, t2), t3)

Unfortunately there is nothing in the C# type system that lets us specify and verify that a function is associative, so we need to rely on documentation and discipline for that.

Alternatively, we can turn to mathematics. It turns out that mathematicians have a name for precisely the kind of thing we’re talking about. A semigroup is a structure that consists of a set of values and an associative binary operation for combining such values. Granted, it’s a strange-sounding name, but it identifies a very precise concept that gives us something to reason about. So it’s a useful abstraction that actually gives us some guarantees that we can rely on when programming.

To represent a semigroup in our program, we can introduce an interface:

public interface ISemigroup<T>
  T Combine(T a, T b);

And we can modify our Reduce function to work with semigroups, which by definition guarantees that the Combine function is associative.

public static T Reduce<T>(this IEnumerable<T> things, 
  ISemigroup<T> semigroup, 
  T initialValue)
  T result = initialValue;
  foreach (var thing in things)
    result = semigroup.Combine(result, thing);
  return result;

And we can introduce a bunch of concrete implementations of this interface, like:

class IntegerUnderAdditionSemigroup : ISemigroup<int>
  public int Combine(int a, int b)
    return a + b;

class IntegerUnderMultiplicationSemigroup : ISemigroup<int>
  public int Combine(int a, int b)
    return a * b;

class StringSemigroup : ISemigroup<string>
  public string Combine(string a, string b) 
    return a + b;

class ListSemigroup<T> : ISemigroup<List<T>> 
  public List Combine(List a, List b) 
    var result = new List();
    return result;

class FuncSemigroup<T> : ISemigroup<Func<T, T>>
  public Func<T, T> Combine(Func<T, T> f, Func<T, T> g) 
    return it => g(f(it));

So that’s quite nice. We can rely on meaningful and precise abstractions to give us some guarantees in our programs.

There is still a small problem when working with semigroups for reduction though. What should the initial value be? We really just want to reduce a bunch of values of some type, we don’t want to be bothered with some additional value.

One approach, I guess, would be to just pick the first value and then perform reduce on the rest.

public static T Reduce(this IEnumerable<T> things, 
  ISemigroup<T> semigroup)
  return things.Skip(1).Reduce(semigroup, things.First();

This would work for non-empty bunches of things. But that means we’d have to check for that in some way before calling Reduce. That’s quite annoying.

What would be useful is some sort of harmless value that we could combine with any other value and just end up with the other value. So we could just use that magical value as the initial value for our Reduce.

Luckily, it turns out that there are such magical values for all the semigroups we’ve looked at. In fact, we’ve seen two such values already. For integers under addition, it’s zero. For lists, it’s the empty list. But there are others. For integers under multiplication, it’s one. For strings, it’s the empty string. And for functions it’s the identity function, which just returns whatever value you hand it. Now if you can provide such a value, which is called the unit value, for your semigroup, you get what the mathematicians call a monoid. It’s another intensely unfamiliar-sounding name, but again the meaning is very precise.

We can represent monoids in our programs by introducing another interface:

public interface IMonoid<T> : ISemigroup<T> 
  T Unit { get; }

So there is nothing more to a monoid than exactly this: it’s a semigroup with a unit value. And the contract that the unit value operates under is this:

Compose(Unit, t) == Compose(t, Unit) == t

This just says that the unit value is magical in the sense we outlined. We can combine it with any value t any way we want, and we end up with t.

Now we can write a new Reduce function that works on monoids:

public static T Reduce(this IEnumerable<T> things, 
  IMonoid<T> monoid)
  return things.Reduce(monoid, monoid.Unit);

This is quite nice, because we don’t have to worry any more about whether or not the bunch of things is empty. We can proceed to implement concrete monoids that we might want to use.

class IntegerUnderAdditionMonoid 
  : IntegerUnderAdditionSemigroup, IMonoid<int>
  public int Unit
    get { return 0; }

class IntegerUnderMultiplicationMonoid 
  : IntegerUnderMultiplicationSemigroup, IMonoid<int>
  public int Unit
    get { return 1; }

class StringMonoid : StringSemigroup, IMonoid<string>
  public string Unit
    get { return ""; }

class ListMonoid<T> 
  : ListSemigroup<T>, IMonoid<List<T>>
  public List<T> Unit
    get { return new List<T>(); }

class FuncMonoid<T> : FuncSemigroup<T>, IMonoid<Func<T, T>>
  public Func<T, T> Unit 
    get { return it => it; }

And we might write a small test program to see if they work as intended.

public static void Main(string[] args)
  var integers = new[] { 1, 2, 4, 8 };
  var sum = integers.Reduce(new IntegerUnderAdditionMonoid());
  var product = integers.Reduce(new IntegerUnderMultiplicationMonoid());
  var strings = new[] { "monoids", " ", "are", " ", "nifty" };
  var str = strings.Reduce(new StringMonoid());
  var lists = new[] {
    new List { "monoids", " " },
    new List { "are" },
    new List { " ", "nice" }
  var list = lists.Reduce(new ListMonoid());
  var str2 = list.Reduce(new StringMonoid());
  var integerFunctions = new Func<T, T>[] { it => it + 1, it => it % 3 };
  var intFun = integerFunctions.Reduce(new FuncMonoid());
  var stringFunctions = new Func<T, T>[] { s => s.ToUpper(), s => s.Substring(0, 5) };
  var strFun = stringFunctions.Reduce(new FuncMonoid());

  Console.WriteLine(strFun("hello world"));

Can you work out what the program will print? If not, you might want to try to run it.

Hopefully this post gives some indication of the flexibility and power that can come with very simple abstractions. It might even give you a creeping sensation that these Haskell heads are onto something when they claim that mathematics that studies structure and composition can be useful for programmers. At the face of things, the processes of adding up integers, concatenating strings, appending lists and composing functions seem quite different, but structurally they nevertheless share some fundamental traits that can be leveraged to great effect.

LINQ to Nullable

Things got a bit out of hand today.

It all started when I added a point to the agenda for our backend team meeting saying I’d explain real quick what a functor is – or at least what my understanding of what a functor is is. And so I did.

Now the explanation itself didn’t go half bad, I don’t think. While I’m sure I would have offended mathematicians and possibly some haskellites, they weren’t there. Instead, the room was filled with C# programmers.

I think I said something like the following. Assume you have a parameterized type S<T>, where S defines some structure on top of type T. The obvious example for a C# programmer would be an IEnumerable<T>, but of course there are others, including Task<T> and Nullable<T> and indeed Whatever<T>. Now if you have such an S and a mapping function that given some S<T> and a function from T to U produces an S<U> then you almost have a functor already! In addition to that, you just need to make sure that your mapping is well-behaved in a sense. First, mapping the identity function over a structure shouldn’t change it. So if you map it => it over some structure S, that should just give you the same structure you started with. And second, assume you have a function f from T to U and a function g from U to V. If you map f over S to yield S<U> and then map g over that to yield S<V>, that should give you the same result as mapping the composed function it => g(f(it)) over S<T>.

To illustrate, I explained that Nullable<T> is a functor – or at least it should be. And it would be, if we defined the appropriate mapping function for Nullable<T>. So I wrote the following on the whiteboard:

public static class NullableExtensions {
  public static TTarget? Select<TSource, TTarget>(
      this TSource? t, 
      Func<TSource, TTarget> selector)
    where TSource : struct
    where TTarget : struct
    return t.HasValue ? (TTarget?) selector(t.Value) : null;

So this is our mapping function, even though I named it Select, which is the name used in the C# and LINQ world. A benefit of this function is that you no longer have to manually handle the mundane issues of worrying about whether or not some Nullable<T> is null. So instead of writing code like this, which resembles something from our code base:

Duration? duration = null;
if (thing.Frames.HasValue)
  var ms = thing.Frames.Value * 40;
  duration = Duration.FromMilliseconds(ms);

You can write this instead:

Duration? duration = thing.Frames.Select(fs => Duration.FromMilliseconds(fs * 40));

I think it is quite nice – at least if you can get comfortable calling an extension method on something that might be null. But from this point on, things started to go awry. But it wasn’t my fault! They started it!

See, some of the people in the meeting said they kind of liked the approach, but argued that Map would be a better name because it would avoid confusion with Select, which is associated with LINQ and IEnumerable<T>. In some sense, this was the opposite argument I used for choosing Select over Map in the first place! I thought it would make sense to call it Select precisely because that’s the name for the exact same thing for another kind of structure.

So as I left the meeting, I started wondering. I suspected that there really was nothing particular that tied LINQ and the query syntax to IEnumerable<T>, which would mean you could use it for other things. Other functors. And so I typed the following into LinqPad:

DateTime? maybeToday = DateTime.Today;
var maybeTomorrow = from dt in maybeToday select dt.AddDays(1);

And it worked, which I thought was pretty cool. I consulted the C# specification and found that as long as you implement methods of appropriate names and signatures, you can use the LINQ query syntax. And so I decided to let functors be functors and just see what I could do with Nullables using LINQ. So I wrote this:

public static TSource? Where<TSource>(
    this TSource? t, 
    Func<TSource, bool> predicate)
  where TSource : struct
    return t.HasValue && predicate(t.Value) ? t : null;

Which allowed me to write

DateTime? MaybeSaturday(DateTime? maybeDateTime)
    from dt in maybeDateTime
    where dt.DayOfWeek == DayOfWeek.Friday
    select dt.AddDays(1);

Which will return null unless it’s passed a Nullable that wraps a DateTime representing a Friday. Useful.

It should have stopped there, but the C# specification is full of examples of expressions written in query syntax and what they’re translated into. For instance, I found that implementing this:

public static TTarget? SelectMany<TSource, TTarget>(
    this TSource? t, 
    Func<TSource, TTarget?> selector)
  where TSource : struct
  where TTarget : struct
  return t.HasValue ? selector(t.Value) : null;

public static TResult? SelectMany<TSource, TIntermediate, TResult>(
    this TSource? t, 
    Func<TSource, TIntermediate?> selector, 
    Func<TSource, TIntermediate, TResult> resultSelector)
  where TSource : struct
  where TIntermediate : struct
  where TResult : struct
  return t.SelectMany(selector)
          .Select(it => resultSelector(t.Value, it));

I could suddenly write this, which is actually quite nice:

TimeSpan? Diff(DateTime? maybeThis, DateTime? maybeThat)
    from dt1 in maybeThis
    from dt2 in maybeThat
    select (dt2 - dt1);

It will give you a wrapped TimeSpan if you pass it two wrapped DateTimes, null otherwise. How many checks did you write? None.

And as I said, it sort of got a bit out of hand. Which is why I now have implementations of Contains, Count, Any, First, FirstOrDefault, even Aggregate, and I don’t seem to be stopping. You can see the current state of affairs here.

What I find amusing is that you can usually find a reasonable interpretation and implementation for each of these functions. Count, for instance, will only ever return 0 or 1, but that sort of makes sense. First means unwrapping the value inside the Nullable<T> without checking that there is an actual value there. Any answers true if the Nullable<T> holds a value. And so on and so forth.

Finally, as an exercise for the reader: what extension methods would you write to enable this?

static async Task Greet()
  var greeting =
    from v1 in Task.FromResult("hello")
    from v2 in Task.FromResult("world")
    select (v1 + " " + v2);

  Console.WriteLine(await greeting);

Picture combinators and recursive fish

On February 9th 2017, I was sitting in an auditorium in Krakow, listening to Mary Sheeran and John Hughes give the opening keynote at the Lambda Days conference. It was an inspired and inspiring keynote, that discussed some of the most influential ideas in some of the most interesting papers written on functional programming. You should absolutely check it out.

One of the papers that was mentioned was Functional Geometry by Peter Henderson, written in 1982. In it, Henderson shows a deconstruction of an Escher woodcut called Square Limit, and how he can elegantly reconstruct a replica of the woodcut by using functions as data. He defines a small set of picture combinators – simple functions that operate on picture functions – to form complex pictures out of simple ones.

Escher’s original woodcut looks like this:


Which is pretty much a recursive dream. No wonder Henderson found it so fascinating – any functional programmer would.

As I was listening the keynote, I recalled that I had heard about the paper before, in the legendary SICP lectures by Abelson and Sussman (in lecture 3A, in case you’re interested). I figured it was about time I read the paper first hand. And so I did. Or rather, I read the revised version from 2002, because that’s what I found online.

And of course one thing led to another, and pretty soon I had implemented my own version in F#. Which is sort of why we’re here. Feel free to tag along as I walk through how I implemented it.

A key point in the paper is to distinguish between the capability of rendering some shape within a bounding box onto a screen on the one hand, and the transformation and composition of pictures into more complex pictures on the other. This is, as it were, the essence of decoupling through abstraction.

Our basic building block will be a picture. We will not think of a picture as a collection of colored pixels, but rather as something that is capable of scaling and fitting itself with respect to a bounding box. In other words, we have this:

type Picture : Box -> Shape list

A picture is a function that takes a box and creates a list of shapes for rendering.

What about the box itself? We define it using three vectors a, b and c.

type Vector = { x : float; y : float }
type Box = { a : Vector; b : Vector; c : Vector}

The vector a denotes the offset from the origin to the bottom left corner of the box. The vectors b and c span out the bounding box itself. Each vector is defined by its magnitude in the x and y dimensions.

For example, assume we have a picture F that will produce the letter F when given a bounding box. A rendering might look like this:


But if we give F a different box, the rendering will look different too:


So, how do we create and render such a magical, self-fitting picture?

We can decompose the problem into three parts: defining the basic shape, transforming the shape with respect to the bounding box, and rendering the final shape.

We start by defining a basic shape relative to the unit square. The unit square has sides of length 1, and we position it such that the bottom left corner is at (0, 0) and top right corner is at (1, 1). Here’s a definition that puts a polygon outlining the F picture inside the unit square:

let fShape = 
  let pts = [ 
    { x = 0.30; y = 0.20 } 
    { x = 0.40; y = 0.20 }
    { x = 0.40; y = 0.45 }
    { x = 0.60; y = 0.45 }
    { x = 0.60; y = 0.55 }
    { x = 0.40; y = 0.55 }
    { x = 0.40; y = 0.70 }
    { x = 0.70; y = 0.70 }
    { x = 0.70; y = 0.80 }
    { x = 0.30; y = 0.80 } ]
  Polygon { points = pts }

To make this basic shape fit the bounding box, we need a mapping function. That’s easy enough to obtain:

let mapper { a = a; b = b; c = c } { x = x; y = y } =
   a + b * x + c * y

The mapper function takes a bounding box and a vector, and produces a new vector adjusted to fit the box. We’ll use partial application to create a suitable map function for a particular box.

As you can see, we’re doing a little bit of vector arithmetic to produce the new vector. We’re adding three vectors: a, the vector obtained by scaling b by x, and the vector obtained by scaling c by y. As we proceed, we’ll need some additional operations as well. We implement them by overloading some operators for the Vector type:

static member (+) ({ x = x1; y = y1 }, { x = x2; y = y2 }) =
    { x = x1 + x2; y = y1 + y2 }

static member (~-) ({ x = x; y = y }) =
    { x = -x; y = -y }

static member (-) (v1, v2) = v1 + (-v2)

static member (*) (f, { x = x; y = y }) =
    { x = f * x; y = f * y }

static member (*) (v, f) = f * v

static member (/) (v, f) = v * (1 / f)

This gives us addition, negation, subtraction, scalar multiplication and scalar division for vectors.

Finally we need to render the shape in some way. It is largely an implementation detail, but we’ll take a look at one possible simplified rendering. The code below can be used to produce an SVG image of polygon shapes using the NGraphics library.

type PolygonShape = { points : Vector list }

type Shape = Polygon of PolygonShape

let mapShape m = function 
  | Polygon { points = pts } ->
    Polygon { points = pts |> m }

let createPicture shapes = 
   fun box ->
     shapes |> (mapShape (mapper box))

let renderSvg width height filename shapes = 
  let size = Size(width, height)
  let canvas = GraphicCanvas(size)
  let p x y = Point(x, height - y) 
  let drawShape = function 
  | Polygon { points = pts } ->
    match pts |> (fun { x = x; y = y } -> p x y) with 
    | startPoint :: t ->
      let move = MoveTo(startPoint) :> PathOp
      let lines = t |> (fun pt -> LineTo(pt) :> PathOp) 
      let close = ClosePath() :> PathOp
      let ops = (move :: lines) @ [ close ] 
      canvas.DrawPath(ops, Pens.Black)
    | _ -> ()
  shapes |> List.iter drawShape
  use writer = new StreamWriter(filename)

When we create the picture, we use the mapShape function to apply our mapping function to all the points in the polygon that makes up the F. The renderSvg is used to do the actual rendering of the shapes produced by the picture function.

Once we have the picture abstraction in place, we can proceed to define combinators that transform or compose pictures. The neat thing is that we can define these combinators without having to worry about the rendering of shapes. In other words, we never have to pry open our abstraction, we will trust it to do the right thing. All our work will be relative, with respect to the bounding boxes.

We start with some basic one-to-one transformations, that is, functions with this type:

type Transformation = Picture -> Picture

The first transformation is turn, which rotates a picture 90 degrees counter-clockwise around its center (that is, around the center of its bounding box).

The effect of turn looks like this:


Note that turning four times produces the original picture. We can formulate this as a property:

(turn >> turn >> turn >> turn) p = p

(Of course, for pictures with symmetries, turning twice or even once might be enough to yield a picture equal to the original. But the property above should hold for all pictures.)

The vector arithmetic to turn the bounding box 90 degrees counter-clockwise is as follows:

(a’, b’, c’) = (a + b, c, -b)

And to reiterate: the neat thing is that this is all we need to consider. We define the transformation using nothing but this simple arithmetic. We trust the picture itself to cope with everything else.

In code, we write this:

let turnBox { a = a; b = b; c = c } =
    { a = a + b; b = c; c = -b }

let turn p = turnBox >> p

The overloaded operators we defined above makes it very easy to translate the vector arithmetic into code. It also makes the code very easy to read, and hopefully convince yourself that it does the right thing.

The next transformation is flip, which flips a picture about the center vertical axis of the bounding box.

Which might sound a bit involved, but it’s just this:


Flipping twice always produces the same picture, so the following property should hold:

(flip >> flip) p = p

The vector arithmetic is as follows:

(a’, b’, c’) = (a + b, -b, c)

Which translates neatly to:

let flipBox { a = a; b = b; c = c } =
   { a = a + b; b = -b; c = c }

let flip p = flipBox >> p

The third transformation is a bit peculiar, and quite particular to the task of mimicking Escher’s Square Limit, which is what we’re building up to. Henderson called the transformation rot45, but I’ll refer to it as toss, since I think it resembles light-heartedly tossing the picture up in the air:


What’s going on here? Its a 45 degree counter-clockwise rotation around top left corner, which also shrinks the bounding box by a factor of √2.

It’s not so easy to define simple properties that should hold for toss. For instance, tossing twice is not the same as turning once. So we won’t even try.

The vector arithmetic is still pretty simple:

(a’, b’, c’) = (a + (b + c) / 2, (b + c) / 2, (c − b) / 2)

And it still translates very directly into code:

let tossBox { a = a; b = b; c = c } =
  let a' = a + (b + c) / 2
  let b' = (b + c) / 2
  let c' = (c − b) / 2
  { a = a'; b = b'; c = c' }

let toss p = tossBox >> p

That’s all the transformations we’ll use. We can of course combine transformations, e.g:

(turn >> turn >> flip >> toss)

Which produces this:


We proceed to compose simple pictures into more complex ones. We define two basic functions for composing pictures, above and beside. The two are quite similar. Both functions take two pictures as arguments; above places the first picture above the second, whereas beside places the first picture to the left of the second.


Here we see the F placed above the turned F, and the F placed beside the turned F. Notice that each composed picture forms a square, whereas each original picture is placed within a half of that square. What happens is that the bounding box given to the composite picture is split in two, with each original picture receiving one of the split boxes as their bounding box. The example shows an even split, but in general we can assign a fraction of the bounding box to the first argument picture, and the remainder to the second.

For implementation details, we’ll just look at above:

let splitHorizontally f box =
  let top = box |> moveVertically (1. - f) |> scaleVertically f  
  let bottom = box |> scaleVertically (1. - f)
  (top, bottom)

let aboveRatio m n p1 p2 =
  fun box ->
    let f = float m / float (m + n)
    let b1, b2 = splitHorizontally f box
    p1 b1 @ p2 b2

let above = aboveRatio 1 1

There are three things we need to do: work out the fraction of the bounding box assigned to the first picture, split the bounding box in two according to that fraction, and pass the appropriate bounding box to each picture. We “split” the bounding box by creating two new bounding boxes, scaled and moved as appropriate. The mechanics of scaling and moving is implemented as follows:

let scaleVertically s { a = a; b = b; c = c } = 
  { a = a
    b = b 
    c = c * s }

let moveVertically offset { a = a; b = b; c = c } = 
  { a = a + c * offset
    b = b
    c = c }

Now we can create more interesting images, such as this one:


Which is made like this:

above (beside (turn (turn (flip p))) (turn (turn p)))
      (beside (flip p) p)

With this, our basic toolset is complete. Now it is time to lose the support wheels and turn our attention to the original task: creating a replica of Henderson’s replica of Escher’s Square Limit!

We start with a basic picture that is somewhat more interesting than the F we have been using so far.

According to the paper, Henderson created his fish from 30 bezier curves. Here is my attempt at recreating it:

Henderson's fish

You’ll notice that the fish violates the boundaries of the unit square. That is, some points on the shape has coordinates that are below zero or above one. This is fine, the picture isn’t really bound by its box, it’s just scaled and positioned relative to it.

We can of course turn, flip and toss the fish as we like.

Henderson's fish (turned, flipped and tossed)

But there’s more to the fish than might be immediately obvious. After all, it’s not just any fish, it’s an Escher fish. An interesting property of the fish is shown if we overlay it with itself turned twice.

We define a combinator over that takes two pictures and places both pictures with respect to the same bounding box. And voila:


As we can see, the fish is designed so that it fits together neatly with itself. And it doesn’t stop there.

The t tile

This shows the tile t, which is one of the building blocks we’ll use to construct Square Limit. The function ttile creates a t tile when given a picture:

let ttile f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over fishN fishE)

Here we see why we needed the toss transformation defined earlier, and begin to appreciate the ingenious design of the fish.

The second building block we’ll need is called tile u. It looks like this:

The u tile

And we construct it like this:

let utile (f : Picture) = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over fishN fishW)
       (over fishE fishS)

To compose the Square Limit itself, we observe that we can construct it from nine tiles organized in a 3×3 grid. We define a helper function nonet that takes nine pictures as arguments and lays them out top to bottom, left to right. Calling nonet with pictures of the letters H, E, N, D, E, R, S, O, N produces this result:


The code for nonet looks like this:

let nonet p q r s t u v w x =
  aboveRatio 1 2 (besideRatio 1 2 p (beside q r))
                 (above (besideRatio 1 2 s (beside t u))
                        (besideRatio 1 2 v (beside w x)))

Now we just need to figure out the appropriate pictures to pass to nonet to produce the Square Limit replica.

The center tile is the easiest: it is simply the tile u that we have already constructed. In addition, we’ll need a side tile and a corner tile. Each of those will be used four times, with the turn transformation applied 0 to 3 times.

Both side and corner have a self-similar, recursive nature. We can think of both tiles as consisting of nested 2×2 grids. Similarly to nonet, we define a function quartet to construct such grids out of four pictures:

let quartet p q r s = above (beside p q) (beside r s)

What should we use to fill our quartets? Well, first off, we need a base case to terminate the recursion. To help us do so, we’ll use a degenerate picture blank that produces nothing when given a bounding box.

We’ll discuss side first since it is the simplest of the two, and also because corner uses side. The base case should look like this:

side 1 fish

For the recursive case, we’ll want self-similar copies of the side-tile in the top row instead of blank pictures. So the case one step removed from the base case should look like this:

side 2 fish

The following code helps us construct sides of arbitrary depth:

let rec side n p = 
  let s = if n = 1 then blank else side (n - 1) p
  let t = ttile p
  quartet s s (t |> turn) t

This gives us the side tile that should be used as the “north” tile in the nonet function. We obtain “west”, “south” and “east” as well by turning it around once, twice or thrice.

Creating a corner is quite similar to creating a side. The base case should be a quartet consisting of three blank pictures, and a u tile for the final, bottom right picture. It should look like this:

corner 1 fish

The recursive case should use self-similar copies of both the corner tile (for the top left or “north-west” picture) and the side tile (for the top right and bottom left pictures), while keeping the u tile for the bottom right tile.

corner 2 fish

Here’s how we can write it in code:

let rec corner n p = 
  let c, s = if n = 1 then blank, blank 
             else corner (n - 1) p, side (n - 1) p
  let u = utile p
  quartet c s (s |> turn) u

This gives us the top left corner for our nonet function, and of course we can produce the remaining corners by turning it a number of times.

Putting everything together, we have:

let squareLimit n picture =
  let cornerNW = corner n picture
  let cornerSW = turn cornerNW
  let cornerSE = turn cornerSW
  let cornerNE = turn cornerSE
  let sideN = side n picture
  let sideW = turn sideN
  let sideS = turn sideW
  let sideE = turn sideS
  let center = utile picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

Calling squareLimit 3 fish produces the following image:


Which is a pretty good replica of Henderson’s replica of Escher’s Square Limit, to a depth of 3. Sweet!

Misson accomplished? Are we done?

Sort of, I suppose. I mean, we could be.

However, if you take a look directly at Escher’s woodcut (or, more likely, the photos of it that you can find online), you’ll notice a couple of things. 1) Henderson’s basic fish looks a bit different from Escher’s basic fish. 2) Escher’s basic fish comes in three hues: white, grey and black, whereas Henderson just has a white one. So it would be nice to address those issues.

Here’s what I came up with.


To support different hues of the same fish requires a bit of thinking – we can’t just follow Henderson’s instructions any more. But we can use exactly the same approach! In addition to transforming the shape of the picture, we need to be able to transform the coloring of the picture. For this, we introduce a new abstraction, that we will call a Lens.

type Hue = Blackish | Greyish | Whiteish

type Lens = Box * Hue

We redefine a picture to accept a lens instead of just a box. That way, the picture can take the hue (that is, the coloring) into account when figuring out what to draw. Now we can define a new combinator rehue that changes the hue given to a picture:

let rehue p =
  let change = function
  | Blackish -> Greyish
  | Greyish -> Whiteish
  | Whiteish -> Blackish
  fun (box, hue) -> p (box, change hue)

Changing hue three times takes us back to the original hue:

(rehue >> rehue >> rehue) p = p

We need to revise the tiles we used to construct the Square Limit to incorporate the rehue combinator. It turns out we need to create two variants of the t tile.


But of course it’s just the same old t tile with appropriate calls to rehue for each fish:

let ttile hueN hueE f = 
   let fishN = f |> toss |> flip
   let fishE = fishN |> turn |> turn |> turn 
   over f (over (fishN |> hueN)
                (fishE |> hueE))

let ttile1 = ttile rehue (rehue >> rehue)

let ttile2 = ttile (rehue >> rehue) rehue

For the u tile, we need three variants:


Again, we just call rehue to varying degrees for each fish.

let utile hueN hueW hueS hueE f = 
  let fishN = f |> toss |> flip
  let fishW = fishN |> turn
  let fishS = fishW |> turn
  let fishE = fishS |> turn
  over (over (fishN |> hueN) (fishW |> hueW))
       (over (fishE |> hueE) (fishS |> hueS))

let utile1 = 
  utile (rehue >> rehue) id (rehue >> rehue) id

let utile2 = 
  utile id (rehue >> rehue) rehue (rehue >> rehue)

let utile3 = 
  utile (rehue >> rehue) id rehue id 

We use the two variants of the t tile in two side functions, one for the “north” and “south” side, another for the “west” and “east” side.

let side tt hueSW hueSE n p = 
  let rec aux n p =
    let t = tt p
    let r = if n = 1 then blank else aux (n - 1) p
    quartet r r (t |> turn |> hueSW) (t |> hueSE)
  aux n p

let side1 =
  side ttile1 id rehue 

let side2 =
  side ttile2 (rehue >> rehue) rehue

We define two corner functions as well, one for the “north-west” and “south-east” corner, another for the “north-east” and the “south-west” corner.

let corner ut sideNE sideSW n p = 
  let rec aux n p = 
    let c, ne, sw = 
      if n = 1 then blank, blank, blank 
               else aux (n - 1) p, sideNE (n - 1) p, sideSW (n - 1) p
    let u = ut p
    quartet c ne (sw |> turn) u
  aux n p 

let corner1 = 
  corner utile3 side1 side2

let corner2 = 
  corner utile2 side2 side1

Now we can write an updated squareLimit that uses our new tile functions.

let squareLimit n picture =
  let cornerNW = corner1 n picture
  let cornerSW = corner2 n picture |> turn
  let cornerSE = cornerNW |> turn |> turn
  let cornerNE = cornerSW |> turn |> turn
  let sideN = side1 n picture
  let sideW = side2 n picture |> turn
  let sideS = sideN |> turn |> turn
  let sideE = sideW |> turn |> turn
  let center = utile1 picture
  nonet cornerNW sideN cornerNE  
        sideW center sideE
        cornerSW sideS cornerSE

And now calling squareLimit 5 fish produces the following image:


Which is a pretty good replica of Escher’s Square Limit, to a depth of 5.

The complete code is here.

Update: I have also written a version using Fable and SAFE that I use for presentations. You can find it here.

Donkey code

This is an attempt to write down a very simple example I’ve been using to explain the profound impact the language we use has on thought, discussion and ultimately code.

Imagine you have a computer system, and that you’re one of the programmers working on that system (not too hard, is it?). The system is called, oh I don’t know, eQuest. It has to do with horses. So it typically works with entities of this kind:


eQuest is a tremendous success for whatever reason, perhaps there’s very little competition. But it is a success, and so it’s evolving, and one day your product owner comes up with the idea to expand to handle entities of this kind as well:


It’s a new kind of horse! It’s mostly like the other horses and so lots of functionality can easily be reused. However, it has some special characteristics, and must be treated a little differently in some respects. Physically it is quite short, but very strong. Behaviour-wise, it is known to be stubborn, intelligent and not easily startled. It’s an interesting kind of horse.  It also likes carrots a lot (but then don’t all horses?). Needless to say, there will be some adjustments to some of the modules.

Design meetings ensue, to flesh out the new functionality and figure out the correct adjustments to be made to eQuest. Discussions go pretty well. Everyone has heard of these “horses that are small and stubborn” as they’re called. (Some rumors indicate that genetically they’re not actually horses at all – apparently there are differences at the DNA level, but the real world is always riddled with such technicalities. From a pragmatic viewpoint, they’re certainly horses. Albeit short and stubborn, of course. And strong, too.) So it’s not that hard to discuss features that apply to the new kind of horse.

There is now a tendency for confusion when discussing other kinds of changes to the product, though. The unqualified term “horse” is obviously used quite a bit in all kinds of discussions, but sometimes the special short and stubborn kind is meant to be included and sometimes it is not. So to clarify, you and your co-workers find yourself saying things like “any horse” to mean the former and “regular horse”, “ordinary horse”, “old horse”, “horse-horse” or even “horse that’s not small and stubborn” to mean the latter.

To implement support for the new horse in eQuest, you need some way of distinguish between it and an ordinary horse-horse. So you add an IsShort property to your Horse data representation. That’s easy, it’s just a derived property from the Height property. No changes to the database schema or anything. In addition, you add an IsStubborn property and checkbox to the registration form for horses in eQuest. That’s a new flag in the database, but that’s OK. With that in place, you have everything you need to implement the new functionality and make the necessary adjustments otherwise.

Although much of the code applies to horses and short, stubborn horses alike, you find that the transport module, the feeding module, the training module and the breeding module all need a few adjustments, since the new horses aren’t quite like the regular horses in all respects. You need to inject little bits of logic here, split some cases in two there. It takes a few different forms, and you and your co-workers do things a bit differently. Sometimes you employ if-branches with logic that looks like this:

if (horse.IsShort && horse.IsStubborn) {
  // Logic for the new horse case.
else {
  // Regular horse code here.

Other times you go fancy with LINQ:

var newHorses = horses.Where(h => h.IsShort && h.IsStubborn);
var oldHorses = horses.Except(newHorses);
foreach (var h in newHorses) {
  // New horse logic.
foreach (var h in oldHorses) {
  // Old horse logic.

And that appears to work pretty well, and you go live with support for short and stubborn horses.

Next day, you have a couple of new bug reports, one in the training module and two concerning the feeding module. It turns out that some of the regular horses are short and stubborn too, so your users would register short regular horses, tick the stubborn checkbox, and erroneously get new horse logic instead of the appropriate horse-horse logic. That’s awkward, but understandable. So you call a few meetings, discuss the issue with your fellow programmers, scratch your head, talk to a UX designer and your product owner. And you figure out that not only are the new horses short and stubborn, they make a distinct sound too. They don’t neigh the way regular horses do, they hee-haw instead.

So you fix the bug. A new property on horse, Sound, with values Neigh and HeeHaw, and updates to logic as appropriate. No biggie.

In design meetings, most people still use the term “horse that’s short and stubborn” to mean the new kind of horse, even though you’re encouraging people to include the sound they make as well, or even just say “hee-hawing horse”. But apart from this nit-picking from your side, things proceed well. It appears that most bugs have been ironed out, and your product owner is happy. So happy, in fact, that there is a new meeting. eQuest is expanding further, to handle entities of this kind as well:


What is it? Well, it’s the offspring from a horse and a horse that’s short and stubborn and says hee-haw. It shares some properties with the former kind of horse and some with the latter, so obviously there will be much reuse! And a few customizations and adjustments unique for this new new kind of horse. At this point you’re getting worried. You sense there will be trouble if you can’t speak clearly about this horse, so you cry out “let’s call it a half hee-haw!” But it doesn’t catch on. Talking about things is getting a bit cumbersome.

“But at least I can still implement it,” you think for yourself. “And I can mostly guess what kind of horses the UX people are thinking about when they say ‘horse’ anyway, I’ll just map it in my head to the right thing”. You add a Sire and a Dam property to Horse. And you proceed to update existing logic and write new logic.

You now have code that looks like this:

if (horse.IsShort && 
    horse.IsStubborn && 
    horse.Sound == Sound.HeeHaw) || 
   (horse.Sire.IsShort && 
    horse.Sire.IsStubborn && 
    horse.Sire.Sound == Sound.HeeHaw) ||
   (horse.Dam.IsShort && 
    horse.Dam.IsStubborn && 
    horse.Dam.Sound == Sound.HeeHaw)) {
  // Logic for both the new horse and the new-new horse!
else {
  // Really regular horse code here.

Which turns out to be wrong, since the new new horse doesn’t really neigh or hee-haw, it does something in-between. There is no word for it, so you invent one: the neigh-haw. You extend the Sound enumeration to incorporate it and fix your code.

Getting all the edge cases right takes a while. Your product owner is starting to wonder why development is slowing down, when so much of the functionality can be reused. You mumble something about technical debt. But you manage to get the bug count down to acceptable levels, much thanks to diligent testing.

At this point, there is another meeting. You are shown two photographs of horses of the newest kind. Or so you think. The product owner smiles. “They’re almost identical, but not quite!” he says. “You see, this one has a horse as a mother and a short stubborn horse as a father.” You see where this is going. “But this one, this one has a short stubborn horse as a mother and a horse as a father.” “Does it matter?” you ask. “This last one is always sterile,” he says. “So you need to handle that in the breeding module.” Oh.

“And then there’s this.”


The point of this example is that it takes very little for software development to get crippled by complexity without precise language. You need words for the things you want to talk about, both in design discussions and in code. Without them, it becomes very difficult to have meaningful communication, and the inability to articulate a thought precisely is made manifest in the code itself. Your task quickly turns from implementing new functionality to making sure that you don’t break existing functionality.

The example is special in that the missing words are sort of jumping out at you. It’s so obvious what they should be. This is not always the case. In many domains, it can be much, much harder to figure out what the words should be. It’s likely to require a lot of time and effort, and include frustrating and heated discussions with people who think differently than you. You might find that you and your team have to invent new words to represent the concepts that are evolving in your collective mind. But it can also be that you’ve all become accustomed to the set of words you’re currently using, and gone blind to the donkeys in your system.

Strings with assumptions

TL;DR Strings always come with strings attached.

I had a little rant about strings on Twitter the other day. It started like this:

This blog post is essentially the same rant, with a bit of extra cheese.

Here’s the thing: I find that most code bases I encounter are unapologetically littered with strings. Strings are used to hold values of all kinds of kinds, from customer names to phone numbers to XML and JSON structures and what have you. As such, strings are incredibly versatile and flexible; properties we tend to think of as positive when we talk about code. So why do I hate strings?

Well, the problem is that we don’t want our types to be flexible like that – as in “accepting of all values”. In fact, the whole point of types is to avoid this flexibility! Types are about restricting the number of possible values in your program, to make it easier to reason about. You want to allow exactly the legal values, and to forbid all the illegal values. String restricts nothing! String is essentially object! But people who have the decency to refrain from using object will still gladly use string all over the place. It’s weird. And dangerous. That’s why we should never give in to the temptation to escape from the type system by submerging our values in the untyped sea of string. When that value resurfaces sometime later on, we’ll effectively be attempting a downcast from object back to the actual type. Will it succeed? Let’s hope so!

So to be very explicit about it: if you have a string in your program, it could be anything – anything! You’re communicating to the computer that you’re willing to accept any and all of the following fine string specimen as data in your program:

Your program does not distinguish between them, they’re all equally good. When you declare a string in your program, you’re literally saying that I’m willing to accept – I’m expecting! – any and all of those as a value. (I should point out that you’re expecting very big strings too, but I didn’t feel like putting any of them in here, because they’re so unwieldy. Not to mention the door is open to that mirage doppelganger of a string, null, as well – but that’s a general problem, not limited to string.)

Of course, we never want this. This is never what the programmer intends. Instead, the programmer has assumptions in their head, that the string value should really be drawn from a very small subset of the entire domain of strings, a subset that fits the programmer’s purpose. Common assumptions include “not terribly big”, “as large as names get”, “reasonable”, “benign”, “as big as the input field in the form that should provide the value”, “similar to values we’ve seen before”, “some format parsable as a date”, “a number”, “fits the limit of the database column that’s used to persist the value”, “well-formed XML”, “matching some regular expression pattern or other” and so on and so forth. I’m sure you can come up with some additional ones as well.

The assumptions might not be explicitly articulated anywhere, but they’re still there. If the assumptions are implicit, what we have is basically a modelling issue that the programmer hasn’t bothered to tackle explicitly yet. It is modelling debt. So whenever you see string in a program, you’re really seeing “string with assumptions”, with the caveats that the assumptions may not be terribly well defined and there may or may not be attempts to have them enforced. In other words, we can’t trust that the assumptions hold. This is a problem.

So what should we do instead? We can’t realistically eradicate strings from our programs altogether. For instance, we do need to be able to speak string at the edges of our programs. Quite often, we need to use strings to exchange data with others, or to persist values in a database. This is fine. But we can minimize the time we allow strings to be “raw”, without enforced assumptions. As soon as we can, we should make our assumptions explicit – even though that means we might need to spend a little time articulating and modelling those assumptions. (That’s a bonus by the way, not a drawback.) We should never allow a string pass unchecked through any part of our system. An unchecked string is Schrodinger’s time bomb. You don’t know if it will explode or not until you try to use it. If it turns out your string is a bomb, the impact may vary from the inconvenient to the embarrassing to the catastrophic.

Unsurprisingly, the good people who care about security (which should be all of us!) find strings with assumptions particularly interesting. Why? Because security bugs can be found precisely where assumptions are broken. In particular, since the string type allows for any string, the scene is set for “Houdini strings” to try to escape the cage where they’re held as data, and break free into the realm of code.

To make our assumptions explicit, we need to use types that are not strings. But it’s perfectly fine for them to carry strings along. Here’s a class to represent a phone number in C#:

Nothing clever, perfectly mundane. You create your PhoneNumber and use it whenever you’d use “string with assumption: valid phone number”. As you can see, the class does nothing more than hold on to a string value, but it does make sure that the string belongs to that small subset of strings that happen to be valid phone numbers as well. It will reject all the other strings. When you need to speak string (at the edges of your program, you just never do it internally), you call ToString() and shed the protection of your type – but at least at that point you know you have a valid phone number.

So it’s not difficult. So why do we keep littering our programs with strings with assumptions?