Function application in la-la land
Posted: October 11, 2017 Filed under: Uncategorized Leave a commentHere’s a familiar scenario for a programmer: You have some useful function that you would like to apply to some values. It could be concat
that concatinates two strings, or add
that adds two integers, or cons
which prepends an element to a list, or truncate
which cuts off a string at the specified length, or indeed any old function f
you come up with which takes a bunch of arguments and produces some result.
Simple, right? But there’s a twist! The values you’d like to apply your function to are all trapped in la-la land! And once you have values in la-la land, it’s not obvious how you’d go about getting them out of there. It really depends on the kind of la-la land your values are in. It’s sort of like being trapped in the afterlife. You might be able to return to the land of the living, but it’s not trivial. Certainly not something you’d want your pure, innocent function to have to deal with!
You might wonder how the values ended up in la-la land in the first place. In many cases, they were born there. They are la-la land natives – it’s the only existence they’ve ever known. It sounds weird, but it’s surprisingly common. Indeed, many programs contain not one but several distinct la-la lands, each with their own peculiar laws and customs! Some familiar la-la lands in the .NET world include Task
, Nullable
and List
.
Since la-la lands are so pervasive in our programs, clearly we need to be able to apply functions to the values that dwell there. Previously we’ve seen that if your la-la land is a Functor, there is a Map
function that lets us do that. But there is a problem: Map
cannot work with any of the functions I mentioned above. The reason is that they all take more than one argument. Map
can transform a single value of type T1
inside la-la land to a single value of type T2
inside la-la land. What Map
does is teleport the T1
value out of la-la land, apply your function to obtain a T2
value, and teleport that back into la-la land. You can of course map multiple times, but you’ll still involving just one la-la land value at a time. So that’s not going to work.
What alternatives do we have? Well, one idea that springs to mind is partial application. If we had a curried function, we could apply it to the la-la land values one by one, producing intermediate functions until we have the final result. For instance, say we have a curried version of add
which looks like this:
Func<int, Func<int, int>> add = a => b => a + b;
Now we have a single-argument function that returns a single-argument function that returns a value. So we can use it like this:
Func<int, Func<int, int>> add = a => b => a + b; Func<int, int> incr = add(1); int four = incr(3);
Unfortunately, this still won’t work with Map
. What would happen if we passed the curried add
to Map
? We would get an incr
function stuck inside of la-la land! And then we’d be stuck too. But what if we replaced Map
with something that could work with functions living in la-la land? Something like this:
Lala<TR> Apply<T, TR>(Lala<Func<T, TR>> f, Lala<T> v);
What Apply
needs to do is teleport both the function and the value out of la-la land, apply the function to the value, and teleport the result back into la-la land.
How would this work with our curried add
function? Well, first we’d need to teleport the function itself into la-la land. For this, we need a function, which we’ll call Pure
. It looks like this:
Lala<T> Pure<T>(T val);
In other words, Pure
is a one-way portal to la-la land.
Let’s see how this would work for our curried add
function:
static Lala<int> AddLalaIntegers(Lala<int> a, Lala<int> b) { Func<int, Func<int, int>> add = a => b => a + b; Lala<Func<int, Func<int, int>>> lalaAdd = Lala.Pure(add); Lala<Func<int, int>> lalaPartial = Lala.Apply(lalaAdd, a); Lala<int> lalaResult = Lala.Apply(lalaPartial, b); return lalaResult; }
Success! Who would have thought?
Well, someone, obviously. It turns out that la-la lands that support Pure
and Apply
are known as Applicative.
But there are still questions worth asking, such as: How do we implement these functions? Like Map
, Pure
and Apply
must obey the laws of the particular la-la land they work with. We’re going to look at two examples in C#.
First, consider the la-la land known as Task<T>
.
public static class TaskApplicative { public static Task<T> Pure(T val) { return Task.FromResult(val); } public static async Task<TR> Apply<T, TR>( this Task<Func<T, TR> funTask, Task<T> valTask) { var fun = await funTask; var val = await valTask; return fun(val); } }
Awaiting the tasks bring them out of Task-land, and the return value is automatically transported back by the async machinery.
Second, imagine a type called Mayhaps<T>
. Mayhaps
is like Nullable
, but it works on any type T
. Why is this important? Because delegates are reference types, which means they can’t be put inside a Nullable
. In other words, functions are not allowed into the la-la land that is Nullable
. So Mayhaps
it is.
Mayhaps
has two possible values, Indeed
and Sorry
. Indeed
holds a value, Sorry
does not. That’s really all you need to know about Mayhaps
. (For implementation details, look here.)
Here are Pure
and Apply
for Mayhaps
:
public static class MayhapsApplicative { public static Mayhaps<TR> Pure<TR>(TR v) { return Mayhaps<TR>.Indeed(v); } public static Mayhaps<TR> Apply<T, TR>( this Mayhaps<Func<T, TR>> mayhapsFunction, Mayhaps<T> mayhapsValue) { if (mayhapsFunction.HasValue && mayhapsValue.HasValue) { var fun = mayhapsFunction.Value; var val = mayhapsValue.Value; return Mayhaps<TR>.Indeed(fun(val)); } else { return Mayhaps<TR>.Sorry; } } }
The semantics of Mayhaps
is to propagate Sorry
– you can only get a new Indeed
if you have both a function and a value.
And of course the nice thing now is that we can separate our logic from the idiosyncracies of each la-la land! Which is pretty great.
But I’ll admit that we’re currently in a situation where calling a function is a little bit involved and awkward. It’s involved because there’s quite a bit of boilerplate, and it’s awkward because working with curried functions and partial application isn’t necessarily the bread and butter of C# programming. So let’s write some helper functions to alleviate some of that pain.
We can start by writing functions to curry Func
s, which should reduce the awkward. There are quite a few of them; here’s an example that curries a Func
with four input parameters:
public static Func<T1, Func<T2, Func<T3, Func<T4, TR>>>> Curry<T1, T2, T3, T4, TR>( this Func<T1, T2, T3, T4, TR> f) { return a => b => c => d => f(a, b, c, d); }
We can use it like this:
Func<int, int, int, int, int> sirplusalot = (a, b, c, d) => a + b + c + d; Func<int, Func<int, Func<int, Func<int, int>>>> = sirplusalot.Curry();
A little less awkward. What about involved? We’ll define some helper functions to reduce the boilerplate. The idea is to use a function Lift
to handle pretty much everything for us. Here is one that can be used with sirplusalot
:
public static Lala<TR> Lift<T1, T2, T3, T4, TR>( this Func<T1, T2, T3, T4, TR> f, Lala<T1> v1, Lala<T2> v2, Lala<T3> v3, Lala<T4> v4) { return Pure(f.Curry()).Apply(v1).Apply(v2).Apply(v3).Apply(v4); }
Note that all Lift
functions will have the same structure, regardless of which la-la land they operate in. Only the implementations of Pure
and Apply
will vary.
And now we can implement functions that look like this:
private async static Task<int> Plus( Task<int> ta, Task<int> tb, Task<int> tc, Task<int> td) { Func<int, int, int, int, int> sirplusalot = (a, b, c, d) => a + b + c + d; return await sirplusalot.Lift(ta, tb, tc, td); } private static Mayhaps<int> Plus( Mayhaps<int> ma, Mayhaps<int> mb, Mayhaps<int> mc, Mayhaps<int> md) { Func<int, int, int, int, int> sirplusalot = (a, b, c, d) => a + b + c + d; return sirplusalot.Lift(ma, mb, mc, md); }
Which is quite nice? Yes?
How to reduce bunches of things
Posted: October 5, 2017 Filed under: Uncategorized Leave a commentSo there you are, a pragmatic C# programmer out to provide business value for your end users and all that stuff. That’s great.
One of the (admittedly many) things you might want to do is reduce a bunch of things of some type into a single thing of that type. For instance, you might want to add a bunch of numbers together, or concatinate a bunch of strings and so on. How would you do that? (Assuming there’s no built-in Aggregate
method available, that is.) Well, you’d write a Reduce
function, right? And since we haven’t specified in advance what kinds of things we should reduce, we better make it generic. So it could work on an IEnumerable<T>
of things.
Now how should the actual reduction take place? An obvious idea is to do it stepwise. It’s both a good problem solving strategy in general, and kind of necessary when dealing with an IEnumerable
. For that to work, though, you need some way of taking two values and combining them to produce a single value. So Reduce
needs to be a higher-order function. The caller should pass in a combine
function, as well as some initial value to combine with the first element. And then the completed function might look something like this:
public static T Reduce(this IEnumerable<T> things, Func<T, T, T> combine, T initialValue) { T result = initialValue; foreach (var t in things) { result = combine(result, t); } return result; }
And now if you have a bunch of integers, say, you can add them all up like this:
var integers = new [] { 1, 2, 3, 4 }; var sum = integers.Reduce((a, b) => a + b, 0);
If, on the other hand, you have a bunch of lists, you’d do something like this instead:
var lists = new [] { new List { 1 }, new List { 2, 3 } }; var sum = lists.Reduce((a, b) => { var list = new List(); list.AddRange(a); list.AddRange(b); return list; }, new List());
And this would give you the list of elements 1, 2, 3. Great.
Now there are other things you might wonder about with respect to the combine function. For whatever reason, you might want to consider alternative implementations of Reduce
. For instance, you’d might like to create batches of n
things, reduce each batch, and then reduce those results for the final result. It would be nice to have that freedom of implementation. For that to be an option, though, you need your combine function to be associative.
Assume you have three values t1
, t2
, t3
. Your combine function is associative if the following holds:
combine(t1, combine(t2, t3)) == combine(combine(t1, t2), t3)
Unfortunately there is nothing in the C# type system that lets us specify and verify that a function is associative, so we need to rely on documentation and discipline for that.
Alternatively, we can turn to mathematics. It turns out that mathematicians have a name for precisely the kind of thing we’re talking about. A semigroup is a structure that consists of a set of values and an associative binary operation for combining such values. Granted, it’s a strange-sounding name, but it identifies a very precise concept that gives us something to reason about. So it’s a useful abstraction that actually gives us some guarantees that we can rely on when programming.
To represent a semigroup in our program, we can introduce an interface:
public interface ISemigroup<T> { T Combine(T a, T b); }
And we can modify our Reduce
function to work with semigroups, which by definition guarantees that the Combine
function is associative.
public static T Reduce<T>(this IEnumerable<T> things, ISemigroup<T> semigroup, T initialValue) { T result = initialValue; foreach (var thing in things) { result = semigroup.Combine(result, thing); } return result; }
And we can introduce a bunch of concrete implementations of this interface, like:
class IntegerUnderAdditionSemigroup : ISemigroup<int> { public int Combine(int a, int b) { return a + b; } } class IntegerUnderMultiplicationSemigroup : ISemigroup<int> { public int Combine(int a, int b) { return a * b; } } class StringSemigroup : ISemigroup<string> { public string Combine(string a, string b) { return a + b; } } class ListSemigroup<T> : ISemigroup<List<T>> { public List Combine(List a, List b) { var result = new List(); result.AddRange(a); result.AddRange(b); return result; } } class FuncSemigroup<T> : ISemigroup<Func<T, T>> { public Func<T, T> Combine(Func<T, T> f, Func<T, T> g) { return it => g(f(it)); } }
So that’s quite nice. We can rely on meaningful and precise abstractions to give us some guarantees in our programs.
There is still a small problem when working with semigroups for reduction though. What should the initial value be? We really just want to reduce a bunch of values of some type, we don’t want to be bothered with some additional value.
One approach, I guess, would be to just pick the first value and then perform reduce on the rest.
public static T Reduce(this IEnumerable<T> things, ISemigroup<T> semigroup) { return things.Skip(1).Reduce(semigroup, things.First(); }
This would work for non-empty bunches of things. But that means we’d have to check for that in some way before calling Reduce
. That’s quite annoying.
What would be useful is some sort of harmless value that we could combine with any other value and just end up with the other value. So we could just use that magical value as the initial value for our Reduce
.
Luckily, it turns out that there are such magical values for all the semigroups we’ve looked at. In fact, we’ve seen two such values already. For integers under addition, it’s zero. For lists, it’s the empty list. But there are others. For integers under multiplication, it’s one. For strings, it’s the empty string. And for functions it’s the identity function, which just returns whatever value you hand it. Now if you can provide such a value, which is called the unit value, for your semigroup, you get what the mathematicians call a monoid. It’s another intensely unfamiliar-sounding name, but again the meaning is very precise.
We can represent monoids in our programs by introducing another interface:
public interface IMonoid<T> : ISemigroup<T> { T Unit { get; } }
So there is nothing more to a monoid than exactly this: it’s a semigroup with a unit value. And the contract that the unit value operates under is this:
Compose(Unit, t) == Compose(t, Unit) == t
This just says that the unit value is magical in the sense we outlined. We can combine it with any value t
any way we want, and we end up with t
.
Now we can write a new Reduce
function that works on monoids:
public static T Reduce(this IEnumerable<T> things, IMonoid<T> monoid) { return things.Reduce(monoid, monoid.Unit); }
This is quite nice, because we don’t have to worry any more about whether or not the bunch of things is empty. We can proceed to implement concrete monoids that we might want to use.
class IntegerUnderAdditionMonoid : IntegerUnderAdditionSemigroup, IMonoid<int> { public int Unit { get { return 0; } } } class IntegerUnderMultiplicationMonoid : IntegerUnderMultiplicationSemigroup, IMonoid<int> { public int Unit { get { return 1; } } } class StringMonoid : StringSemigroup, IMonoid<string> { public string Unit { get { return ""; } } } class ListMonoid<T> : ListSemigroup<T>, IMonoid<List<T>> { public List<T> Unit { get { return new List<T>(); } } } class FuncMonoid<T> : FuncSemigroup<T>, IMonoid<Func<T, T>> { public Func<T, T> Unit { get { return it => it; } } }
And we might write a small test program to see if they work as intended.
public static void Main(string[] args) { var integers = new[] { 1, 2, 4, 8 }; var sum = integers.Reduce(new IntegerUnderAdditionMonoid()); var product = integers.Reduce(new IntegerUnderMultiplicationMonoid()); var strings = new[] { "monoids", " ", "are", " ", "nifty" }; var str = strings.Reduce(new StringMonoid()); var lists = new[] { new List { "monoids", " " }, new List { "are" }, new List { " ", "nice" } }; var list = lists.Reduce(new ListMonoid()); var str2 = list.Reduce(new StringMonoid()); var integerFunctions = new Func<T, T>[] { it => it + 1, it => it % 3 }; var intFun = integerFunctions.Reduce(new FuncMonoid()); var stringFunctions = new Func<T, T>[] { s => s.ToUpper(), s => s.Substring(0, 5) }; var strFun = stringFunctions.Reduce(new FuncMonoid()); Console.WriteLine(sum); Console.WriteLine(product); Console.WriteLine(str); Console.WriteLine(list.Count); Console.WriteLine(str2); Console.WriteLine(intFun(1)); Console.WriteLine(intFun(2)); Console.WriteLine(strFun("hello world")); Console.WriteLine(strFun(str)); }
Can you work out what the program will print? If not, you might want to try to run it.
Hopefully this post gives some indication of the flexibility and power that can come with very simple abstractions. It might even give you a creeping sensation that these Haskell heads are onto something when they claim that mathematics that studies structure and composition can be useful for programmers. At the face of things, the processes of adding up integers, concatenating strings, appending lists and composing functions seem quite different, but structurally they nevertheless share some fundamental traits that can be leveraged to great effect.