Strings dressed in tags

In a project I’m working on, we needed a simple way of wrapping strings in tags in a custom grid in our ASP.NET MVC application. The strings should only be wrapped given certain conditions. We really wanted to avoid double if checks, you know, once for the opening tag and one for the closing tag?

We ended up using a Func from string to string to perform wrapping as appropriate. By default, the Func would just be the identify function; that is, it would return the string unchanged. When the right conditions were fulfilled, though, we’d replace it with a Func that would create a new string, where the original one was wrapped in the appropriate tag.

The code I came up with lets you write transforms such as this:


Func<string, string> transform =
s => s.Tag("a")
.Href("http://einarwh.posterous.com")
.Style("font-weight: bold");

Which is pretty elegant and compact, don’t you think? Though perhaps a bit unusual. In particular, you might be wondering about the following:

  1. How come there’s a Tag method on the string?
  2. Where do the other methods come from?
  3. How come the return value is a string?

So #1 is easy, right? It has to be an extension method. As you well know, an extension method is just an illusion created by the C# compiler. But it’s a neat illusion that allows for succinct syntax. The extension method looks like this:


public static class StringExtensions
{
public static dynamic Tag(this string content, string name)
{
return new Tag(name, content);
}
}

So it simply creates an instance of the Tag class, passing in the string to be wrapped and the name of the tag. That’s all. So that explains #2 as well, right? Href and Style must be methods defined on the Tag class? Well no. That would be tedious work, since we’d need methods for all possible HTML tag attributes. I’m not doing that.

If you look closely at the signature of the Tag method, you’ll see that it returns an instance of type dynamic. Now what does that mean, exactly? When dynamic was introduced in C# 4, prominent bloggers were all “oooh it’s statically typed as dynamic, my mind is blown, yada yada yada”, you know, posing as if they didn’t have giant brains grokking this stuff pretty easily? It’s not that hard. As usual, the compiler is sugaring the truth for us. Our trusty ol’ friend ILSpy is kind enough to let us figure out what dynamic really means, by revealing all the gunk the compiler spews out in response to it. You’ll find that it introduces a CallSite at the point in code when you’re interacting with the dynamic type, as well as a CallSiteBinder to handle the run-time binding of operations on the CallSite.

We don’t have to deal with all of that, though. Long story short, Tag inherits from DynamicObject, a built-in building block for creating types with potensially interesting dynamic behaviour. DynamicObject exposes several virtual methods that are called during run-time method dispatch. So basically when the run-time is trying to figure out which method to invoke and to invoke it, you’ve got these nice hooks where you can insert your own stuff. Tag, for instance, implements its own version of TryInvokeMember, which is invoked by the run-time to, uh, you know, try to invoke a member? It takes the following arguments:

  • An instance of InvokeMemberBinder (a subtype of CallSiteBinder) which provides run-time binding information.
  • An array of objects containing any arguments passed to the method.
  • An out parameter which should be assigned the return value for the method.

Here is Tag‘s implementation of TryInvokeMember:


public override bool TryInvokeMember(
InvokeMemberBinder binder,
object[] args,
out object result)
{
_props[binder.Name] = GetValue(args) ?? string.Empty;
result = this;
return true;
}
private string GetValue(object[] args)
{
if (args.Length > 0)
{
var arg = args[0] as string;
if (arg != null)
{
return arg;
}
}
return null;
}

What does it do? Well, not a whole lot, really. Essentially it just hamsters values from the method call (the method name and its first argument) in a dictionary. So for instance, when trying to call the Href method in the example above, it’s going to store the value “http://einarwh.posterous.com&#8221; for the key “href”. Simple enough. And what about the return value from the Href method call? We’ll just return the Tag instance itself. That way, we get a nice fluent composition of method calls, all of which end up in the Tag‘s internal dictionary. Finally we return true from TryInvokeMember to indicate that the method call succeeded.

Of course, you’re not going to get any IntelliSense to help you get the attributes for your HTML tags right. If you misspell Href, that’s your problem. There’s no checking of anything, this is all just a trick for getting a compact syntax.

Finally, Tag defines an implicit cast to string, which explains #3. The implicit cast just invokes the ToString method on the Tag instance.


public static implicit operator string(Tag tag)
{
return tag.ToString();
}
public override string ToString()
{
var sb = new StringBuilder();
sb.Append("<").Append(_name);
foreach (var p in _props)
{
sb.Append(" ")
.Append(p.Key.ToLower())
.Append("=\"")
.Append(p.Value)
.Append("\"");
}
sb.Append(">")
.Append(_content)
.Append("</")
.Append(_name)
.Append(">");
return sb.ToString();
}

The ToString method is responsible for actually wrapping the original string in opening and closing tags, as well as injecting any hamstered dictionary entries into the opening tag as attributes.

And that’s it, really. That’s all there is. Here’s the complete code:


namespace DynamicTag
{
class Program
{
static void Main()
{
string s = "blog"
.Tag("a")
.Href("http://einarwh.posterous.com")
.Style("font-weight: bold");
Console.WriteLine(s);
Console.ReadLine();
}
}
public class Tag : DynamicObject
{
private readonly string _name;
private readonly string _content;
private readonly IDictionary<string, string> _props =
new Dictionary<string, string>();
public Tag(string name, string content)
{
_name = name;
_content = content;
}
public override bool TryInvokeMember(
InvokeMemberBinder binder,
object[] args,
out object result)
{
_props[binder.Name] = GetValue(args) ?? string.Empty;
result = this;
return true;
}
private string GetValue(object[] args)
{
if (args.Length > 0)
{
var arg = args[0] as string;
if (arg != null)
{
return arg;
}
}
return null;
}
public override string ToString()
{
var sb = new StringBuilder();
sb.Append("<").Append(_name);
foreach (var p in _props)
{
sb.Append(" ")
.Append(p.Key.ToLower())
.Append("=\"")
.Append(p.Value)
.Append("\"");
}
sb.Append(">")
.Append(_content)
.Append("</")
.Append(_name)
.Append(">");
return sb.ToString();
}
public static implicit operator string(Tag tag)
{
return tag.ToString();
}
}
public static class StringExtensions
{
public static dynamic Tag(this string content, string name)
{
return new Tag(name, content);
}
}
}

view raw

DynamicTag.cs

hosted with ❤ by GitHub


Introducing μnit

Last week, I teamed up with Bjørn Einar (control-engineer gone js-hipster) and Jonas (bona fide smalltalk hacker) to talk about .NET gadgeteer at the NDC 2012 conference in Oslo. .NET gadgeteer is a rapid prototyping platform for embedded devices running the .NET micro framework – a scaled down version of the .NET framework itself. You can read the abstract of our talk here if you like. The talk itself is available online as well. You can view it here.

The purpose of the talk was to push the envelope a bit, and try out things that embedded .NET micro devices aren’t really suited for. We think it’s important for developers to fool around a bit, without considering mundane things like business value. That allows for immersion and engagement in projects that are pure fun.

I started gently though, with a faux-test driven implementation of Conway’s Game of Life. That is, I wrote the implementation of Life first, and then retro-fitted a couple of unit tests to make it seem like I’d followed the rules of the TDD police. That way I could conjure up the illusion of a true software craftsman, when in fact I’d just written a few tests after the implementation was done, regression tests if you will.

I feel like I had a reasonable excuse for cheating though: at the time, there were no unit testing frameworks available for the .NET micro framework. So you know how TDD opponents find it tedious to write the test before the implementation? Well, in this case I would have to write the unit testing framework before writing the test as well. So the barrier to entry was a wee bit higher.

Now in order to create the illusion of proper craftsmanship in retrospect, I did end up writing tests, and in order to do that, I did have to write my own testing framework. So procrastination didn’t really help all that much. But there you go. Goes to show that the TDD police is on to something, I suppose.

Anyways, the testing framework I wrote is called μnit, pronounced [mju:nit]. Which is a terribly clever name, I’m sure you’ll agree. First off, the μ looks very much like a u. So in terms of glyphs, it basically reads like unit. At the same time, the μ is used as a prefix signifying “micro” in the metric system of measurement – which is perfect since it’s written for the .NET *micro* framework. So yeah, it just reeks of clever, that name.

Implementation-wise it’s pretty run-of-the-mill, though. You’ll find that μnit works just about like any other xUnit framework out there. While the .NET micro framework is obviously scaled down compared to the full .NET framework, it is not a toy framework. Among the capabilities it shares with its bigger sibling is reflection, which is the key ingredient in all the xUnit frameworks I know of. Or at least I suppose it is, I haven’t really looked at the source code of any of them. Guess I should. Bound to learn something.

Anyways, the way I think these frameworks work is that you have some mechanics for identifying test methods hanging off of test classes. For each test method, you create an instance of the test class, run the method, and evaluate the result. Since you don’t want to state explicitly which test methods to run, you typically use reflection to identify and run all the test methods instead. At least that’s how μnit works.

One feature that got axed in the .NET micro framework is custom attributes, and hence there can be no [Test] annotation for labelling test methods. So μnit uses naming conventions for identifying test methods instead, just like in jUnit 3 and earlier. But that’s just cosmetics, it doesn’t really change anything. In μnit we use the arbitrary yet common convention that test methods should start with the prefix “Test”. In addition, they must be public, return void and have no parameters. Test classes must inherit from the Fixture base class, and must have a parameterless constructor. All catering for the tiny bit of reflection voodoo necessary to run the tests.

Here’s the Fixture class that all test classes must inherit from:


namespace Mjunit
{
public abstract class Fixture
{
public virtual void Setup() {}
public virtual void Teardown() {}
}
}

view raw

Fixture.cs

hosted with ❤ by GitHub

As you can see, Fixture defines empty virtual methods for set-up and tear-down, named SetUp and TearDown, respectively. Test classes can override these to make something actually happen before and after a test method is run. Conventional stuff.

The task of identifying test methods to run is handler by the TestFinder class.


namespace Mjunit
{
public class TestFinder
{
public ArrayList FindTests(Assembly assembly)
{
var types = assembly.GetTypes();
var fixtures = GetTestFixtures(types);
var groups = GetTestGroups(fixtures);
return groups;
}
private ArrayList GetTestFixtures(Type[] types)
{
var result = new ArrayList();
for (int i = 0; i < types.Length; i++)
{
var t = types[i];
if (t.IsSubclassOf(typeof(Fixture)))
{
result.Add(t);
}
}
return result;
}
private ArrayList GetTestGroups(ArrayList fixtures)
{
var result = new ArrayList();
foreach (Type t in fixtures)
{
var g = new TestGroup(t);
if (g.NumberOfTests > 0)
{
result.Add(g);
}
}
return result;
}
}
}

view raw

TestFinder.cs

hosted with ❤ by GitHub

You might wonder why I’m using the feeble, untyped ArrayList, giving the code that unmistakeable old-school C# 1.1 tinge? The reason is simple: the .NET micro framework doesn’t have generics. But we managed to get by in 2003, we’ll manage now.

What the code does is pretty much what we outlined above: fetch all the types in the assembly, identify the ones that inherit from Fixture, and proceed to create a TestGroup for each test class we find. A TestGroup is just a thin veneer on top of the test class:


namespace Mjunit
{
class TestGroup : IEnumerable
{
private readonly Type _testClass;
private readonly ArrayList _testMethods = new ArrayList();
public TestGroup(Type testClass)
{
_testClass = testClass;
var methods = _testClass.GetMethods();
for (int i = 0; i < methods.Length; i++)
{
var m = methods[i];
if (m.Name.Substring(0, 4) == "Test" &&
m.ReturnType == typeof(void))
{
_testMethods.Add(m);
}
}
}
public Type TestClass
{
get { return _testClass; }
}
public int NumberOfTests
{
get { return _testMethods.Count; }
}
public IEnumerator GetEnumerator()
{
return _testMethods.GetEnumerator();
}
}
}

view raw

TestGroup.cs

hosted with ❤ by GitHub

The TestFinder is used by the TestRunner, which does the bulk of the work in μnit, really. Here it is:


namespace Mjunit
{
public class TestRunner
{
private Thread _thread;
private Assembly _assembly;
private bool _done;
public event TestRunEventHandler SingleTestComplete;
public event TestRunEventHandler TestRunStart;
public event TestRunEventHandler TestRunComplete;
public TestRunner() {}
public TestRunner(ITestClient client)
{
RegisterClient(client);
}
public TestRunner(ArrayList clients)
{
foreach (ITestClient c in clients)
{
RegisterClient(c);
}
}
public bool Done
{
get { return _done; }
}
public void RegisterClient(ITestClient client)
{
TestRunStart += client.OnTestRunStart;
SingleTestComplete += client.OnSingleTestComplete;
TestRunComplete += client.OnTestRunComplete;
}
public void Run(Type type)
{
Run(Assembly.GetAssembly(type));
}
public void Run(Assembly assembly)
{
_assembly = assembly;
_thread = new Thread(DoRun);
_thread.Start();
}
public void Cancel()
{
_thread.Abort();
}
private void DoRun()
{
FireCompleteEvent(TestRunStart, null);
var gr = new TestGroupResult(_assembly.FullName);
try
{
var finder = new TestFinder();
var groups = finder.FindTests(_assembly);
foreach (TestGroup g in groups)
{
gr.AddResult(Run(g));
}
}
catch (Exception ex)
{
Debug.Print(ex.Message);
Debug.Print(ex.StackTrace);
}
FireCompleteEvent(TestRunComplete, gr);
_done = true;
}
private void FireCompleteEvent(TestRunEventHandler handler,
ITestResult result)
{
if (handler != null)
{
var args = new TestRunEventHandlerArgs
{ Result = result };
handler(this, args);
}
}
private TestClassResult Run(TestGroup group)
{
var result = new TestClassResult(group.TestClass);
foreach (MethodInfo m in group)
{
var r = RunTest(m);
FireCompleteEvent(SingleTestComplete, r);
result.AddResult(r);
}
return result;
}
private SingleTestResult RunTest(MethodInfo m)
{
try
{
DoRunTest(m);
return TestPassed(m);
}
catch (AssertFailedException ex)
{
return TestFailed(m, ex);
}
catch (Exception ex)
{
return TestFailedWithException(m, ex);
}
}
private void DoRunTest(MethodInfo method)
{
Fixture testObj = null;
try
{
testObj = GetInstance(method.DeclaringType);
testObj.Setup();
method.Invoke(testObj, new object[0]);
}
finally
{
if (testObj != null)
{
testObj.Teardown();
}
}
}
private Fixture GetInstance(Type testClass)
{
var ctor = testClass.GetConstructor(new Type[0]);
return (Fixture)ctor.Invoke(new object[0]);
}
private SingleTestResult TestFailedWithException(
MethodInfo m, Exception ex)
{
return new SingleTestResult(m, TestOutcome.Fail)
{ Exception = ex };
}
private SingleTestResult TestFailed(
MethodInfo m, AssertFailedException ex)
{
return new SingleTestResult(m, TestOutcome.Fail)
{ AssertFailedException = ex };
}
private SingleTestResult TestPassed(MethodInfo m)
{
return new SingleTestResult(m, TestOutcome.Pass);
}
}
}

view raw

TestRunner.cs

hosted with ❤ by GitHub

That’s a fair amount of code, and quite a few new concepts that haven’t been introduced yet. At a high level, it’s not that complex though. It works as follows. The user of a test runner will typically be interested in notification during the test run. Hence TestRunner exposes three events that fire when the test run starts, when it completes, and when each test has been run respectively. To receive notifications, the user can either hook up to those events directly or register one or more so-called test clients. We’ll look at some examples of test clients later on. To avoid blocking test clients and support cancellation of the test run, the tests run in their own thread.

As you can see from the RunTest method, each test results in a SingleTestResult, containing a TestOutcome of Pass or Fail. I don’t know how terribly useful it is, but μnit currently distinguishes between failures due to failed assertions and failures due to other exceptions. It made sense at the time.

The SingleTestResult instances are aggregated into TestClassResult instances, which in turn are aggregated into a single TestGroupResult instance representing the entire test run. All of these classes implement ITestResult, which looks like this:


namespace Mjunit
{
public interface ITestResult
{
string Name { get; }
TestOutcome Outcome { get; }
int NumberOfTests { get; }
int NumberOfTestsPassed { get; }
int NumberOfTestsFailed { get; }
}
}

view raw

ITestResult.cs

hosted with ❤ by GitHub

Now for a SingleTestResult, the NumberOfTests will obviously be 1, whereas for a TestClassResult it will match the number of SingleTestResult instances contained by the TestClassResult, and similarly for the TestGroupResult.

So that pretty much wraps it up for the core of μnit. Let’s take a look at how it looks at the client side, for someone who might want to use μnit to write some tests. The most convenient thing to do is probably to register a test client; that is, some object that implements ITestClient. ITestClient looks like this:


namespace Mjunit
{
public interface ITestClient
{
void OnTestRunStart(object sender,
TestRunEventHandlerArgs args);
void OnSingleTestComplete(object sender,
TestRunEventHandlerArgs args);
void OnTestRunComplete(object sender,
TestRunEventHandlerArgs args);
}
}

view raw

ITestClient.cs

hosted with ❤ by GitHub

The registered test client will then receive callbacks as appropriate when the tests are running.

In order to be useful, test clients typically need to translate notifications into something that a human can see and act upon if necessary. In the .NET gadgeteer world, it means you need to interact with some hardware.

For the Game of Life implementation (which can be browsed here if you’re interested) I implemented two test clients interacting with elements of the FEZ Spider kit: a DisplayTestClient that shows test results on a small display, and a LedTestClient that simply uses a multicolored LED light to give feedback to the user. Here’s the code for the latter:


namespace Mjunit.Clients.GHI
{
public class LedTestClient : ITestClient
{
private readonly MulticolorLed _led;
private bool _isBlinking;
private bool _hasFailed;
public LedTestClient(MulticolorLed led)
{
_led = led;
Init();
}
public void Init()
{
_led.TurnOff();
_isBlinking = false;
_hasFailed = false;
}
public void OnTestRunStart(object sender,
TestRunEventHandlerArgs args)
{
Init();
}
public void OnTestRunComplete(object sender,
TestRunEventHandlerArgs args)
{
OnAnyTestComplete(sender, args);
}
private void OnAnyTestComplete(object sender,
TestRunEventHandlerArgs args)
{
if (!_hasFailed)
{
if (args.Result.Outcome == TestOutcome.Fail)
{
_led.BlinkRepeatedly(Colors.Red);
_hasFailed = true;
}
else if (!_isBlinking)
{
_led.BlinkRepeatedly(Colors.Green);
_isBlinking = true;
}
}
}
public void OnSingleTestComplete(object sender,
TestRunEventHandlerArgs args)
{
OnAnyTestComplete(sender, args);
}
}
}

As you can see, it starts the test run by turning the LED light off. Then, as individual test results come in, the LED light starts blinking. On the first passing test, it will start blinking green. It will continue to do so until a failing test result comes in, at which point it will switch to blinking red instead. Once it has started blinking red, it will stay red, regardless of subsequent results. So the LedTestClient doesn’t actually tell you which test failed, it just tells you if some test failed. Useful for a sanity check, but not much else. That’s where the DisplayTestClient comes in, since it actually shows the names of the tests as they pass or fail.

How does it look in practice? Here’s a video of μnit tests for Game of Life running on the FEZ Spider. When the tests all succeed, we proceed to run Game of Life. Whee!


Recursion for kids

Consider the following problem:

The field vole can have up to 18 litters (batches of offspring) each year, each litter contains up to 8 children. The newborn voles may have offspring of their own after 25 days. How many field voles can a family grow to during the course of a year?

Of course, unless you’re a native English speaker, you might wonder what the heck a field vole is. I know I did.

This a field vole:

Field-vole-500px-border

I’m not really sure if it’s technically a mouse or just a really close relative, but for all our intents and purposes, it sure is. A small, very reproductive mouse.

So, do you have an answer to the problem? No?

To provide a bit of background: this problem was presented to a class of fifth graders. Does that motivate you? Do you have an answer now?

If you do, that’s great, but if you don’t, you probably have a whole litter of questions instead. That’s OK too.

You see, the father of one of those fifth graders is a friend of mine. He emailed this problem to a rather eclectic group of people (including some with PhDs in matematics). Between us, we came up with a list of questions including these:

  • What is the distribution of sexes among the voles?
  • What is the average number of voles per litter? And the distribution?
  • How many voles are gay?
  • How many voles die before they reach a fertile age?
  • How many voles are celibate? Alternatively, how many voles prefer to live without offspring? (Given that voles don’t use prophylactics, these questions yield equivalent results.)
  • Will ugly voles get laid?
  • What is the cheese supply like?
  • Are there cats in the vicinity?

And so on and so forth. Luckily, the fifth grade teacher was able to come up with some constraints for us. Of course, they were rather arbitrary, but perhaps not completely unreasonable:

Each litter contains exactly 8 new voles, 4 females and 4 males. No voles die during the year in question.

That’s great! Given these constraints, we can get to work on a solution.

First, we make the arbitrary choice of associating the offspring with the female voles only. The male voles will be counted as having no offspring at all. While perhaps a bit old fashioned, this greatly simplifies our task. (Of course, we could just as well have opted for the opposite.)

Now we just need to count the offspring of female voles. Since we know that the offspring function is purely deterministic, this isn’t too hard. Given a certain number of days available for reproduction, a female vole we will always yield the same number of offspring. (As if women were idempotent!)

To calculate an answer, we can write a small program.


public class Voles
{
private static int _daysBeforeFirst = 25;
private static int _daysBetween = 20;
private static Dictionary<int, long> _cache =
new Dictionary<int, long>();
public static long F(int days) {
if (!_cache.ContainsKey(days)) {
_cache[days] = F0(days);
}
return _cache[days];
}
private static long F0(int days) {
int end = days _daysBeforeFirst;
if (end < 0) {
return 1;
}
int start = end % _daysBetween;
long count = 0;
for (int d = start; d <= end; d += _daysBetween) {
count += F(d) + 1;
}
return 1 + 4 * count;
}
}

view raw

Voles.cs

hosted with ❤ by GitHub

The F method calculates the total number of offspring for a female vole as a function of how many days it has lived. If you call F with an input of 365 days, you’ll find that the answer is 55,784,398,225. That’s a lot of voles.

How does the algorithm work, though? Well, we assume that we start with a single newborn female vole that has 365 days available to produce offspring (with the first litter arriving after 25 days). Then the number of offspring is given by:

F(365) = 1 + 4 * F(340) + 4 + 4 * F(320) + 4 + … + 4 * F(0) + 4

Of course, you can factor out all the 4’s, like so:

F(365) = 1 + 4 * (F(340) + 1 + F(320 + 1 + … + F(0) + 1)

And that’s pretty much what the code does. In addition, it uses a cache, so that it won’t have to calculate a value twice.

As you might imagine, the kids weren’t really expected to come up with a solution to this problem. Instead, they were supposed to think about recursion and reasonable constraints. Which are noble things to teach kids, for sure. More of that, please.

Nevertheless, I still think the problem kinda sucked. Even if the kids were able to come up with reasonable constraints, they wouldn’t have the tools at hand to produce an answer. Pretty demotivating, I’d say.

My friend’s son was unfazed and cool about it, though. In fact, he was content and confident that the tree structure he started drawing would yield the correct answer, if only he had a sufficiently large piece of paper. How cool is that?


Bix-It: Pix-It in the Browser

The previous blog post introduced PixItHandler, a custom HTTP handler for ASP.NET. The handler responds to HTTP POST requests containing a JSON description of a 8-bit style image with an actual PNG image. Provided you know the expected JSON format, it’s pretty easy to use a tool like Fiddler (or cURL for that matter) to generate renderings of your favorite retro game characters. However, while you might (and should) find those tools on the machine of a web developer, they have somewhat limited penetration among more conventional users. Web browsers have better reach, if you will.

So a challenge remains before the PixItHandler is ready to take over the world. Say we wanted to include a pix-it image in a regular HTML page? That is, we would like the user to make the request from a plain ol’ web browser, and use it to display the resulting image to the user. We can’t just use an HTML img tag as we normally would, since it issues an HTTP GET request for the resource specified in the src attribute. Moreover, we lack a way of including the JSON payload with the request. We can use another approach though. Using JQuery, we can issue the appropriate POST request with the JSON payload to the HTTP handler. So that means we’re halfway there.

We’re not quite done, though. We still need to figure out what to do with the response. The HTTP response from the PixItHandler is a binary file – it’s not something you can easily inject into the DOM for rendering. So that’s our next challenge.

Luckily, a little-known HTML feature called the data URI scheme comes to the rescue! Basically, data URIs allow you to jam a blob of binary data representing a resource in where you’d normally put the URI for that resource. So in our case, we can use a data URI in the src attribute of our img tag. To do so, we must base64-encode the PNG image and prefix it with some appropriate incantations identifying the text string as a data URI. Base64-encoding is straightforward to do, and there are JavaScript implementations you could steal right off the Internet. Good stuff.

You might think I’d declare victory at this point, but there’s one more obstacle in our way. Unfortunately, it seems that JQuery isn’t entirely happy funnelling the binary response through to us. Loading up binary data isn’t really the scenario the XMLHttpRequest object was designed to support, and so different browsers may or may not allow this to proceed smoothly. I haven’t really gone down to the bottom of the rabbit hole on this issue, because there’s a much simpler solution available: do the base64-encoding server side and pass the image data as text. So I’ve written a BixItHandler which is almost identical to the PixItHandler, except it base64-encodes the result before writing it to the response stream:


private static void WriteResponse(
HttpResponse response,
byte[] buffer)
{
response.ContentType = "plain/text";
response.Write(Convert.ToBase64String(buffer));
response.Flush();
}

Problem solved! Now we can easily create an HTML page with some JQuery to showcase our pix-it images. Here’s one way to do it:


<html>
<head>
<title>Invaders!</title>
<style type="text/css">
.invader { visibility: hidden }
</style>
</head>
<body>
<div class="invader">#990000</div>
<div class="invader">#009900</div>
<div class="invader">#000099</div>
</body>
<script type="text/javascript"
src="scripts/json2.js"></script>
<script type="text/javascript"
src="scripts/jquery-1.6.4.min.js"></script>
<script type="text/javascript"
src="scripts/pixit.js"></script>
<script type="text/javascript">
$(document).ready(PixIt.load);
</script>
</html>

view raw

invaders.html

hosted with ❤ by GitHub

Not much going on in the HTML file, as you can see. Three innocuous-looking div‘s that aren’t even visible yet, that’s all. As you might imagine, they are just placeholders that our JavaScript code can work with. That’s where pixit.js comes in:


var PixIt = {
load : function () {
var j = {
"pixelsWide": 13,
"pixelsHigh": 10,
"pixelSize": 8,
"payload":
[
{
"color": '#000000',
"pixels":
[
[1, 5], [1, 6], [1, 7], [2, 4],
[2, 5], [3, 1], [3, 3], [3, 4],
[3, 5], [3, 6], [3, 7], [4, 2],
[4, 3], [4, 5], [4, 6], [4, 8],
[5, 3], [5, 4], [5, 5], [5, 6],
[5, 8], [6, 3], [6, 4], [6, 5],
[6, 6], [7, 3], [7, 4], [7, 5],
[7, 6], [7, 8], [8, 2], [8, 3],
[8, 5], [8, 6], [8, 8], [9, 1],
[9, 3], [9, 4], [9, 5], [9, 6],
[9, 7], [10, 4], [10, 5],
[11, 5], [11, 6], [11, 7]
]
}
]
};
$('div.invader').each(function (index) {
var inv = $(this);
j.payload[0].color = inv.text();
$.ajax({
type: 'POST',
url: "http://localhost:52984/bix.it&quot;,
contentType: "application/json; charset=utf-8",
accepts: "plain/text",
dataType: "text",
data: JSON.stringify(j),
success: function (d) {
var src = "data:image/png;base64," + d;
inv.html('<img src="' + src + '"/>');
inv.css('visibility', 'visible');
}
});
});
}
}

view raw

pixit.js

hosted with ❤ by GitHub

As you can see, we define the basic outline for a space invader as static JSON data in the script. For each of the div tags, we hijack the color code inside and use that to override the color for the space invader. Then we issue the POST request to our brand new BixItHandler, which has been configured to capture requests aimed at the bix.it virtual resource. The response is a base64-encoded PNG file, which we then insert into the src attribute of an img element that we conjure up on the fly.

And how does it look?

Invaders-in-the-browser

Optimus Prime

Iterators, man. They’re so much fun.

I’ve messed about with IEnumerable<T> a little bit before, even in the short history of this blog, but I don’t feel like I can say I’ve done so in anger. Sort of the litmus test to see if you’ve grokked lazy evaluation is to implement an infinite sequence of some sort, wouldn’t you agree? Until you do that it’s all talk and no walk. So I thought I’d rectify that, to be a certified IEnumerable<T>-grokker. You in?

We’ll start gently. The simplest example of an infinite sequence that I can think of is this:

   1, 2, 3, 4, 5…

You’re absolutely right, it’s the sequence of positive integers!

There are actually two simple ways to implement an IEnumerable<int> that would give you that. First off, you could write the dynamic duo of IEnumerable<int> and IEnumerator<int> that conspire to let you write code using the beloved foreach keyword. As you well know, foreach over an IEnumerable<T> compiles to IL that will obtain an IEnumerator<T> and use that to do the actual iteration. (I was going to write iterate over here, but that sort of presupposes that you’ve got something finite that you’re stepping through, doesn’t it?)

Anyways, first implementation:


public class Incrementer : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
return new IncrementingEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public class IncrementingEnumerator : IEnumerator<int>
{
private int _n;
public bool MoveNext()
{
++_n;
return true;
}
public int Current
{
get { return _n; }
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose() { }
public void Reset()
{
_n = 0;
}
}

view raw

Incrementer.cs

hosted with ❤ by GitHub

An alternative implementation would be this:


public class ContinuationIncrementer : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
int n = 0;
while (true)
{
yield return ++n;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}

This is simpler in terms of lines of code, but it requires you to understand what the yield keyword does. So what does the yield keyword do? Conceptually, yield gives you what is known as a continuation. In essence, it allows you to jump right back into the code where you left off at the previous step in the iteration. Of course, the best way to find out what is going on under the hood is to look at the IL. If you do that, you’ll see that what the C# compiler actually does is conjure up an IEnumerator<int> of its own. This generated class essentially performs the same task as our handwritten IncrementingEnumerator.

Unsurprisingly, then, the end result is the same, regardless of the implementation we choose. What Incrementer gives you is an infinite sequence of consecutive positive integers, starting with 1. So if you have code like this:


foreach (int i in new Incrementer())
{
Console.WriteLine(i);
if (i == 10) { break; }
}

view raw

Iterating.cs

hosted with ❤ by GitHub

That’s going to print the numbers 1 through 10. And since there’s no built-in way to stop the Incrementer, it’s fairly important to break out of that loop!

 

That’s not terribly interesting, though. Although it might be worth noting that at least it consumes little memory, since the IEnumerable<int> only holds on to a single integer at a time. That’s good. Furthermore, we could generalize it to produce the sequence

   n, 2n, 3n, 4n, 5n…

instead (without exciting people too much, I guess). We could even provide an additional constructor to enable you to set a start value k, so you’d get the sequence

   n+k, 2n+k, 3n+k, 4n+k, 5n+k…

Still not excited? Oh well. There’s no pleasing some people.

Let’s implement it anyway. We’ll call it NumberSequence and have it use a NumberEnumerator to do the actual work, such as it is (it isn’t much):


public class NumberSequence : IEnumerable<int>
{
private readonly int _startValue;
private readonly int _increment;
public NumberSequence(int startValue, int increment)
{
_startValue = startValue;
_increment = increment;
}
public IEnumerator<int> GetEnumerator()
{
return new NumberEnumerator(_startValue, _increment);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public class NumberEnumerator : IEnumerator<int>,
IComparable<NumberEnumerator>
{
private readonly int _increment;
private int _currentValue;
public NumberEnumerator(int startValue, int increment)
{
_currentValue = startValue;
_increment = increment;
}
public bool MoveNext()
{
_currentValue += _increment;
return true;
}
public int Current
{
get { return _currentValue; }
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose() { }
public void Reset()
{
throw new NotSupportedException();
}
public int CompareTo(NumberEnumerator other)
{
return Current.CompareTo(other.Current);
}
}

You’ll notice that I got a bit carried away and implemented IComparable<NumberEnumerator> as well, so that we could compare the state of two such NumberEnumerator instances should we so desire.

A completely different, but equally simple way to create an infinite sequence is to repeat the same finite sequence over and over again. You could do that completely generically, by repeating an IEnumerable<T>. Like so:

https://gist.github.com/1263173

So you could write code like this:


var words = new [] { "Hello", "dear", "friend" };
int wordCount = 0;
foreach (string s in new BrokenRecord(words))
{
Console.WriteLine(s);
if (++wordCount == 10) { break; }
}

And of course the output would be:

   Hello
dear
friend
Hello
dear
friend
Hello
dear
friend
Hello

Now you could put a finite sequence of numbers in there, or just about any sequence you like, in fact. Including, as it were, an infinite sequence of some sort – although that wouldn’t be very meaningful, since you’d never see that thing starting over!

So yeah, we’re starting to see how we can create infinite sequences, and it’s all very easy to do. But what can you do with it?

Let’s turn our attention to an archetypical academic exercise: generating a sequence of prime numbers. Now as you well know, prime numbers aren’t as useless as they might seem to the untrained eye – there are practical applications in cryptography and what have you. But we’re not interested in that right now; we’re going for pure academic interest here. Let’s not fool ourselves to think we’re doing anything useful.

A well-known technique for finding prime numbers is called the Sieve of Eratosthenes (a sieve, of course, being a device that separates wanted elements from unwanted ones). In a nutshell, the Sieve of Eratosthenes works like this:

You have a finite list of consecutive integers, starting at 2. You take the first number in the list and identify it as a prime. Then you go over the rest of the list, crossing out any multiples of that prime (since obviously multiples of a prime cannot be other primes). Then you take the next number in the list that hasn’t been crossed out. That’s also a prime, so you repeat the crossing out process for that prime. And so on and so forth until you’re done.

Here’s a simple example of the primes from 2-20.

   2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

The number 2 is a prime! Cross out all the multiples of 2.

   2  3  X  5  X  7  X  9  X 11  X 13  X 15  X 17  X 19  X

The number 3 is also a prime! Now cross out all its multiples:

   2  3  X  5  X  7  X  X  X 11  X 13  X  X  X 17  X 19  X

And we keep going for 5, 7, 11, 13, 17 and 19. As it turns out, all the multiples have already been covered by other prime-multiples, so there are actually no more numbers being crossed out. But algorithmically, we obviously need to go through the same process for all the primes.

Now, a problem with the Sieve of Eratosthenes as it is formulated here, is that it presupposes a finite list of numbers, and hence you get a finite list of primes as well. We want an infinite list, otherwise surely we won’t certify. Luckily, it is possible to adjust the Sieve of Eratosthenes to cope with infinite lists of numbers. There’s a very readable paper by Melissa O’Neill that shows how you could go about it. The trick is to do just-in-time elimination of prime-multiples, instead of carrying out the elimination immediately when the prime is found.

Essentially, we maintain an infinite sequence of prime-multiples for each prime we encounter. For each new number we want to check, we check to see if there are pending prime-multiples (i.e. obvious non-primes) matching the number. If there are, the number is not a prime (since we’re looking at a number that would have been eliminated in the finite Sieve of Erastosthenes). The number is discarded just-in-time, and the eliminating sequence(s) of prime-multiples are advanced to the next prime-multiple. If there is no matching prime-multiple in any of the sequences, the number is a brand new prime. This means that we need to add a new infinite sequence of prime-multiples to our collection. In other words, we’re aggregating such sequences as we go along. Luckily, they each hold on to no more than a single integer value (See, I told you that was useful. And you wouldn’t believe me!)

Now how do we keep track of our prime-multiple-sequences? A naive approach would be to keep them all in a list, and just check them all for each candidate number. That wouldn’t be too bad for a small number of primes. However, say you’re looking to see if a number n is the 1001th prime – you’d have to go through all 1000 prime-multiple sequences to see if any of them eliminate n as a candidate. That’s a lot of unnecessary work! What we really need to do, is check the one(s) with the smallest pending prime-multiple. Using a priority queue to hold our sequences makes this an O(1) operation. Unfortunately, the .NET framework doesn’t contain an implementation of a priority queue. Fortunately, the C5 Generic Collection Library does. So we’ll use that.

Here, then, is how we could implement an IEnumerable<int> that represents an infinite sequence of primes:


public class PrimeSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
return new SimplePrimeEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public class SimplePrimeEnumerator : IEnumerator<int>
{
private readonly IEnumerator<int> _candidates =
new NumberSequence(1, 1).GetEnumerator();
private IPriorityQueue<NumberEnumerator> _pq =
new IntervalHeap<NumberEnumerator>();
public int Current
{
get { return _candidates.Current; }
}
public bool MoveNext()
{
while (true)
{
_candidates.MoveNext();
int n = _candidates.Current;
if (_pq.IsEmpty || n < _pq.FindMin().Current)
{
// There are no pending prime-multiples.
// This means n is a prime!
_pq.Add(new NumberEnumerator(n*n, n));
return true;
}
do
{
var temp = _pq.DeleteMin();
temp.MoveNext();
_pq.Add(temp);
} while (n == _pq.FindMin().Current);
}
}
object IEnumerator.Current
{
get { return Current; }
}
public void Reset()
{
throw new NotSupportedException();
}
public void Dispose() {}
}

An interesting thing to point out is that the prime-multiple sequence doesn’t have to start until prime*prime. Why? Because smaller multiples of the prime will already be covered by previously considered primes! For instance, the prime-multiple sequence for 17 doesn’t have to contain the multiple 17*11 since the prime-multiple sequence for 11 will contain the same number.

Now this implementation is actually pretty decent. There’s just one thing that leaps to mind as sort of wasted effort. We’re checking every number there is to see if it could possibly be a prime. Yet we know that 2 is the only even number that is a prime (all the others, well, they’d be divisible by 2, right?). So half of our checks are completely in vain.

What if we baked in a little bit of smarts to handle this special case? Say we create a PrimeSequence like so:


public class PrimeSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
yield return 2;
var e = new OddPrimeEnumerator();
while (e.MoveNext())
{
yield return e.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}

The OddPrimeEnumerator is actually quite similar to the naive PrimeEnumerator, except three things:

  1. It needs to start at 3 instead of 2, since we already yielded 2 in the PrimeSequence.
  2. It needs to skip every other number, so it uses 2 as an increment instead of 1.
  3. The MoveNext method can no longer assume that n <= the smallest pending prime-multiple. In fact, it may very well have skipped past a prime-multiple.


public class OddPrimeEnumerator : IEnumerator<int>
{
private readonly IEnumerator<int> _candidates =
new NumberEnumerator(3, 2);
private readonly IPriorityQueue<NumberEnumerator> _pq =
new IntervalHeap<NumberEnumerator>();
public int Current
{
get { return _candidates.Current; }
}
public bool MoveNext()
{
while (true)
{
_candidates.MoveNext();
int n = _candidates.Current;
bool crossedOut = false;
while (!crossedOut)
{
if (_pq.IsEmpty || n < _pq.FindMin().Current)
{
// There are no pending prime-multiples.
// This means n is a prime!
_pq.Add(new NumberEnumerator(n * n, n));
return true;
}
crossedOut = n == _pq.FindMin().Current;
do
{
var temp = _pq.DeleteMin();
temp.MoveNext();
_pq.Add(temp);
} while ((n > _pq.FindMin().Current) ||
(crossedOut && n == _pq.FindMin().Current));
}
}
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose() { }
public void Reset()
{
throw new NotSupportedException();
}
}

Note that we must go out of our way a little bit to handle the case where we skip past a prime-multiple. Hence the code is microscopically uglier, but we cut our work pretty much in half. But of course, it’s tempting to go further. We check an awful lot of multiples of 3 as well, you know? And of 5? What if we could just skip those too? Turns out there’s a well-known optimization technique known as “wheel factorization” that allows us to do just that.

Here’s a 2*3*5 wheel (well, the three first layers of it, anyway). Note that it starts at 7, which is the first prime we’re not including in this wheel factorization.

Wheel

The green “spokes” of the wheel represents sectors where you might find a prime number. The big red areas you don’t even have to check, because they contain only multiples of the first three primes.

Obviously then, the wheel allows us to rule out a great deal of numbers right off the bat. The numbers not filtered out by the wheel are checked in the normal way. According to O’Neill, there are quickly diminishing returns in using large wheels, so we’ll restrain ourselves to a small wheel that takes out the multiples of 2, 3, and 5.

Now how do we implement this wheel in our code? Well, clearly the wheel can be represented as another infinite sequence, with the characteristic that it repeats the same pattern of numbers to skip over and over again. Well gee, that sounds almost like a broken record, doesn’t it? (You’d think I planned these things!)

Say you wanted to create an infinite skip sequence corresponding to the wheel shown above. This code would do nicely:


var skip = new[] { 4, 2, 4, 2, 4, 6, 2, 6 };
var skipSequence = new BrokenRecord<int>(skip);

view raw

SkipSequence.cs

hosted with ❤ by GitHub

Now we can use the skip sequence to create a sequence of prime candidate numbers. We’ll call it a WheelSequence for lack of a better term.


public class WheelSequence : IEnumerable<int>
{
private readonly int _startValue;
private readonly IEnumerable<int> _;
public WheelSequence(int startValue,
IEnumerable<int> skipSequence)
{
_startValue = startValue;
_ = skipSequence;
}
public IEnumerator<int> GetEnumerator()
{
yield return _startValue;
var wse = new WheelSequenceEnumerator(_startValue,
_.GetEnumerator());
while (wse.MoveNext())
{
yield return wse.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public class WheelSequenceEnumerator : IEnumerator<int>
{
private readonly IEnumerator<int> _;
private int _value;
public WheelSequenceEnumerator(int startValue,
IEnumerator<int> skip)
{
_value = startValue;
_ = skip;
}
public bool MoveNext()
{
_.MoveNext();
_value += _.Current;
return true;
}
public int Current
{
get { return _value; }
}
object IEnumerator.Current
{
get { return Current; }
}
public void Reset()
{
throw new NotSupportedException();
}
public void Dispose() { }
}

Now we can replace our original naive PrimeEnumerator with one that uses wheel factorization to greatly reduce the number of candidates considered.


public class WheelPrimeEnumerator : IEnumerator<int>
{
private readonly IEnumerator<int> _candidates =
new WheelSequence(11,
new BrokenRecord<int>(
new[] { 4, 2, 4, 2, 4, 6, 2, 6 }
)
).GetEnumerator();
private readonly IPriorityQueue<NumberEnumerator> _pq =
new IntervalHeap<NumberEnumerator>();
public int Current
{
get { return _candidates.Current; }
}
public bool MoveNext()
{
while (true)
{
_candidates.MoveNext();
int n = _candidates.Current;
bool crossedOut = false;
while (!crossedOut)
{
if (_pq.IsEmpty || n < _pq.FindMin().Current)
{
// There are no pending prime-multiples.
// This means n is a prime!
_pq.Add(new NumberEnumerator(n*n, n));
return true;
}
crossedOut = n == _pq.FindMin().Current;
do
{
var temp = _pq.DeleteMin();
temp.MoveNext();
_pq.Add(temp);
} while ((n > _pq.FindMin().Current) ||
(crossedOut && n == _pq.FindMin().Current));
}
}
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose() {}
public void Reset()
{
throw new NotSupportedException();
}
}

This particular implementation uses a wheel that pre-eliminates multiples of 2, 3, and 5, but obviously you could use any wheel you want. Note that the only difference between this implementation and the OddPrimeEnumerator is in the choice of IEnumerator<int> for prime number candidates. The rest is unchanged.

Of course, to use this thing, we must first manually yield the primes that we eliminated the multiples of. Like so:


public class WheelPrimeSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
yield return 2;
yield return 3;
yield return 5;
var e = new WheelPrimeEnumerator();
while (e.MoveNext())
{
yield return e.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}

That just about wraps it up. I should point out that the current implementation doesn’t really give you an infinite sequence of primes. Unfortunately, the abstraction is all a-leak like a broken faucet since the pesky real world of finite-sized integers causes it to break down at a certain point. In fact, for the current implementation, that point is after the 4.792th prime, which is 46.349. Why? Because then we start a prime-multiple sequence at 46.349*46.349, which won’t fit into the 32-bit integer we’re currently using to store the current value. Hence we get a overflow, the prime-multiple sequence gets a negative number, and it’s all messed up. We really should put an if-statement in there, to return false from MoveNext if and when we overflow, effectively terminating our not-so-infinite-infinite sequence.

Of course we could use a 64-bit integer instead, but keep in mind that we’re really just buying time – we’re not fixing the underlying problem. C# doesn’t have arbitrary-sized integers, end of story. Nevertheless, 64-bit integers will give you primes larger than 3.000.000.000. I’d say it’s good enough for an academic exercise; or as I like to put it, large enough for all impractical purposes.

Do I certify?


Dry data

I’m a big proponent of the NoORM movement. Haven’t heard of it? That’s because it doesn’t exist. But it sort of does, under a different name. So-called “micro-ORMs” like Simple.Data, dapper and massive all belong to this category. That’s three really smart guys (unwittingly) supporting the same cause as I. Not bad. I’d say that’s the genesis of a movement right there.

Unsurprisingly, NoORM means “not only ORM”. The implication is that there are scenarios where full-blown object-relational mapper frameworks like nHibernate and Entity Framework are overkill. Such frameworks really go beyond merely addressing the infamous object/relational impedence mismatch (which is arguably irreconcilable), to support an almost seamless experience of persistent objects stored in a relational database. To do so, they pull out the heavy guns, like the Unit of Work pattern from Martin Fowler’s Patterns of Enterprise Application Architecture (One of those seminal tomes with an “on bookshelf”-to-“actually read it” ratio that’s just a little too high.)

And that’s great! I always say, let someone else “maintain a list of objects affected by a business transaction and coordinate the writing out of changes and the resolution of concurrency problems”. Preferably someone smarter than me. It’s hard to get right, and it in the right circumstances, being able to leverage a mature framework to do that heavy lifting for you is a huge boon.

Make sure you have some heavy lifting to do, though. All the power, all the functionality, all the goodness supported by state-of-the-art ORMs, comes at a cost. There’s a cost in configuration, in conceptual overhead, in overall complexity of your app. Potentially there’s a cost in performance as well. What if you don’t care about flexible ways of configuring the mapping of a complex hierarchy of rich domain objects onto a highly normalized table structure? What if you don’t need to support concurrent, persistent manipulation of the state of those objects? What if all you need is to grab some data and go to town? In that case, you might be better served with something simpler, like raw ADO.NET calls or some thin, unambitious veneer on top of that.

Now there’s one big problem with using raw ADO.NET calls: repetition (lack of DRYness). You basically have to go through the same song-and-dance every time, with just minor variations. With that comes boredom and bugs. So how do we avoid that? How do fight repetition and duplication? By abstraction, of course. We single out the stuff that varies and abstract away the rest. Needless to say, the less that varies, the simpler and more powerful the abstraction. If you can commit to some serious constraints with respect to varity, your abstraction becomes that much more succinct and that much more powerful. Of course, in order to go abstract, we first need to go concrete. So let’s do that.

Here’s a scenario: there’s a database. A big, honkin’ legacy database. It’s got a gazillion stored procedures, reams and reams of business logic written in T-SQL by some tech debt nomad who has since moved on to greener pastures. A dozen business critical applications rely on it. It’s not something you’ll want to touch with a ten-foot pole. The good thing is you don’t have to. For the scope of the project you’re doing, all you need to do is grab some data and go. Invoke a handful of those archaic stored procedures to get the data you need, and you’re done. Home free. Have a cup of tea.

Now what sort of constraints can we embrace and exploit in this scenario?

  1. Everything will be stored procedures.
  2. It’s SQL Server, and that’s not going to change.

As it turns out, the second point is not really significant, since we’ll need database-agnostic code if we’re going to write tests. The first one is interesting though. We’ll also assume that the stored procedures will accept input parameters only. That’s going to simplify our code a great deal.

Let’s start by introducing a naive client doing straight invocation of a few stored procedures in plain ol’ ADO.NET:


public class Client1
{
private readonly string _connStr;
private readonly DbProviderFactory _dpf;
public Client1(string connStr) : this(connStr,
DbProviderFactories.GetFactory("System.Data.SqlClient"))
{}
public Client1(string connStr, DbProviderFactory dpf)
{
_connStr = connStr;
_dpf = dpf;
}
private DbParameter CreateParameter(string name, object val)
{
var p = _dpf.CreateParameter();
p.ParameterName = name;
p.Value = val;
return p;
}
public IEnumerable<User> GetCompanyUsers(string company)
{
var result = new List<User>();
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = "spGetCompanyUsers";
var p = CreateParameter("@companyName", company);
cmd.Parameters.Add(p);
var reader = cmd.ExecuteReader();
while (reader.Read())
{
var u = new User
{
Id = (string) reader["id"],
UserName = (string) reader["user"],
Name = (string) reader["name"],
Email = (string) reader["emailAddress"],
Phone = (string) reader["cellPhone"],
ZipCode = (string) reader["zip"]
};
result.Add(u);
}
}
return result;
}
public string GetUserEmail(string userId)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = "spGetEmailForUser";
var p = CreateParameter("@userId", userId);
cmd.Parameters.Add(p);
return (string) cmd.ExecuteScalar();
}
}
public void StoreUser(User u)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = "spInsertOrUpdateUser";
var ps = new [] {
CreateParameter("@userId", u.Id),
CreateParameter("@user", u.UserName),
CreateParameter("@name", u.Name),
CreateParameter("@emailAddress", u.Email),
CreateParameter("@cellPhone", u.Phone),
CreateParameter("@zip", u.ZipCode)
};
cmd.Parameters.AddRange(ps);
cmd.CommandType = CommandType.StoredProcedure;
cmd.ExecuteNonQuery();
}
}
}

view raw

Client1.cs

hosted with ❤ by GitHub

So you can see, there’s a great deal of duplication going on there. And obviously, as you add new queries and commands, the amount of duplication increases linearly. It’s the embryo of a maintenance nightmare right there. But we’ll fight back with that trusty ol’ weapon of ours: abstraction! To arrive at a suitable one, let’s play a game of compare and contrast.

What varies?

  • The list of input parameters.
  • In the case of queries: the data row we’re mapping from and the .NET type we’re mapping to.
  • The names of stored procedures.
  • The execute method (ExecuteReader, ExecuteScalar, ExecuteNonQuery). We’re gonna ignore DataSets since I don’t like them. (I’ll be using my own anemic POCOs, thank you very much!).

What stays the same?

  • The connection string.
  • The need to create and open a connection.
  • The need to create and configure a command object.
  • The need to execute the command against the database.
  • The need to map the result of the command to some suitable representation (unless we’re doing ExecuteNonQuery).

There are a couple of design patterns that spring to mind, like Strategy or Template method, that might help us clean things up. We’ll be leaving GoF on the shelf next to PoEAA, though, and use lambdas and generic methods instead.

I take “don’t repeat yourself” quite literally. So we’re aiming for a single method where we’ll be doing all our communication with the database. We’re going to channel all our queries and commands through that same method, passing in just the stuff that varies.

To work towards that goal, let’s refactor into some generic methods:


public class Client2
{
private readonly string _connStr;
private readonly DbProviderFactory _dpf;
public Client2(string connStr) : this(connStr,
DbProviderFactories.GetFactory("System.Data.SqlClient"))
{}
public Client2(string connStr, DbProviderFactory dpf)
{
_connStr = connStr;
_dpf = dpf;
}
private DbParameter CreateParameter(string name, object val)
{
var p = _dpf.CreateParameter();
p.ParameterName = name;
p.Value = val;
return p;
}
public IEnumerable<User> GetCompanyUsers(string company)
{
return ExecuteReader("spGetCompanyUsers",
new[] {CreateParameter("@companyName", company)},
r => new User
{
Id = (string) r["id"],
UserName = (string) r["user"],
Name = (string) r["name"],
Email = (string) r["emailAddress"],
Phone = (string) r["cellPhone"],
ZipCode = (string) r["zip"]
});
}
public IEnumerable<T> ExecuteReader<T>(string spName,
DbParameter[] sqlParams, Func<IDataRecord, T> map)
{
var result = new List<T>();
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
cmd.Parameters.AddRange(sqlParams);
var reader = cmd.ExecuteReader();
while (reader.Read())
{
result.Add(map(reader));
}
}
return result;
}
public string GetUserEmail(string userId)
{
return ExecuteScalar("spGetEmailForUser",
new[] {CreateParameter("@userId", userId)},
o => o as string);
}
public T ExecuteScalar<T>(string spName,
DbParameter[] sqlParams, Func<object, T> map)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
cmd.Parameters.AddRange(sqlParams);
return map(cmd.ExecuteScalar());
}
}
public void StoreUser(User u)
{
ExecuteNonQuery("spInsertOrUpdateUser",
new[]
{
CreateParameter("@userId", u.Id),
CreateParameter("@user", u.UserName),
CreateParameter("@name", u.Name),
CreateParameter("@emailAddress", u.Email),
CreateParameter("@cellPhone", u.Phone),
CreateParameter("@zip", u.ZipCode)
});
}
public void ExecuteNonQuery(string spName,
DbParameter[] sqlParams)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
cmd.Parameters.AddRange(sqlParams);
cmd.CommandType = CommandType.StoredProcedure;
cmd.ExecuteNonQuery();
}
}
}

view raw

Client2.cs

hosted with ❤ by GitHub

So we’ve bloated the code a little bit – in fact, we just doubled the number of methods. But we’re in a much better position to write new queries and commands. We’re done with connections and usings and what have you. Later on, we can just reuse the same generic methods.

However, we still have some glaring duplication hurting our eyes: the three execute methods are practically identical. So while the code is much DRYer than the original, there’s still some moisture in there. And moisture leads to smell and rot.

To wring those few remaining drops out of the code, we need to abstract over the execute methods. The solution? To go even more generic!


public TResult Execute<T, TResult>(string spName,
DbParameter[] sqlParams, Func<IDbCommand, T> execute,
Func<T, TResult> map)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
cmd.Parameters.AddRange(sqlParams);
cmd.CommandType = CommandType.StoredProcedure;
return map(execute(cmd));
}
}

view raw

Execute.cs

hosted with ❤ by GitHub

So basically the solution is to pass in a function that specifies the execute method to run. The other execute methods can use this to get
their stuff done. Now that we have our single, magical do-all database interaction method, let’s make make things a bit more reusable. We’ll cut the database code out of the client, and introduce a tiny abstraction. Let’s call it Database, since that’s what it is. In fact, for good measure, let’s throw in a new method that might be useful in the process: ExecuteRow. Here’s the code:


public class Database
{
private readonly string _connStr;
private readonly DbProviderFactory _dpf;
public Database(string connStr): this(connStr,
DbProviderFactories.GetFactory("System.Data.SqlClient"))
{}
public Database(string connStr, DbProviderFactory dpf)
{
_connStr = connStr;
_dpf = dpf;
}
public IEnumerable<T> ExecuteReader<T>(string spName,
DbParameter[] sqlParams, Func<IDataRecord, T> map)
{
return Execute(spName, sqlParams, cmd => cmd.ExecuteReader(),
r =>
{
var result = new List<T>();
while (r.Read())
{
result.Add(map(r));
}
return result;
});
}
public T ExecuteRow<T>(string spName,
DbParameter[] sqlParams, Func<IDataRecord, T> map)
{
return ExecuteReader(spName, sqlParams, map).First();
}
public T ExecuteScalar<T>(string spName,
DbParameter[] sqlParams, Func<object, T> map)
{
return Execute(spName, sqlParams,
cmd => cmd.ExecuteScalar(), map);
}
public void ExecuteNonQuery(string spName,
DbParameter[] sqlParams)
{
Execute(spName, sqlParams,
cmd => cmd.ExecuteNonQuery(),
o => o);
}
public TResult Execute<T, TResult>(string spName,
DbParameter[] sqlParams, Func<IDbCommand, T> execute,
Func<T, TResult> map)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
cmd.Parameters.AddRange(sqlParams);
cmd.CommandType = CommandType.StoredProcedure;
return map(execute(cmd));
}
}
}

view raw

Database.cs

hosted with ❤ by GitHub

ExecuteScalar is pretty straightforward, but there are a few interesting details concerning the others. First, ExecuteReader derives a map from IDataReader to IEnumerable from the user-supplied map from IDataRecord to T. Second, ExecuteNonQuery doesn’t really care about the result from calling DbCommand.ExecuteNonQuery against the database (which indicates the number of rows affected by the command/non-query). So we’re providing the simplest possible map – the identity map – to the Execute method.

So the execution code is pretty DRY now. Basically, you’re just passing in the stuff that varies. And there’s a single method actually creating connections and commands and executing them against the database. Good stuff.

Let’s attack redundancy in the client code. Here’s what it looks like at the moment:


public class Client4
{
private readonly Database _db;
public Client4(Database db)
{
_db = db;
}
public IEnumerable<User> GetCompanyUsers(string company)
{
return _db.ExecuteReader("spGetCompanyUsers",
new[] { new SqlParameter("@companyName", company) },
r => new User
{
Id = (string)r["id"],
UserName = (string)r["user"],
Name = (string)r["name"],
Email = (string)r["emailAddress"],
Phone = (string)r["cellPhone"],
ZipCode = (string)r["zip"]
});
}
public string GetUserEmail(string userId)
{
return _db.ExecuteScalar("spGetEmailForUser",
new[] { new SqlParameter("@userId", userId) },
o => o as string);
}
public void StoreUser(User u)
{
_db.ExecuteNonQuery("spInsertOrUpdateUser",
new[]
{
new SqlParameter("@userId", u.Id),
new SqlParameter("@user", u.UserName),
new SqlParameter("@name", u.Name),
new SqlParameter("@emailAddress", u.Email),
new SqlParameter("@cellPhone", u.Phone),
new SqlParameter("@zip", u.ZipCode)
});
}
}

view raw

Client4.cs

hosted with ❤ by GitHub

Actually, it’s not too bad, but I’m not happy about the repeated chanting of new SqlParameter. We’ll introduce a simple abstraction to DRY up that too, and give us a syntax that’s a bit more succinct and declarative-looking.


public class StoredProcedure
{
private readonly DbProviderFactory _dpf;
private readonly DbCommand _sp;
public StoredProcedure(DbCommand sp, DbProviderFactory dpf)
{
_sp = sp;
_dpf = dpf;
}
public StoredProcedure this[string parameterName,
object value, int? size = null, DbType? type = null]
{
get { return AddParameter(parameterName, value, size, type); }
}
public StoredProcedure AddParameter(string parameterName,
object value, int? size = null, DbType? type = null)
{
var p = _dpf.CreateParameter();
if (p != null)
{
p.ParameterName = parameterName;
p.Value = value;
if (size.HasValue)
{
p.Size = size.Value;
}
if (type.HasValue)
{
p.DbType = type.Value;
}
_sp.Parameters.Add(p);
}
return this;
}
}

This is basically a sponge for parameters. It uses a little trick with a get-indexer with side-effects to do its thing. This allows for a simple fluent syntax to add parameters to a DbCommand object. Let’s refactor the generic Execute method to use it.


public TResult Execute<T, TResult>(string spName,
Func<StoredProcedure, StoredProcedure> configure,
Func<IDbCommand, T> execute,
Func<T, TResult> map)
{
using (var conn = _dpf.CreateConnection())
using (var cmd = _dpf.CreateCommand())
{
conn.ConnectionString = _connStr;
conn.Open();
cmd.Connection = conn;
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = spName;
configure(new StoredProcedure(cmd, _dpf));
cmd.CommandType = CommandType.StoredProcedure;
return map(execute(cmd));
}
}

The refactoring ripples through to the other execute methods as well, meaning you pass in a Func instead of the parameter array. Now the interesting part is how the new abstraction affects the client code. Here’s how:


public class Client5
{
private readonly Database _db;
public Client5(Database db)
{
_db = db;
}
public IEnumerable<User> GetCompanyUsers(string company)
{
return _db.ExecuteReader(
"spGetCompanyUsers",
sp => sp["@companyName", company],
r => new User
{
Id = (string) r["id"],
UserName = (string) r["user"],
Name = (string) r["name"],
Email = (string) r["emailAddress"],
Phone = (string) r["cellPhone"],
ZipCode = (string) r["zip"]
});
}
public string GetUserEmail(string userId)
{
return _db.ExecuteScalar(
"spGetEmailForUser",
sp => sp["@userId", userId],
o => o as string);
}
public void StoreUser(User u)
{
_db.ExecuteNonQuery(
"spInsertOrUpdateUser",
sp => sp["@userId", u.Id]
["@user", u.UserName]
["@name", u.Name]
["@emailAddress", u.Email]
["@cellPhone", u.Phone]
["@zip", u.ZipCode]);
}
}

view raw

Client5.cs

hosted with ❤ by GitHub

Which is pretty much as DRY as it gets, at least in my book. We just grab the data and go. Wheee! Where’s my tea?


UTC now!

The other day, I stumbled across this post, describing how .NET developers are alledgedly misuing DateTime.Now. The gist of it is that we are using DateTime.Now in cases where DateTime.UtcNow would be more appropriate, for instance when measuring runtime performance. The difference between Now and UtcNow is that the former gives you local time, whereas the latter gives you “universal” or location-independent time. It turns out that Now is one or two orders of magnitude slower than UtcNow; the reason being that Now has to derive the local time by calculating an offset from universal time depending on locale. Clearly, this is unnecessary work if all you’re going to do is calculate the time elapsed between two points in time. Based on this, the author of the blog post argues that using Now when you could have used UtcNow is wrong.

I’m of two minds about this issue. My gut reaction was that, “gee, that’s really unimportant”. Or to borrow a phrase from one of my co-workers, “I would look for performance optimisations elsewhere”. So Now is much slower than UtcNow. I didn’t know that, but then again I don’t think it’s a big deal. Performance doesn’t matter until it does – that is, when something is perceived as unacceptably slow by a user. Of course, the post includes the mandatory contrived example, showing how the difference between Now and UtcNow matters when current time is sampled a million times in a tight loop. Yup. EVERYTHING matters when you’re doing it a million times. (Especially if you do it on some thread where there’s a user waiting in the other end.) Nevertheless, I find it entirely plausible that in a practical scenario, there would be other things you needed to do a million times besides getting the time, which just might dwarf the demonstrated performance difference. Put it another way: I estimate there’s something like a one-in-a-million chance that you’re actually going to be in a real world situation where this is going to be a problem. Now for that one guy in Duluth, Minnesota who actually has to get the current time – and nothing else – a million consecutively for some business critical routine, it’s going to matter if he knows about the performance difference between Now and UtcNow. I hope he does. For the rest of us, not so much.

On the flip side though: now that I know, I’ll never again use Now if I can use UtcNow instead. Isn’t that ironic? I’ve just claimed that this has no practical implications whatsoever, and yet: In the future, whenever I’m doing some relative measurement of time, it’s UtcNow all the way. Unless it’s StopWatch, which is usually much more convenient (and uses UtcNow under the hood by default). After all, there’s no reason to waste clock cycles on purpose. Heh.

Lurking underneath this rather trivial matter, though, is a somewhat deeper issue. Which is: should I have known?

Isn’t it negligence on my part that I was unaware of the performance implications of using Now instead of UtcNow? After all, I’m a professional developer, right? I’ve heard Uncle Bob‘s sermons about craftmanship and I’m a believer. My employer pays me good money each month to produce the best code I’m able to. In turn, my employer’s customers rely on the quality of that code. And I’ve put my share of Now‘s in there, you know? Without needing to. Superflous work. Burnt cycles. UtcNow would have been just fine. Not that it matters, performance-wise, in any way, shape or form – if and when I’ve written sluggish code, it’s not because of Now – it’s because I’ve chosen a suboptimal algorithm or data structure at some key point in the application. Still: the right thing would have been to use UtcNow unless I needed local time, dammit! And I’ve plunged ahead, unwittingly using Now instead, all due to my incomplete knowledge of the .NET framework. Isn’t that sloppy?

As you can imagine, I’m totally going to exonerate myself on this issue. No, I should not have known! I mean, I could have known, it would have been nice if I would have known, but I don’t think I should have known. You see, programmers work in a constant state of partial, incomplete understanding. This is not an anomaly, it’s the name of the game. Modern software is just too large and complex to make it feasible to grok it all unless you’re L. Peter Deutsch or something. And you probably aren’t. This certainly goes for the vast .NET framework. According to this post, version 3.5 of the .NET framework contained 11417 types and 109657 members. Complete understanding is not only unattainable, it’s also economically unsound to aim for. Since program understanding is a slow and costly process, your efforts should be focussed on understanding what is needed to solve the problem at hand. You need just enough program understanding to implement a feature or fix a bug or whatever you need to do. Just-in-time program understanding is a lean and pragmatic approach that allows you to get things done. (Note that I’m not advocating programming by coincidence here. Clearly you have to understand, at some level, what the code you’re invoking does. But you do so at the appropriate level of abstraction. You shouldn’t have to fire up Reflector every time you’re using an API function.)

If anything, it’s an API design issue. (Blaming it on Microsoft is always a safe bet!) When using an API, you try to map the problem you need solved onto the members of the API. You scan the available properties and methods, and conjecture a hypothesis regarding which one will fulfill your need. In doing so, you reach for the simplest-sounding potential match first. Only when that doesn’t work for you do you consider the next most plausible candidate. You can think of this as API affordance – the API will invite you, through it’s vocabulary, to try out certain things first. If you need Foo to happen and the API supports methods Foo() and FrobnitzFoo(), you’re going to try Foo() first. Right? Only when the plain Foo() fails to Foo as you’d like it to will you try FrobnitzFoo() instead. I’m sure you can see the parallel to Now/UtcNow. Since Microsoft gave the straight-forward simplest name to the property that gives you local time, that’s what developers reach for, even when they just need a relative timestamp. It’s pure economics. And lo and behold! it works, because for all but the poor schmuck in Duluth, Now gets the job done just fine. The hypothesis is never falsified. (Of course, that guy from Duluth is going to end up an enlightened schmuck, because for him, the hypothesis will be falsified – he’s going to get burned by the relatively slow implementation, probe around the DateTime API, maybe even crank out Reflector, and eventually find that “hey, I should be using UtcNow instead”. And he’ll be good to go, and all smug about it to boot.)

Of course, we can speculate as to why Microsoft used the name ‘Now‘ instead of something like ‘LocalNow‘, which would have been more precise. Is it accidental? Due to negligence or incompetence? Probably not. Now is broadly applicable and good enough in all but the most extreme scenarios. It can be used successfully both for showing local time to the user (which is usually the right thing to do), and for measuring runtime performance (even though it’s not the optimal choice in the latter case). I think they did it very deliberately. And I don’t think it’s a bad thing. If Microsoft made a mistake, it’s that Now is a property. In general, the rule is that properties should be instantenous. According to the MSDN documentation, a method is preferable to a property when “[it] performs a time-consuming operation, [that is] perceivably slower than the time it takes to set or get a field’s value”. Arguably, then, Now should really have been Now(). That would have provided a subtle hint that getting the current time isn’t free. And these long posts about DateTime.Now wouldn’t have been written at all.


The Indispensable IDisposable

The using statement in C# is a peculiar spoonful of syntactic sugar. It is peculiar because it’s tailor-made for a particular interface in the .NET framework. (i.e. IDisposable). Hence in the C# standard, you’ll find that the semantics of using is defined in terms of how it interacts with that interface, the existence of which is sort of assumed a priori. So the boundary between language and library gets really blurred.

As you well know, the purpose of using is to make it 1) more convenient for programmers to work with so-called unmanaged resources, and 2) more likely that programmers will dispose of such resources in a timely manner. That’s why it’s there.

The archetypical usage is something like:


using (var resource = new Resource())
{
// Use the resource here.
}

view raw

using.cs

hosted with ❤ by GitHub

This will expand to:


var resource = new Resource();
try
{
// Use the resource here.
}
finally {
if (resource != null)
{
resource.Dispose();
}
}

The using statement has a lot of potential use cases beyond that, though – indeed, that’s what this blog post is all about! The MSDN documentation states that “the primary use of IDisposable is to release unmanaged resources”, but it is easy and fun to come up with interesting secondary uses. Basically any time you need something to happen before and after an operation, you got a potential use case for using. In other words, you can use it as sort of a poor man’s AOP.

Some people find the secondary uses for using to be abuse, others find it artistic. The most convincing argument I’ve read against liberal use of using is Eric Lippert’s comment on this stack overflow question. Essentially, the argument is that a Dispose method should be called out of politeness, not necessity: the correctness of your code shouldn’t depend upon Dispose being called. I won’t let that stop me though! (Granted, you’d need to put 1024 me’s in a cluster to get the brain equivalent of a Lippert, but hey – he’s just this guy, you know?). After all, what does code correctness mean? If your application leaks scarce resources due to untimely disposal, it’s broken – you’ll find it necessary to explicitly dispose of them. There’s a sliding scale between politeness and necessity, between art and abuse, and it’s not always obvious when you’re crossing the line. Also, I have to admit, I have a soft spot for cute solutions, especially when it makes for clean, readable code. I therefore lean towards the forgiving side. YMMW.

So with that out of the way, let’s start abusing using:

Example 1: Performance timing

In my mind, the simplest non-standard application of using is to measure the time spent doing some operation (typically a method call). A PerfTimer implementing IDisposible gives you a neat syntax for that:


class Program
{
static void Main()
{
using (new PerfTimer())
{
// Do your thing.
}
}
}
class PerfTimer : IDisposable
{
private readonly Stopwatch _ = new Stopwatch();
public PerfTimer()
{
_.Start();
}
public void Dispose()
{
_.Stop();
Console.WriteLine("Spent {0} ms.", _.ElapsedMilliseconds);
}
}

Note that you don’t have to hold on to the PerfTimer you obtain in the using statement, since you’re not actually using it inside the scope of the using block. Obviously Dispose will be called nevertheless.

Example 2: Impersonation

Impersonation is one of my favorite using use cases. What you want is to carry out a sequence of instructions using a particular identity, and then revert to the original identity when you’re done. Wrapping your fake id up in an IDisposable makes it all very clean and readable:


class Program
{
static void Main()
{
WindowsIdentity id = …;
using (new Persona(id))
{
// Act as id.
}
}
}
class Persona : IDisposable
{
private readonly WindowsImpersonationContext _;
public Persona(WindowsIdentity id)
{
_ = id.Impersonate();
}
public void Dispose()
{
_.Undo();
}
}

Example 3: Temporary dependency replacement

Another useful application of using is to fake out some global resource during testing. It’s really a kind of dependency injection happening in the using statement. The neat thing is that you can reinject the real object when you’re done. This can help avoid side-effects from one test affecting another test.

Let’s say you want to control time:


class Program
{
static void Main()
{
Tick(); Tick(); Tick();
DateTime dt = DateTime.Now;
using (Timepiece.Replacement(() => dt.Add(dt DateTime.Now)))
{
Tick(); Tick(); Tick();
}
Tick(); Tick(); Tick();
}
static void Tick()
{
Thread.Sleep(1000);
Console.WriteLine("The time is {0}", Timepiece.Now.ToLongTimeString());
}
}
public static class Timepiece
{
private static Func<DateTime> _ = () => DateTime.Now;
public static DateTime Now { get { return _(); } }
public static IDisposable Replacement(Func<DateTime> f)
{
return new TempTimepiece(f);
}
class TempTimepiece : IDisposable
{
private readonly Func<DateTime> _original;
public TempTimepiece(Func<DateTime> f)
{
_original = _;
_ = f;
}
public void Dispose()
{
_ = _original;
}
}
}

The idea is that we eliminate uses of DateTime.Now in our code, and consistently use Timepiece.Now instead. By default, Timepiece.Now uses DateTime.Now to yield the current time, but you’re free to replace it. You can pass in your own time provider to the Replacement method, and that we be used instead – until someone calls Dispose on the TempTimepiece instance returned from Replacement, that is. In the code above, we’re causing time to go backwards for the three Ticks inside the using block. The output looks like this:

Timepiece-backwards

Example 4: Printing nested structures

So far we’ve seen some modest examples of abuse. For our last example, let’s go a bit overboard, forget our inhibitions and really embrace using!

Here’s what I mean:


public override void Write()
{
using(Html())
{
using (Head())
{
using (Title())
{
Text("Greeting");
}
}
using (Body(Bgcolor("pink")))
{
using(P(Style("font-size:large")))
{
Text("Hello world!");
}
}
}
}

view raw

Using.Dsl.cs

hosted with ❤ by GitHub

Hee hee.

Yup, it’s an embedded DSL for writing HTML, based on the using statement. Whatever your other reactions might be – it’s fairly readable, don’t you think?

When you run it, it produces the following output (nicely formatted and everything):

Html-writer

How does it work, though?

Well, the basic idea is that you don’t really have to obtain a new IDisposable every time you’re using using. You can keep using the same one over and over, altering its state as you go along. Here’s how you can do it:


class Program
{
static void Main(string[] args)
{
new HtmlWriter(Console.Out).Write();
}
}
class HtmlWriter : BaseHtmlWriter
{
public HtmlWriter(TextWriter tw) : base(tw) {}
public override void Write()
{
using(Html())
{
using (Head())
{
using (Title())
{
Text("Greeting");
}
}
using (Body(Bgcolor("pink")))
{
using(P(Style("font-size:large")))
{
Text("Hello world!");
}
}
}
}
}
class DisposableWriter : IDisposable
{
private readonly Stack<string> _tags = new Stack<string>();
private readonly TextWriter _;
public DisposableWriter(TextWriter tw)
{
_ = tw;
}
public IDisposable Tag(string tag, params string[] attrs)
{
string s = attrs.Length > 0 ? tag + " " + string.Join(" ", attrs) : tag;
Write("<" + s + ">");
_tags.Push(tag);
return this;
}
public void Text(string s)
{
Write(s);
}
private void Write(string s) {
_.WriteLine();
}
public void Dispose()
{
var tag = _tags.Pop();
Write("</" + tag + ">");
}
}
abstract class BaseHtmlWriter
{
private readonly DisposableWriter _;
protected BaseHtmlWriter(TextWriter tw)
{
_ = new DisposableWriter(tw);
}
protected IDisposable Html()
{
return _.Tag("html");
}
protected IDisposable Body(params string[] attrs)
{
return _.Tag("body", attrs);
}
// More tags…
protected string Bgcolor(string value)
{
return Attr("bgcolor", value);
}
protected string Style(string value)
{
return Attr("style", value);
}
// More attributes…
protected string Attr(string key, string value)
{
return key + "=\"" + value + "\"";
}
protected void Text(string s)
{
_.Text(s);
}
public abstract void Write();
}

view raw

Using.Html.cs

hosted with ❤ by GitHub

So you can see, it’s almost like you’re using an IDisposable in a fluent interface. You just keep using the same DisposableWriter over and over again! Internally, it maintains a stack of tags. Whenever you add a new tag to the writer (which happens on each new using), it writes the start tag to the stream and pushes it onto the stack. When the using block ends, Dispose is called on the DisposableWriter – causing it to pop the correct tag off the stack and write the corresponding end tag to the stream. The indentation is determined by the depth of the stack, of course.

Wasn’t that fun? There are other things you could do, too. For instance, I bet you could implement an interpreter for a stack-based language (such as IL) pretty easily. Let each instruction implement IDisposable, pop values off the stack upon instantiation, execute the instruction, optionally push a value back on upon Dispose. Shouldn’t be hard at all.

Now if I could only come up with some neat abuses of foreach


Enumerating enumerables

You know when you’re iterating over some IEnumerable, and you need to associate the items in the IEnumerable with a sequence number?

In Python, you could do this:

Python-enumerate-shell

In C#, however, you’re forced to do something like this:


var items = new [] { "zero", "one", "two" };
int no = 0;
foreach (var it in items)
{
Console.WriteLine("{0} => {1}", no, it);
++no;
}

view raw

Enumeration.cs

hosted with ❤ by GitHub

Yuck. I feel dirty each time. It’s two measly lines of code, but it sure feels like I’m nailing something onto the loop that doesn’t belong there. (And that’s probably because that’s exactly what I’m doing.) It feels out of sync with the level of abstraction for the foreach statement, and it’s just plain ugly. So what I’m looking for is an approach that’s more appealing aesthetically, something a little more polished, something like:


var items = new [] { "zero", "one", "two" };
foreach (var it in items.Enumerate())
{
Console.WriteLine("{0} => {1}", it.Number, it.Item);
}

view raw

Enumerate.cs

hosted with ❤ by GitHub

To be sure, this is still not as clean as the Python code (for one, there’s no decomposition of tuple types).  But personally, I like it a whole lot better than the original C# version. It’s prettier, cleaner, and plugs the leaky abstraction.

As you can imagine, I’m using an extension method to pretend that IEnumerables can be, you know, enumerated. The task of the extension method is just to turn an IEnumerable<T> into an IEnumerable<Enumerated<T>>, like so:


public static IEnumerable<Enumerated<T>> Enumerate<T>(this IEnumerable<T> e)
{
int i = 0;
return e.Select(it => new Enumerated<T>(i++, it));
}

And Enumerated<T> is just a necessary evil to appease the C# compiler:


class Enumerated<T>
{
private readonly int _number;
private readonly T _;
public Enumerated(int number, T t)
{
_number = number;
_ = t;
}
public int Number
{
get { return _number; }
}
public T Item
{
get { return _; }
}
}

view raw

Enumerated.cs

hosted with ❤ by GitHub

It is easy to augment types with arbitrary information this way; sequence numbers is just one example. For a general solution, though, you probably wouldn’t want to keep writing these plumbing wrappers like Enumerated<T>. It’s not just that your brain would go numb, you also need something more versatile, something that’s not bound to the specific type of information you’re augmenting with. The task-specific types are an obstacle to a simple, generic and flexible implementation.

A solution is to use the Tuple<T1, T2> type introduced in .NET 4. It’s sort of a compromise, though, and I don’t quite like it. Since it is a generic tuple, the names of the properties are meaningless (Item1 and Item2), and I believe rather firmly that names should be meaningful. However, using the Tuple<T1, T2> type makes it very easy to generalize the augmentation process. Here’s how you could go about it:


public static IEnumerable<Tuple<T, T1>> Augment<T, T1>(this IEnumerable<T> e, Func<T1> aug)
{
return e.Select(it => Tuple.Create(it, aug()));
}

view raw

Augment.Ext.cs

hosted with ❤ by GitHub

You can use Augment directly, like so:


foreach (var it in items.Augment(() => Guid.NewGuid()))
{
Console.WriteLine("{0} => {1}", it.Item2, it.Item1);
}

view raw

Augment.Use.cs

hosted with ❤ by GitHub

In this case, I’m augmenting each item with a Guid. Here’s the output:

Csharp-augment-prompt

This is convenient for one-off scenarios. If you’re going to augment types the same way multiple times, though, you might go through the trouble of defining some extension methods:


public static IEnumerable<Tuple<T, int>> Enumerate<T>(this IEnumerable<T> e)
{
int i = 0;
return Augment(e, () => i++);
}
public static IEnumerable<Tuple<T, DateTime>> WithTimestamps<T>(this IEnumerable<T> e)
{
return Augment(e, () => DateTime.Now);
}
public static IEnumerable<Tuple<T, Guid>> WithGuids<T>(this IEnumerable<T> e)
{
return Augment(e, Guid.New);
}

view raw

Augmenters.cs

hosted with ❤ by GitHub

And so on and so forth, for all your clever augmentation needs.

Then your code would look like this:


foreach (var it in items.WithGuids())
{
Console.WriteLine("{0} => {1}", it.Item2, it.Item1);
}

Which is pretty neat. If you can stomach Item1 and Item2, that is.


Patching polymorphic pain at runtime

In the last post, we saw that data binding in ASP.NET doesn’t support polymorphism. We also saw that we could mitigate the problem by using simple wrapper types. Writing such wrappers by hand won’t kill you, but it is fairly brain-dead. I mentioned that an alternative would be to generate the wrappers at runtime, using reflection. That actually sounds like a bit of fun, so let’s see how it can be done. If nothing else, it’s a nice introductory lesson in using Reflection.Emit.

Comeback of the canines

As an example, let’s revisit our two four-legged friends and one enemy from the previous post: the Dog, the Chihuahua and the Wolf. They all implement ICanine.

The canines have gained a skill since last time, though – they can now indicate whether or not they’ll enjoy a particular kind of food. The code looks like this:


public enum Food { Biscuit, Meatballs, You }
public interface ICanine
{
string Bark { get; }
bool Eats(Food f);
}
public class Wolf : ICanine
{
public virtual string Bark { get { return "Aooo!"; } }
public bool Eats(Food f) { return f != Food.Biscuit; }
}
public class Dog : ICanine
{
public virtual string Bark { get { return "Woof!"; } }
public virtual bool Eats(Food f) { return f != Food.You; }
}
public class Chihuahua : Dog
{
public override string Bark { get { return "Arff!"; } }
public override bool Eats(Food f) { return f == Food.Biscuit; }
}

view raw

Canines.cs

hosted with ❤ by GitHub

What we want to do in our web application is display a grid that shows the canine’s eating preferences as well as its bark. This calls for a combination of auto-generated and custom columns: an automatic one for the Bark property, and a custom one for each kind of food.

The DataGrid is declared in the .aspx page:


<asp:DataGrid
ID="_grid"
runat="server"
AutogenerateColumns="true"
FontSize="X-Large"
FontNames="Consolas"
HeaderStyleBackColor="LightBlue" />

view raw

DataGrid.cs

hosted with ❤ by GitHub

This gives us a column for the Bark out of the box.

In the code-behind, we add a column for each kind of food. We also get a list of canines, which we wrap in something called an BoxEnumerable<ICanine> before binding to it.


protected void Page_Load(object sender, EventArgs e)
{
GetGridColumns().ForEach(f => _grid.Columns.Add(f));
_grid.DataSource = new BoxEnumerable<ICanine>(GetCanines());
_grid.DataBind();
}
private static List<DataGridColumn> GetGridColumns()
{
return new List<DataGridColumn>
{
new TemplateColumn
{
HeaderText = "Biscuits?",
ItemTemplate = new FoodColumnTemplate(Food.Biscuit)
},
new TemplateColumn
{
HeaderText = "Meatballs?",
ItemTemplate = new FoodColumnTemplate(Food.Meatballs)
},
new TemplateColumn
{
HeaderText = "You?",
ItemTemplate = new FoodColumnTemplate(Food.You)
}
};
}
private static IEnumerable<ICanine> GetCanines()
{
return new List<ICanine> {new Dog(), new Wolf(), new Chihuahua() };
}

view raw

GridColumns.cs

hosted with ❤ by GitHub

The food preference columns use an ItemTemplate called FoodColumnTemplate. It’s a simple example of data binding which goes beyond mere properties, since we’re invoking a method on the data item:


class FoodColumnTemplate : ITemplate
{
private readonly Food _food;
public FoodColumnTemplate(Food food)
{
_food = food;
}
public void InstantiateIn(Control container)
{
var label = new Label();
label.DataBinding += OnDataBinding;
container.Controls.Add(label);
}
private void OnDataBinding(object sender, EventArgs e)
{
var label = (Label) sender;
var row = (DataGridItem) label.NamingContainer;
var canine = (ICanine) row.DataItem;
label.Text = canine.Eats(_food) ? "Yes" : "No";
}
}

If we run the application, we get the result we wanted:

Foods-result

Without the presence of the BoxEnumerable<ICanine> above, though, we’d have a runtime exception at our hands. Under the covers, BoxEnumerable<ICanine> is producing the necessary wrappers around the actual canines to keep the DataGrid happy.

How it works

Let’s see how we can do this. Here’s an overview of the different moving parts:

Box-overview

That’s a fair amount of types, but most of them have trivial implementations. Consider BoxEnumerable<T> first:


public class BoxEnumerable<T> : IEnumerable<Box<T>>
{
private readonly IEnumerable<T> _;
public BoxEnumerable(IEnumerable<T> e)
{
_ = e;
}
public IEnumerator<Box<T>> GetEnumerator()
{
return new BoxEnumerator<T>(_.GetEnumerator());
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}

As you can see, it’s really the simplest possible wrapper around the original IEnumerable<T>, turning it into an IEnumerable<Box<T>>. It relies on another wrapper type, BoxEnumerator<T>:


public class BoxEnumerator<T> : IEnumerator<Box<T>>
{
private readonly IEnumerator<T> _;
private readonly BoxFactory<T> _factory = new BoxFactory<T>();
public BoxEnumerator(IEnumerator<T> e)
{
_ = e;
}
public void Dispose()
{
_.Dispose();
}
public bool MoveNext()
{
return _.MoveNext();
}
public void Reset()
{
_.Reset();
}
public Box<T> Current
{
get { return _factory.Get(_.Current); }
}
object IEnumerator.Current
{
get { return Current; }
}
}

That too is just a minimal wrapper. The only remotely interesting code is in the Current property, where a BoxFactory<T> is responsible for turning the T instance into a Box<T> instance. BoxFactory<T> looks like this:


public class BoxFactory<T>
{
private readonly Box<T> _ = EmptyBoxFactory.Instance.CreateEmptyBox<T>();
public Box<T> Get(T t)
{
return _.Create(t);
}
}

view raw

BoxFactory.cs

hosted with ❤ by GitHub

This is short but a little weird, perhaps. For fun, we’re adding a dash of premature optimization here. We’re using EmptyBoxFactory to create an “empty” instance of Box<T> (that is, without an instance of T inside). The BoxFactory<T> holds on to that empty instance for the rest of its lifetime, and uses it to create “populated” boxes. In other words, the initial empty box acts as a prototype for all subsequent boxes. That way, we avoid using reflection more than once to create the boxes. This should make people who fear the performance penalty of reflection a little happier. Let’s see how the prototype creates populated boxes for the factory:


public Box<T> Create(T t)
{
var box = (Box<T>) MemberwiseClone();
box._ = t;
return box;
}

view raw

CreateBox.cs

hosted with ❤ by GitHub

Easy as pie, we’re just cloning and setting the protected T field. Doesn’t get much simpler than that.

It’s time to start worrying about the box itself, though. Of course, this is where things get both non-trivial and interesting.

So the goal is to create a type at runtime. The type should be used to wrap each item in an IEnumerable<T>, so that the control’s DataSource is set to a perfectly homogenous IEnumerable. That is, it will only contain instances of the same concrete type. The wrapper type won’t have any intelligence of its own, it will merely delegate to the wrapped instance of T.

To support auto-generation of columns, the wrapper type must have the same public properties as T. (We won’t consider the option of masking or renaming properties – that’s a use case that goes beyond just fixing what is broken.) In the case of T being an interface, a viable option would be for the wrapper type to implement T. However, we need the wrapper to work for all kinds of T, including when T is a base class with one or more non-virtual members. In the general case, therefore, the wrapper must simply mimic the same properties, duck typing-style.

Auto-generation of columns is pretty nifty, and a property-mimicking wrapper is sufficient for that scenario. For more sophisticated data binding scenarios, though, you need to be able to call arbitrary methods on the item we’re binding to. To do so in the general case (where T might be a class), we need some way of shedding the wrapper. We can’t simply call the methods on the wrapper itself, since we don’t have access to the name of the dynamically generated wrapper type at compile time. The C# compiler wouldn’t let us (well, we could use dynamic, but then we’re giving up static typing). So we’ll be using an Unwrap method, giving us access to the bare T. (Note that we can’t use a property, since that would show up when auto-generating columns!)

Now how can we call Unwrap if the type doesn’t even exist at compile time? Well, we know that there’s a small set of core capabilities that all wrapper types are going to need: the wrapped instance of T, and a way of wrapping and unwrapping T. So let’s create an abstract base class containing just that:


abstract class Box<T>
{
protected T _;
public T Unwrap() { return _; }
public Box<T> Create(T t)
{
var box = MemberwiseClone();
box._ = t;
return box;
}
}

view raw

Box.cs

hosted with ❤ by GitHub

That way, we can always cast to Box<T>, call Unwrap, and we’re good.

Why are we calling it a “box”, by the way? It’s sort of a tip of the hat to academia, of all things. According to this paper on micro patterns, a “box” is “a class which has exactly one, mutable, instance field”. That suits our implementation to a T (hah!) so “box” it is.

The concrete box for our example should conceptually look like this:


public class BoxedICanine : Box<ICanine>, ICanine
{
public string Bark
{
get { return _.Bark; }
}
public bool Eats(Food f)
{
return _.Eats(f);
}
}

view raw

BoxedICanine.cs

hosted with ❤ by GitHub

Of course, the boxes we generate at runtime will never actually have a C# manifestation – they will be bytecode only. At this point though, the hand-written example will prove useful as target for our dynamically generated type.

Note that we’re going to try to be a little clever in our implementation. In the case where T is an interface (like ICanine), we’re going to let the dynamically generated box implement the original interface T, in addition to extending Box<T>. This will allow us to pretend that the box isn’t even there during data binding. You might recall that we’re casting to ICanine rather than calling Unwrap in the FoodColumnTemplate, even though the data item is our dynamically generated type rather than the original canine. Obviously we won’t be able to pull off that trick when T is a class, since C# has single inheritance.

Looking at the bytecode for BoxedICanine in ILDASM, ILSpy or Reflector, you should see something like this (assuming you’re doing a release compilation):


.class public auto ansi beforefieldinit BoxedICanine
extends PolyFix.Lib.Box`1<class PolyFix.Lib.ICanine>
implements PolyFix.Lib.ICanine
{
.method public hidebysig specialname rtspecialname instance void .ctor() cil managed
{
.maxstack 8
L_0000: ldarg.0
L_0001: call instance void PolyFix.Lib.Box`1<class PolyFix.Lib.ICanine>::.ctor()
L_0006: ret
}
.method public hidebysig newslot virtual final instance bool Eats(valuetype PolyFix.Lib.Food f) cil managed
{
.maxstack 8
L_0000: ldarg.0
L_0001: ldfld !0 PolyFix.Lib.Box`1<class PolyFix.Lib.ICanine>::_
L_0006: ldarg.1
L_0007: callvirt instance bool PolyFix.Lib.ICanine::Eats(valuetype PolyFix.Lib.Food)
L_000c: ret
}
.method public hidebysig specialname newslot virtual final instance string get_Bark() cil managed
{
.maxstack 8
L_0000: ldarg.0
L_0001: ldfld !0 PolyFix.Lib.Box`1<class PolyFix.Lib.ICanine>::_
L_0006: callvirt instance string PolyFix.Lib.ICanine::get_Bark()
L_000b: ret
}
.property instance string Bark
{
.get instance string PolyFix.Lib.BoxedICanine::get_Bark()
}
}

view raw

BoxedICanine.il

hosted with ❤ by GitHub

This, then, is what we’re aiming for. If we can generate this type at runtime, using ICanine as input, we’re good.

IL for beginners

If you’re new to IL, here’s a simple walk-through of the get_Bark method. IL is a stack-based language, meaning it uses a stack to transfer state between operations. In addition, state can be written to and read from local variables.

The .maxstack 8 instruction tells the runtime that a stack containing a eight elements will be sufficient for this method (in reality, the stack will never be more than a single element deep, so eight is strictly overkill). That’s sort of a preamble to the actual instructions, which come next. The ldarg.0 instruction loads argument 0 onto the stack, that is, the first parameter of the method. Now that’s confusing, since get_Bark seems to have no parameters, right? However, all instance methods receive a reference to this as an implicit 0th argument. So ldarg.0 loads the this reference onto the stack. This is necessary to read the _ instance field, which happens in the ldfld !0 instruction that follows. The ldfld !0 pops the this reference from the stack, and pushes the reference held by the 0th field (_) back on. So now we got an reference to an ICanine on there. The following callvirt instruction pops the ICanine reference from the stack and invokes get_Bark on it (passing the reference as the implicit 0th argument, of course). When the method returns, it will have pushed its return value onto the stack. So there will be a reference to a string there. Finally, ret returns from the method, leaving the string reference on the stack as the return value from the method.

If you take a look at the Eats method next, you’ll notice it’s practically identical to get_Bark. That’s because we’re essentially doing the same thing: delegating directly to the underlying T instance referenced by the _ field.

Now, how can we generate stuff like this on the fly?

Creating a type at runtime

As you can see below, a .NET type lives inside a module that lives inside an assembly that lives inside an appdomain.

Appdomain-blue

So before we can start generating the actual type, we need to provide the right environment for the type to live in. We only want to create this environment once, so we’ll do it inside the constructor of our singleton EmptyBoxFactory:


private readonly ModuleBuilder _moduleBuilder;
private EmptyBoxFactory()
{
const string ns = "PolyFix.Boxes";
_moduleBuilder = Thread.GetDomain()
.DefineDynamicAssembly(new AssemblyName(ns), AssemblyBuilderAccess.Run)
.DefineDynamicModule(ns);
}

AssemblyBuilderAccess.Run indicates that we’re creating a transient assembly – it won’t be persisted to disk. We’re holding on to the module builder, which we’ll use when creating types later on. Assuming that we’ll be using the BoxEnumerable<T> in multiple data binding scenarios (for various Ts), the module will be accumulating types over time.

The public API of EmptyBoxFactory is limited to a single method, CreateEmptyBox. It uses reflection to create an instance of the appropriate type.


public Box<T> CreateEmptyBox<T>()
{
return (Box<T>)Activator.CreateInstance(GetBoxType<T>());
}

Creating the instance is simple enough (albeit slower than newing up objects the conventional way). The real work lies in coming up with the type to instantiate, so we need to move on! GetBoxType<T> looks like this:


private Type GetBoxType<T>()
{
var t = typeof(T);
string typeName = t.FullName + "Box";
foreach (var existingType in _moduleBuilder.GetTypes())
{
if (existingType.FullName == typeName)
{
return existingType;
}
}
return CreateBoxType(t, typeof (Box<T>), typeName);
}

view raw

GetBoxType.cs

hosted with ❤ by GitHub

We’re still treading the waters, though. Specifically, we’re just checking if the module already contains the suitable box type – meaning that we’ve been down this road before. Assuming we haven’t (and we haven’t, have we?), we’ll go on to CreateBoxType. Hopefully we’ll see something interesting there.


public Type CreateBoxType(Type t, Type boxType, string typeName)
{
var boxBuilder = _moduleBuilder.DefineType(
typeName, TypeAttributes.Public, boxType, t.IsInterface ? new[] {t} : new Type[0]);
var f = boxType.GetField("_", BindingFlags.Instance | BindingFlags.NonPublic);
return new BoxTypeFactory(t, boxBuilder, f).Create();
}

Oh man, it seems we’re still procrastinating! We haven’t reached the bottom of the rabbit hole just yet. Now we’re preparing for the BoxTypeFactory to create the actual type.

Two things worth noting, though. One thing is that if t is an interface, then we’ll let our new type implement it as mentioned earlier. This will let us pretend that the box isn’t even there during data binding. The other thing is that we’re obtaining a FieldInfo instance to represent the _ field of BoxType<T>, which as you’ll recall holds the instance of T that we’ll be delegating all our method calls and property accesses to. Once we have the FieldInfo, we can actually forget all about BoxType<T>. It’s sort of baked into the TypeBuilder as the superclass of the type we’re creating, but apart from that, BoxTypeFactory is oblivious to it.

But now! Now there’s nowhere left to hide. Let’s take a deep breath, dive in and reflect:


class BoxTypeFactory
{
private readonly Type _type;
private readonly TypeBuilder _boxBuilder;
private readonly FieldInfo _field;
private readonly Dictionary<string, MethodBuilder> _specials = new Dictionary<string, MethodBuilder>();
public BoxTypeFactory(Type type, TypeBuilder boxBuilder, FieldInfo field)
{
_type = type;
_boxBuilder = boxBuilder;
_field = field;
}
public Type Create()
{
foreach (MethodInfo m in _type.GetMethods())
{
if (!IsGetType(m)) CreateProxyMethod(m);
}
foreach (PropertyInfo p in _type.GetProperties())
{
ConnectPropertyToAccessors(p);
}
return _boxBuilder.CreateType();
}
private static bool IsGetType(MethodInfo m)
{
return m.Name == "GetType" && m.GetParameters().Length == 0;
}
private void CreateProxyMethod(MethodInfo m)
{
var parameters = m.GetParameters();
// Create a builder for the current method.
var methodBuilder = _boxBuilder.DefineMethod(m.Name,
MethodAttributes.Public | MethodAttributes.Virtual,
m.ReturnType,
parameters.Select(p => p.ParameterType).ToArray());
var gen = methodBuilder.GetILGenerator();
// Emit opcodes for the method implementation.
// The method should just delegate to the T instance held by the _ field.
gen.Emit(OpCodes.Ldarg_0); // Load 'this' reference onto the stack.
gen.Emit(OpCodes.Ldfld, _field); // Load 'T' reference onto the stack (popping 'this').
for (int i = 1; i < parameters.Length + 1; i++)
{
gen.Emit(OpCodes.Ldarg, i); // Load any method parameters onto the stack.
}
gen.Emit(m.IsVirtual ? OpCodes.Callvirt : OpCodes.Call, m); // Call the method.
gen.Emit(OpCodes.Ret); // Return from method.
// Keep reference to "special" methods (for wiring up properties later).
if (m.IsSpecialName)
{
_specials[m.Name] = methodBuilder;
}
}
private void ConnectPropertyToAccessors(PropertyInfo p)
{
var paramTypes = p.GetIndexParameters().Select(ip => ip.ParameterType).ToArray();
var pb = _boxBuilder.DefineProperty(p.Name, p.Attributes, p.PropertyType, paramTypes);
WireUpIfExists("get_" + p.Name, pb.SetGetMethod);
WireUpIfExists("set_" + p.Name, pb.SetSetMethod);
}
private void WireUpIfExists(string accessor, Action<MethodBuilder> wireUp)
{
if (_specials.ContainsKey(accessor))
{
wireUp(_specials[accessor]);
}
}
}

Oh. That’s almost anti-climatic – it’s not really hard at all. The Create method is super-simple: create proxy methods for any public methods in the type we’re wrapping, wire up any properties to the corresponding getter and/or setter methods, and we’re done! CreateProxyMethod seems like it might warrant some explanation; however, all we’re really doing is copying verbatim the IL we looked at in our walkthrough of get_Bark earlier. The wiring up of properties is necessary because a property consists of two parts at the IL level, a .property thing and a .method thing for each accessor. That, too, we saw in the IL of the hand-written class. So there’s really not much to it.

You might note that we’re explicitly not creating a proxy for the GetType method, defined on System.Object. This applies to the case where the type we’re boxing is a class, not an interface. In general, we shouldn’t proxy any non-virtual methods inherited from System.Object, but in practice that’s just GetType. So we’re taking the easy way out. (Note that the .NET runtime wouldn’t actually be fooled if we did inject a lying GetType implementation – it would still reveal the actual type of the object. Still, it’s better to play by the book.)

We will be providing proxies for virtual methods, though (e.g. Equals, GetHashCode and ToString). This makes the box as invisible as possible.

Afterthought: Anonymous types

There’s actually an alternative way of getting around the problem with broken polymorphism in simple scenarios. Rather than hand-writing your own wrapper or generating one at runtime, you can have the C# compiler generate one for you at compile time, using anonymous types. In fact, you can approximate a working solution for our example just by doing this in the code-behind:


protected void Page_Load(object sender, EventArgs e)
{
_grid.DataSource = GetCanines().Select(
c => new {
Biscuit = c.Eats(Food.Biscuit),
Meatballs = c.Eats(Food.Meatballs),
You = c.Eats(Food.You),
c.Bark
});
_grid.DataBind();
}

view raw

Canines.Anon.cs

hosted with ❤ by GitHub

Note that you don’t add any custom columns in this case, it’s all auto-generated. Running the application, you get this:

Food-result-anon

It’s not exactly the same as before, but it’s pretty close. Unfortunately, the approach isn’t very flexible – it breaks down as soon as you want to display something that’s not just text in the grid. For instance, say you want something like this:

Food-dropdown

Anonymous types won’t help you, but the runtime wrapper will (as will a hand-written one, of course). You just need a suitable ITemplate:


public class FoodListColumnTemplate : ITemplate
{
public void InstantiateIn(Control container)
{
var list = new DropDownList();
list.DataBinding += OnDataBinding;
container.Controls.Add(list);
}
private void OnDataBinding(object sender, EventArgs e)
{
var list = (DropDownList) sender;
var row = (DataGridItem) list.NamingContainer;
var canine = (ICanine) row.DataItem;
Action<Food> add = food => { if (canine.Eats(food)) { list.Items.Add(food.ToString()); } };
add(Food.Biscuit);
add(Food.Meatballs);
add(Food.You);
}
}

So…

Turns out that generating types at runtime is no big deal. It provides a flexible solution to the data binding problem, without the need for mindless hand-written wrappers.

As usual, let me know if you think there’s something wrong with the approach or the implementation. Also, I’d love to hear it if you have a different solution to the problem.