Rich Newman

August 6, 2011

Closures in C#

Filed under: .net, c#, closure — Tags: , , — richnewman @ 3:08 am

Introduction

There seems to be some confusion about closures in C#: people are mystified as to what they are and there’s even an implication that they don’t work the way you’d expect.

As this short article will explain they are actually quite simple, and do work the way you’d expect if you’re an object oriented programmer.  The article will also briefly look at why there is confusion surrounding them, and discuss whether they are an appropriate tool in an object oriented program.

Closures

Wikipedia defines a closure as ‘a first-class function with free variables that can be bound in the lexical environment’.  What that means is that a ‘closure’ is a function that can access variables from the environment where it is declared without them being explicitly passed in as parameters.  In object-oriented programming a method in a class is a closure of sorts: it can access fields of the class directly without needing them to be passed in.  However, more usually in C# ‘closure’ refers to a function declared as an anonymous delegate that uses variables that are not explicitly passed into it, but are available at the point it is created.

Basic Example

Consider the code below:

        internal void MyFunction()
         {
             int x = 1;
             Action action = () => { x++; Console.WriteLine(x); };
             action();              // Outputs '2'
         }

Here we define an anonymous delegate (a function) ‘action’, and call it.  This increments x and outputs it to the console, even though x isn’t passed into it.  x is simply declared in the ‘lexical environment’ (the calling method).

‘action’ is a closure, and we can say it is ‘closed over’ x, and that x is an ‘upvalue’.  Note that it’s only a closure because of the way it uses x.  Not all anonymous delegates are closures.

By the way, in the Java community anonymous functions frequently are referred to as ‘closures’.  (The Java community has been debating whether ‘closures’ should be added to the language for some time.)

Where’s the Confusion?

The example above is pretty clear and simple.  So how has it caused confusion?

The answer is that the value of x in MyFunction after the ‘action’ call is now 2.  Furthermore, x is completely shared between MyFunction and the action delegate: any code that changes the value in one changes the value in the other.

Consider the code below:

        internal void MyFunction()
         {
             int x = 1;
             Action action = () => { x++; Console.WriteLine(x); };
             action();              // Outputs '2'
             x++;
             action();              // Outputs '4'
         }

Here we call ‘action’, increment x in the calling method (MyFunction), and then call ‘action’ again.  Overall we started with x at 1, and incremented it 3 times, twice in our ‘action’ delegate and once in the calling method.  So it’s no surprise that the shared variable ends up with a value of 4.

This shows that we can change our ‘upvalue’ in the calling method and it is then reflected in our next call to the function: x is genuinely shared in this example.

Whilst this is a little odd (see below) it’s perfectly logical: x is shared between the calling method and any calls to our ‘action’ function.  There’s only one version of x.

This isn’t the way closures work in most functional programming languages (not that you could easily implement the example above, since all variables are immutable in functional languages).  The concept of closures has come from functional languages, so many people are surprised to find them working this way in C#.  There is more on this later.

Closures and Scope

This becomes even odder if we allow the local variable x to go out of scope (which would usually lead to it being destroyed), but retain a reference to the delegate that uses it:

        internal void Run()
        {
             Action action = MyFunction();
             action();             // Outputs '5'
        }

        internal Action MyFunction()
        {
             int x = 1;
             Action action = () => { x++; Console.WriteLine(x); };
             action();              // Outputs '2'
             x++;
             action();              // Outputs '4'
             return action;
        }

Here our Run method retrieves the action delegate from MyFunction and calls it.  When it calls it the value of x is 4 (from the activity in MyFunction), so it increments that and outputs 5.  At this point MyFunction is out of scope so the local variable x would normally have been destroyed.

Again, logically this is what we’d expect, but it looks strange.

This also gives an indication that closures are hard to implement.

A Little Odd?

For an object-oriented programmer this is a little odd.  This is because we’re not used to being able to share local variables with a separate method in this way.

Normally if we want a method to act on a local variable we have to pass it in as a parameter.  If the local variable is a value type as above it gets copied, and changing it in the method will not affect its value in the calling method.  So if we wanted to use it in the calling method we’d have to pass it back explicitly as a return value or an output parameter.

Of course, there are good reasons why we don’t usually allow a method to access any variable in a calling method (quite apart from the practicalities of actually being able to do it with methods other than anonymous functions).  These are to do with encapsulation and ensuring we can maintain state in a way that’s easy to deal with.  We only really allow data to be shared at a class level, or globally if we’re using static variables, although in general we try to keep them to a minimum.

So in some ways closures of this kind break our usual object-oriented encapsulation.  My feeling is that they should be used sparingly in regular object-oriented code as a result.

Other writers have gone further than this, because if you don’t understand that an upvalue is fully shared between the anonymous function and the calling code you can get unexpected behaviour.  See, for example, this article ‘Closures in C# Can Be Evil’.

Conclusion

The concepts behind closures in C# are actually fairly straightforward.  However, if we use them it’s important we understand them and the effects on scope, or we may get behaviour we don’t expect.

Advertisements

1 Comment »

  1. […] a local variable in the calling method without explicitly passing it into our new method.  We can form a closure.  Since in this example we don’t need to pass our string in as a parameter we use a […]

    Pingback by Delegate Syntax in C# for Beginners « Rich Newman — February 7, 2012 @ 3:49 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: