Optimize your delegate usage

Posted by Filip Ekberg on February 15 2013 19 Comments

Kudos to David Fowler for spotting this! We had a chat on JabbR and David pointed out something quite odd about delegates which he had discovered while optimizing some code.

Let’s assume that we have the following code that declares a delegate and a method that uses it:

public delegate void TestDelegate();

public void Bar(TestDelegate test)
{
    test();
}

Now consider that you want to run this method and pass a method for it to execute that corresponds with the delegate. The process of running this will be in a loop that runs for 10 000 iterations.

The method we want to run is called Foo and looks like the following:

public void Foo() { }

Everything is set up, so what is it that we need to optimize when calling this 10 000 times? Well we have two different ways of using the method with a delegate.

Option 1
The first option is that we can use an anonymous method to call this method looking like the following:

for (var i = 0; i < 10000; i++)
{
    Bar(() => Foo());
}

If we compile this and open it up in Reflector to see what is generated, there’s also some other stuff generated behind the scenes but this is the important part:

TestDelegate test = null;
for (int i = 0; i < 0x2710; i++)
{
    if (test == null)
    {
        test = () => this.Foo();
    }
    this.Bar(test);
}

Looks good so far, right? Let’s take a look at Option 2 and compare.

Option 2
The second option that we have is just writing the method name to tell it to use this like you can see here:

for (var i = 0; i < 10000; i++)
{
    Bar(Foo);
}

This one is quite common and I’ve seen it used a lot, but what happens behind the scenes here?

If we open this up in Reflector we can see that the following code was generated:

for (int i = 0; i < 0x2710; i++)
{
    this.Bar(new TestDelegate(this.Foo));
}

UmpOi

This is significantly different from the lambda one! Is your mind blown yet?

Ok let me break it down, it’s quite simple. What happens with option 2 is that it will create 10 000 instances of TestDelegate and thus using a lot more memory. The lambda version was optimized but the “normal” one wasn’t?

Let’s just verify that it actually does use a lot more memory! I’ve set the solution to compile in Release mode with Optimization turned on and I’m using the following code to test it:

public class Program
{
    public delegate void TestDelegate();

    public void Bar(TestDelegate test)
    {
        test();
    }
    public void Foo()
    { }

    public static void Main()
    {
        var program = new Program();
        GC.WaitForFullGCComplete(100000);
        Console.WriteLine("Memory usage before Lambda version:\t{0}", GC.GetTotalMemory(false));

        program.LambdaVersion();
        Console.WriteLine("Memory usage After Lambda version:\t{0}", GC.GetTotalMemory(false));

        GC.WaitForFullGCComplete(100000);
        Console.WriteLine("Memory usage before Normal version:\t{0}", GC.GetTotalMemory(false));

        program.NormalVersion();
        Console.WriteLine("Memory usage After Normal version:\t{0}", GC.GetTotalMemory(false));

    }
    public void LambdaVersion()
    {
        for (var i = 0; i < 10000; i++)
        {
            Bar(() => Foo());
        }
    }

    public void NormalVersion()
    {
        for (var i = 0; i < 10000; i++)
        {
            Bar(Foo);
        }
    }
}

Here’s the result from that operation:

Memory usage before Lambda version:     29460
Memory usage After Lambda version:      37652
Memory usage before Normal version:     37652
Memory usage After Normal version:      357140

Conclusion

If we use delegates “wrong” or don’t think what code is actually generated this can leave us with large memory imprints. Of course you always need to think about the code you write but in some cases you might not really know what the compiler ends up doing.

By using the lambda version instead in this case we’ve avoided to create a lot of new delegate instances and thus minimized the memory imprint.

Fun fact: If we compile the “normal version” using MonoDevelop and Mono (2.10.9) it results in the same output. Which leads me to think that this is by design. The only difference is when we compile the lambda version but nothing significant that changes the behavior at all.

Do you say this is a bug or a feature? Did you know it behaved like this?

Vote on HN

19 Responses to Optimize your delegate usage

  1. MoeNo Gravatar says:

    This is a very interesting behavior and a good thing to know!

  2. Johan van der VleutenNo Gravatar says:

    ReSharper advices to convert your option 1 (anonymous method) into option 2 (method group)

    See image:
    ReSharper giving wrong advice


    https://twitter.com/JohanVdVleuten/status/302417574568804352

  3. Giacomo Stelluti ScalaNo Gravatar says:

    Good to know,
    anyway that should be address by the compiler and btw I hope it will be.

  4. Pingback: Dew Drop – February 18, 2013 (#1,500) | Alvin Ashcraft's Morning Dew

  5. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1298

  6. DamienNo Gravatar says:

    It seems like an obvious improvement here would be for it to create the delegate once, outside the loop. However, that could subtly change some programs. You might think it’s safe since delegates are immutable, but there’s a subtle issue.

    Bar()

    might keep a reference to the delegate, and something might later

    lock()

    on it. Suddenly, the distinction between there being one instance or thousands matters.

  7. John AttenNo Gravatar says:

    Great post!

    This is very interesting behavior. Does anyone know if there is a good reason for the method group version not to be optimized in teh same manner as the anonymous version? You would think this type of thing would be a no-brainer for the compiler?

  8. Neal GafterNo Gravatar says:

    At the moment Roslyn duplicates the behavior of the native compiler (i.e. it has an optimization for the lambda case but not the “normal” case). Optimizing the “normal” version is on the list of things we’d like to do.

  9. NOtherDevNo Gravatar says:

    Note that the issue is valid only when calling the lambda or method group in a loop, which is not always the case. For single calls, there’s no benefit from using lambda and some smart people like Eric Lippert suggest to use method group by default as it allows to avoid one unnecessary level of indirection.

  10. thargyNo Gravatar says:

    Put another way this is one of the (many) reasons you should be careful not to ‘leave the monad’.

    Essentially you are mixing a structural paradigm (for) with a functional one (the lambda).

    You can stay/start in the monad by using Enumerable.Range instead of the for. Essentially, whenever you do a for loop you are implying that the contents should be repeated (so it is effectively by design). You are creating a delegate in the body, so the creation will be repeated.

    If you use Enumerable.Range you will be acting on a set of data by applying the function to each item, so only the application is repeated…

  11. Chris MarisicNo Gravatar says:

    I will still do Bar(Foo); each and every single time.

    Even if that’s a billion times slower than Bar(x=> Foo(x)), I still expect I will have other performance concerns before changing that will matter in the slightest.

  12. Pingback: Weekly Digest 2 | chodounsky

  13. JeromeNo Gravatar says:

    Interestingly, the lambda storage optimization does not apply when using a foreach loop… (using C# 5.0/.NET 4.5 RTM compiler)

  14. Pingback: Линкблог #15

  15. Filip EkbergNo Gravatar says:

    John, it’s according to the specification it seems. But as Neal mentioned this is something that the compiler team wants to fix.

  16. Daniel RoseNo Gravatar says:

    The differences between the two caused my application’s code to break. I wanted to invoke a method on the UI thread. When I changed from lambda syntax to method group, I got a “impossible” cross-thread exception (after all I was invoking on the correct thread). The reason was that getting the delegate is done on the target thread in the first cause, but calling thread in the second case.

    See http://stackoverflow.com/questions/8329026/using-c-sharp-method-group-executes-code

  17. JaredParNo Gravatar says:

    This method group conversion behavior is actually guaranteed by the C# Language spec in section 6.6

    A new instance of the delegate type D is allocated. If there is not enough memory available to allocate the new instance, a System.OutOfMemoryException is thrown and no further steps are executed.

    The spec makes no such mention of lambdas though which is why they can be cached.

  18. Joel LucsyNo Gravatar says:

    If you look at the IL for the () => Foo() version it still creates a TestDelegate.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>