In the previous post we looked at the documentation that came with Roslyn and how to create your first code analyzer. Now let’s take this a step further and start refactoring the code and look for more errors. Start off by create a new solution, don’t worry we’re going to re-use bits of the code from the previous post, but in a refactored manner!
I’ll call my project FECodeAnalyzer

Then I am going to rename the CodeIssueProvider to LocalDeclarationInspection and remove the body of GetIssues and replace it with return null;

Now we are ready to start thinking about how the analyzer should work, I want my GetIssues method to ask members if certain critera’s are followed or not. In this post I’ll look at the two following:
- Can the variable be a constant value instead?
- Is the variable used somewhere in the context?
The first one is what we looked at in the previous post, but I want to do some changes and break it out from the GetIssues method. First of all, there’s one thing that I mentioned in the previous post and this is how we determine that the Node actually is a LocalDeclarationStatementSyntax . I added an if statement to check if the type was what I wanted but it is a much faster way to add it to the class attribute ExportSyntaxNodeCodeIssueProvider instead like this:
[ExportSyntaxNodeCodeIssueProvider
("FECodeAnalyzer",
LanguageNames
.CSharp,
typeof(LocalDeclarationStatementSyntax
))]
Since I want to delegate the work from GetIssues and check if my current node follows up on all the critera’s that I require, I want to do some initial initialization of variables that I will pass to the methods. So this is what I want to limit to GetIssues:
- Have a
List<CodeIssue> that we want to fill with errors
- Load the semantic model so we don’t need to pass the document around
- Load the containing block and return null if there is none
- Measure the analysis bounds
- Create a data flow analysis, we don’t want to do this over and over again.
So this will look somewhat like this:
List
<CodeIssue
> issues
= new List
<CodeIssue
>();
var localDeclaration
= (LocalDeclarationStatementSyntax
)node
;
var semanticModel
= document
.GetSemanticModel();
var containingBlock
= localDeclaration
.FirstAncestorOrSelf<BlockSyntax
>();
if (containingBlock
== null) return issues
;
var analysisBounds
= TextSpan
.FromBounds(
start
: localDeclaration
.Span.End,
end
: containingBlock
.Span.End);
var dataFlowAnalysis
= semanticModel
.AnalyzeRegionDataFlow(analysisBounds
);
A side note here is that Roslyn don’t support yield returns yet, so that is why I declare and initialize a List of issues to return. Now the next thing I want to create here is a method to check if the variable can be made constant or not, this is my method signature for it:
private bool CanBeConst(LocalDeclarationStatementSyntax localDeclaration,
ISemanticModel semanticModel,
IRegionDataFlowAnalysis dataFlowAnalysis)
I can almost take the code from the previous sample with a small amount of modifications, instead of returning null, I am returning false and the last return is a true instead of a code issue object. Like this:
private bool CanBeConst(LocalDeclarationStatementSyntax localDeclaration,
ISemanticModel semanticModel,
IRegionDataFlowAnalysis dataFlowAnalysis)
{
if (localDeclaration.Modifiers.Any(SyntaxKind.ConstKeyword)) return false;
if (localDeclaration.Declaration.Variables.Any(v => v.InitializerOpt == null)) return false;
if (localDeclaration.Declaration.Variables
.Select(v => semanticModel.GetDeclaredSymbol(v))
.Any(s => dataFlowAnalysis.WrittenInside.Contains(s)))
{
return false;
}
if (localDeclaration.Declaration.Variables
.Select(v => v.InitializerOpt.Value)
.Select(i => semanticModel.GetSemanticInfo(i))
.Any(info => !info.IsCompileTimeConstant))
{
return false;
}
return true;
}
This means that I can write something like this is GetIssues:
if (CanBeConst
(localDeclaration, semanticModel, dataFlowAnalysis
))
{
issues
.Add(new CodeIssue
(CodeIssue
.Severity.Warning,
localDeclaration
.Span,
string.Format("{0} can be made constant",
localDeclaration
.Declaration.Variables.First().Identifier)));
}
This is pretty similar to the previous post, except that we are not clotting GetIssues with a lot of code. Also I can now add more issues to the iterator! So next up I want to check if the variable is used somewhere in the context, the context being the analysis bounds.
Consider this code where x is being unused:
As long as x is not used anywhere(read) it will be considered unused! So we can ask our data flow analysis for all reads that are inside the analysis bounds and we want to search for the local declarations variable, this can be done like this:
dataFlowAnalysis.ReadInside.Contains(
semanticModel.GetDeclaredSymbol(localDeclaration.Declaration.Variables.First()))
So the method that I want to have for checking for unused variables can look somewhat like this:
public bool IsNeverUsed(LocalDeclarationStatementSyntax localDeclaration,
ISemanticModel semanticModel,
IRegionDataFlowAnalysis dataFlowAnalysis)
{
if (dataFlowAnalysis.ReadInside.Contains(
semanticModel.GetDeclaredSymbol(localDeclaration.Declaration.Variables.First()))
)
return false;
return true;
}
Which means that in my GetIssues method I can add the following to check if the node is unused and then add an error message:
if (IsNeverUsed
(localDeclaration, semanticModel, dataFlowAnalysis
))
{
issues
.Add(new CodeIssue
(CodeIssue
.Severity.Warning,
localDeclaration
.Span,
string.Format("Variable {0} is declared but never used",
localDeclaration
.Declaration.Variables.First().Identifier)));
}
As you can see this is quite modular at the moment, we can add more of these checks to correspond with our rules! The last thing we do in the GetIssues method is to return the issue list. So all the methods will end up looking like this:
private bool CanBeConst
(LocalDeclarationStatementSyntax localDeclaration,
ISemanticModel semanticModel,
IRegionDataFlowAnalysis dataFlowAnalysis
)
{
if (localDeclaration
.Modifiers.Any(SyntaxKind
.ConstKeyword)) return false;
if (localDeclaration
.Declaration.Variables.Any(v
=> v
.InitializerOpt == null)) return false;
if (localDeclaration
.Declaration.Variables
.Select(v
=> semanticModel
.GetDeclaredSymbol(v
))
.Any(s
=> dataFlowAnalysis
.WrittenInside.Contains(s
)))
{
return false;
}
if (localDeclaration
.Declaration.Variables
.Select(v
=> v
.InitializerOpt.Value)
.Select(i
=> semanticModel
.GetSemanticInfo(i
))
.Any(info
=> !info
.IsCompileTimeConstant))
{
return false;
}
return true;
}
public bool IsNeverUsed
(LocalDeclarationStatementSyntax localDeclaration,
ISemanticModel semanticModel,
IRegionDataFlowAnalysis dataFlowAnalysis
)
{
if (dataFlowAnalysis
.ReadInside.Contains(
semanticModel
.GetDeclaredSymbol(localDeclaration
.Declaration.Variables.First()))
)
return false;
return true;
}
public IEnumerable
<CodeIssue
> GetIssues
(IDocument document,
CommonSyntaxNode node,
CancellationToken cancellationToken
)
{
List
<CodeIssue
> issues
= new List
<CodeIssue
>();
var localDeclaration
= (LocalDeclarationStatementSyntax
)node
;
var semanticModel
= document
.GetSemanticModel();
var containingBlock
= localDeclaration
.FirstAncestorOrSelf<BlockSyntax
>();
if (containingBlock
== null) return issues
;
var analysisBounds
= TextSpan
.FromBounds(
start
: localDeclaration
.Span.End,
end
: containingBlock
.Span.End);
var dataFlowAnalysis
= semanticModel
.AnalyzeRegionDataFlow(analysisBounds
);
if (CanBeConst
(localDeclaration, semanticModel, dataFlowAnalysis
))
{
issues
.Add(new CodeIssue
(CodeIssue
.Severity.Warning,
localDeclaration
.Span,
string.Format("{0} can be made constant",
localDeclaration
.Declaration.Variables.First().Identifier)));
}
if (IsNeverUsed
(localDeclaration, semanticModel, dataFlowAnalysis
))
{
issues
.Add(new CodeIssue
(CodeIssue
.Severity.Warning,
localDeclaration
.Span,
string.Format("Variable {0} is declared but never used",
localDeclaration
.Declaration.Variables.First().Identifier)));
}
return issues
;
}
And when debugging this ( by pressing F5 ), we can see the errors like this in a console application when the variable is both unused and can be made constant:

And look like this when the variable is not possible to make constant but is unused:

I hope you found this interesting, if you have any thoughts please leave a comment below!
?>
Vote on HN
Creating a basic code analysis with Roslyn
Posted by Filip Ekberg on October 23 2011 Leave a Comment
If you’ve installed the Roslyn CTP, you can go to the installation folder and look inside the Documentation folder, there’s a lot of interesting information here that you can make use of. I’ve got my documentation here:
C:\Program Files (x86)\Microsoft Codename Roslyn CTP\Documentation
Now there’s one document here that is a bit extra interesting, at least for me, it talks about how we can make basic code analysis with Roslyn ( How to Write a Quick Fix (CSharp).docx ). The basic idea is to identify whenever a variable can be made const. So for those of you that haven’t had the time to download and install Roslyn yet, I’ll show you how to do exactly that with the help of their sample. It’s essentially the same outcome and code as they use in their documentation, but I will try explain a little bit more about each piece and add some extra things as well. But be sure to check out the documentation that comes with Roslyn as well!
However, the sample in the document has an error to it so it doesn’t run out of the box!
First thing is to open up an instance of Visual Studio and create a new Code Issue project, I’ll call it MyFirstCodeIssueFix

This project comes with some code already so that you can get started, but we’re going to start looking at this from the beginning so let’s remove everything in
public IEnumerable<CodeIssue> GetIssues(IDocument document,
CommonSyntaxNode node,
CancellationToken cancellationToken)
Don’t confuse it to the other override where the second argument is a CommonSyntaxToken and not CommonSyntaxNode!
The first thing that we have to do is that we have to check if the node is what we expect it to be:
if (node
.GetType() != typeof(LocalDeclarationStatementSyntax
)) return null;
There is another way to do this though which is what they use in the documentation, in their example they restrict the entire class to only work with LocalDeclarationStatementSyntax like this:
[ExportSyntaxNodeCodeIssueProvider
("MyFirstCodeIssueFix",
LanguageNames
.CSharp,
typeof(LocalDeclarationStatementSyntax
))]
Using this will make it a bit faster, but for clearity I will not use it now. However when you have a lot of analyses going on you might want to break it all out and have one statement syntax per file. For instance, don’t analyze using blocks and variables in the same file.
The reason we check for LocalDeclarationStatementSyntax is because, we want to see if it is a local variable. This method will be invoked for each different node in the source that we are analyzing, so we will se UsingDirectiveSyntaxamong a lot of others.
The next two things that we are going to do is to cast the node parameter to its correct type and then check if it is already a constant type, if it is we don’t need to do anything at all with it
var localDeclaration = (LocalDeclarationStatementSyntax)node;
if (localDeclaration.Modifiers.Any(SyntaxKind.ConstKeyword)) return null;
Next up we will check if we can actually retrieve the code block surrounding the variable, so that we can actually analyze this block later on. Then we check if the variable actually has an initializer
var containingBlock = localDeclaration.FirstAncestorOrSelf<BlockSyntax>();
if (containingBlock == null) return null;
if (localDeclaration.Declaration.Variables.Any(v => v.InitializerOpt == null)) return null;
Now we want to get the semantic model and this is fetched from the document argument that is passed to the method:
var semanticModel = document.GetSemanticModel();
When we have the semantic model, we can get a little bit of information from it, in this case we want to see if the variable is initialized with a constant expression. This is done by selecting the actual value of the variable and see if the value is a compile time constant
if (localDeclaration.Declaration.Variables
.Select(v => v.InitializerOpt.Value)
.Select(i => semanticModel.GetSemanticInfo(i))
.Any(info => !info.IsCompileTimeConstant))
{
return null;
}
The next thing which is almost the last thing, is that we want to check if the variable is set to another value later down in the code, so to do this we need to analyze the code block after the current variable to see if it occurs more than once.
So we define the bounds for where we want to analyze:
var analysisBounds = TextSpan.FromBounds(
start: localDeclaration.Span.End,
end: containingBlock.Span.End);
Note that if you were to set localDeclaration.Span.Start instead, we would include the current variable in the check and thus always have a true statement for our next test! So now we can create a data flow analyzer for this and search through it for any new occurrences of the variable like this:
var dataFlowAnalysis = semanticModel.AnalyzeRegionDataFlow(analysisBounds);
if (localDeclaration.Declaration.Variables
.Select(v => semanticModel.GetDeclaredSymbol(v))
.Any(s => dataFlowAnalysis.WrittenInside.Contains(s)))
{
return null;
}
So by now we’ve completed the check and if we’ve come this far, there is an error, the variable can be made constant, so what we do now is we return an error saying what is wrong
return new[]
{
new CodeIssue
(CodeIssue
.Severity.Warning, localDeclaration
.Span,
string.Format("{0} can be made constant",
localDeclaration
.Declaration.Variables.First().Identifier))
};;
How do we test this bad boy?, if you press F5 you’ll get a new instance of Visual Studio 2010, this is exactly what we want. Now create a console application in this new instance and write the following in the main method:
And this is what you should see:

Here’s the entire GetIssues method:
public IEnumerable
<CodeIssue
> GetIssues
(IDocument document,
CommonSyntaxNode node,
CancellationToken cancellationToken
)
{
if (node
.GetType() != typeof(LocalDeclarationStatementSyntax
)) return null;
var localDeclaration
= (LocalDeclarationStatementSyntax
)node
;
if (localDeclaration
.Modifiers.Any(SyntaxKind
.ConstKeyword)) return null;
var containingBlock
= localDeclaration
.FirstAncestorOrSelf<BlockSyntax
>();
if (containingBlock
== null) return null;
if (localDeclaration
.Declaration.Variables.Any(v
=> v
.InitializerOpt == null)) return null;
var semanticModel
= document
.GetSemanticModel();
if (localDeclaration
.Declaration.Variables
.Select(v
=> v
.InitializerOpt.Value)
.Select(i
=> semanticModel
.GetSemanticInfo(i
))
.Any(info
=> !info
.IsCompileTimeConstant))
{
return null;
}
var analysisBounds
= TextSpan
.FromBounds(
start
: localDeclaration
.Span.End,
end
: containingBlock
.Span.End);
var dataFlowAnalysis
= semanticModel
.AnalyzeRegionDataFlow(analysisBounds
);
if (localDeclaration
.Declaration.Variables
.Select(v
=> semanticModel
.GetDeclaredSymbol(v
))
.Any(s
=> dataFlowAnalysis
.WrittenInside.Contains(s
)))
{
return null;
}
return new[]
{
new CodeIssue
(CodeIssue
.Severity.Warning, localDeclaration
.Span,
string.Format("{0} can be made constant",
localDeclaration
.Declaration.Variables.First().Identifier))
};
}
I hope you found this interesting, if you have any thoughts please leave a comment below!
?>
Vote on HN
Getting all methods from a code file with Roslyn
Posted by Filip Ekberg on October 21 2011 Leave a Comment
In the previous post we started looking at Roslyn and let’s continue on this topic and see what else we can get out of it! I want to take a look at how we can retrieve all methods and get some information about them. I’ve added another method to the Person-class so it looks like this now:
public class Person
{
public string Name { get; private set; }
public Person(string name)
{
Name = name;
}
public void Evaporate()
{
}
public string Speak()
{
string str = "test";
return string.Format("Hello! My name is{0}",
Name);
}
}
We’ve already got the tree-structure and the root node so let’s just use that. Everything is represented as a SyntaxNode so we need to get all the descending nodes that are methods, methods are declared as MethodDeclarationSyntax. So all methods are retrieved like this:
IEnumerable<MethodDeclarationSyntax> methods = tree.Root
.DescendentNodes()
.OfType<MethodDeclarationSyntax>().ToList();
No we can just iterate over this:
foreach(var method in methods)
{
}
However, you might be a bit confused as to how you print the method name, because there’s not a Name-property on the object! Instead there is something called an Identifier that we can use:
foreach(var method in methods)
{
Console.WriteLine(method.Identifier);
}
This will print all methods and not including the constructors, if we want to get the constructors we ask for ConstructorDeclarationSyntax instead of MethodDeclarationSyntax. We can get a lot of interesting things from the method-object in the iterator, we can ask about the parameters, the return type and a lot of other nice things.
I hope you found this interesting, if you have any thoughts please leave a comment below!
?>
Vote on HN
Using Roslyn to parse C# code files
Posted by Filip Ekberg on October 20 2011 Leave a Comment
A couple of days ago Microsoft released something called the Roslyn Project and it is now in it’s CTP state, just as Async! But what is Roslyn and what can it be used for? In the previous post I talked about how I wrote assembler that was generated from an application that parsed some programming language, but what I didn’t do was the actual parsing of code. Actually parsing code is not only relevant when you want to write a compiler, it is also useful when you want to evaluate how good a certain chunk of code is.
There are a lot of really good software out there that will help you analyze your code, some of them analyze the code after it has been compile such as a software called NDepend, which is a really good tool. Another program that is commonly used is ReSharper, from what I know, ReSharper analyzes the code structure without actually compiling it all the way down to IL.
You can call this parsing+evaluating, when you add the extra step that actually generates new code I would call it a compiler. However, let’s get back to Roslyn. So the project places themselves on the market saying that before roslyn the C# and VB.NET compiler were just a black box with no integration capabilities, what roslyn does is that it opens up the black box and allowing an interface between your code and the compiler.
What this means is that you can parse a code file that haven’t been compiled yet and get a nice structure out of it that you can do whatever you like with. To get started with Roslyn, this is what you need to do:
When all this is installed, open up Visual Studio and create a new Roslyn C# Console Application

Now create a new folder called ToParse and add a class to it with some fields and methods

So now we have this Person class that we want to parse, here’s to code so you can just copy/paste it:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace HelloWorldAnalyzer.ToParse
{
public class Person
{
public string Name { get; private set; }
public Person(string name)
{
Name = name;
}
public string Speak()
{
return string.Format("Hello! My name is{0}",
Name);
}
}
}
Now go back to the main method and let’s get started with the parsin!
The first thing that I want to do is to just read the text inside the file, for the purpose of this example I’ll do it like this:
var code
= new StreamReader
("..\\..\\ToParse\\Person.cs").ReadToEnd();
We go back two folders ( ..\..\ ) because the application will run from the bin\Debug\ folder!
Next up we want to get something that is called a SyntaxTree from our code, here’s a good illustration of what a SyntaxTree is:

So we get this tree by doing this:
SyntaxTree tree = SyntaxTree.ParseCompilationUnit(code);
Now we want to retrieve the root of the tree:
var root = (CompilationUnitSyntax)tree.Root;
Next up I want to print all the using-blocks in the class file that we just parsed on the CompilationUnitSyntax instance we have a property called Usings, we can use this to get a list of UsingDirectiveSyntax
foreach(var usingBlock in root.Usings)
{
Console.WriteLine("Using block: {0}", usingBlock.Name);
}
This will print all the using blocks and will result in something like this:
Using block: System
Using block: System.Collections.Generic
Using block: System.Linq
Using block: System.Text
Now let’s do something a bit more fun, let’s retrieve all the nodes in the syntax tree and look for a LiteralExpressionSyntax, which actually will be the string inside the Speak() method!
We do this by first getting all the descendent nodes from the root and just get the first literal expression syntax that we find:
var personSpoke = root.DescendentNodes()
.OfType<LiteralExpressionSyntax>()
.FirstOrDefault();
If we write this to the console as well we should see the following:

This is just the bare surface of what you can do with Roslyn, there are some Very interesting resources to look through. Here’s an MSDN page with a lot of documents on how you do certain things in Roslyn. Be sure to check that out! So far we’ve just done the parsing step, but you can also do compilation with it since it exposes all the different steps of the C# and VB.NET compiler.
I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!
?>
Vote on HN
When can knowing about IL and the internals be useful?
Posted by Filip Ekberg on October 19 2011 Leave a Comment
Lately we’ve been looking a lot on how you can use IL to create types at runtime, the same IL that we have emitted at runtime is what the C# compiler compiles the C# code to. So the first thing that comes to mind when someone asks me when knowing about IL and the internals of C# can be useful is when you want to write a compiler. Let’s say that you for educational purposes want to create a very simple language and you want to be statically typed and usable by other .NET languages, then compiling the code to MSIL will allow you to do just that.
Back when I studied for my bachelor in Software Engineering I attended a course the last year that was called “Advance UNIX Programming” one of the laboratory practicals that we did in this course was to complete a compiler, and when I say to complete it I do not mean write it from scratch in this case. We had other courses that covered the depth of compiler technology, however in this particular programming course we were given an application that parsed a programming language and our assignment was to make it generate the correct assembler.
The assembler that we generated was in this case GAS and the custom programming language they called “Calc” looked like this:
a=732;
b=2684;
while(a != b) {
if(a > b) {
a=a-b;
} else {
b=b-a;
}
}
print a;
print a gcd b;
As you can see on the bottom we also had to create some library functions such as “gcd“. Basically we only had one file that we had to edit in, or to be even more specific we did everything inside one big method that had a switch case that told us what kind of operation that we were currently working with. So we had a lot of parsing done already, there are a lot of tools out there if you want to parse a programming language, there’s something called lexical analyzing that you can look into.
In this laboratory practical the purpose was to spit out correct assembler and show that we had somewhat a knowledge of how the stack worked and how to use certain operations, here are two examples take from that laboratory practical, the actual programming language used in the generator is C:
case '+': printf("\tpopl\t%%eax\n\tpopl\t%%ebx\n\tadd\t%%ebx, %%eax \n\tpushl\t%%eax\n"); break;
case '-': printf("\tpopl\t%%ebx\n\tpopl\t%%eax\n\tsub\t%%ebx, %%eax \n\tpushl\t%%eax\n"); break;
So in this big switch when it has found a + and a – this is what is done. It might be a bit hard to see in that printf-statement, but here’s the generated assembler:
popl %eax
popl %ebx
add %ebx, %eax
pushl %eax
As you can see here, it’s pretty similar to MSIL. Now this might not normally be how generate the assembler, you might have it a little bit nicer in the actual compiler than just one-liners, but I just wanted to show you some really old stuff.
I won’t go through how you write a compiler here, it’s a very broad topic and take a very long time to master, if you like to read about deep level stuff like how the people behind the compilers think, read Eric Lipperts blog..
Essentially a compiler is just a program that translates your text into some other format that is either readable by another compiler or directly by a machine. In the laboratory practical that I shared with you above, we actually generated GAS and compiled it with GCC to make it executable.
So everything we’ve been looking at lately is directly applicable if you want to start off by writing a compiler for a new language. I really hope you’ve enjoyed the reflection series that has been focusing on IL generations and dynamic methods, there will most certainly be more of them around in the future.
I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!
?>
Vote on HN
Creating a Dynamic Method that uses a Switch
Posted by Filip Ekberg on October 18 2011 1 Comment
We’ve looked at some very interesting Dynamic Method creation in the last posts and let’s keep this up! We’ve looked at how we can compare values and then jump somewhere if the comparison is evaluates to true. Now let’s take a look at how we can implement a switch!
So I was thinking that I wanted to create the following method as a dynamic method:
int Calculate(int a, int b, int operation)
{
switch(operation)
{
case 0:
return a + b;
case 1:
return a * b;
case 2:
return a / b;
case 3:
return a - b;
default:
return 0;
}
}
Basically it takes three integers, the first two integers are the ones that I do the mathematical operation on and the last argument tells the switch statement which of the operation to run. We already know how to do almost everything that we are about to look at, these are the following operation codes that we’ve seen before that we’re going to see again in this post:
There are just two operations that we haven’t looked at yet that we are going to use here and that is OpCodes.Switch and OpCodes.Br_S, we’ll see how these are used a bit further down.
But first off let’s take a look at how we get started, as we’ve done before we create our DynamicMethod instance:
Type
[] methodArguments
= {
typeof(int),
typeof(int),
typeof(int)
};
var calculateMethod
= new DynamicMethod
(
"Calculate",
typeof(int),
methodArguments,
typeof(Program
).Module);
The method will have three arguments and a return type of integer and we’re going to call it “Calculate”. Next up get the IL-generator:
var il = calculateMethod.GetILGenerator();
Let’s start emitting some IL!
However, before we do that, we need to define a couple of targets for our switch to jump to, if you’ve read the previous post you might remember that an if-statement evaluates something and if it’s true, it jumps into it’s context. It’s the same with a switch-statement.
You provide a value that the switch statement bases it’s evaluation on and then you give it entry points to where it can jump. By looking at the method we are trying to convert we have 4 different labels to where we want to jump and one default label.
So first of all we create our 5 labels like this:
var defaultCase
= il
.DefineLabel();
var endOfMethod
= il
.DefineLabel();
Label
[] jumpTable
= new [] { il
.DefineLabel(),
il
.DefineLabel(),
il
.DefineLabel(),
il
.DefineLabel() };
Now we’ve prepared the jump table, so we can start look at the beginning of the method. And the first thing we want to do in the method is to perform a switch on our third argument, so what we need to do is to add the third argument to the evaluation stack and call the switch operation:
// Perform switch
il.Emit(OpCodes.Ldarg_2);
il.Emit(OpCodes.Switch, jumpTable);
The next thing we need to do is, if the switch didn’t jump anywhere, we need to define the default case and that is done like this:
il.Emit(OpCodes.Br_S, defaultCase);
We just emit the call because if it goes pass the Switch-operation, it didn’t find what it was looking for! Now next up is four almost identical operations. What we do is that we first of all mark a label saying where the current case is and then we load our two first arguments on to the evaluation stack and then we perform the mathematical operation and jump to the end of the method, it will look something like this:
// Case 0 - Perform Add on Ldarg_0 and Ldarg_1
il.MarkLabel(jumpTable[0]);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldarg_1);
il.Emit(OpCodes.Add);
il.Emit(OpCodes.Br_S, endOfMethod);
Can you guess what the next case will look like?
Actually I am just going to give you the next three math operations!
// Case 1 - Perform Mul on Ldarg_0 and Ldarg_1
il.MarkLabel(jumpTable[1]);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldarg_1);
il.Emit(OpCodes.Mul);
il.Emit(OpCodes.Br_S, endOfMethod);
// Case 2 - Perform Div on Ldarg_0 and Ldarg_1
il.MarkLabel(jumpTable[2]);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldarg_1);
il.Emit(OpCodes.Div);
il.Emit(OpCodes.Br_S, endOfMethod);
// Case 3 - Perform Sub on Ldarg_0 and Ldarg_1
il.MarkLabel(jumpTable[3]);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldarg_1);
il.Emit(OpCodes.Sub);
il.Emit(OpCodes.Br_S, endOfMethod);
As you can see, the only difference is the operation for the math operation and the label that we mark! Now we’ve almost reached the end of the method, so we need to define what happens if it was a default value or not and then return our value, since all the above operations have put the value on the evaluation stack, we are good to go!
il.MarkLabel(defaultCase);
il.Emit(OpCodes.Ldc_I4, 0);
il.MarkLabel(endOfMethod);
il.Emit(OpCodes.Ret);
If it’s the default case, all we do is add a 4 byte integer to the evaluation stack with the value 0 and then return from the method!
Let’s try it out!
var calculate
=
(Func
<int,
int,
int,
int>)calculateMethod
.CreateDelegate(typeof(Func
<int,
int,
int,
int>));
var result
= calculate
(1,
2,
0); // 3
result
= calculate
(1,
2,
1); // 2
result
= calculate
(1,
2,
2); // 0
result
= calculate
(1,
2,
3); // -1
Here’s the entire code that I used above:
Type
[] methodArguments
= {
typeof(int),
typeof(int),
typeof(int)
};
var calculateMethod
= new DynamicMethod
(
"Calculate",
typeof(int),
methodArguments,
typeof(Program
).Module);
var il
= calculateMethod
.GetILGenerator();
var defaultCase
= il
.DefineLabel();
var endOfMethod
= il
.DefineLabel();
Label
[] jumpTable
= new [] { il
.DefineLabel(),
il
.DefineLabel(),
il
.DefineLabel(),
il
.DefineLabel() };
// Perform switch
il
.Emit(OpCodes
.Ldarg_2);
il
.Emit(OpCodes
.Switch, jumpTable
);
// Default case
il
.Emit(OpCodes
.Br_S, defaultCase
);
// Case 0 - Perform Add on Ldarg_0 and Ldarg_1
il
.MarkLabel(jumpTable
[0]);
il
.Emit(OpCodes
.Ldarg_0);
il
.Emit(OpCodes
.Ldarg_1);
il
.Emit(OpCodes
.Add);
il
.Emit(OpCodes
.Br_S, endOfMethod
);
// Case 1 - Perform Mul on Ldarg_0 and Ldarg_1
il
.MarkLabel(jumpTable
[1]);
il
.Emit(OpCodes
.Ldarg_0);
il
.Emit(OpCodes
.Ldarg_1);
il
.Emit(OpCodes
.Mul);
il
.Emit(OpCodes
.Br_S, endOfMethod
);
// Case 2 - Perform Div on Ldarg_0 and Ldarg_1
il
.MarkLabel(jumpTable
[2]);
il
.Emit(OpCodes
.Ldarg_0);
il
.Emit(OpCodes
.Ldarg_1);
il
.Emit(OpCodes
.Div);
il
.Emit(OpCodes
.Br_S, endOfMethod
);
// Case 3 - Perform Sub on Ldarg_0 and Ldarg_1
il
.MarkLabel(jumpTable
[3]);
il
.Emit(OpCodes
.Ldarg_0);
il
.Emit(OpCodes
.Ldarg_1);
il
.Emit(OpCodes
.Sub);
il
.Emit(OpCodes
.Br_S, endOfMethod
);
il
.MarkLabel(defaultCase
);
il
.Emit(OpCodes
.Ldc_I4,
0);
il
.MarkLabel(endOfMethod
);
il
.Emit(OpCodes
.Ret);
var calculate
=
(Func
<int,
int,
int,
int>)calculateMethod
.CreateDelegate(typeof(Func
<int,
int,
int,
int>));
var result
= calculate
(1,
2,
3);
I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!
?>
Vote on HN
Creating a recursive dynamic method that calculates factorial
Posted by Filip Ekberg on October 17 2011 Leave a Comment
In the last post we looked at how we could call a dynamic method, now let’s have a look at how we can create a recursive method that calculates factorial for non-negative integers! First of all we should take a look at what factorial means, usually it’s written like this:
Which means that it will perform this:
So by looking at the above sequence we can see that we want to perform the multiplication operation over and over again, reducing the integer by 1 until it’s equal to one. Here’s a good explanation if you don’t know what recursion is:
Recursion is the process of repeating items in a self-similar way. For instance, when the surfaces of two mirrors are exactly parallel with each other the nested images that occur are a form of infinite recursion.
If you are having troubles understanding what recursion is, start by reading this blog post from the beginning again. When talking about recursion, you look for one or more base cases, these are the times that the recursion ends. In the above, the base case is when the integer has been reduced to 1.
Let’s start off by looking at the method signature here, the method will return an integer and take an integer as its first argument:
Then we want to define our base base, which in this case will be when x is equal to 1, then we want to end the recursion and return x:
The last part is a bit tricky to understand, what we want to do is that we want to multiply x with the value of Factorial(x-1) this means that if we call Factorial(3) we expect the following to happen the first time:
return 3*Factorial(3 - 1);
In this case, we will call the method with the value 2 and then the return-statement will look like this:
return 2*Factorial(2 - 1);
But now when we’re inside Factorial this time, x will be equal to 1, so we will return 1, which means that we will actually multiply 2 by 1 and then we return that value and we will return 3 by 2.
This is the entire recursive method that we want to produce as a dynamic method:
int Factorial(int x)
{
if (x == 1) return x;
return x*Factorial(x - 1);
}
Just as we’ve done in the previous posts, we instantiate our DynamicMethod like this:
Type
[] methodArguments
= {
typeof(int)
};
var recursiveFactorial
= new DynamicMethod
(
"Factorial",
typeof(int),
methodArguments,
typeof(Program
).Module);
var il
= recursiveFactorial
.GetILGenerator();
As you can see here we expect one parameter and that is an integer and we expect a return value which will be an integer as well. Now, there is one new operation code that we are going to look on today and there are two other methods on the il-generator that we will explore. First of all we need to look at how we can create an if-statement and jump to somewhere in the code.
Basically what an if-statement does is that it evaluates if two values conform to a certain rule, it might be equality or not-equality among others and then it tells you to go somewhere, either you enter the body or you continue after the body of the if-statement.
When we look at IL it’s not so different, but at a first glance it might look like that. So first of all, we need to be able to define somewhere where we will go if our statement is true, this is done by calling DefineLabel() on our il-generator instance like this:
var endOfMethod = il.DefineLabel();
This will give us an instance of the class Label and if you’ve not seen labels before in any programming languages, it might look like this:
someLabel:
Console.WriteLine("Goto test");
goto someLabel;
A more common scenario might be if you have nested loops, which might not be a good idea in the long term anyways but if you do, using labels will allow you to break the outer loop. In our case on the other hand, we will use the label to jump somewhere when our check evaluates to true. Evaluating if two values are equal is done by using the operation code OpCodes.Beq. It will assume that two values are pushed onto the evaluation stack and if they are equal, it will jump to the label that you’ve emitted with the operation code. Like this:
il.Emit(OpCodes.Beq, endOfMethod);
This will jump to wherever endOfMethod is if the two values on the evaluation stack are equal. So now that we’ve covered almost all of the new things, let’s get rocking with some more IL!
The first thing that I want to do when my method enters, is reading the value passed to my method, this is because I want to use this method later on.
The second thing is to check if the parameter is equal to one, however the branch-is-equal operation will pop the two values from the evaluation stack, so in order for us not to lose our value passed to our method, we just push it onto the stack again, we also push the value 1 because that’s what we want to evaluate against and the stack will look something like this now:

Which means that after we’ve done:
il.Emit(OpCodes.Beq, endOfMethod);
It will look like this:

We haven’t defined where endOfMethod is yet, we just have the label for usage. So the next thing we want to emit is what happens if the branch did not work and this is the subtraction. This assumes that we have two values on the evaluation stack as well, so we just need to push the value 1, which is what we want to subtract with:
il.Emit(OpCodes.Ldc_I4, 1);
So now the stack looks like this:

The next thing we do is to actually call the subtraction:
This will pop the two values off the stack and push the result so the stack will look like this:

When the subtraction is done, we want to call the method recursively and to do this, we just call our own dynamic method and since we have the result from the subtraction on the evaluation stack, this will be treated as the first argument to the method:
il.Emit(OpCodes.Call, recursiveFactorial);
The code after this is where we are when the method actually returns and here we want to multiply the return value by the argument that we first sent to our method, the return value from the method is already on the evaluation stack, so we just need to add the argument again and call the multiplication operation:
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Mul);
Everything above up until where we did the branch is when the base case did not fall through now we need to define the entry point for the end of our method, which means that we have to mark where the label should be. This is how we do that:
il.MarkLabel(endOfMethod);
However, both the base case and every time this method is called have a similar ending and that is returning a value, the value will always be on the evaluation stack either if it’s from the multiplication above, or if it is from the subtraction even further up:
This is the entire il that I emitted:
// Either to return or send as argument to recursive call
il.Emit(OpCodes.Ldarg_0);
// Compare the argument value to 1
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldc_I4, 1);
// Jump to endOfMethod if the argument value is equal to 1
il.Emit(OpCodes.Beq, endOfMethod);
// Subtract 1
il.Emit(OpCodes.Ldc_I4, 1);
il.Emit(OpCodes.Sub);
// Do recursive call
il.Emit(OpCodes.Call, recursiveFactorial);
// Multiply the return value by the argument value
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Mul);
il.MarkLabel(endOfMethod);
il.Emit(OpCodes.Ret);
To try it out, we can call it like this:
var toInvoke
= (Func
<int,
int>)recursiveFactorial
.CreateDelegate(typeof(Func
<int,
int>));
var fact
= toInvoke
(10);
And this is the result:

I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!
?>
Vote on HN
Calling a dynamic method from a dynamic method
Posted by Filip Ekberg on October 16 2011 Leave a Comment
Let’s take a look at how we can call a method from a dynamic method. In the last post we looked at how we called a static method in our current context, but let’s take a look at how we can call another dynamically created method that takes an integer parameter and then does some math operation on it and then returns it.
First off we need to create this method, we’re just going to use IL that we’ve seen before and I am going to add the input parameter with 2. The dynamic method will look like this with the emitted IL:
var addMethod
= new DynamicMethod
(
"AddMethod",
typeof(int),
methodArguments,
typeof(Program
).Module
);
var il
= addMethod
.GetILGenerator();
il
.Emit(OpCodes
.Ldarg_0);
il
.Emit(OpCodes
.Ldc_I4,
2);
il
.Emit(OpCodes
.Add);
il
.Emit(OpCodes
.Ret);
So we load the first argument onto the evaluation stack, then we add a 4 byte integer with the value of 2 onto the evaluation stack and last we call OpCodes.Add.
I don’t know if you’ve noticed this before, but Dynamic Method derives from MethodInfo, which means that we can just call this method!
We’ve got the following code from the last blog post:
var mathOperation
= new DynamicMethod
(
"AdvanceMathOperationMethod",
typeof(void),
methodArguments,
typeof(Program
).Module);
il
= mathOperation
.GetILGenerator();
var methods
= typeof(Program
).GetMethods();
In both these dynamic methods, we use the variable methodArguments that we’ve also seen before:
Type
[] methodArguments
= {
typeof(int)
};
It just says that we expect an integer parameter sent to the method. Let’s take a look at the IL we’re going to emit, first of all we want to load the argument onto the evaluation stack then we want to add the value 10 and multiply these and pass the result as a parameter to the add method.
So this code is the same from the last blog post:
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldc_I4, 10);
il.Emit(OpCodes.Mul);
So now we’re prepared to call the dynamic add method we’ve created and since it derives from MethodInfo we can just do this:
il.Emit(OpCodes.Call, addMethod);
Since the result from the multiplication is already on the evaluation stack it’s also the first argument that this method gets!
The Add method in the dynamic add method we’ve created will also have it’s value on the evaluation stack, so let’s print it with our static method from the last blog post:
il.Emit(OpCodes.Call, methods.First(x => x.Name == "PrintWithSpecificFormat"));
il.Emit(OpCodes.Ret);
And now if we invoke this:
var toInvoke
= (Action
<int>)mathOperation
.CreateDelegate(typeof(Action
<int>));
toInvoke
(10);
we should see this printed:
I hope you found this interesting and if you have any thoughts please leave a comment below!
?>
Vote on HN
Exploring OpCodes with DynamicMethod and looking at the evaluation stack
Posted by Filip Ekberg on October 16 2011 1 Comment
In the previous post we looked at how our DynamicMethod could pass a value to another method, let’s take a look at something a little bit more interesting! Consider that I want to have a method that takes an integer and this integer is manipulated and then printed out by this method. In this case, the manipulation is a multiplication and the second method is just a method to print the result in a nicely formatted way.
This is what we have, from the previous post. First of all we got the method that prints it in a formatted way, I’ve renamed it for clarity in this post:
public static void PrintWithSpecificFormat(int a)
{
Console.WriteLine("The value is: {0}", a);
}
The second thing we have is the DynamicMethod, however, I’ve made two changes here, instead of taking zero parameters, I am now taking an integer parameter, secondly I’ve renamed it to something more understandable:
Type
[] methodArguments
= {
typeof(int)
};
var mathOperation
= new DynamicMethod
(
"AdvanceMathOperationMethod",
typeof(void),
methodArguments,
typeof(Program
).Module);
ILGenerator il
= mathOperation
.GetILGenerator();
var methods
= typeof(Program
).GetMethods();
Now let’s start off by looking at the operations that we are going to emit. First of all we need to load the argument that we pass onto the evaluation stack this is done by OpCodes.Ldarg_0. Before we look at the other operations, notice that I mentioned the evaluation stack here, from now on it’s important to understand a little bit about the evaluation stack. Values are pushed and popped onto the evaluation stack and certain operations except values to already be on the evaluation stack.
If you don’t know what a stack is, here’s a quick little intro with a bit of help from our friend Wikipedia. First of all, this is what a stack might look like when represented on a paper:

As you can see, elements that are pushed onto the stack, is “stacked” on top of the old values and when you want to pop something, you always take out the last item that was added. This is called LIFO which stands for Last In First Out.
So if we now apply this knowledge and start thinking about how the evaluation stack works and why it’s important you might see that operations such as OpCodes.Mul expects there to be two values that it can pop from the stack then when it’s popped them, the values are multiplied and the result is pushed back onto the evaluation stack for the caller to use.
This is what it looks like when we want to multiply 10 by 20
First the value 10 is pushed onto the stack

Then the value 20 is pushed onto the stack

Then we call OpCodes.Mul and it starts off by popping the first value

The it pops the second value

After the multiplication is done, the result is pushed onto the stack

This is the general pattern used, things are pushed onto the evaluation stack and then operations pops the values they want to use and then pushes a possible result back onto the evaluation stack. So how do we proceed now?
We already got the first value on the evaluation stack, let’s say that whatever we pass into our dynamic method will be multiplied by 10, since we did OpCodes.Ldarg_0 we only need to emit code that adds our next value!
This is done exactly as we did in the last post:
il.Emit(OpCodes.Ldc_I4, 10);
If the parameter sent to the dynamic method is 20 now, the stack will look like this:

Now let’s do the multiplication, this is done by emitting OpCodes.Mul:
So what now? We want to pass the result from the multiplication operation to the PrintWithSpecificFormat method and how do we pass values again?
By pushing them to the evaluation stack!, but wait a minute, the result is already on the evaluation stack so we don’t need to do anything! We can just call the method as we like. The complete IL-emitting looks like this:
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldc_I4, 10);
il.Emit(OpCodes.Mul);
il.Emit(OpCodes.Call, methods.First(x => x.Name == "PrintWithSpecificFormat"));
il.Emit(OpCodes.Ret);
Now the last thing we are going to do is creating the delegate and invoking it, we can use Action<T> as we did in the last post and then we just invoke the created delegate:
var toInvoke
= (Action
<int>)mathOperation
.CreateDelegate(typeof(Action
<int>));
toInvoke
(10);
The result should look like this:

I hope you found this interesting and if you have any thoughts please leave a comment below!
?>
Vote on HN
Calling a non-dynamic method with parameters from a dynamic method
Posted by Filip Ekberg on October 14 2011 Leave a Comment
I think it’s time that we explore some more OpCodes and in this post I will look at how we can call methods with parameters passed to them!
There are two OpCodes that we can use to execute a method these are:
The difference between these two are important, the first one OpCodes.Call, calls a method and expects that we are coming back after the OpCodes.Ret in the called method. Whereas OpCodes.Jmp exit the context.
Also the Jmp operation has a set of requirements ( from MSDN ):
- There are no stack transition behaviors for this instruction.
- The jmp (jump) instruction transfers control to the method specified by method, which is a metadata token for a method reference. The current arguments are transferred to the destination method.
- The evaluation stack must be empty when this instruction is executed. The calling convention, number and type of arguments at the destination address must match that of the current method.
- The
OpCodes.Jmp instruction cannot be used to transferred control out of a try, filter, catch, or finally block.
So by the looks of it, we want to use OpCodes.Call! The method that I want to call is quite simple, it looks like this:
public static void Test(int a)
{
Console.WriteLine("Test: {0}", a);
}
The method takes a parameter and prints “Test: {The Value of a goes here}”, so if we would call Test(10) we would get this printed in the console:
We’ve seen how we can read arguments with operation codes, but how do we actually emit and operation code with a certain value, the value that we want to pass to the method? There’s an operation code called Ldc_I4 and it basically means the following:
Pushes a supplied value of type int32 onto the evaluation stack as an int32.
This is exactly what we are looking for! So we got the operation that we want to emit in order to solve the argument issue, but how do we actually call our method? It turns out that the operation OpCodes.Call expects a parameter passed to the emit method and in this case it’s expecting information about the method that we want to call. And how do we pass information about the method?
By using MethodInfo of course!
Let’s first of all compose the DynamicMethodbefore we start emitting IL, it will look somewhat like this:
var someMethod
= new DynamicMethod
(
"SomeMethod",
typeof(void),
null,
typeof(Program
).Module);
The third variable is null because I don’t want any parameters sent to my dynamic method at the time being! Okay so we need to do two more things before we actually start emitting the IL, this includes getting the actual IL generator and all methods in our current class:
ILGenerator il
= someMethod
.GetILGenerator();
var methodToCall
= typeof(Program
).GetMethods();
Now let’s start emitting IL!
The first thing that I want to emit is the argument that I will be passing to the method and this will look like this:
il.Emit(OpCodes.Ldc_I4, 123);
So I am saying that I want to push a 32bit integer and the value of the integer is 123. Next up we’ll be doing the actual call and for this I need to get the method information from the above methods collection and then I will just return from the method:
il.Emit(OpCodes.Call, methodToCall.First(x => x.Name == "Test"));
il.Emit(OpCodes.Ret);
Now everything is set, we have our dynamic method ready and all we need to do know is to create the delegate, but we don’t have a deleaget that we’ve defined? No worries! We can use Action:
var toInvoke
= (Action
)someMethod
.CreateDelegate(typeof(Action
));
And now we can call this just like we normally do:
The result will look like this:

I hope you found this interesting and if you have any thoughts please leave a comment below!
?>
Vote on HN