Getting all methods from a code file with Roslyn

Posted by Filip Ekberg on October 21 2011 Leave a Comment

In the previous post we started looking at Roslyn and let’s continue on this topic and see what else we can get out of it! I want to take a look at how we can retrieve all methods and get some information about them. I’ve added another method to the Person-class so it looks like this now:

public class Person
{
    public string Name { get; private set; }
    public Person(string name)
    {
        Name = name;
    }
    public void Evaporate()
    {
           
    }
    public string Speak()
    {
        string str = "test";
        return string.Format("Hello! My name is{0}",
            Name);
    }
}

We’ve already got the tree-structure and the root node so let’s just use that. Everything is represented as a SyntaxNode so we need to get all the descending nodes that are methods, methods are declared as MethodDeclarationSyntax. So all methods are retrieved like this:

IEnumerable<MethodDeclarationSyntax> methods = tree.Root
                .DescendentNodes()
                .OfType<MethodDeclarationSyntax>().ToList();

No we can just iterate over this:

foreach(var method in methods)
{
}

However, you might be a bit confused as to how you print the method name, because there’s not a Name-property on the object! Instead there is something called an Identifier that we can use:

foreach(var method in methods)
{
    Console.WriteLine(method.Identifier);
}

This will print all methods and not including the constructors, if we want to get the constructors we ask for ConstructorDeclarationSyntax instead of MethodDeclarationSyntax. We can get a lot of interesting things from the method-object in the iterator, we can ask about the parameters, the return type and a lot of other nice things.

I hope you found this interesting, if you have any thoughts please leave a comment below!

Vote on HN

Using Roslyn to parse C# code files

Posted by Filip Ekberg on October 20 2011 Leave a Comment

A couple of days ago Microsoft released something called the Roslyn Project and it is now in it’s CTP state, just as Async! But what is Roslyn and what can it be used for? In the previous post I talked about how I wrote assembler that was generated from an application that parsed some programming language, but what I didn’t do was the actual parsing of code. Actually parsing code is not only relevant when you want to write a compiler, it is also useful when you want to evaluate how good a certain chunk of code is.

There are a lot of really good software out there that will help you analyze your code, some of them analyze the code after it has been compile such as a software called NDepend, which is a really good tool. Another program that is commonly used is ReSharper, from what I know, ReSharper analyzes the code structure without actually compiling it all the way down to IL.

You can call this parsing+evaluating, when you add the extra step that actually generates new code I would call it a compiler. However, let’s get back to Roslyn. So the project places themselves on the market saying that before roslyn the C# and VB.NET compiler were just a black box with no integration capabilities, what roslyn does is that it opens up the black box and allowing an interface between your code and the compiler.

What this means is that you can parse a code file that haven’t been compiled yet and get a nice structure out of it that you can do whatever you like with. To get started with Roslyn, this is what you need to do:

When all this is installed, open up Visual Studio and create a new Roslyn C# Console Application

Now create a new folder called ToParse and add a class to it with some fields and methods

So now we have this Person class that we want to parse, here’s to code so you can just copy/paste it:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace HelloWorldAnalyzer.ToParse
{
    public class Person
    {
        public string Name { get; private set; }
        public Person(string name)
        {
            Name = name;
        }
        public string Speak()
        {
            return string.Format("Hello! My name is{0}",
                Name);
        }
    }
}

Now go back to the main method and let’s get started with the parsin!

The first thing that I want to do is to just read the text inside the file, for the purpose of this example I’ll do it like this:

var code = new StreamReader("..\\..\\ToParse\\Person.cs").ReadToEnd();

We go back two folders ( ..\..\ ) because the application will run from the bin\Debug\ folder!

Next up we want to get something that is called a SyntaxTree from our code, here’s a good illustration of what a SyntaxTree is:

So we get this tree by doing this:

SyntaxTree tree = SyntaxTree.ParseCompilationUnit(code);

Now we want to retrieve the root of the tree:

var root = (CompilationUnitSyntax)tree.Root;

Next up I want to print all the using-blocks in the class file that we just parsed on the CompilationUnitSyntax instance we have a property called Usings, we can use this to get a list of UsingDirectiveSyntax

foreach(var usingBlock in root.Usings)
{
    Console.WriteLine("Using block: {0}", usingBlock.Name);
}

This will print all the using blocks and will result in something like this:

Using block: System
Using block: System.Collections.Generic
Using block: System.Linq
Using block: System.Text

Now let’s do something a bit more fun, let’s retrieve all the nodes in the syntax tree and look for a LiteralExpressionSyntax, which actually will be the string inside the Speak() method!

We do this by first getting all the descendent nodes from the root and just get the first literal expression syntax that we find:

var personSpoke =   root.DescendentNodes()
                        .OfType<LiteralExpressionSyntax>()
                        .FirstOrDefault();

If we write this to the console as well we should see the following:

This is just the bare surface of what you can do with Roslyn, there are some Very interesting resources to look through. Here’s an MSDN page with a lot of documents on how you do certain things in Roslyn. Be sure to check that out! So far we’ve just done the parsing step, but you can also do compilation with it since it exposes all the different steps of the C# and VB.NET compiler.

I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!

Vote on HN

When can knowing about IL and the internals be useful?

Posted by Filip Ekberg on October 19 2011 Leave a Comment

Lately we’ve been looking a lot on how you can use IL to create types at runtime, the same IL that we have emitted at runtime is what the C# compiler compiles the C# code to. So the first thing that comes to mind when someone asks me when knowing about IL and the internals of C# can be useful is when you want to write a compiler. Let’s say that you for educational purposes want to create a very simple language and you want to be statically typed and usable by other .NET languages, then compiling the code to MSIL will allow you to do just that.

Back when I studied for my bachelor in Software Engineering I attended a course the last year that was called “Advance UNIX Programming” one of the laboratory practicals that we did in this course was to complete a compiler, and when I say to complete it I do not mean write it from scratch in this case. We had other courses that covered the depth of compiler technology, however in this particular programming course we were given an application that parsed a programming language and our assignment was to make it generate the correct assembler.

The assembler that we generated was in this case GAS and the custom programming language they called “Calc” looked like this:

a=732;
b=2684;
while(a != b) {
  if(a > b) {
    a=a-b;
  } else {
    b=b-a;
  }
}
print a;

print a gcd b;

As you can see on the bottom we also had to create some library functions such as “gcd“. Basically we only had one file that we had to edit in, or to be even more specific we did everything inside one big method that had a switch case that told us what kind of operation that we were currently working with. So we had a lot of parsing done already, there are a lot of tools out there if you want to parse a programming language, there’s something called lexical analyzing that you can look into.

In this laboratory practical the purpose was to spit out correct assembler and show that we had somewhat a knowledge of how the stack worked and how to use certain operations, here are two examples take from that laboratory practical, the actual programming language used in the generator is C:

case '+':   printf("\tpopl\t%%eax\n\tpopl\t%%ebx\n\tadd\t%%ebx, %%eax \n\tpushl\t%%eax\n"); break;
case '-':   printf("\tpopl\t%%ebx\n\tpopl\t%%eax\n\tsub\t%%ebx, %%eax \n\tpushl\t%%eax\n"); break;

So in this big switch when it has found a + and a – this is what is done. It might be a bit hard to see in that printf-statement, but here’s the generated assembler:

popl   %eax
popl   %ebx
add    %ebx, %eax
pushl  %eax

As you can see here, it’s pretty similar to MSIL. Now this might not normally be how generate the assembler, you might have it a little bit nicer in the actual compiler than just one-liners, but I just wanted to show you some really old stuff.

I won’t go through how you write a compiler here, it’s a very broad topic and take a very long time to master, if you like to read about deep level stuff like how the people behind the compilers think, read Eric Lipperts blog..

Essentially a compiler is just a program that translates your text into some other format that is either readable by another compiler or directly by a machine. In the laboratory practical that I shared with you above, we actually generated GAS and compiled it with GCC to make it executable.

So everything we’ve been looking at lately is directly applicable if you want to start off by writing a compiler for a new language. I really hope you’ve enjoyed the reflection series that has been focusing on IL generations and dynamic methods, there will most certainly be more of them around in the future.

I hope you found this interesting because I had a lot of fun writing it and if you have any thoughts please leave a comment below!

Vote on HN