Saturday, July 19, 2014

Problem 79 - CIL, who needs high level languages?

After using Parrot Assembly and Java Bytecode, I figured I might as well continue my low level / "intermediate representation" language stint. CIL is the "Common Intermediate Language": it is essentially Java Bytecode for the .Net VM. So, programming in it was not too bad, maybe even a little bit nicer than Java Bytecode at times: one standard "call" method was nice instead of invokestatic, invokevirtual, etc. And most of the language is very similar to Java Bytecode - both are stack based, so they use very similar syntax. And the problem I chose to solve was pretty similar pro grammatically to problem 90 that I solved in Java Bytecode: both just involved lots of set operations, which are easy to express as ors, ands, etc (also, my code contains an xor, so clearly I am doing fancy stuff...). Also, CIL was nice as, though Java Bytecode exists only for Java, CIL is supposed to be able to support more than just C#, so there was no need in this case to create a class to wrap the main method in.

There was, however, one very important way in which programming in CIL was far less fun than Java Bytecode. I could not find as much good documentation, and the error messages are horrible. At least using mono (basically the open source .Net VM, while Microsoft also has one) there were essentially two errors one could get: "Unrecoverable Syntax Error" if an error happens at compile time, and "Invalid IL code" if the error happens at runtime. So, even though my code was basically correct very early (a couple of or's had to be changed to and's once I had things running...) Forgetting a single branch statement to go tho the top of a loop and a single pop instruction to clear the stack caused very hard to debug run time errors that took lots of "remove code until the error goes away" style debugging, which was not fun.

So basically, CIL is a fine stack-based assembly languages, as long as you don't cause any errors.
Also, it is worth noting that despite the title of this post...high level languages are actually quite nice. Variable names are really nice, even if I have gotten pretty good at remembering which register holds what variable.

Below code runs in 50ms on my machine.
.assembly extern mscorlib {}
.assembly e79 {}

.namespace E79 
{
    .method public static void main() cil managed
    {
        .entrypoint
        .maxstack 100
        .locals([0] int32[],
                [1] int32,
                [2] int32,
                [3] int32,
                [4] int32,
                [5] int32,
                [6] int32)
        ldc.i4 10
        newarr int32
        stloc.0
        ldc.i4.0
        stloc.3
        ldc.i4.0
        stloc.1
LOOP_INIT:
        ldloc.1
        ldloc.0
        ldlen
        bge OVERLOOP_INIT
        ldloc.0
        ldloc.1
        ldc.i4.0
        stelem.i4
        ldloc.1
        ldc.i4.1
        add
        stloc.1
        br LOOP_INIT
OVERLOOP_INIT:

LOOP_MAIN:
        call string [mscorlib]System.Console::ReadLine()
        dup
        brnull OVERLOOP_MAIN
        call int32 [mscorlib]System.Int32::Parse(string)
        stloc.2
        ldloc.2
        ldc.i4 100
        div
        stloc 4
        ldloc.2
        ldc.i4 10
        div
        ldc.i4 10
        rem
        stloc 5
        ldloc.2
        ldc.i4 10
        rem
        stloc 6
        ldloc.0
        ldloc 5
        ldloc.0
        ldloc 5
        ldelem.i4
        ldc.i4.1
        ldloc 4
        shl
        or
        stelem.i4
        ldloc.0
        ldloc 6
        ldloc.0
        ldloc 6
        ldelem.i4
        ldc.i4.1
        ldloc 5
        shl
        or
        ldc.i4.1
        ldloc 4
        shl
        or
        stelem.i4
        ldloc.3
        ldc.i4.1
        ldloc 4
        shl
        or
        ldc.i4.1
        ldloc 5
        shl
        or
        ldc.i4.1
        ldloc 6
        shl
        or
        stloc.3
        br LOOP_MAIN
OVERLOOP_MAIN:
        pop
        ldc.i4.0
        stloc 4
        ldc.i4.0
        stloc.1
LOOP_FINAL:
        ldloc.1
        ldc.i4 10
        bge OVERLOOP_FINAL
        ldloc.0
        ldloc.1
        ldelem.i4
        ldloc.3
        ldloc.1
        shr
        ldc.i4.1
        and
        ldc.i4.1
        xor
        or
        brzero IFPART
ELSEPART:
        ldloc.1
        ldc.i4.1
        add
        stloc.1
        br LOOP_FINAL
IFPART:
        ldloc 4
        ldc.i4 10 
        mul
        ldloc.1
        add
        stloc 4
        ldc.i4.1
        ldloc.1
        shl
        not
        ldloc.3
        and
        stloc.3
        ldc.i4.0
        stloc 5
LOOP_CLEAR:
        ldloc 5
        ldc.i4 10
        bge OVERLOOP_CLEAR
        ldloc.0
        ldloc 5
        ldloc.0
        ldloc 5
        ldelem.i4
        ldc.i4.1
        ldloc.1
        shl
        not
        and
        stelem.i4
        ldloc 5
        ldc.i4.1
        add
        stloc 5
        br LOOP_CLEAR
OVERLOOP_CLEAR:
        ldc.i4.0
        stloc.1
        br LOOP_FINAL
OVERLOOP_FINAL:
        ldloc.s 4
        call void [mscorlib]System.Console::WriteLine(int32)
        ret
    }
}

No comments:

Post a Comment