Dynamic Compilation

[This is part 3 of the Writing Compilers series]

In high school I was an avid logic-puzzle solver, voraciously scouring the school and public library for new conquests.  Along the way, I came across the “Mathematical Recreations” column in Scientific American.  In particular, one article described “Core War,” a game where programs written by the players fought against each other in the memory of a virtual computer system.  The winner was the program that caused the enemy to execute an illegal instruction.

The language of Core War is called Redcode, a simplified version of assembly language, and served as my introduction to many advanced concepts including registers and addressing modes, data vs. text segments, multi-threaded concurrency, defensive programming (literally!), virtual machines, and low-level debuggers.  Truly, Core War was one of my foundational learning experiences!

I played the game for many years, contributing several articles to The Core War Newsletter (Vol 6 issues 2 & 3).  The evolution of the language itself and the strategies involved in winning offered a continuous stream of intellectual and technical challenges. The apex of my Core War experience was writing a genetic algorithm to evolve warriors and pitting them against the latest-and-greatest warriors in the competitive hills.

Generating Code In-Situ

Last time we wrote a compiler for a trivial AST consisting only of fixnums, following the back-to-front compiler methodology.  This time let’s work on dynamic compilation.

Dynamic compilation is the process of accepting a program fragment, generating the native instructions, and then transferring control to the newly generated fragment (or patching a running program with the new fragment).

To test this technique, we will need a way to call our generated code directly from the compiler – in this case from the GHC runtime.  This means we will be using GHC’s foreign function interface, which is a remarkably pleasant experience!

This post is an example of one of the reasons I want to implement this compiler.  While I’ve been using Haskell for many years, I’ve never needed to reach below the runtime.  This is a good opportunity to delve into the bits-and-bytes of the infrastructure I use every day.

I perused other Haskell projects which generate and execute arbitrary x86 code (such as Harpy), but I will need a more thorough understanding of the low-level representations and calling conventions in order to implement the compiler.  Thus, we’ll go through the process from scratch.

First, let’s pull the byte stream for the fixnum example in the last post:

$ clang -O3 fixnum.c
$ ndisasm a.out
...
00000F44  B82A000000        mov eax,0x2a
00000F4A  C3                ret
...

Here I used ndisasm instead of otool because I couldn’t find a way to get otool to display the bytes corresponding to an individual instruction.

The goal for today will be to put these 6 bytes into a buffer, then execute them.

Dynamic Compilation in C

Native instruction streams must be aligned on boundaries specified by the CPU architecture, in this case x86-64, and many modern operating systems will not allow execution of writable memory.  This requires us to allocate memory aligned to page boundaries and then protect them before execution.  Here we arbitrarily choose 4096-bytes for the page size.

   int main(int argc, char** argv)
   {
      // Allocate some memory aligned at a 'page' boundary.
      void *code = 0;
      posix_memalign( &code, PAGESIZE, PAGESIZE );

      // Let's call it data...
      unsigned char *data = (unsigned char *)code;

      // What's in that memory?
      printf("Old: %0x %0x %0x\n", data[0], data[1], data[2]);

      // Dynamically compile our function!
      data[0] = 0xB8;
      data[1] = 0x2A;
      data[2] = 0x00;
      data[3] = 0x00;
      data[4] = 0x00;
      data[5] = 0xC3;

      // Let's check that it was written..
      printf("New: %0x %0x %0x\n", data[0], data[1], data[2]);

      // Can't execute writable memory, requires protection first.
      int err = mprotect(code, PAGESIZE, PROT_READ | PROT_EXEC);
      printf("mprotect=%d\n", err);
      if (err) {
         perror("Couldn’t mprotect");
         exit(errno);
      }

      // We'll need these values later..
      printf("PROT_READ=%d\n",PROT_READ);
      printf("PROT_EXEC=%d\n",PROT_EXEC);

      // Call our dynamically generated code..
      printf("result=%d\n",((int (*)())code)());
      return 0;
   }

The code does the least amount of work I could get away with; allocate memory, fill with generated code, protect, then execute.  Along the way I print out some data to confirm my intuition of the process and check the result of calling the dynamic code.

Let’s try it:

$ clang dynamic.c
$ ./a.out
Old: 0 0 0
New: b8 2a 0
mprotect=0
PROT_READ=1
PROT_EXEC=4
result=42

It works!

Dynamic Compilation in Haskell

In order to build on my C knowledge, I decided to translate the C program directly into Haskell, allowing me to understand the lower level concepts more directly.  Once we have this example working we can abstract away from the details, incorporating the new code into the existing compiler.

The first piece we need is pointer representations in Haskell.  The GHC FFI defines a (Ptr a) type to represent a pointer to an object of type ‘a’.  The raw functions we will need to manipulate objects of this type are ‘peek’ and ‘poke,’ along with basic pointer arithmetic such as ‘plusPtr’.  These are found in the Foreign.Ptr and Foreign.Storable libraries.

We’re calling into C code for memory protection, so we need Haskell bindings for these functions.  The Foreign.C.Types library supplies Haskell types which marshal to C types losslessly, such as CInt and CSize, which combine in the obvious ways with Ptr to form (Ptr CInt), etc.  This makes importing C functions into Haskell simple, and the function declarations read quite naturally:

In C:

int mprotect(const void *, size_t, int);

In Haskell:

foreign import ccall unsafe
   mprotect :: (Ptr a) -> CSize -> CInt -> IO CInt

The last hoop is the critical type-cast from data pointer to function pointer, allowing us to enter the function.  This is handled with a type conversion operator (casting), also handled by the Foreign library.  An example:

type MyFuncTy = CInt -> CInt
foreign import ccall "dynamic"
   funPtrToMyFuncTy :: FunPtr MyFuncTy -> MyFuncTy

With this in hand, the translation of the C code is straightforward:

{-# LANGUAGE ForeignFunctionInterface #-}

    module Main where

    import Foreign
    import Foreign.Marshal.Alloc
    import Foreign.C.Types
    import Foreign.C.Error
    import Control.Monad ( when )

    type StubFn = CInt -> CInt
    foreign import ccall "dynamic" mkStub :: FunPtr StubFn -> StubFn
    foreign import ccall unsafe
       mprotect :: Ptr a -> CSize -> CInt -> IO CInt

    allocAligned :: Int -> Int -> IO (Ptr a)
    allocAligned align size = allocaBytesAligned align size $ \ptr ->
        do
            return ptr

    main = do

        let size = 0x1000
        buf

Let’s try it:

$ ghc --make Main.hs 
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
$ ./Main 
mprotect = 0
errno = 0
42

Excellent! Now we have a working dynamic ‘compiler’ in Haskell.

  • The native code we are using for the example (the fixnum() function from my previous post) was compiled as a function taking no arguments.  In Haskell this is encoded as a value of type CInt (rather than a function).  Since I’m just exploring here, I decided to cast it as CInt -> CInt instead.  x86 calling convention tends to pass the first argument in a register, so I gambled that using CInt -> CInt would work for this example.
  • I searched for projects that marshal data to and from Haskell, assuming I would be using the posix_memalign() function for allocation, and I needed that double-pointer allocated in Haskell to pass into C.  I came across a use of posix_memalign() it in the CCI library only to discover that it uses ‘alloca’ from Foreign.Marshal.Alloc.  A quick perusal led me to ‘allocaBytesAligned’.  Long story short: I should have just searched in Hoogle.

Next Steps

Next time I’ll work on integrating this new knowledge into the compiler, giving two different outputs: dynamic code generation and assembly code generation.

I’ll also begin turning this into a ‘real’ project, with test infrastructure, stack/cabal project, etc.  This will make things a bit smoother as the project grows.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s