Writing Compilers

[This is part 1 of the Writing Compilers series]

I’ve been a programmer for almost my entire life – I still remember my Dad letting us play games on his early home computer. I was in first grade in the early-80s and “home computer” was an infrequent term. I was playing a game not unlike Missile Command, becoming frustrated that the enemies were too hard to shoot.

I wanted them to move slower.  More specifically, I wanted my Dad to make them move slower.  In the naivety of youth this seemed trivial, something my all-powerful father could certainly accomplish with just a wave of his hand.  Of course I was distraught to discover that he could not alter the game’s behavior.

This led to a different kind of frustration, one which launched my career and still drives me today; HOW do those little dots move around in the first place?  What controls them, and how can *I* become their master?

Languages

As an industrial programmer, I don’t often have the freedom to choose which language I use for my work.  Among the usual C/C++, Java, and scripting languages, I have been extremely lucky over the past decade to work on professional projects in Haskell, Scheme, and other functional languages – one of which was my own design.

That is a story for another time, but it leads into the impetus for this blog; I want to build a compiler for a new language.

Why?

  • Explore the design space of computer languages
  • Practice the implementation of compiler internals
  • Refine my understanding of software engineering practice

Specifically, I want to write a self-hosting dynamic compiler for a statically-typed, pure, total-functional language.

Self-Hosting means that the compiler can compile itself.  This means the language needs to be expressive enough to write a compiler in, and that the compiler is written in its own language.  Pascal was famously written this way and distributed with a small C interpreter to bootstrap the language on new platforms.  Today it is more common to write a compiler that targets an existing runtime platform like Java, .Net, or LLVM.

A dynamic compiler is one which is built into the runtime system, always available, and able to link newly compiled code into already running software.  Dynamic compilation supports incremental and interactive programming such as editing the image you’re debugging, or patching live production systems. Dynamic compilers also support just-in-time compilation.  This combination of features generally makes the process of software engineering more pleasant.

Statically typed languages enforce invariants during compilation which eliminate some classes of errors from occurring at runtime.

Pure functional languages have no side effects; the only output of a function is its return value – if it returns.

Total functional languages are ones in which every function returns – they are not Turing complete.  This seems like an excessive restriction, but it turns out to be a very useful property for program transformation, and many common algorithms are already total.  For example, programs written in total functional languages can be executed using eager or lazy evaluation without modification, so any optimization which is valid for either one is valid for total-functional programs too.

This collection of compiler and language properties also supports my main focus of study professionally, and another motivation for this blog: fault-tolerant distributed computing.

I believe that a careful selection of the semantics of the source language will offer programmers enough expressivity to write complex software systems, yet remain easily transformed by the compiler for efficient execution.  Those same semantics should be chosen to ease the transition of software into our distributed, concurrent, and fault-tolerant clouds.

Next Steps

I’ve written compilers using the traditional batch-oriented construction, so dynamic compilation is the only part that is new to me.  I hope to write a series of articles in which I implement algorithms presented in various papers and refine them to match the needs of this compiler.

An Incremental Approach to Compiler Construction

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s