Introducing Germ Lisp
Presenting my tiny, self-hosted Lisp.
Background
For many years, I’ve been helping to make software more “bootstrappable”. This has all been done in the context of the GNU Guix project. Guix makes the data flow from source code to running software environment explicit: these sources become these binaries by way of this compiler, whose sources become binaries by way of this other compiler, etc. How deep into compilers of compilers of compilers can you go? For many programs, all the way back to a 357-byte proto-compiler (on x86 Linux). There is a bit of hand waving here, as all of the builds are scripted in Scheme and need a compiled version Guile to run. Guile is several mebibytes. We’re close to getting rid of Guile, too, but that will have to wait for another post.
The 357-byte proto-compiler, hex0, simply turns ASCII hex digits into binary form. From there, we work up to a primitive assembler called M0. Then, with M0, we can build a compiler for a C-like language (a subset of C, really). This process is iterated a few times as the tools become more sophisticated, resulting in more capable compilers, standard libraries, assemblers, linkers, and several other programs. Then, we can build Mes. Mes is a Scheme interpreter. With Mes, we can run an even more capable C compiler called MesCC and use a more complete standard library. From here, we can build TinyCC, which can build an earlier version of GCC, which can build a later version of GCC, and so on until we have built the latest GNU build system.
This is all great! Really! It’s all ready to be integrated into Guix and make the bulk of the distro bootstrappable from soup to nuts. But sometimes I find myself wondering why we go from C to Lisp and then back again? Especially in Guix, where we end up going from C to Lisp, then back to C, and then finally back to Lisp. In fact, I’ve often wondered if we could insert Lisp immediately after hex0. A primitive Lisp interpreter is not much more complicated than a primitive assembler. It’s much more powerful, though!
This question has been in the back of my mind for several years now. I’ve built a few prototypes and explored various ways to pack a lot of punch in a small Lisp interpreter. Recently, I built a functional proof of concept which I’m delighted to present to you, dear reader, in this blog post.
Germ Lisp
I call the thing Germ Lisp (think “germinate”). It is
written in assembly, runs on x86 Linux, and weighs in at about 2.25KiB. It works in two stages. The first stage
is the kernel which is a tree-walking interpreter with a garbage
collector, system interfaces, and enough of a parser to read basic
S-expressions. The second stage runs directly on that kernel, adding a
more sophisticated parser (apostrophe for quote
, hex numbers,
character literals, etc.) and a basic Lisp-style macro system. This
second stage is Germ Lisp, and if you run it, you will be inside of a
REPL, ready to send
expressions to the interpreter.
System interface
The system interface is very Spartan, but extremely capable. It includes:
- Testing pointer equality (
eq?
) - Manipulating pairs (
cons
,car
,set-cdr!
, etc.) - Arithmetic on machine integers (
combine-numbers
) - Calling the interpreter (
eval
andapply
) - Execution environment (
command-line
andenvironment-variables
) - General system calls (
syscall
withbuf-ref
andbuf-set!
)
The combine-numbers
procedure provides an indexed interface in to
the various operations on machine integers. For example, you give it
0 for addition, 1 subtraction, and so on. It exposes 12 operations in
total, including comparison. This is wonky, but saves space and goes
away with a few simple definitions like:
(define + (lambda (x y) (combine-numbers 0 x y)))
The syscall
procedure invokes the Linux kernel. In order to
manipulate C-style strings, the Germ kernel provides a fixed-size buffer
that can be read and written to using the buf-ref
and buf-set!
procedures. The syscall
procedure takes a specification string for
argument types. If the string has an “n”, the corresponding argument is
a number and should be passed to the Linux kernel as-is. If the string
has a “p” the corresponding argument is an index into the buffer and
should be converted to a pointer before going to the Linux kernel. For
example, to write the letter “A” to standard output, you could write:
(define %sys-write 4)
(define %stdout-handle 1)
(buf-set! 0 65)
(buf-set! 1 0)
(syscall "nnpn" %sys-write %stdout-handle 0 1)
Of course users of Germ wouldn’t usually use combine-numbers
or
syscall
directly. The bootstrap script renames all the basic
arithmetic procedures and provides a very primitive “ports” interface
for input/output.
Memory management
To save space, the kernel only supports three types of objects: pairs, numbers, and “special” values (sigils, essentially). Since numbers and special values are atomic, the garbage collector only has to worry about pairs. This means that strings have to be lists of numbers with a “this is a string” special value in the car of the first pair. It’s horrifyingly inefficient, but it means the garbage collector itself is tiny.
It’s just a stop-and-copy system: stop execution, copy over some root
objects, and then keep copying the objects they refer to
(transitively) until you run out of new objects. When you copy a
pair, put a “broken heart” special value in the old car, and the new
address in the old cdr. That way, you never copy the same object more
than once. The whole thing is just two subroutines: collect
which
traverses live objects and copy
which copies them.
Self-hosting
Germ can build itself. The only script I’ve written for Germ (so far)
is an x86 assembler, and a Lispified version of the Germ assembly
code. The assembler constructs an embedded domain-specific
language for writing assembly programs in Lisp. Here’s what
the exit
subroutine looks like:
;;; exit
;;; Exit the program with the code in EAX.
(: 'exit)
(mov %eax %ebx) ; Use EAX for the exit code
(movst ($ sys-exit) %eax) ; EAX := SYS_exit
(int ($ #x80)) ; Call kernel
If you’re familiar with x86, GNU assembler, and Lisp this should be
pretty easy to understand. (The movst
instruction is like an
assembler macro that does the move via the stack, which is one byte
cheaper.) Basically, each instruction is a Lisp procedure that
manipulates some global state in the assembler. When you are finished
adding instructions, you can run assemble!
to get a list of bytes
corresponding to the assembled instructions. These bytes go on to
write-elf
, which adds an ELF header to the machine code. The resulting ELF file
is bit-for-bit identical to Germ as built with the GNU assembler.
As a bonus, the assembler written in Lisp can be run by Chibi Scheme (or likely any R⁷RS-compatible system). That means there are three ways to build Germ. Using the GNU assembler, using a Scheme interpreter, or using Germ itself. They all create identical results.
Limitations
It’s small and simple; what’s the catch? There are two major design decisions that trade performance for simplicity. The first is that variable values are resolved by linear search. This means that as programs grow in complexity performance degrades. For the small assembler, this is not a big deal. Germ is orders of magnitude slower than Chibi, but manages to assemble itself in a few seconds on my aging hardware. The second limitation is the representation of strings. First of all, it wastes a lot of memory. But it also wastes a lot of time to do string manipulations like linked list manipulations. And don’t even get me started on the extra GC pressure this causes!
On the other hand, it’s not as small as I would like. When I started, I hoped to keep the thing under 2048 bytes. The current ELF file is 2313 bytes. At this point I’m torn: it might be possible to bring it down to 2048, but there are one or two features the kernel is missing that would make the bootstrap script simpler. For the moment I’m trying to avoid getting distracted by nice, round numbers.
Inspiration
Like many Lispers, I’ve built a number of interpreters over the years. (It’s hard to resist!) When I first started exploring this space, I thought that it would require some kind of fancy trick to keep things small. Maybe inventing a clever pointer tagging system or avoiding having a top-level environment or something like that. In the end, the tried-and-true approaches yielded the best results. Germ looks a lot like any other interpreter. Rather than any kind of breakthrough, it’s the result of careful and parsimonious implementations of old ideas.
There are two projects that inspired me to take this approach. They both are, in many ways, careful takes on old ideas. The first is the rather famous SectorLISP. It shows how simple memory management and reading S-expressions can be. In some ways Germ is a beefier, Linux-compatible version of SectorLISP. The other project is Ribbit – a Scheme system composed of a bytecode VM and compiler. The original system provided a REPL in only 4K, and it was then updated and expanded to provide a complete R⁴RS system in 7K. (For comparison, when you include its bootstrap script, Germ is close to 30K and offers fewer features.) While Ribbit’s architecture is not directly applicable to bootstrapping, the simplicity of its bytecode VM helped convince me to throw away unnecessary features.
What’s next?
In order to make Germ more than just a proof of concept, it would have to be integrated into a larger bootstrapping process. To do this, it would have to be expanded in three directions.
First, it would need to be expanded down in order to connect to something simpler. It’s small, but maybe not small enough to be the bootstrap seed. It would be nice to keep hex0 in that role. This means converting the Germ sources to hex0 (or hex1 or even hex2). This is not a complicated job, just a bit tedious!
Next, it needs to be expanded up in order to connect to something more powerful. For this, a second version of Germ might be necessary to remove the two performance limitations discussed above. This would allow Germ to run more complicated programs – perhaps even MesCC. This should be done by conditionally assembling features into the Germ kernel. That way, we could keep the 2.25KiB version of the kernel, and use it to build a bigger version that can support a more capable scripting environment.
Lastly, it should be expanded sideways. By this I mean ported to other systems. A good start would be a RISC-V port. This should be pretty easy, as having more registers makes hand-writing assembly with bespoke calling conventions easier. It would be interesting to explore running on other kernels, of course, but that would require significant changes.
To be clear, I’m not sure if I’ll do any of these. My guess is that
I’ll look at making Germ more capable. I’ve written an R⁷RS front end
that provides syntax-case
macros and a module system. I also have a
handful of syntax and procedures from R⁷RS lying around. If I could get
these to run (or even crawl) on a slightly expanded Germ kernel, that
would be pretty neat.
Conclusion
It’s always nice to have a project with well-defined goal. “Is it small?” and “can it build itself?” are two questions that have pretty concrete answers. Now that Germ answers both questions in the affirmative, it’s time to put it out there. Please have a look at the code and feel free to send me feedback.