This repository contains my implementation of the clox interpreter described in the book Crafting Interpreters (2021), by Robert Nystrom. It is written in C and comprises of:
- a bytecode compiler.
- a virtual machine to execute the bytecode instructions, with garbage collection.
clox interprets the programming language Lox, designed for the purposes of the book. Lox is a simple, dynamically typed, object-oriented language. Its built-in data types are booleans, numbers, strings, and nil.
common.h contains some DEBUG_ flags you can optionally enable for logging purposes.
My notes on jlox, the book's simpler and less efficient Java-based Lox interpreter, can be found here. They cover many of the high-level concepts used in clox.
To compile and run the WIP interpreter with GCC, run either of the following commands in the root repository directory:
gcc *.c && ./a.out # Launches the REPL.or
gcc *.c && ./a.out someLoxFile.lox # Runs someLoxFile.lox.Here are some technical points about the clox implementation I wanted to keep track of. However, please note that most of my implementation notes are in the form of code comments.
-
Our Lox code is converted into a series of chunks - compact sequences of bytecode instructions by the compiler.
-
The virtual machine deserialises these chunks and executes them.
- Parses using a Pratt parser. Implemented in
compiler.c.
-
Temporary variables are stored on a simple stack, implemented in
vm.c. -
All heap-allocated values are of type Obj, implemented in the
object.cmodule. We store all allocated Objs in a linked list in the VM, for simple garbage collection.
-
Each type of dynamic array we use in clox requires its own nearly-identical struct, init, write, and free functions. This isn't very elegant, but we don't need many of them. We could avoid repetition by simulating generics using preprocessor macros, but this would likely prove even more complicated.
-
OP_CONSTANTuses only a single byte for its operand, limiting a chunk to 256 different constants. This could be addressed by making it use more bytes (which would make every constant instruction take up more space), or by adding a newOP_CONSTANT_LONGinstruction for this special case. -
Our VM has a fixed stack size. We could grow this dynamically for more flexibility.
-
Our hash table in table.c only accepts string key values. We don't need to hash other types to implement clox.
-
Our VM uses string interning - we don't store duplicate strings in memory. Instead, if you try to create a string which has already been allocated in memory, clox will return the allocated string's memory address. This is a trade-off which adds overhead to string creation in exchange for fast string equality testing. To see if two strings are equal, we can just compare their memory addresses - O(1) - instead of comparing each character between both strings - O(n).