Crafting Interpreters by Bob Nystrom has become one of my favorite programming books. It walks you line-by-line through creating Lox, a fully functional object-oriented programming language. Over the past year, I’ve worked my way through the book five times, implementing the tree-walk interpreter in Java, Python, and Rust, and the byte-code virtual machine in C and Zig. My next challenge would be to write it in Pascal!
Pascal is my first love. I started learning programming in 1984 on an IBM computer in Basica. I didn’t like the way my programs scrolled onto the screen line-by-line. I wanted them to pop onto the screen the way Norton Utilities did. So, I picked up Peter Norton’s book and taught myself Assembly. A friend saw me writing a program in Assembly, and asked “What are you doing???” The next day he handed me a floppy disk labeled “Turbo Pascal 3.0”. I took one look at the example programs, and I was hooked. The code looked like stories, unlike the illegible lumps that were Basic programs and the terse syntax of Assembly. The line “SetColor(Red);” jumped out at me. My mind was blown. It was so much more expressive than “SCREEN 0: COLOR 12“. This set the tone for everything I’ve programmed since then.
My Lox port to Zig came out much better than I hoped for. I expected it to run slightly slower than in C, but when I cut a release build, the benchmarks all were as good or better. The book says “…our implementation is portable to any platform with a C compiler and is fast enough for real-world production use. We have a single-pass bytecode compiler, a tight virtual machine interpreter for our internal instruction set, compact object representation, a stack for storing variables without heap allocation, and a precise garbage collector.” Wow! This is a great place to start at. But, in Bob’s own words, Lox is still a “toy” language. So how do we turn it into a full-fledged language? Well, writing a language in it isn’t a bad way to begin. Lox in Lox. Someone’s already done that. Good idea. Instead, I will work on creating a tree-walk interpreter for Pascal, and then write Lox in it. So, JPascal and PLox.
Syntax
Changing the syntax from Lox to Pascal was the easy part. Braces change to begin/end. Operator = changes to := and == to just =, and != becomes <>. Operator ! changes to not. Add in then and change fun to procedure and function. Add in a toLower() and a few equalsIgnoreCase() and Lox is no longer case-sensitive! A few hours, and a handful of changes to the Scanner and Parser, and it is looking very much like Pascal.
Getting Started
Back to page one of Crafting Interpreters! Only this time I’m writing Lox in a Pascal dialect of Lox. Getting through the first chapter was undoubtedly the hardest part. Lox doesn’t have Lists and Maps or even Arrays. There are no Char or Integer types, and no subscript to index into the provided String. And no case statement or enums. On his website Bob provides code to his Array challenge, using a native function. List, Map, and Stack is just more of the same. Adding Char and Integers was straightforward. The subscript operator was a little harder, but once I determined the precedence, it wasn’t too hard. The case statement is nothing but a glorified if/else if/else series. Similar to what the book does for the for statement, I added “syntactic sugar” to parse in the case statement and add if statements to the AST. The same thing with enum types. Following the type-safe enumeration pattern from “Effective Java”, I wrote TokenType in Lox using classes, and added the Pascal type keyword to the Parser as more “sweetener”.
Test-Driven Development
With Scanner.pas finished and running, it dawned on me what was missing. Unit tests. I don’t code without unit tests. I just don’t. Well, I had for several hundred lines of code. Well, not really. I mean, I have unit tests for JPascal, but not for Scanner.pas. Programming at 3 levels simultaneously can be intense, like a scene from Inception.
I don’t know what a unit test is supposed to look like in Pascal, so I modeled it after Rust and Zig, with a Pascal feel:
test 'Scan Tokens'; begin var Uut := Scanner('(){},.-+;*'); Uut.ScanTokens(); AssertEqual(TOKEN_LEFT_PAREN, Uut.Tokens[0].TypeOfToken); end
I’ve already written a good suite of unit tests for RLox, all that remains is copying them over, and converting from Rust to Pascal. Of course it needs to indicate a pass or fail, so a little borrowing from Rust and Zig, and a lot from Maven:
Static Typing
Ok, now for the elephant in the room. One of Pascal’s claim-to-fame is that it is a strongly typed language, and Lox is dynamic. I actually prefer dynamic languages, but being able to see the type can be incredibly useful. I’m not going to lie, I had to think about this problem for a few days. I could go with Python’s “type hints”, but that feels unsatisfying. If I specify a type, I want that type to be enforced. In the end, I decided to go with optional but enforced typing.
Crafting Interpreters doesn’t have much to say on this other than it can be accomplished by adding a type checking pass. Purple Dragon book to the rescue! That uses static typing, so it has a good deal to say about reducing expression types. In the Parser, the type definition is optional. And, as the book suggests, I added a TypeChecker as a separate pass, based on the Resolver class. If a variable or parameter has a type, the type checker enforces it, throwing a RuntimeError for any mismatches.
To make the collections work, I had to add “generics” using the keyword of. As in “List of String”. For typecasting, I also added the keyword as.
Quality of Life
With static types in place, methods can have “signatures” allowing method overloading. Scanner’s AddToken methods are an example of this. This is not necessary but is nicer than having to come up with two separate names, like in Rust. Modifying the + operator to concatenate Strings with any other type of value was a huge improvement. And a number of native functions were needed, such as Write, WriteLn, Ord, Str, and Copy.
Imports and Use Keyword
Whew. That was alot of work for one chapter. Moving on to the next chapter. Wait, I’m not writing this in one massive file! Ok, we need to implement the use keyword. I would like to have individual class imports such as “use Token, TokenType from Token”, but for now, let’s keep it simple. A simple C style file include will work, but we can do slightly better. Automatically add the guards that every C header has to prevent circular includes.
Onward!
From chapter 2 on, things moved much more quickly. Parser, Interpreter, Resolver, and on. There were more features that needed to be added, such as the break statement, the raise statement, and try/except. 13 chapters, 404 tests, and close to 14,000 lines of code later I have a good approximation of Pascal written, and a fully functional version of Lox, running in Pascal, running in Java 🙂
Looking Forward
This is a significant milestone in the development of PRISM. So, what’s next? Some of the code in JPascal is less than perfect, and I am sure there are numerous bugs remaining. I’m not going to fix all of them. I don’t have to. This work has to be done all over again in Zig in the bye-code virtual machine.
I chose to go this route to make a concrete list of what’s needed for the foundation of Algol-24. I have 24 years of experience in Java, and 14 in recursive descent parsing, so it made sense to start there.
The 5,500 lines of code in PLox will not need to be rewritten. It can be used as it is to guide the work in Zig. I had a lot of fun. It felt really, really good to code in my “native language” after close to 30 years.
The code for PLox and jars for JPascal can be found on GitHub here:
PLox: Lox Tree-walk Interpreter, converted to “Pascal” from JLox
Next up, I will be making a “Macro Assembler” front-end for my virtual machine to create a CLR (common language runtime) to allow custom languages to be easily plugged in to PRISM.
For now, though, I’m going to take a break, and play a little. I’m going to write the 1982 game “CIA Adventure” in Pascal!
Leave a Reply