15 items tagged “compilers”
2024
Figma’s journey to TypeScript: Compiling away our custom programming language (via) I love a good migration story. Figma had their own custom language that compiled to JavaScript, called Skew. As WebAssembly support in browsers emerged and improved the need for Skew’s performance optimizations reduced, and TypeScript’s maturity and popularity convinced them to switch.
Rather than doing a stop-the-world rewrite they built a transpiler from Skew to TypeScript, enabling a multi-year migration without preventing their product teams from continuing to make progress on new features.
Ruff v0.4.0: a hand-written recursive descent parser for Python. The latest release of Ruff—a Python linter and formatter, written in Rust—includes a complete rewrite of the core parser. Previously Ruff used a parser borrowed from RustPython, generated using the LALRPOP parser generator. Victor Hugo Gomes contributed a new parser written from scratch, which provided a 2x speedup and also added error recovery, allowing parsing of invalid Python—super-useful for a linter.
I tried Ruff 0.4.0 just now against Datasette—a reasonably large Python project—and it ran in less than 1/10th of a second. This thing is Fast.
2023
Lark parsing library JSON tutorial (via) A very convincing tutorial for a new-to-me parsing library for Python called Lark.
The tutorial covers building a full JSON parser from scratch, which ends up being just 19 lines of grammar definition code and 15 lines for the transformer to turn that tree into the final JSON.
It then gets into the details of optimization—the default Earley algorithm is quite slow, but swapping that out for a LALR parser (a one-line change) provides a 5x speedup for this particular example.
Mapping Python to LLVM (via) Codon is a fascinating new entry in the “compile Python code to something else” world—this time targeting LLVM. Ariya Shajii describes in great detail how it pulls this off, including tricks such as transforming Python generators to LLVM coroutines. Codon doesn’t promise that all Python code will work—it’s best thought of as a Python-like language which can be used to create compiled modules which can then be imported back into regular Python projects.
2022
A CGo-free port of SQLite. Fascinating Go version of SQLite, which uses Go code that has been translated from the original SQLite C using ccgo, a package by the same author which “translates cc ASTs to Go source code”. It claims to pass the full public SQLite test suite, which is very impressive.
Writing a minimal Lua implementation with a virtual machine from scratch in Rust. Phil Eaton implements a subset of Lua in a Rust in this detailed tutorial.
2021
Introducing stack graphs (via) GitHub launched “precise code navigation” for Python today—the first language to get support for this feature. Click on any Python symbol in GitHub’s code browsing views and a box will show you exactly where that symbol was defined—all based on static analysis by a custom parser written in Rust as opposed to executing any Python code directly. The underlying computer science uses a technique called stack graphs, based on scope graphs research from Eelco Visser’s research group at TU Delft.
lex.go in json5-go. This archived GitHub repository has a beautifully clean and clear example of a hand-written lexer in Go, for the JSON5 format (JSON + comments + multi-line strings). parser.go is worth a look too.
2020
A hands-on introduction to static code analysis. Useful tutorial on using the Python standard library tokenize and ast modules to find specific patterns in Python source code, using the visitor pattern.
A Compiler Writing Journey (via) Warren Toomey has been writing a self-compiling compiler for a subset of C, and extensively documenting every step of the journey here on GitHub. The result is an extremely high quality free textbook on compiler construction.
2010
grammar.coffee (via) The annotated grammar for CoffeeScript, a new language that compiles to JavaScript developed by DocumentCloud’s Jeremy Ashkenas. The linked page is generated using Jeremy’s Docco tool for literate programming, also written in CoffeeScript. CoffeeScript itself is implemented in CoffeeScript, using a bootstrap compiler originally written in Ruby.
2009
Developing for the Apple iPhone using Flash. A brilliant feat of engineering: Adobe worked around Apple’s “no runtime allowed” rules by writing a compiler front end for LLVM that compiles ActionScript 3 to ARM assembly code, and apparently ported the regular Flash drawing APIs as well.
2008
Simple Top-Down Parsing in Python. Eye-opening tutorial on building a recursive descent parser for Python, in Python that uses top-down operator precedence.
python4ply tutorial. python4ply is a parser for Python written in Python using the PLY toolkit, which compiles to Python bytecode using the built-in compiler module. The tutorial shows how to use it to add support for Perl-style 1_000_000 readable numbers.
2007
NestedVM. Provides binary translation from a GCC compiled MIPS binary to a Java class file, letting you run anything supported by GCC on the JVM with no source changes.