Simon Willison’s Weblog

Subscribe

February 2022

Feb. 2, 2022

webvm.io (via) This is one heck of a tech demo: it’s a full copy of Debian, compiled to WebAssembly and running in your browser. It’s fully stocked with Python, Perl, Ruby, Node.js and even a working gcc compiler! The underlying technology, CheerpX, is a closed-source WebAssembly virtualization platform.

# 2:29 am / debian, webassembly

Help scraping: track changes to CLI tools by recording their --help using Git

Visit Help scraping: track changes to CLI tools by recording their --help using Git

I’ve been experimenting with a new variant of Git scraping this week which I’m calling Help scraping. The key idea is to track changes made to CLI tools over time by recording the output of their --help commands in a Git repository.

[... 978 words]

Feb. 5, 2022

Every few weeks, someone on Twitter notices how demented the content on Facebook is. I’ve covered a lot of these stories. The quick TL;DR is that Facebook’s video section is essentially run by a network of magicians and Vegas stage performers who hack the platform’s algorithm with surreal low-value content designed to distract users long enough to trigger an in-video advertisement and anger them enough to leave a comment.

Ryan Broderick

# 10:41 pm / facebook

Feb. 7, 2022

Sha256 Algorithm Explained (via) Absolutely beautiful interactive animated explanation by Domingo Martin of the SHA256 hashing algorithm.

# 7:27 pm / algorithms, explorables

Feb. 9, 2022

Single dependency stacks (via) Brandur Leach notes that the core services at Crunchy (admittedly a PostgreSQL hosting and consultancy company) have only one stateful dependency – Postgres. No Redis, ElasticSearch or anything else. This means that problems like rate limiting and search, which are often farmed out to external services, are all handled using either PostgreSQL or in-memory mechanisms on their servers.

# 6:43 pm / postgresql, brandur-leach

Feb. 10, 2022

GitHub Burndown (via) Neat Observable notebook by Tom MacWright—give it a GitHub access token and the name of a repo and it pulls the details of every issue and plots a burndown chart over time, showing how long issues stay open for. The code is worth spending some time with—the way it fetches data from the paginated JSON API is a really great example of using generators with Observable, and the chart itself is a lovely clear example of Observable Plot.

# 4:29 pm / github, observable, tom-macwright, observable-plot

lon lat lon lat lon. Tom MacWright’s definitive guide to the (latitude, longitude) v.s. (longitude, latitude) debate. The answer is frustrating: both orders are used by significant software, so there’s no single answer that will satisfy everyone. I’ve recently been mostly convinced over to the longitude, latitude side mainly because that’s a better fit for the non-geospatial x, y pattern.

# 4:32 pm / geospatial, tom-macwright

Feb. 12, 2022

jless (via) A really nice new command-line JSON viewer, written in Rust, created by Paul Julius Martinez. It provides a terminal interface for navigating through large JSON files, including expanding and contracting nested objects and searching for strings or a modified form of regular expressions.

# 3:17 am / json, rust

Running C unit tests with pytest (via) Brilliant, detailed tutorial by Gabriele Tornetta on testing C code using pytest, which also doubles up as a ctypes tutorial. There’s a lot of depth here—in addition to exercising C code through ctypes, Gabriele shows how to run each test in a separate process so that segmentation faults don’t fail the entire suite, then adds code to run the compiler as part of the pytest run, and then shows how to use gdb trickery to generate more useful stack traces.

# 5:14 pm / c, ctypes, pytest

Feb. 14, 2022

Datasette table diagram using Mermaid (via) Mermaid is a DSL for generating diagrams from plain text, designed to be embedded in Markdown. GitHub just added support for Mermaid to their Markdown pipeline, which inspired me to try it out. Here’s an Observable Notebook I built which uses Mermaid to visualize the relationships between Datasette tables based on their foreign keys.

# 7:43 pm / dsl, github, visualization, datasette, observable, mermaid

Feb. 15, 2022

Using SQLite and Datasette with Fly Volumes

Visit Using SQLite and Datasette with Fly Volumes

A few weeks ago, Fly announced Free Postgres Databases as part of the free tier of their hosting product. Their announcement included this snippet:

[... 1,463 words]

Feb. 17, 2022

redbean (via) “redbean makes it possible to share web applications that run offline as a single-file αcτµαlly pδrταblε εxεcµταblε zip archive which contains your assets. All you need to do is download the redbean.com program below, change the filename to .zip, add your content in a zip editing tool, and then change the extension back to .com”.

redbean is implemented as a single C file with a dazzling array of clever tricks—most impressively, the single executable works on Linux, macOS, Windows and various BSDs!

It embeds Lua, and in June last year added SQLite too—so self-contained distributable web applications built with Redbean can now use Lua and SQLite for dynamic scripting. Performance sounds incredible: “redbean can serve 1 million+ gzip encoded responses per second on a cheap personal computer”.

# 6:01 am / c, lua, sqlite, redbean, cosmopolitan

Feb. 18, 2022

Fullmoon (via) A “fast and minimalistic web framework” written in Lua, based on Redbean. The documentation for this is fantastic, and because it uses Redbean the development experience is to download the Redbean executable (which runs on every platform) and then drop your own Lua scripts into it using zip.

# 6:41 pm / lua, redbean, cosmopolitan

Feb. 20, 2022

Google Drive to SQLite

Visit Google Drive to SQLite

I released a new tool this week: google-drive-to-sqlite. It’s a CLI utility for fetching metadata about files in your Google Drive and writing them to a local SQLite database.

[... 1,221 words]

Feb. 21, 2022

[history] When I tried this in 1996 (via) “I removed the GIL back in 1996 from Python 1.4...” is the start of a fascinating (supportive) comment by Greg Stein on the promising nogil Python fork that Sam Gross has been putting together. Greg provides some historical context that I’d never heard before, relating to an embedded Python for Microsoft IIS.

# 10:43 pm / gil, history, python

Feb. 23, 2022

Support open source that you use by paying the maintainers to talk to your team

I think I’ve come up with a novel hack for the challenge of getting your company to financially support the open source projects that it uses: reach out to the maintainers and offer them generous speaking fees for remote talks to your engineering team.

[... 645 words]

Feb. 26, 2022

migra (via) This looks like a very handy tool to have around: run “migra postgresql:///a postgresql:///b” and it will detect and output the SQL alter statements needed to modify the first PostgreSQL database schema to match the second. It’s written in Python, running on top of SQLAlchemy.

# 11:23 pm / databases, migrations, postgresql

Feb. 27, 2022

Even then, what does “best” even mean? I think back then I used it a lot more just because I was writing for a food blog every day, and “best” gives you more clicks than “really good.” These days, I don’t really care about clicks, and so I very rarely say something is “best.” I generally go out of my way to say, “This is just what I felt like doing today.”

J. Kenji López-Alt

# 3:57 pm / cooking

Weeknotes: Datasette Tutorials

I published two new tutorials for Datasette this week, both focused at end-users of the web application.

[... 479 words]

2022 » February

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
28