Simon Willison’s Weblog

Subscribe
Atom feed for ben-johnson

10 posts tagged “ben-johnson”

2025

Litestream v0.5.0 is Here (via) I've been running Litestream to backup SQLite databases in production for a couple of years now without incident. The new version has been a long time coming - Ben Johnson took a detour into the FUSE-based LiteFS before deciding that the single binary Litestream approach is more popular - and Litestream 0.5 just landed with this very detailed blog posts describing the improved architecture.

SQLite stores data in pages - 4096 (by default) byte blocks of data. Litestream replicates modified pages to a backup location - usually object storage like S3.

Most SQLite tables have an auto-incrementing primary key, which is used to decide which page the row's data should be stored in. This means sequential inserts to a small table are sent to the same page, which caused previous Litestream to replicate many slightly different copies of that page block in succession.

The new LTX format - borrowed from LiteFS - addresses that by adding compaction, which Ben describes as follows:

We can use LTX compaction to compress a bunch of LTX files into a single file with no duplicated pages. And Litestream now uses this capability to create a hierarchy of compactions:

  • at Level 1, we compact all the changes in a 30-second time window
  • at Level 2, all the Level 1 files in a 5-minute window
  • at Level 3, all the Level 2’s over an hour.

Net result: we can restore a SQLite database to any point in time, using only a dozen or so files on average.

I'm most looking forward to trying out the feature that isn't quite landed yet: read-replicas, implemented using a SQLite VFS extension:

The next major feature we’re building out is a Litestream VFS for read replicas. This will let you instantly spin up a copy of the database and immediately read pages from S3 while the rest of the database is hydrating in the background.

# 3rd October 2025, 3:10 pm / sqlite, fly, litestream, ben-johnson

2023

[On SQLite for production concurrent writes] In general, WAL mode “just works” as Simon said. You just need to make sure you don’t have long running write transactions, although those are somewhat problematic in any database system. Don’t do stuff like starting a write txn and then calling a remote API and then committing. That’ll kill your write throughout.

Ben Johnson

# 26th January 2023, 7:36 pm / sqlite, ben-johnson

2022

Introducing LiteFS (via) LiteFS is the new SQLite replication solution from Fly, now ready for beta testing. It’s from the same author as Litestream but has a very different architecture; LiteFS works by implementing a custom FUSE filesystem which spies on SQLite transactions being written to the journal file and forwards them on to other nodes in the cluster, providing full read-replication. The signature Litestream feature of streaming a backup to S3 should be coming within the next few months.

# 21st September 2022, 6:56 pm / databases, replication, sqlite, fly, litestream, ben-johnson

How the SQLite Virtual Machine Works. The latest entry in Ben Johnson’s series about SQLite internals.

# 7th September 2022, 8:49 pm / databases, sqlite, ben-johnson

How SQLite Scales Read Concurrency (via) Ben Johnson’s series on SQLite internals continues—this time with a detailed explanation of how the SQLite WAL (Write-Ahead Log) is implemented.

# 24th August 2022, 4:16 pm / databases, sqlite, ben-johnson

How SQLite Helps You Do ACID (via) Ben Johnson’s series of posts explaining the internals of SQLite continues with a deep look at how the rollback journal works. I’m learning SO much from this series.

# 10th August 2022, 3:39 pm / sqlite, fly, ben-johnson

SQLite Internals: Pages & B-trees (via) Ben Johnson provides a delightfully clear introduction to SQLite internals, describing the binary format used to store rows on disk and how SQLite uses 4KB pages for both row storage and for the b-trees used to look up records.

# 27th July 2022, 2:57 pm / algorithms, databases, sqlite, ben-johnson

SQLite Happy Hour—a Twitter Spaces conversation about three interesting projects building on SQLite

Yesterday I hosted SQLite Happy Hour. my first conversation using Twitter Spaces. The idea was to dig into three different projects that were doing interesting things on top of SQLite. I think it worked pretty well, and I’m curious to explore this format more in the future.

[... 1,998 words]

2021

Litestream runs continuously on a test server with generated load and streams backups to S3. It uses physical replication so it'll actually restore the data from S3 periodically and compare the checksum byte-for-byte with the current database.

Ben Johnson

# 11th February 2021, 8:50 pm / testing, litestream, ben-johnson

Why I Built Litestream. Litestream is a really exciting new piece of technology by Ben Johnson, who previously built BoltDB, the key-value store written in Go that is used by etcd. It adds replication to SQLite by running a process that converts the SQLite WAL log into a stream that can be saved to another folder or pushed to S3. The S3 option is particularly exciting—Ben estimates that keeping a full point-in-time recovery log of a high write SQLite database should cost in the order of a few dollars a month. I think this could greatly expand the set of use-cases for which SQLite is sensible choice.

# 11th February 2021, 7:25 pm / replication, scaling, sqlite, ben-johnson