31st October 2018 - Link Blog
October 21 post-incident analysis (via) Legitimately fascinating post-mortem by GitHub. They run database masters in multiple data centers with raft for leader election... but when they had an unexpected network split between east and west coast they ended up with several seconds of write that had not been correctly replicated. Cleaning up the resulting mess took the best part of 24 hours! Distributed systems are hard.
Recent articles
- Porting the Moebius 0.2B image inpainting model to run in the browser with Claude Code - 22nd June 2026
- sqlite-utils 4.0rc1 adds migrations and nested transactions - 21st June 2026
- Datasette Apps: Host custom HTML applications inside Datasette - 18th June 2026