July 2021
58 posts: 9 entries, 21 links, 2 quotes, 26 beats
July 17, 2021
Datasette downloads per day (with Observable Plot) (via) I built an Observable notebook that imports PyPI package download data from datasette.io (itself scraped from pypistats.org using a scheduled GitHub Action) and plots it using Observable Plot. Datasette downloads from PyPI apparently jumped from ~800/day in May to ~4,000/day in July—would love to know why!
It doesn’t take much public creativity to stand out as a job candidate
I’ve spent nearly twenty years blogging, giving talks and releasing open source code. It’s been fantastic for my career, and a huge amount of work. But here’s a useful secret: you don’t have to put very much work at all into public creativity in order to stand out as a job candidate.
[... 495 words]July 18, 2021
I've always believed that a book, even a technical book, should try to tell a cohesive story. The challenge is that as Python has grown in popularity, it has really turned into three different languages--each with their own story. There is a whimsical Python for scripting and tinkering, a quirky Python for asynchronous programming and networking, and a serious Python for enterprise applications. Sometimes these stories intersect. Sometimes not.
Organize and Index Your Screenshots (OCR) on macOS (via) Alexandru Nedelcu has a very neat recipe for creating an archive of searchable screenshots on macOS: set the default save location for screenshots to a Dropbox folder, then create a launch agent that runs a script against new files in that folder to run tesseract OCR to convert them into a searchable PDF.
July 19, 2021
Inserting One Billion Rows in SQLite Under a Minute (via) Avinash Sajjanshetty experiments with accelerating writes to a test table in SQLite, using various SQLite pragmas to accelerate inserts followed by a rewrite of Python code to Rust. Also of note: running the exact same code in PyPy saw a 3.5x speed-up!
toyDB: references. toyDB is a “distributed SQL database in Rust, written as a learning project”, with its own implementations of SQL, raft, ACID transactions, B+trees and more. toyDB author Erik Grinaker has assembled a detailed set of references that he used to learn how to build a database—I’d love to see more projects do this, it’s really useful.
Launch HN Instructions (via) The instructions for YC companies that are posting their launch announcement on Hacker News are really interesting to read. “As founders, you’re used to talking to users, customers, and investors. HN readers are not any of those—what they are is peers, and using any of those styles with peers feels clueless and entitled. [...] To interest HN, write in a factual, personal, and modest way about what problem you solve, why it matters, how you solve it, and how you got there.”
July 21, 2021
Weeknotes: sqlite-transform 1.1, Datasette 0.58.1, datasette-graphql 1.5
Work on Project Pelican inspires new features and improvements across a number of different projects.
[... 1,419 words]July 22, 2021
Datasette—an ecosystem of tools for working with small data
This is the transcript and video from a talk I gave at PyGotham 2020 about using SQLite, Datasette and Dogsheep to work with small data.
[... 4,655 words]July 23, 2021
The Tyranny of Spreadsheets (via) In discussing the notorious Excel incident last year when the UK lost track of 16,000 Covid cases due to a .xls row limit, Tim Harford presents a history of the spreadsheet, dating all the way back to Francesco di Marco Datini and double-entry bookkeeping in 1396. A delightful piece of writing.
July 24, 2021
How the Python import system works (via) Remarkably detailed and thorough dissection of how exactly import, modules and packages work in Python—eventually digging right down into the C code. Part of Victor Skvortsov’s excellent “Python behind the scenes” series.
July 25, 2021
July 28, 2021
The Baked Data architectural pattern
I’ve been exploring an architectural pattern for publishing websites over the past few years that I call the “Baked Data” pattern. It provides many of the advantages of static site generators while avoiding most of their limitations. I think it deserves to be used more widely.
[... 1,896 words]July 29, 2021
Clickhouse on Cloud Run (via) Alex Reid figured out how to run Clickhouse against read-only baked data on Cloud Run last year, and wrote up some comprehensive notes.
Weeknotes: datasette-remote-metadata, sqlite-transform --multi
I mentioned Project Pelican (still a codename until the end of the embargo) last week. This week it inspired a new plugin, datasette-remote-metadata.
[... 595 words]
