Simon Willison’s Weblog

Subscribe

November 2021

60 posts: 5 entries, 9 links, 3 quotes, 43 beats

Nov. 15, 2021

Release sqlite-utils 3.18 — Python CLI utility and library for manipulating SQLite databases

Weeknotes: git-history, created for a Git scraping workshop

Visit Weeknotes: git-history, created for a Git scraping workshop

My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.

[... 1,239 words]

Nov. 16, 2021

TIL Planning parallel downloads with TopologicalSorter — For [complicated reasons](https://github.com/simonw/datasette/issues/878) I found myself wanting to write Python code to resolve a graph of dependencies and produce a plan for efficiently executing them, in parallel where possible.

Nov. 17, 2021

Release asyncinject 0.1a0 — Run async workflows using pytest-fixtures-style dependency injection
Release asyncinject 0.2a0 — Run async workflows using pytest-fixtures-style dependency injection
Release datasette-graphql 2.0 — Datasette plugin providing an automatic GraphQL API for your SQLite databases

Nov. 18, 2021

Cookiecutter Data Science (via) Some really solid thinking in this documentation for the DrivenData cookiecutter template. They emphasize designing data science projects for repeatability, such that just the src/ and data/ folders can be used to recreate all of the other analysis from scratch. I like the suggestion to give each project a dedicated S3 bucket for keeping immutable copies of the original raw data that might be too large for GitHub.

# 3:21 pm / data-science, cookiecutter

TIL Using cog to update --help in a Markdown README file — My [csvs-to-sqlite README](https://github.com/simonw/csvs-to-sqlite/blob/main/README.md) includes a section that shows the output of the `csvs-to-sqlite --help` command ([relevant issue](https://github.com/simonw/csvs-to-sqlite/issues/82)).
Release csvs-to-sqlite 1.3 — Convert CSV files into a SQLite database
Release s3-credentials 0.6 — A tool for creating credentials for accessing S3 buckets

Many Web3 boost­ers see them­selves as disruptors, but “tokenize all the things” is noth­ing if not an obe­di­ent con­tin­u­a­tion of “market-ize all the things”, the cam­paign started in the 1970s, hugely suc­cessful, ongoing. I think the World Wide Web was the real rupture — “Where … is the money?”—which Web 2.0 smoothed over and Web3 now attempts to seal totally.

Robin Sloan

# 9:55 pm / robin-sloan, web3

Nov. 19, 2021

Release sqlite-utils 3.19a0 — Python CLI utility and library for manipulating SQLite databases

Nov. 20, 2021

TIL Using build-arg variables with Cloud Run deployments — For [datasette/issues/1522](https://github.com/simonw/datasette/issues/1522) I wanted to use a Docker build argument in a `Dockerfile` that would then be deployed to Cloud Run.
TIL Assigning a custom subdomain to a Fly app — I deployed an app to [Fly](https://fly.io/) and decided to point a custom subdomain to it.
Release datasette-redirect-to-https 0.1 — Datasette plugin that redirects all non-https requests to https
Release datasette 0.59.3 — An open source multi-tool for exploring and publishing data

Nov. 21, 2021

Release sqlite-utils 3.19 — Python CLI utility and library for manipulating SQLite databases
Release git-history 0.4 — Tools for analyzing Git history using SQLite

Nov. 22, 2021

Hurl (via) Hurl is “a command line tool that runs HTTP requests defined in a simple plain text format”—written in Rust on top of curl, it lets you run HTTP requests and then execute assertions against the response, defined using JSONPath or XPath for HTML. It can even assert that responses were returned within a specified duration.

# 3:32 am / curl, http, rust

Release datasette-publish-vercel 0.12 — Datasette plugin for publishing data using Vercel

Weeknotes: Apache proxies in Docker containers, refactoring Datasette

Updates to six major projects this week, plus finally some concrete progress towards Datasette 1.0.

[... 1,630 words]

Introduction to heredocs in Dockerfiles (via) This is a fantastic upgrade to Dockerfile syntax, enabled by BuildKit and a new frontend for executing the Dockerfile that can be specified with a #syntax= directive. I often like to create a standalone Dockerfile that works without needing other files from a directory, so being able to use <<EOF syntax to populate configure files from inline blocks of code is really handy.

# 5:01 pm / docker

htmlspecialchars was a very early function. Back when PHP had less than 100 functions and the function hashing mechanism was strlen(). In order to get a nice hash distribution of function names across the various function name lengths names were picked specifically to make them fit into a specific length bucket. This was circa late 1994 when PHP was a tool just for my own personal use and I wasn't too worried about not being able to remember the few function names.

Rasmus Lerdorf

# 7:23 pm / php, rasmus-lerdorf

Nov. 25, 2021

TIL Pausing traffic and retrying in Caddy — A pattern I really like for zero-downtime deploys is the ability to "pause" HTTP traffic at the load balancer, such that incoming requests from browsers appear to take a few extra seconds to return, but under the hood they've actually been held in a queue while a backend server is swapped out or upgraded in some way.

PHP 8.1 release notes (via) PHP is gaining “Fibers” for lightweight cooperative concurrency—very similar to Python asyncio. Interestingly you don’t need to use separate syntax like “await fn()” to call them—calls to non-blocking functions are visually indistinguishable from calls to blocking functions. Considering how much additional library complexity has emerged in Python world from having these two different colours of functions it’s noteworthy that PHP has chosen to go in a different direction here.

# 7:53 pm / async, php

Nov. 28, 2021

Release datasette-table 0.1.0 — A Web Component for embedding a Datasette table on a page
TIL Publishing a Web Component to npm — I tried this for the first time today with my highly experimental [datasette-table](https://www.npmjs.com/package/datasette-table) Web Component. Here's [the source code](https://github.com/simonw/datasette-table/tree/0.1.0) for version 0.1.0.

Nov. 29, 2021

TIL Reusing an existing Click tool with register_commands — The [register_commands](https://docs.datasette.io/en/stable/plugin_hooks.html#register-commands-cli) plugin hook lets you add extra sub-commands to the `datasette` CLI tool.

Nov. 30, 2021

Release datasette 0.59.4 — An open source multi-tool for exploring and publishing data
Release s3-credentials 0.7 — A tool for creating credentials for accessing S3 buckets

2021 » November

MTWTFSS
1234567
891011121314
15161718192021
22232425262728
2930