Simon Willison’s Weblog

Subscribe

April 2021

41 posts: 6 entries, 5 links, 3 quotes, 27 beats

April 1, 2021

VIAL: Preparing for some collaborative testing

With the Airtable limits fast approaching, I’m going to start leaning heavily on people to help verify that VIAL can do the jobs that it needs to do.

[... 465 words]

If you measure things by foot traffic we [the SFO Museum] are one of the busiest museums in the world. If that is the case we are also one of the busiest museums in the world that no one knows about. Nothing in modern life really prepares you for the idea that a museum should be part of an airport. San Francisco, as I've mentioned, is funny that way.

Aaron Straup Cope

# 10:40 pm / aaron-straup-cope, museums, san-francisco

April 2, 2021

Release airtable-export 0.6 — Export Airtable data to YAML, JSON or SQLite files on disk

April 4, 2021

TIL Useful Markdown extensions in Python — I wanted to render some markdown in Python but with the following extra features:

Animated choropleth of vaccinations by US county

Visit Animated choropleth of vaccinations by US county

Last week I mentioned that I’ve recently started scraping and storing the CDC’s per-county vaccination numbers in my cdc-vaccination-history GitHub repository. This week I used an Observable notebook and d3’s TopoJSON support to render those numbers on an animated choropleth map.

[... 1,138 words]

Spatialite Speed Test. Part of an excellent series of posts about SpatiaLite from 2012—here John C. Zastrow reports on running polygon intersection queries against a 1.9GB database file in 40 seconds without an index and 0.186 seconds using the SpatialIndex virtual table mechanism.

# 4:28 pm / gis, spatialite, sqlite

TIL Language-specific indentation settings in VS Code — When I'm working with Python I like four space indents, but for JavaScript or HTML I like two space indents.

April 5, 2021

Render single selected county on a map (via) Another experiment at the intersection of Datasette and Observable notebooks. This one imports a full Datasette table (3,200 US counties) using streaming CSV and loads that into Observable’s new Search and Table filter widgets. Once you select a single county a second Datasette SQL query (this time retuning JSON) fetches a GeoJSON representation of that county which is then rendered as SVG using D3.

# 4:48 am / d3, datasette, observable

Behind GitHub’s new authentication token formats (via) This is a really smart design. GitHub’s new tokens use a type prefix of “ghp_” or “gho_” or a few others depending on the type of token, to help support mechanisms that scan for accidental token publication. A further twist is that the last six characters of the tokens are a checksum, which means token scanners can reliably distinguish a real token from a coincidental string without needing to check back with the GitHub database. “One other neat thing about _ is it will reliably select the whole token when you double click on it”—what a useful detail!

# 9:28 pm / github, security

April 8, 2021

Release airtable-export 0.7 — Export Airtable data to YAML, JSON or SQLite files on disk

April 9, 2021

Release django-sql-dashboard 0.6a0 — Django app for building dashboards using raw SQL queries
TIL Efficient bulk deletions in Django — I needed to bulk-delete a large number of objects today. Django deletions are relatively inefficient by default, because Django implements its own version of cascading deletions and fires signals for each deleted object.
Release airtable-export 0.7.1 — Export Airtable data to YAML, JSON or SQLite files on disk

April 10, 2021

Release json-post 0.1a0 — Tool for posting JSON to an API, broken into pages
TIL Using unnest() to use a comma-separated string as the input to an IN query — [django-sql-dashboard](https://github.com/simonw/django-sql-dashboard) lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.

April 11, 2021

Release json-post 0.2a0 — Tool for posting JSON to an API, broken into pages
Release json-post 0.2 — Tool for posting JSON to an API, broken into pages

In general, relying only on natural keys is a nightmare. Double nightmare if it's PII. Natural keys only work if you are flawlessly omniscient about the domain. And you aren't.

Jacques Chester

# 10:48 pm / databases

April 12, 2021

Porting VaccinateCA to Django

Visit Porting VaccinateCA to Django

As I mentioned back in February, I’ve been working with the VaccinateCA project to try to bring the pandemic to an end a little earlier by helping gather as accurate a model as possible of where the Covid vaccine is available in California and how people can get it.

[... 2,157 words]

Release django-sql-dashboard 0.7a0 — Django app for building dashboards using raw SQL queries

April 14, 2021

TIL Using json_extract_path in PostgreSQL — The `json_extract_path()` function in PostgreSQL can be used to extract specific items from JSON - but I couldn't find documentation for the path language it uses.
Release django-sql-dashboard 0.8a0 — Django app for building dashboards using raw SQL queries
Release django-sql-dashboard 0.8a1 — Django app for building dashboards using raw SQL queries
Release django-sql-dashboard 0.8a2 — Django app for building dashboards using raw SQL queries

Why you shouldn’t use ENV variables for secret data (via) I do this all the time, but this article provides a good set of reasons that secrets in environment variables are a bad pattern—even when you know there’s no multi-user access to the host you are deploying to. The biggest problem is that they often get captured by error handling scripts, which may not have the right code in place to redact them. This article suggests using Docker secrets instead, but I’d love to see a comprehensive write-up of other recommended patterns for this that go beyond applications running in Docker.

# 6:22 pm / security

April 15, 2021

TIL Listing files uploaded to Cloud Build — Today while running `datasette publish cloudrun ...` I noticed the following:

April 18, 2021

TIL Enabling the fuzzystrmatch extension in PostgreSQL with a Django migration — The PostgreSQL [fuzzystrmatch extension](https://www.postgresql.org/docs/13/fuzzystrmatch.html) enables several functions for fuzzy string matching: `soundex()`, `difference()`, `levenshtein()`, `levenshtein_less_equal()`, `metaphone()`, `dmetaphone()` and `dmetaphone_alt()`.

country-coder (via) Given a latitude and longitude, how can you tell what country that point sits within? One way is to do a point-in-polygon lookup against a set of country polygons, but this can be tricky: some countries such as New Zealand have extremely complex outlines, even though for this use-case you don’t need the exact shape of the coastline. country-coder solves this with a custom designed 595KB GeoJSON file with detailed land borders but loosely defined ocean borders. It also comes with a wrapper JavaScript library that provides an API for resolving points, plus useful properties on each country with details like telepohen calling codes and emoji flags.

# 7:37 pm / gis, geojson

Weeknotes: The Aftermath

Some tweets that effectively illustrate my week:

[... 208 words]

April 20, 2021

TIL Usable horizontal scrollbars in the Django admin for mouse users — I got a complaint from a Windows-with-mouse user of a Django admin project I'm working on: they couldn't see the right hand columns in a table without scrolling horizontally, but since the horizontal scrollbar was only available at the bottom of the page they had to scroll all the way to the bottom first in order to scroll sideways.

2021 » April

MTWTFSS
   1234
567891011
12131415161718
19202122232425
2627282930