Simon Willison’s Weblog

Subscribe

August 2020

Aug. 1, 2020

James Bennett on why Django should not support JWT in core (via) The topic of adding JWT support to Django core comes up occasionally—here’s James Bennett’s detailed argument for not doing that. The short version is that the JWT specification isn’t just difficult to implement securely: it’s fundamentally flawed, which results in things like five implementations in three different languages all manifesting the same vulnerability. Third party modules exist that add JWT support to Django, but baking it into core would act as a form of endorsement and Django’s philosophy has always been to encourage people towards best practices.

# 12:28 am / security, django, james-bennett, jwt

The impact of crab mentality on performance was quantified by a New Zealand study in 2015 which demonstrated up to an 18% average exam result improvement for students when their grades were reported in a way that prevented others from knowing their position in published rankings.

Crab mentality on Wikipedia

# 4:25 pm / psychology

sqlite-utils 2.14 (via) I finally figured out porter stemming with SQLite full-text search today—it turns out it’s as easy as adding tokenize=’porter’ to the CREATE VIRTUAL TABLE statement. So I just shipped sqlite-utils 2.14 with a tokenize= option (plus the ability to insert binary file data from stdin).

# 9:19 pm / projects, search, sqlite, full-text-search, sqlite-utils

Aug. 2, 2020

How a Cheese Goes Extinct (via) Ruby Tandoh writes for the New Yorker about the culture, history and anthropology of cheesemaking through the lens of the British cheese industry. I learned that two of my favourite British cheeses—Tymsboro and Innes Log, have sadly ceased production. Beautifully written.

# 5:51 pm / cheese, new-yorker

When you talk with cheese aficionados, it doesn’t usually take long for the conversation to veer this way: away from curds, whey, and mold, and toward matters of life and death. With the zeal of nineteenth-century naturalists, they discuss great lineages and endangered species, painstakingly cataloguing those cheeses that are thriving and those that are lost to history.

Ruby Tandoh

# 6:57 pm / cheese

Aug. 5, 2020

Zero Downtime Release: Disruption-free Load Balancing of a Multi-Billion User Website (via) I remain fascinated by techniques for zero downtime deployment—once you have it working it makes shipping changes to your software so much less stressful, which means you can iterate faster and generally be much more confident in shipping code.

Facebook have invested vast amounts of effort into getting this right, and their new paper for the ACM SIGCOMM conference goes into detail about how it all works.

# 3:27 am / deployment, zero-downtime

Aug. 7, 2020

GraphQL in Datasette with the new datasette-graphql plugin

Visit GraphQL in Datasette with the new datasette-graphql plugin

This week I’ve mostly been building datasette-graphql, a plugin that adds GraphQL query support to Datasette.

[... 1,249 words]

Design Docs at Google. Useful description of the format used for software design docs at Google—informal documents of between 3 and 20 pages that outline the proposed design of a new project, discuss trade-offs that were considered and solicit feedback before the code starts to be written.

# 4:31 pm / google, documentation

Pysa: An open source static analysis tool to detect and prevent security issues in Python code (via) Interesting new static analysis tool for auditing Python for security vulnerabilities—things like SQL injection and os.execute() calls. Built by Facebook and tested extensively on Instagram, a multi-million line Django application.

# 8:50 pm / security, python, facebook, staticanalysis, django

Aug. 8, 2020

COVID-19 attacks our physical bodies, but also the cultural foundations of our lives, the toolbox of community and connectivity that is for the human what claws and teeth represent to the tiger.

Wade Davis

# 3:48 pm / covid19

Aug. 9, 2020

Datasette 0.46 (via) I just released Datasette 0.46 with a security fix for an issue involving CSRF tokens on canned query pages, plus a new debugging tool, improved file downloads and a bunch of other smaller improvements.

# 4:57 pm / projects, security, datasette

Aug. 13, 2020

We’re generally only impressed by things we can’t do - things that are beyond our own skill set. So, by definition, we aren’t going to be that impressed by the things we create. The end user, however, is perfectly able to find your work impressive.

@gamemakerstk

# 3:12 pm / productivity

Weeknotes: Installing Datasette with Homebrew, more GraphQL, WAL in SQLite

Visit Weeknotes: Installing Datasette with Homebrew, more GraphQL, WAL in SQLite

This week I’ve been working on making Datasette easier to install, plus wide-ranging improvements to the Datasette GraphQL plugin.

[... 1,009 words]

Aug. 19, 2020

Announcing the Consortium for Python Data API Standards (via) Interesting effort to unify the fragmented DataFrame API ecosystem, where increasing numbers of libraries offer APIs inspired by Pandas that imitate each other but aren’t 100% compatible. The announcement includes some very clever code to support the effort: custom tooling to compare the existing APIs, and an ingenious GitHub Actions setup to run traces (via sys.settrace), derive type signatures and commit those generated signatures back to a repository.

# 5:48 am / standards, data-science, github-actions, python

Aug. 21, 2020

Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database

Visit Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database

This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of datasette-graphql, upgraded the CSRF protection for datasette-upload-csvs and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.

[... 1,294 words]

Why weekly? You want to keep your finger on the pulse of what’s really going on. When 1:1s are scheduled bi-weekly, and either of you have to cancel, you’ll likely be going a month between conversations and that is far too long to go without having a 1:1 with your direct report. Think of how much happens in a month. You don’t want to be that far behind!

Adrienne Lowe

# 5:02 pm / management

California Protected Areas Database in Datasette (via) I built this yesterday: it’s a Datasette interface on top of the CPAD 2020 GIS database of protected areas in California maintained by GreenInfo Network. This was a useful excuse to build a GitHub Actions flow that builds a SpatiaLite database using my shapefile-to-sqlite tool, and I fixed a few bugs in my datasette-leaflet-geojson plugin as well.

# 11:15 pm / shapefiles, github-actions, datasette, projects, california, spatialite, gis

Aug. 28, 2020

Weeknotes: California Protected Areas in Datasette

Visit Weeknotes: California Protected Areas in Datasette

This week I built a geospatial search engine for protected areas in California, shipped datasette-graphql 1.0 and started working towards the next milestone for Datasette Cloud.

[... 1,099 words]

Mexican bean shakshuka

Visit Mexican bean shakshuka

The first time I ever tried a shakshuka I cooked it from this recipe by J. Kenji López-Alt. It quickly became my favourite breakfast.

[... 448 words]

Aug. 29, 2020

airtable-export. I wrote a command-line utility for exporting data from Airtable and dumping it to disk as YAML, JSON or newline delimited JSON files. This means you can backup an Airtable database from a GitHub Action and get a commit history of changes made to your data.

# 9:48 pm / projects, json, airtable, yaml

2020 » August

MTWTFSS
     12
3456789
10111213141516
17181920212223
24252627282930
31