March 2020
March 4, 2020
Weeknotes: datasette-ics, datasette-upload-csvs, datasette-configure-fts, asgi-csrf
I’ve been preparing for the NICAR 2020 Data Journalism conference this week which has lead me into a flurry of activity across a plethora of different projects and plugins.
[... 834 words]March 5, 2020
Millions of tiny databases. Fascinating, detailed review of a paper that describes Amazon’s Physalia, a distributed configuration store designed to provide extremely high availability coordination for Elastic Block Store replication. My eyebrows raised at “Physalia is designed to offer consistency and high-availability, even under network partitions.” since that’s such a blatant violation of CAP theorem, but it later justifies it like so: “One desirable property therefore, is that in the event of a partition, a client’s Physalia database will be on the same side of the partition as the client. Clever placement of cells across nodes can maximise the chances of this.”
March 7, 2020
I called it normalization because then President Nixon was talking a lot about normalizing relations with China. I figured that if he could normalize relations, so could I.
March 9, 2020
datasette-search-all: a new plugin for searching multiple Datasette tables at once
I just released a new plugin for Datasette, and it’s pretty fun. datasette-search-all is a plugin written mostly in JavaScript that executes the same search query against every searchable table in every database connected to your Datasette instance.
[... 819 words]The unexpected Google wide domain check bypass (via) Fantastic story of discovering a devious security vulnerability in a bunch of Google products stemming from a single exploitable regular expression in the Google closure JavaScript library.
March 11, 2020
Weeknotes: COVID-19 numbers in Datasette
COVID-19, the disease caused by the novel coronavirus, gets more terrifying every day. Johns Hopkins Center for Systems Science and Engineering (CSSE) have been collating data about the spread of the disease and publishing it as CSV files on GitHub.
[... 644 words]March 12, 2020
Announcing Daylight Map Distribution. Mike Migurski announces a new distribution of OpenStreetMap: a 42GB dump of the version of the data used by Facebook, carefully moderated to minimize the chance of incorrect or maliciously offensive edits. Lots of constructive conversation in the comments about the best way for Facebook to make their moderation decisions more available to the OSM community.
New governance model for the Django project. This has been under discussion for a long time: I’m really excited to see it put into action. It’s difficult to summarize, but they key effect should be a much more vibrant, active set of people involved in making decisions about the framework.
March 18, 2020
Weeknotes: this week was absurd
As of this morning, San Francisco is in a legally mandated shelter-in-place. I can hardly remember what life was like seven days ago. It’s been a very long, very crazy week. This was not a great week for getting stuff done.
[... 246 words]March 19, 2020
datasette-publish-fly (via) Fly is a neat new Docker hosting provider with a very tempting pricing model: Just $2.67/month for their smallest always-on instance, and they give each user $10/month in free credit. datasette-publish-fly is the first plugin I’ve written using the publish_subcommand plugin hook, which allows extra hosting providers to be added as publish targets. Install the plugin and you can run “datasette publish fly data.db” to deploy SQLite databases to your Fly account.
Django: Added support for asynchronous views and middleware (via) An enormously consequential feature just landed in Django, and is set to ship as part of Django 3.1 in August. Asynchronous views will allow Django applications to define views using “async def myview(request)”—taking full advantage of Python’s growing asyncio ecosystem and providing enormous performance improvements for Django sites that do things like hitting APIs over HTTP. Andrew has been puzzling over this for ages and it’s really exciting to see it land in a form that should be usable in a stable Django release in just a few months.
March 21, 2020
hacker-news-to-sqlite (via) The latest in my Dogsheep series of tools: hacker-news-to-sqlite uses the Hacker News API to fetch your comments and submissions from Hacker News and save them to a SQLite database.
March 25, 2020
Weeknotes: Datasette 0.39 and many other projects
This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.
[... 806 words]March 26, 2020
Slack’s not specifically a “work from home” tool; it’s more of a “create organizational agility” tool. But an all-at-once transition to remote work creates a lot of demand for organizational agility.
Making Datasets Fly with Datasette and Fly (via) It’s always exciting to see a Datasette tutorial that wasn’t written by me! This one is great—it shows how to load Central Park Squirrel Census data into a SQLite database, explore it with Datasette and then publish it to the Fly hosting platform using datasette-publish-fly and datasette-cluster-map.
March 27, 2020
PostGraphile: Production Considerations. PostGraphile is a tool for building a GraphQL API on top of an existing PostgreSQL schema. Their “production considerations” documentation is particularly interesting because it directly addresses some of my biggest worries about GraphQL: the potential for someone to craft an expensive query that ties up server resources. PostGraphile suggests a number of techniques for avoiding this, including a statement timeout, a query allowlist, pagination caps and (in their “pro” version) a cost limit that uses a calculated cost score for the query.