Django admin customization, JSON in our PostgreSQL
25th February 2021
Originally posted to my internal blog at VaccinateCA
My progress slowed a bit today as I started digging into some things I’m less familiar with—but I’ve found some tricks that I think will help us out a lot.
Django admin customization
In the past I’ve used the Django admin mostly as a database debugging tool, on the basis that once you get deep into building out an interface that’s more than just a very basic CMS you’re better off rolling something from scratch.
Today my opinion changed. I think the Django admin may be the solution to a LOT of our problems, with very little extra customization.
I was looking at a simple feature request: show a summary of calls made by a specific reporter. It turns out adding a custom templated block of text to an existing Django admin “change item” page is trivial, using the pattern I wrote up here:
The implementation for this is tiny. We can use this pattern to add SO much depth to our admin pages—and Django’s default permission system is robust enough that we can give users access to these pages without them being able to make edits.
I also added a column to the reporters table showing the number of calls each reporter has made, and made that column sortable! Implementation here.
Once we’re fully on Django we’ll be able to turn around features like this in very little time. You don’t need to be a Django expert to build these either—a tiny bit of Python and HTML knowledge should be enough to productively modify this interface.
JSON in our PostgreSQL
My main goal for the day was to tighten up the call reports importer script I wrote yesterday.
My big breakthrough on this came after a long and super-valuable conversation with Nicholas Schiefer, who’s been heavily involved in the growth of our Airtable schema ever since the project started.
Our Airtable data is complicated, because the way we write data to it has constantly evolved. Before the launch of help.vaccinateca
the data was all entered through a custom Airtable app, and many of the fields we are capturing now weren’t being captured just a few weeks ago.
Since writing a one-off importer that patches over all of these differences in a single go is virtually impossible, we decided to try an alternative track: my importer now saves the entire original Airtable JSON to a PostgreSQL JSON column (using Django 3.1’s brand new JSONField).
I taught the Django Admin to pretty-print the JSON (implementation here):
The most obvious value of this is in debugging—it’s much easier to look at a record now and compare it to the Airtable version.
More importantly: if we make a mistake in the importer code today and don’t notice for six months, that’s fine! We can re-backfill against the new lessons we have learned using the Airtable JSON data that we’ve already stored.
The cost? ~20,000 database records with a few extra KB of data stored against them. That’s totally worth it.
Querying JSON
PostgreSQL has a bunch of features for querying into JSON fields which I haven’t really explored before. I decided to try them out.
They’re incredible. Here’s a query that shows the callers who have made the most calls, based on extracting the {"Reported by": {"name": "NAME"}}
nested field from that JSON column:
select
jsonb_extract_path(
airtable_json, 'Reported by', 'name'
) as name,
count(*) as n
from call_report
group by name
order by n desc;
Even more exciting... here’s a query that counts the keys that have been used in ALL of the JSON returned from Airtable:
SELECT
jsonb_object_keys(airtable_json) AS key, count(*)
FROM call_report GROUP BY key;
And here’s what it outputs:
JSON key | Times used |
---|---|
Affiliation (from Location) | 21589 |
airtable_createdTime | 21589 |
airtable_id | 21589 |
Appointments by phone? | 466 |
Appointment scheduling instructions | 2526 |
auth0_reporter_id | 1583 |
auth0_reporter_name | 1583 |
auth0_reporter_roles | 1583 |
Availability | 21589 |
County (from Location) | 21589 |
Do not call until | 3116 |
external_reports_base_external_report_id | 3139 |
Hour | 21589 |
ID | 21589 |
Internal Notes | 18258 |
is_latest_report_for_location | 21589 |
is_pending_review | 18 |
Location | 21589 |
location_id | 21589 |
location_latest_eva_report_time | 13443 |
location_latest_report_id | 21589 |
location_latest_report_time | 21354 |
Location Type (from Location) | 21588 |
Name (from Location) | 21589 |
Notes | 3359 |
Number of Reports (from Location) | 21589 |
parent_eva_report | 958 |
parent_external_report | 2410 |
Phone | 82 |
Reported by | 21589 |
report_id | 21589 |
Report Type | 21589 |
soft-dropped-column: Vaccines available? | 15948 |
time | 21589 |
tmp_eva_flips | 21589 |
Vaccine demand | 733 |
Vaccine demand notes | 4 |
This is amazingly useful data for the importer script that I’m writing!
The query also executes in about 800ms against the cheapest PostgreSQL database server that Heroku offer—doing a deep full table scan against all 22,000 imported records.
Based on how powerful is, I’m now thinking that we should go all-in on JSON in our database. Imagine if every scraper we were running dumped its full scraped JSON data into PostgreSQL—we could join arbitrary scraped data against our other tables to figure out if there are any new locations.
My biggest concern about replacing Airtable is that we’ll lose the amazing flexibility it’s given us. I think JSON columns can help bridge that gap.
Call targeting: the most interesting problem
Another topic that came out of my conversation with Nicholas: I had not seen quite how ingenious the way call targeting works is. This is SUCH a smart usage of Airtable!
The short version: call targeting (the logic that decides which number a volunteer should be asked to call) is powered by Airtable views, with really clever application of Airtable’s filters to help build up the call lists.
We need to maintain our ability to smartly target where the calls go, and ideally make it even better. This is going to be a really fun problem to solve!
More recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024