Building a Covid sewage Twitter bot (and other weeknotes)
18th April 2022
I built a new Twitter bot today: @covidsewage. It tweets a daily screenshot of the latest Covid sewage monitoring data published by Santa Clara county.
I’m increasingly distrustful of Covid numbers as fewer people are tested in ways that feed into the official statistics. But the sewage numbers don’t lie! As the Santa Clara county page explains:
SARS-CoV-2 (the virus that causes COVID-19) is shed in feces by infected individuals and can be measured in wastewater. More cases of COVID-19 in the community are associated with increased levels of SARS-CoV-2 in wastewater, meaning that data from wastewater analysis can be used as an indicator of the level of transmission of COVID-19 in the community.
That page also embeds some beautiful charts of the latest numbers, powered by an embedded Observable notebook built by Zan Armstrong.
Once a day, my bot tweets a screenshot of those latest charts that looks like this:
How the bot works
The bot runs once a daily using this scheduled GitHub Actions workflow.
Here’s the bit of the workflow that generates the screenshot:
- name: Generate screenshot with shot-scraper
run: |-
shot-scraper https://covid19.sccgov.org/dashboard-wastewater \
-s iframe --wait 3000 -b firefox --retina -o /tmp/covid.png
This uses my shot-scraper screenshot tool, described here previously. It takes a retina screenshot just of the embedded iframe, and uses Firefox because for some reason the default Chromium screenshot failed to load the embed.
This bit sends the tweet:
- name: Tweet the new image
env:
TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }}
TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }}
TWITTER_ACCESS_TOKEN_KEY: ${{ secrets.TWITTER_ACCESS_TOKEN_KEY }}
TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }}
run: |-
tweet-images "Latest Covid sewage charts for the SF Bay Area" \
/tmp/covid.png --alt "Screenshot of the charts" > latest-tweet.md
tweet-images is a tiny new tool I built for this project. It uses the python-twitter library to send a tweet with one or more images attached to it.
The hardest part of the project was getting the credentials for sending tweets with the bot! I had to go through Twitter’s manual verification flow, presumably because I checked the “bot” option when I applied for the new developer account. I also had to figure out how to extract all four credentials (with write permissions) from the Twitter developer portal.
I wrote up full notes on this in a TIL: How to get credentials for a new Twitter bot.
Datasette for geospatial analysis
I stumbled across datanews/amtrak-geojson, a GitHub repository containing GeoJSON files (from 2015) showing all of the Amtrak stations and sections of track in the USA.
I decided to try exploring it using my geojson-to-sqlite tool, which revealed a bug triggered by records with a geometry but no properties. I fixed that in version 1.0.1, and later shipped version 1.1 with improvements by Chris Amico.
In exploring the Amtrak data I found myself needing to learn how to use the SpatiaLite GUnion
function to aggregate multiple geometries together. This resulted in a detailed TIL on using GUnion to combine geometries in SpatiaLite, which further evolved as I used it as a chance to learn how to use Chris’s datasette-geojson-map and sqlite-colorbrewer plugins.
This was so much fun that I was inspired to add a new “uses” page to the official Datasette website: Datasette for geospatial analysis now gathers together links to plugins, tools and tutorials for handling geospatial data.
sqlite-utils 3.26
I’ll quote the release notes for sqlite-utils 3.26 in full:
- New
errors=r.IGNORE/r.SET_NULL
parameter for ther.parsedatetime()
andr.parsedate()
convert recipes. (#416)- Fixed a bug where
--multi
could not be used in combination with--dry-run
for the convert command. (#415)- New documentation: Using a convert() function to execute initialization. (#420)
- More robust detection for whether or not
deterministic=True
is supported. (#425)
shot-scraper 0.12
In addition to support for WebKit contributed by Ryan Murphy, shot-scraper 0.12 adds options for taking a screenshot that encompasses all of the elements on a page that match a CSS selector.
In also adds a new --js-selector
option, suggested by Tony Hirst. This covers the case where you want to take a screenshot of an element on the page that cannot be easily specified using a CSS selector. For example, this expression takes a screenshot of the first paragraph on a page that includes the text “shot-scraper”:
shot-scraper https://simonwillison.net/2022/Apr/8/weeknotes/ \
--js-selector 'el.tagName == "P" && el.innerText.includes("shot-scraper")' \
--padding 15 --retina
And an airship museum!
I finally got to add another listing to my www.niche-museums.com website about small or niche museums I have visited.
The Moffett Field Historical Society museum in Mountain View is situated in the shadow of Hangar One, an airship hangar built in 1933 to house the mighty USS Macon.
It’s the absolute best kind of local history museum. Our docent was a retired pilot who had landed planes on aircraft carriers using the kind of equipment now on display in the museum. They had dioramas and models. They even had a model railway. It was superb.
Releases this week
-
tweet-images: 0.1.1—(2 releases total)—2022-04-17
Send tweets with images from the command line -
asyncinject: 0.3—(5 releases total)—2022-04-16
Run async workflows using pytest-fixtures-style dependency injection -
geojson-to-sqlite: 1.1.1—(11 releases total)—2022-04-13
CLI tool for converting GeoJSON files to SQLite (with SpatiaLite) -
sqlite-utils: 3.26—(99 releases total)—2022-04-13
Python CLI utility and library for manipulating SQLite databases -
summarize-template: 0.1—2022-04-13
Show a summary of a Django or Jinja template -
shot-scraper: 0.12—(13 releases total)—2022-04-11
Tools for taking automated screenshots of websites
TIL this week
More recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024