6th October 2025
Two of my public Datasette instances - for my TILs and my blog's backup mirror - were getting hammered with misbehaving bot traffic today. Scaling them up to more Fly instances got them running again but I'd rather not pay extra just so bots can crawl me harder.
The log files showed the main problem was facets: Datasette provides these by default on the table page, but they can be combined in ways that keep poorly written crawlers busy visiting different variants of the same page over and over again.
So I turned those off. I'm now running those instances with --setting allow_facet off (described here), and my logs are full of lines that look like this. The "400 Bad Request" means a bot was blocked from loading the page:
GET /simonwillisonblog/blog_entry?_facet_date=created&_facet=series_id&_facet_size=max&_facet=extra_head_html&_sort=is_draft&created__date=2012-01-30 HTTP/1.1" 400 Bad Request
Recent articles
- Porting the Moebius 0.2B image inpainting model to run in the browser with Claude Code - 22nd June 2026
- sqlite-utils 4.0rc1 adds migrations and nested transactions - 21st June 2026
- Datasette Apps: Host custom HTML applications inside Datasette - 18th June 2026