Simon Willison’s Weblog

Subscribe

Wednesday, 31st August 2022

Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator. Andy Baio and I collaborated on an investigation into the training set used for Stable Diffusion. I built a Datasette instance with 12m image records sourced from the LAION-Aesthetics v2 6+ aesthetic score data used as part of the training process, and built a tool so people could run searches and explore the data. Andy did some extensive analysis of things like the domains scraped for the images and names of celebrities and artists represented in the data. His write-up here explains our project in detail and some of the patterns we’ve uncovered so far.

# 2:10 am / machine-learning, ai, stable-diffusion, generative-ai, laion, training-data

Farmbound, or how I built an app in 2022. Stuart Langridge describes the architecture and decision process behind his new mobile web game, Farmbound.

# 11:23 pm / stuart-langridge, web

2022 » August

MTWTFSS
1234567
891011121314
15161718192021
22232425262728
293031