Wikidata is a Giant Crosswalk File. Drew Breunig shows how to take the 140GB Wikidata JSON export, use sed 's/,$//'
to convert it to newline-delimited JSON, then use DuckDB to run queries and extract external identifiers, including a query that pulls out 500MB of latitude and longitude points.
Recent articles
- ChatGPT will happily write you a thinly disguised horoscope - 15th October 2024
- OpenAI DevDay: Let’s build developer tools, not digital God - 2nd October 2024
- OpenAI DevDay 2024 live blog - 1st October 2024