25th June 2021 - Link Blog
Querying Parquet using DuckDB (via) DuckDB is a relatively new SQLite-style database (released as an embeddable library) with a focus on analytical queries. This tutorial really made the benefits click for me: it ships with support for the Parquet columnar data format, and you can use it to execute SQL queries directly against Parquet files—e.g. “SELECT COUNT(*) FROM ’taxi_2019_04.parquet’”. Performance against large files is fantastic, and the whole thing can be installed just using “pip install duckdb”. I wonder if faceting-style group/count queries (pretty expensive with regular RDBMSs) could be sped up with this?
Recent articles
- Can coding agents relicense open source through a “clean room” implementation of code? - 5th March 2026
- Something is afoot in the land of Qwen - 4th March 2026
- I vibe coded my dream macOS presentation app - 25th February 2026