Group thousands of similar spreadsheet text cells in seconds (via) Luke Whyte explains how to efficiently group similar text columns in a table (Walmart and Wal-mart for example) using a clever combination of TF/IDF, sparse matrices and cosine similarity. Includes the clearest explanation of cosine similarity for text I’ve seen—and Luke wrote a Python library, textpack, that implements the described pattern.
Recent articles
- Exploring Promptfoo via Dave Guarino's SNAP evals - 24th April 2025
- AI assisted search-based research actually works now - 21st April 2025
- Maybe Meta's Llama claims to be open source because of the EU AI act - 19th April 2025