Simon Willison’s Weblog

Subscribe

Simple, Fast, and Scalable Reverse Image Search Using Perceptual Hashes and DynamoDB. Christopher Bong provides a clear explanation of how perceptual hashes can be used to create a string representing the visual content of an image, such that similar images can be identified by calculating a hamming distance between those hashes. He then explains how they built a large-scale system for this at Canva on top of DynamoDB, by splitting those strings into smaller hash windows and using those for efficient bulk lookups of similar candidates.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe