Simon Willison’s Weblog

Subscribe

mrjob: Distributed Computing for Everybody. Yelp use MapReduce with Hadoop (running on Amazon’s EMR service) to power all sorts of interesting features on the site, including spelling suggestions, review highlights, top searches and “people who viewed X also viewed...”. mrjob is their new open source Python framework for writing MapReduce jobs against the Hadoop streaming API.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe