9 items tagged “hadoop”
2009
Today, Facebook counts 29% of its employees (and growing!) as Hive users. More than half (51%) of those users are outside of Engineering. They come from distinct groups like User Operations, Sales, Human Resources, and Finance. Many of them had never used a database before working here. Thanks to Hive, they are now all data ninjas who are able to move fast and make great decisions with data.
Introducing Cloudera Desktop. It’s a GUI for Hadoop, and under the hood is a whole stack of open source software, including Python, Django, MooTools, Twisted, lxml, CherryPy, Mako, Java and AspectJ.
Finding similar items with Amazon Elastic MapReduce, Python, and Hadoop streaming. Tutorial for running Hadoop jobs on Elastic MapReduce using Python and the 2005 Audioscrobbler dataset.
Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job.
2008
Cascading. A Java API abstraction layer over Hadoop that lets developers think in terms of pipes and filters rather than map/reduce. The Cascading developers claim that this model is easier to understand and less error prone.
3 and 1/2 minutes to sort a Terabyte, and a look at Hadoop’s code structure. Bill de hÓra uses some clever static analysis tools to explore Hadoop’s 100,000+ lines of code.
Python + Hadoop = Flying Circus Elephant. Last.fm have released Dumbo, a Python module that lets you easily write Hadoop map/reduce tasks using Python and generators.
2007
Writing An Hadoop MapReduce Program In Python. Hadoop (the open source map/reduce framework) can interact with any program that reads from stdin and outputs on stdout—so it’s trivial to drop in Python scripts for the map and reduce steps.
2006
Hadoop. Open-source Google File System / map-reduce equivalent. Apparently scales amazingly well.