IT Management and Cloud Blog

« | Home | »

Some Hadoop Links

By John | September 11, 2008

Write a Hadoop MapReduce job in any programming language

MapReduce is a method for writing software that can be parallelized across thousands of machines to process enormous amounts of data. For instance, let’s say you want to count the number of referrals, by domain, in all the world’s Apache server logs.

Cascading – Answer For The Question Few Would Ask

I came across a framework called Cascading that basically aims to provide an abstraction on top of the Hadoop for defining complex workflows avoiding MapReduce details. It does so by creating a new DSL with tuples, pipes and flows that gets automatically translated into MapReduce operations upon execution.

MapReduce and the Database: Analytics in Hyperdrive

Now, by expanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal. More data, more data types, more data sources — all rolled into an analytical framework that can be directly targeted by developers, scripters, business analysts, executives and investors.

Topics: hadoop | 1 Comment »

One Response to “Some Hadoop Links”

  1. September Roundup! | IT Management and Cloud Blog Says:
    October 3rd, 2008 at 9:42 am

    [...] Some Hadoop Links [...]