IT Management and Cloud Blog

mapreduce

eBay and Very Large Data Sets

Thursday, April 30th, 2009

Metrics on eBay’s main Teradata data warehouse include:

>2 petabytes of user data
10s of 1000s of users
Millions of queries per day
72 nodes
>140 GB/sec of I/O, or 2 GB/node/sec, or maybe that’s a peak when the workload is scan-heavy
100s of production databases being fed in

Metrics on eBay’s Greenplum data warehouse (or, if you like, data mart) include:

6 [...]

Erlang and Map Reduce – Awsome 4/9/2009

Tuesday, April 28th, 2009

Awsome MapReduce and Hadoop Presentation

Wednesday, March 11th, 2009

Last night at our Awsome meetup Don Brown of Twitpay gave a great presentation on Map Reduce and Hadoop.

Ultraparallel Computing

Monday, January 5th, 2009

Ultraparallel Computing
.

Writing An Hadoop MapReduce Program In Python

Saturday, November 29th, 2008

Here is another great tutorial on using Hadoop MapReduce.
Hadoop MapReduce Program In Python Tutorial

A Ruby MapReduce Framework

Sunday, June 15th, 2008

Skynet
Skynet is an open source Ruby implementation of Google’s MapReduce framework, created at Geni. With Skynet, one can easily convert a time-consuming serial task, such as a computationally expensive Rails migration, into a distributed program running on many computers.
http://skynet.rubyforge.org/doc/index.html

Cloud Talk

Thursday, March 13th, 2008

| View | Upload your own

Enterprise Beware!

Wednesday, March 12th, 2008

There was an interesting article yesterday in SearchDataCenter.com about how Motorola is using Splunk to troubleshoot IT management problems. Enterprise Search technologies are being used more and more by large customers to apply Google-like search technology to traditional data center analytics. Companies like Coke and Comcast are using Enterprise Search technologies to [...]

Can Your Programming Language Do This? – MapReduce

Monday, March 10th, 2008

Can Your Programming Language Do This?
Joel on Software does a nice MapReduce tutorial.

CouchDB from 10,000 Feet

Thursday, March 6th, 2008

CouchDB from 10,000 Feet
CouchDB views allow you to filter, collate, and aggregate data. Views are powered by Map/Reduce. The map stage processes key/value pairs to produce intermediate values and reduce then combines intermediate values for particular key. Map/Reduce is inherently parallelizable making it useful on clusters of machines.
Interesting Note:
Damien Katz, the CouchDB creator [...]

Demystifying Clouds

Tuesday, February 5th, 2008

force ma·jeure – noun Etymology: French, superior force
Date: 1883 1 : superior or irresistible force 2 : an event or effect that cannot be reasonably anticipated or controlled — compare act of god

I’ll admit it: I am caught up in the cloud hype. The caveat, however, is that I truly believe [...]

A Cloud Naysayer

Monday, January 28th, 2008

I posted some comments on a recent article by Robin Harris, Cloud computing – in your dreams.
Who knows who’s right? Here are my comments…
I think the differentiators are virtualization and power consumption. It is virtualization that has turned academic exercises into real business prototypes. I may be wrong but I think the brick and mortar [...]

The Big Question

Monday, January 14th, 2008

The latest buzz on cloud computing is Nicholas Carr’s new book, The Big Switch.