mapreduce
eBay and Very Large Data Sets
Thursday, April 30th, 2009Metrics on eBay’s main Teradata data warehouse include:
>2 petabytes of user data
10s of 1000s of users
Millions of queries per day
72 nodes
>140 GB/sec of I/O, or 2 GB/node/sec, or maybe that’s a peak when the workload is scan-heavy
100s of production databases being fed in
Metrics on eBay’s Greenplum data warehouse (or, if you like, data mart) include:
6 [...]
Erlang and Map Reduce – Awsome 4/9/2009
Tuesday, April 28th, 2009Awsome MapReduce and Hadoop Presentation
Wednesday, March 11th, 2009Last night at our Awsome meetup Don Brown of Twitpay gave a great presentation on Map Reduce and Hadoop.
Ultraparallel Computing
Monday, January 5th, 2009Ultraparallel Computing
.
Writing An Hadoop MapReduce Program In Python
Saturday, November 29th, 2008Here is another great tutorial on using Hadoop MapReduce.
Hadoop MapReduce Program In Python Tutorial
A Ruby MapReduce Framework
Sunday, June 15th, 2008Skynet
Skynet is an open source Ruby implementation of Google’s MapReduce framework, created at Geni. With Skynet, one can easily convert a time-consuming serial task, such as a computationally expensive Rails migration, into a distributed program running on many computers.
http://skynet.rubyforge.org/doc/index.html
Cloud Talk
Thursday, March 13th, 2008| View | Upload your own
Enterprise Beware!
Wednesday, March 12th, 2008There was an interesting article yesterday in SearchDataCenter.com about how Motorola is using Splunk to troubleshoot IT management problems. Enterprise Search technologies are being used more and more by large customers to apply Google-like search technology to traditional data center analytics. Companies like Coke and Comcast are using Enterprise Search technologies to [...]
Can Your Programming Language Do This? – MapReduce
Monday, March 10th, 2008Can Your Programming Language Do This?
Joel on Software does a nice MapReduce tutorial.
CouchDB from 10,000 Feet
Thursday, March 6th, 2008CouchDB from 10,000 Feet
CouchDB views allow you to filter, collate, and aggregate data. Views are powered by Map/Reduce. The map stage processes key/value pairs to produce intermediate values and reduce then combines intermediate values for particular key. Map/Reduce is inherently parallelizable making it useful on clusters of machines.
Interesting Note:
Damien Katz, the CouchDB creator [...]
Demystifying Clouds
Tuesday, February 5th, 2008force ma·jeure – noun Etymology: French, superior force
Date: 1883 1 : superior or irresistible force 2 : an event or effect that cannot be reasonably anticipated or controlled — compare act of god
I’ll admit it: I am caught up in the cloud hype. The caveat, however, is that I truly believe [...]
A Cloud Naysayer
Monday, January 28th, 2008I posted some comments on a recent article by Robin Harris, Cloud computing – in your dreams.
Who knows who’s right? Here are my comments…
I think the differentiators are virtualization and power consumption. It is virtualization that has turned academic exercises into real business prototypes. I may be wrong but I think the brick and mortar [...]
The Big Question
Monday, January 14th, 2008The latest buzz on cloud computing is Nicholas Carr’s new book, The Big Switch.

