« Cloud Cafe #37 Elastra With Stu Charlton | Home | BotchagalupeMarks for August 15th – 06:55 »
BotchagalupeMarks for August 14th – 13:02
By John | August 15, 2009
These are my links for August 14th from 13:02 to 17:19:
- cascading success story: hopefully
– cascading-user | Google Groups – FlightCaster Story using EC2, Hadoop, and Cascasing - GFS: Evolution on Fast-forward – ACM Queue – During the early stages of development at Google, the initial thinking did not include plans for building a new file system. While work was still being done on one of the earliest versions of the company's crawl and indexing system, however, it became quite clear to the core engineers that they really had no other choice, and GFS (Google File System) was born.
- Flightcaster Using #ec2, #hadoop, #cascading, and #clojure – Flightcaster – the final talk was in some ways the most interesting to me as it was about this startup’s use of Flightcaster to do actual stuff. The idea of the company is to use public and private data sources to predict flight delays. It uses some machine learning techniques like Bayes law and conditional probability now with more stuff planned for support vector machines and other techniques in the future. The code is all in Clojure and from what we looked at, it was pretty compact. He had built a DSL within Clojure that got decently close to what he needed for representing some of the probability stuff. They’re also using Hadoop for some map/reduce stuff.
Topics: other | No Comments »

