Tuesday, June 10, 2008

Open Source EC2 - the beginning of a Scale Stack

In my last post I mentioned that what I wanted to see was the industry coalesce around an open source "scale stack" [1].

We may be seeing the beginnings of this. The High Scalability site (great blog, by the way) recently posted a blog talking about a new kid in town coming out of UC Santa Barbara called Eucalyptus . They are providing an open source implementation an elastic compute infrastructure that is interface-compatible with Amazon's EC2 which you can take and deploy on your own hardware.

This is very encouraging, and I think is a smart approach. Rather than try to build some standard that is lost in committee for years, use the de-facto standard, which in this space is Amazon.

Another piece of the puzzle is Hadoop, an open source implementation of map/reduce. Hadoop also has a distributed file system - one thing that might be worth investigating is building an S3 layer on top of Hadoop's file system.

What about the queuing service? Well, one possibility is to but an SQS API on top of ActiveMQ or OpenJMS.

Throw in CouchDB, and you're starting to get a very interesting stack indeed. I'm not sure about putting a SimpleDB interface on top of this - CouchDB is pretty darn interesting in its own right, and I think the jury is still out on SimpleDB.

[1] I am not sure if he wants me to mention his name, so I won't, but I want to acknowledge that the idea for an open source stack based on Amazon's APIs is not my own, but comes from a colleague at Sun. I think it's a great idea, and may it come to fruition.

No comments: