Friday, February 13, 2009

Setting up a CouchDB cluster fronted by memcached

While I'm looking for work, I thought I'd get to play around with some technologies that I have been very interested in but I haven't been able to get really hands-on with because of my day job.

My first project - set up a CouchDB cluster fronted by memcached and run a simple PHP application on top of that.

The architecture I'm thinking of looks something like this. I'm sure I'll refine it over time after playing around with it, but starting fairly simple first:
  • A web tier running Apache and PHP
  • A memcached tier
  • A CouchDB cluster tier
I will distribute records across the CouchDB cluster by using a distributed hashing algorithm against the keys. The ones that look most promising are Kademlia and Chord. I'm leaning towards Kademlia.

I think it's cool that the same qualities you need for peer-to-peer file sharing are valuable for server-side clustering - even distribution, good performance when nodes come and go, and robustness under heavy changes to node configuration.

Once I have something working, I'll let you know my experiences. Then I'm interested in trying this with SimpleDB and Project Voldemort... I also want to take a look at MemcacheDB. So much interesting stuff, so little time :)

2 comments:

ndimiduk said...

So how's your CouchDB clustering coming? I'm particularly interested in your experiences with adding and removing instances from the cluster and how you maintain replication rates. :D

Unknown said...

Sigh... sad to say I got a job and haven't been able to spend any time on this. I am doing some CouchDB work on the side, but it's not to do with clustering.

It sounds like there are solutions out there - did you try emailing the couchdb-user list to see what people are up to? Clustering seems to be a common use case...

Sorry! :(