Van Couvering Is Not a Verb: Interesting stuff from LinkedIn

Monday, January 26, 2009

Interesting stuff from LinkedIn - Project Voldemort

Although the name is a bit odd, Project Voldemort itself looks interesting.

The folks at LinkedIn have open sourced a distributed cache/storage engine under the Apache 2.0 license. The interface looks a lot like memcached: get(key), put(key, value), delete(key). The key (haha) difference is that it is not just a cache - it's also provides persistent storage.

I recommend taking a look at their design page. Here are some things

No structure, no queries

Project Voldemort explicitly eliminates the structured form of relational databases and queries, just like memcached. This means if you want to do things like queries, joins, etc., then you need to do it yourself.

It appears that one way they solve this is by building pre-built "answers" to queries by running Hadoop queries and then putting the result back into the storage engine en masse. Much more efficient than trying to run the queries against your "live" store.

Eventual Consistency and Ordering of Versions

They also seem to be following the principles laid out by Werner Vogels and the Amazon team around providing eventual consistency.

I also particularly liked how they do versioning (a version is defined by a tuple of server numbers and version numbers) and how they handle conflicts and fix consistency issues: they go ahead and write whenever you want to write, and then when someone does a read, they look at the various versions and make a decision who wins (or decide there is a conflict and mark it as such so that the problem can be resolved manually).

But Does It Work?

Being a long-term database guy, I always wonder what key functionality you are giving up when you go for the simple key/value way of doing things. I know it scales, and I know it is fast, and I know it avoids issues with network partitioning. But what requirements does it place on the client as a result? They mention, for instance, that this solution separates business logic from data storage, and that's a good thing. It's funny, because in my Sybase days, placing business logic close to data storage was considered the right way to go - function shipping instead of data shipping.

Anyway, it looks like another distributed key-value store has hit the streets. I have some time right now, maybe I'll take a closer look. And I'll be doing the same thing with SimpleDB and CouchDB while I'm at it...

4 comments:

Tom White said...: Richard Jones of Last.fm did a good round up of distributed key-value stores.; 5:36 AM
Anonymous said...: I remained disturbed by allegedly smart folk who abandoned the principles laid down by Dr. Codd for the crap that comes out of knucklehead youngsters. The relational model was devised to solve the problems that these "new" technologies are re-creating from the 1960s. Collective intelligence? Not hardly.; 6:58 PM
Anonymous said...: Hi David,

This was interesting to read about. Especially after reading today about Newton, a distributed OSGi framework, which moves code around the network, to be executed as needed.

I wonder if any combination of these two could be a good thing?

Regards,

-james.; 12:31 AM
Anonymous said...: "I remained disturbed by allegedly smart folk who abandoned the principles laid down by Dr. Codd"

You may be disturbed, but there are reasons to abandon Dr. Codd. It's time to admit that relational databases do some things poorly:

- Schema changes. This is a big pain when you've got billions of rows and can't have any down time. And how do you do a join on multiple versions of a table?

- Denormalization: Often, it's impossible to normalize because of performance. The ultimate de-normalization is a key-value store. If you're not doing any joins, then why use an RDBMS?

- Scalability: The only way to scale an RDBMS horizontally (to 1000's of nodes) is to shard. But if you shard, you loose transactions, joins, ACID, etc. So why use an RDBMS?

- Availability: You can't have ACID if you want high availability. (Proven as the CAP Theorum (aka Brewer's Conjecture) in 2002.) An "eventually consistent" architecture seems radical, but that's how the internet works (DNS, web caching, etc).

"crap that comes out of knucklehead youngsters"

LinkedIn is doing a social graph with billions of connections. And you are?

Let's put it this way: None of the top websites (EBay, Amazon, Google, Facebook, etc.) run on an RDBMS. (Ebay does use oracle, but they use it as a key-value store. No transactions, no joins, massive sharding, etc.)

"The relational model was devised to solve the problems [..]

Using an RDBMS is simpler, I'll give you that. But for high availability and high scalability, it's just not an option.

"[..] that these "new" technologies are re-creating from the 1960s."

You can say they are ideas from 1960's, but that's only a half-truth. Those ideas were never done on a massively distributed scale like today. Most of this new code is dealing with partitioning, replication, and read-repair -- not just the storage and retrieval ideas from the 60's.

Sources:
http://highscalability.com/ebay-architecture
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html; 7:58 PM