Monday, October 08, 2007

Heck with SQL: is a persistent hash all you need?

Jamie Flournoy compares ActiveRecord to VB: "The easy stuff is easy, but the hard stuff is impossible," he quote James Gosling as saying about VB.

One of the things Jamie says in this very long, well-thought-out article, by someone who obviously has experience with both Rails and databases, gave me pause.

Jamie referred to some of the leaders of the Rails community saying that SQL is a design mistake and that most web apps only need a persistent hash.

This statement follows shortly after me reading an article by Diego Parilla about the impact of ORM on the database, and finding out that Amazon uses basically a highly available, persistent hash for much of its database storage on the web tier.

Michael Stonebraker is even saying that relational databases can not meet everybody's needs any more, and talks about the need for specialized data stores for specific use cases.

I actually like a lot of this. We need to simplify, simplify, when it comes to building web applications - I've been appalled at what a huge mess it is for quite some time, and have found myself avoiding building web apps if at all possible because of this. Also, SQL ties you to a database (yeah, I know, it's a "standard"), and also ties your application to a particular incarnation of the data model in the database. So for all these reasons, eliminating the need for SQL is goodness.

My concern is that if you're not careful, you start with a hash table, but then start implementing your own database on top of it. Jamie mentions this, and I have seen this too:

Satisfying queries is the database’s job, period. It’s just hideously slow to try and do an inner join in the application across a network link to a database. If you find yourself doing this, that’s a pretty good sign that your architecture is broken.

It's easy to think that all you need is a hash table, and when you're building simple, basic web apps, that's probably all you need. Heck, that's all Amazon needs, for the most part. But there are times when you want more. You really do need to do a complex query, or a stored procedure, or text search. You want to be able to run useful reports without having to do joins and sorts within your application.

I think Rails is getting a lot of things right by hiding a lot of complexity from you (as do the Java ORM technologies).

But I agree with Jamie that for those times when you need SQL, it's nice to have the option -- at least until ORM gets so friggin' smart that writing in SQL is like writing in assembly - sure you could do it, but who wants to, and why?


The Narrator said...

Every other year somebody tries to get rid of RDBMS and every time they fail. The last screwball attempt at this was Prevalayer.

With regards to speed, You can using caching with an RDBMS to speed things up you know. In Java we do this with Terracotta, EhCache or OSCache.

Orion Letizi said...
This comment has been removed by the author.
Orion Letizi said...

The way I see it, you should put stuff in a relational database that you want to report on later or otherwise view in a different context.

If your data is shaped like objects, though, and you are always going to consume it in object form, flattening it out to relation form is just a waste of effort and an abuse of a relational database.

David Van Couvering said...

I totally agree, I don't believe you can get rid of the relational database.

But I can see some applications having requirements that a relational database can't meet, and seeing the rise of specialized data stores for those requirements, ala Amazon's solution.

I can also see it possible that most application writers don't have to write or know SQL, even if they are working with a relational database. But for the foreseeable future, I believe you do need an escape hatch to write your own SQL if you need to, even with the best ORM tools out there today. Note that JDO had to add this escape hatch in version 2 due to customer requests.