Thursday, May 08, 2008

Internet content by reference, not by value

You may have noticed that I am very interested in how data is managed on the Internet as a platform, at a web scale. In that light, I have been having some very illuminating and interesting conversations with an old friend and colleague, Nitin Borwankar. His thoughts on data property rights and DRY data are concepts that if implemented could result in a major shift in how we manage data on the web.

Data property rights is about laying out a "bill of rights" for data that goes far beyond "the right to move". It also includes the right to access, modify, remove and own your data. So often it happens that once you upload your content to a site, you no longer have full rights to that content, as if somehow in the act of uploading it it is no longer yours. It's like living in a serfdom where you do all the work to plow, seed, tend and harvest the land, but the fruit of your labor is not yours, just because you are using the land that someone else owns.

DRY data is about following the principle of Don't Repeat Yourself for web content. Web applications need to start applying this principle, so that rather than you having to load copies of your content across multiple sites (and losing ownership of it in the process), you place it in one location (your "home" on the Web, as it were), and then you refer application providers to that one place. They can focus on providing added value (for instance, referring it to your friends, enabling collaboration, or helping you organize it or present it in useful ways) rather than on the overhead of building and deploying a scalable storage architecture.

Nitin calls this architecture YINAS (YINAS Is Not A Silo).

The value of DRY for the user is obvious - I only have to put my stuff in one place, and I get to really own my stuff, rather than the vendor owning it. DRY is also very valuable for the vendor, as they can save overhead and complexity by delegating the work of scalable storage and indexing to a "data service provider" rather than having to do it themselves. It's even good for the environment, because you need fewer disk farms sucking up power and space. I guess the only folks who would lose out are the storage and power vendors :)

It's funny, it makes so much sense, but nobody is really doing this.

I pulled Tim Bray aside at Java One to talk to him about these ideas after reading his blog about changing his address, and he suggested that concepts are good, but a simple proof of concept is better. Hm... let me think about that ... :)


Nitin said...

Hi David,

Thanks much for the plug. Re: your comment on the disk and power vendors losing out. No one loses out - we just use capacity more efficiently - we are never going to reduce our apetite for computing capacity, so we might as well use it more efficiently.


Anonymous said...

David, have you come across the "communication rights" people? They are interested in similar issues. Looks like a bad site but they have an interesting discussion document to download.


Unknown said...

Thanks, Elizabeth, for the pointer. Very interesting discussion, and definitely we are seeing in the US the control of communication by the few and powerful, combined with poor education, and the resulting weakening of true democratic dialog.

It seems that a "Data Property Rights" could potentially fall into this kind of discussion, as it is tight to excessive control of content by the content host rather than the content provider. But it seems like a tiny piece of a puzzle that spans issue like economic reform, political reform, and digital rights reform. But it is definitely a piece of the puzzle.