Wednesday, February 25, 2009

Hitting the scalability wall - Amdahl's Law

Author's note: I was thinking about this blog and how I didn't quite capture some things correctly, particularly around what it means for processing to become serialized.  Then I saw some comments that repeated my own concerns.  So I have done some rewriting in an attempt to explain more clearly what I think the issues are... 

When examining the challenges of throughput and scalability in the brave new world of Web Scale, one solution I've been looking at is Gigaspaces.

I recently read their article about scalability (follow the link to "The Scalability Revolution" - you may need to register to be able to see it), and they made a claim that gave me pause, to say the least.

Basically they said that if you have a multi-tier architecture, you are doomed - sooner or later you are going to hit a scalability wall, either because of cost, complexity or latency.  I think in particular they meant that if your end-to-end request/response path must travel over multiple tiers to complete, you are doomed.

They have examples to back up this claim, but what was compelling for me was their reference to Amdahl's Law.

Amdahl's Law states if any portion of your system requires processing to be serial instead of parallel  then the system is going to hit a scalability wall where it can go no further, no matter how much hardware you throw at it.   From the Wikipedia article:
The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of 1 hour cannot be parallelized, while the remaining promising portion of 19 hours (95%) can be parallelized, then regardless of how many processors we devote to a parallelized execution of this program, the minimal execution time can not be less than that critical 1 hour. Hence the speed up is limited up to 20x.
Note that the actual numbers don't matter too much.  You can avoid hitting the wall by reducing the amount of serialization, but sooner or later, you are going to hit that wall.  And even before you hit it, the cost of getting a fraction more speedup is going to increase exponentially.  I have heard many stories to vouch for this, where getting 10% more throughput requires 100 times the hardware investment and/or complete rewrites of applications.

If your system architecture requires you to potentially wait for a shared resource within the request/response path, then you are shifting from parallel to serial execution, and you are going to hit Amdahl's Wall.   Here are some examples:
  • Reading/writing data in a shared database
  • Sending a request over a network which will block if the network is congested (the network is shared by multiple simultaneous requests)
  • Writing to a disk which may block if it is busy, where the disk is shared between multiple simultaneous requests.
Well, that does seem to throw a wrench in things.  How can a service run independently, with no database access, disk I/O or network I/O, and still be useful?

Well, the key thing is that these things can not happen within the request/response path.  They can still be done in the background, asynchronously. For example, if the user updates their profile, then this update is stored in-memory and then, in the background, this status update is stored in the database.  If you want to send a message to another service, the message is queued up in memory and then delivered to the external service asynchronously.  

Note the consistency implications of this.  You end up with a system where not everybody has a consistent view of the current state of the system.  But we already know through the CAP theorem that this is something you probably need to accept.

There is also a window of possibility where the user thinks the update succeeded but the actual persistence of this update fails or there is some kind of conflict.  This means your company and your application has to know how to apologize.

But if the benefit of this is linear scalability, these tradeoffs may be well worth it.  Better to apologize to a small subset of customers about a lost order than apologize to all of them for terrible latency or to apologize to your management about the cost and time overruns for your system.

Gigaspaces claims to have accomplished this kind of architecture with a runtime environment they call a space which provides in-memory access to all the resources you need to process a request.  The underlying Gigaspaces runtime ensures that each request is routed to the right cluster node and that the node contains all the resources needed, in memory, for that object to get its job done.   To be honest, I am still mystified how they can accomplish this, and this is something I would like to investigate further.

At any rate, I appreciate the Gigaspaces article helping me understand the implications of Amdahl's Law.  As I evaluate a scalability architecture, if something, somewhere, requires the request/response path to potentially wait on a shared resource, then I know that sooner or later we've got trouble.  It's the law.


Anonymous said...

I really liked the way you summarized the topic.
One note about our OpenSource - Most of our development framework is provided as opensource under the Apache 2.0 license (See details here). Its only the underlying core clustering that is kept under close source.

Anonymous said...

I think either you or them are mixing something up here. Or I am misunderstanding something.

1. while accessing a database or a network resource or the filesystem might be and often is the limiting factor for performance, I can't see how this relates to Amdahls Law and scalability. Waiting on such a resource doesn't keep other threads/processes from working, so it isn't a serial thing.

2. After you banished network and file I/O including database access you want to fix the problem by putting stuff into gigaspaces which seems to be after a quick glance: a distributed thingy backed by a database and distributed means network I/O

Don't get me wrong. I think there are valid points in here, but I also think they need a little more clarification what problem exactly you want to fix.

Anonymous said...

Oh, and by the way, the article you are referencing is not available (access denied) at least for me

Anonymous said...

All this does is spread FUD re scalability. Which in many cases is a FUD argument to start with.

It's really not as simple as "nothing can be serial". Do you even KNOW what is serial in your system when it comes to the underlying components, the network, the database, the threads?

Its unlikely you do. Or I do. More importantly, things that are parallel can become serial when bandwidth is exhausted in any direction.

This is the real problem with scalability, the point at which parallel behavior becomes overloaded and must be serialized.

That's the definition of falling over. Given a system, an architecture and a usage scenario, it is not really possible in a good design to decide when this will happen...So you try to horizontally expand possible bottlenecks.

Its common sense design. And there are no tricks to get around it.

Joseph Wofford said...

"While accessing a database or a network resource or the filesystem might be and often is the limiting factor for performance, I can't see how this relates to Amdahls Law and scalability. Waiting on such a resource doesn't keep other threads/processes from working, so it isn't a serial thing."

If all the threads in a cluster are blocked waiting on the same resource, it becomes a serial thing. It's called thread starvation deadlock and it's a particularly common problem in SOA architectures. Introducing timeouts can mitigate the problem at the cost of inconsistency due to partial failure, but there's no fixed timeout value that can eliminate the risk of thread starvation deadlock at all scales.

The author is perfectly correct in identifying synchronous blocking waits as the root cause of this problem. Unfortunately, fixing the problem is going to require a completely asynchronous programming model which is wildly unnatural in current mainstream programming languages.

Unknown said...

jens: I tried to clarify that it's not waiting on a disk or a network that causes serialization, but the inherent cause of the waiting, because other threads are trying to use the shared resource (disk, network, database, etc.)

Also, note that Gigaspaces is a coherent clustered cache, and thus uses the network, but it does this *outside the requests/response path*. That I think is the crucial difference.

I fixed the link, thanks.

Unknown said...

Anonymous: thanks, good points. Yes, it's doubtful you can ever know all the points where your system is requiring synchronization.

But I would argue that if you *do* find such points (e.g. your request thread is talking to the database), then you should fix it.

And yes, this may be common sense, but the majority of application architectures I see regularly assume they can scale out even though they are accessing a database or shooting messages back and forth across the network within the request/response path. Very few indeed have I seen that do everything asynchronously...