"When I hear the word performance I reach for my gun".
In the fuzzy front end of a project when unknowns abound people need a sense of stable footing.
Since it usually takes a long (calendar) time to understand the requirements and domain discussions tend to focus on more concrete things like the application architecture, and eventually everybody will be obsessed with performance. It is a safe harbour for “productive discussions” when the waters are full of unknown monsters like fuzzy or non-existing requirements, a vague understanding of the domain etc.
So performance discussions ensue. Many times the customer wants to know the hardware requirements up front so we order heaps of multi-CPU servers, SANs, middle-tier application servers, front-end application servers, load-balancers etc.
The next natural step is to think up an intricate object or component distribution strategy to ensure “scalability”.
I don’t know of anyone who ever got fired for “designing a system for high performance and scalability” and building complex, buzzword compliant distributed application architectures.
But maybe someone should be. Just to set an example.
When I worked with the outstanding architect Bjarne Hansen he would often mutter, “Fowler, page 87”.
He was referring to Martin Fowler’s book, Patterns of Enterprise Application Architecture. Page 87 has a section title in big bold letters: “The allure of distributed objects” – and it addresses precisely that.
We worked on a system designed to deliver high performance. It had a set of multi-CPU servers, a fiber SAN, a separate multi-CPU application server, handwritten OR mappers using high-performance stored procedures and a complex distributed caching object database on top to keep all the clients in sync and take the load off the database server.
As for performance…
It took some 80 people around two years to build the system. I worked with Martin Gildenpfenning for a total of about two man-weeks to optimize away the bottlenecks in the application. At that point it ran faster on a developer desktop with all the components deployed on a single machine including client, server, database server and GIS system than on the high-powered multi-server deployment set-up.
It turned out network latency was the limiting factor.
The system had been designed to optimize a scenario that did not happen in practice.
As Donald Knuth put it, more often than not premature optimization is the root of all evil.
The optimization work we did was quite simple. We did so as performance as bottlenecks became evident as we neared the end of the development cycle. At that point we had a real working application and knew the use cases to optimize. Armed with a set of profilers the task was quite easy.
In fact, the big lesson was that once the app is there and the real data is there it is easy to find the bottlenecks – and in general most performance bottlenecks can be solved in a very local manner: most of the bottlenecks could be removed by refactoring a single module only. Mostly it was a matter of replacing a algorithm with a better (non-quadratic!) one or introduce a bit of caching, for example, to not load the same object more than once from the database during the same transaction.
Discussing performance early in a project embodies the complete fallacies of the waterfall project model. It does not work. It is broken. Given a good design, optimizing performance is a simple, measurement-driven task that belongs in the end of every iteration in the project.
1) “It is easier to optimize correct code than to correct optimized code.” (Bill Harlan, http://billharlan.com/pub/papers/A_Tirade_Against_the_Cult_of_Performance.html)
2) Don’t pretend you can spec the hardware for a system before you even know the requirements.
Someone said on the Ruby on Rails IRC channel:
> “PHP5 is faster than Rails”
< “Not if you’re a programmer”
Google and O’Reilley’s “Hacker of the Year” award recipient for 2005, David Hanson, related how they built and deployed the first major Rails application on an 800 MHz Intel Celeron server running the full web-server-app-server-database application stack. It maxed out when they had about 20,000 users on the application and at that time it was easy to scale it out.
At the same time Rails is widely claimed to “not be scalable” by the members of the J2EE “enterprise app” community. These are the same people who designed our multi-tier architecture that only maxed out the CPU idle times when it was put into production.
So, alas here is a checklist for your next project:
- Get some hard, measurable targets for performance and expected transaction volume.
- If performance comes up, ask for the CPU, network, I/O utilization statistics for the current system (if it exists). At least this will provide some guidance to whether or not performance will be an issue.
- If you are asked to design the hardware, suggest building the app on a single server with plenty of RAM and doing some measurements later. Even if performance becomes an issue, you will benefit from some additional months of Moore’s law to get a better deal on those Itanium boxes.
- Create a simple design with no optimizations: use an off-the-shelf O/R-mapper; resist the urge to build complex caches, keep everything in the same process, fight the urge to write those “performance enhancing” stored procedures. Figure out the common use cases and implement a spike with them. Even if the tools and simple design introduces a few milliseconds of overhead you will have saved plenty of man-months in development to pay for an extra CPU to compensate.
- Scale the application out, not the components. The big lesson from the big web sites is that buying a load balancer and a rack of cheap servers each running the full application is good enough. In fact, for most applications you don’t even need more than one server.
- Measure early, measure often.