The Fastware Project: 05/01/2010

Friday, May 21, 2010

Notes for Portland Code Camp Talk Now Available

My presentation isn't until tomorrow, but I somehow managed to finish the materials for it today. If you're interested in what I have to say on the topic "CPU Caches and Why You Care," I encourage you to download the presentation materials (PDF) and take a look. If you find that you have something to say in return, I encourage you to post your comments here or to send me email directly (smeyers@aristeia.com).

I'd like to say a lot more than is in the presentation (and certainly there's a lot more to be said), but the length of the presentation is only 75 minutes, and I'm shooting to talk for only about an hour to allow time for questions.

I'll face a similar constraint when I write about hardware caches in Fastware!, because I can devote only part of one chapter to the topic, and, as Ulrich Drepper demonstrated in What Every Programmer Should Know About Memory, there's an enormous number of things that can be said. If you have comments on what developers really need to understand (because it is likely to affect how they design and implement their systems), please let me know.

Scott

Thursday, May 13, 2010

A Thought About Minimizing System Latency

When programmers think about optimizing things, they often think about optimizing code. After all, code is where they live. It's what they do. If it's not the code that's to be optimized, what is? I confess that this was my way of thinking for many years. Many years. Want to know how to make things faster? Fire up the profiler!

I no longer think that way. Now I think about optimizing the system. (This recognition, in fact, has convinced me that I'm going to have to give Fastware! a new subtitle. It's currently "Straight Talk about Fast Code," which I think has a nice ring to it, but that's too codecentric. I'll have to find a way to say something catchy that corresponds to "Straight Talk about Fast Systems." But I digress.)

In rare cases, the system is a single executable over which you have complete control, so, sure, fire up the profiler. More commonly, the system consists of multiple cooperating components, often broken across multiple executables. Web-based systems, for example, may have different components handling network traffic, business logic, database queries, etc. In that case, if the system is slow, it's the system that's slow, so before you fire up the profiler, you need to know where to point it.

At first glance, the general way you approach this problem is to start a timer when an event brings the system into action (e.g., a browser sends a request to your web site), stop the timer when the system produces the appropriate response (e.g., packets are sent back to the browser), then look at how much time was taken in the various parts of the system. The problem is that in many cases, this is the wrong way to think about things.

Steve Souders has done a great job of demonstrating one aspect of this idea as it applies to web sites, describing in various forms (book, article, video) how he derived what he calls the Performance Golden Rule:

Only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components in the page.

His approach takes the view that you start the timer when the initial request for an HTML page comes in, but you don't stop it until everything on the page has been rendered in the browser. From a user's point of view, this is much more reasonable. After all, until the page has been displayed, he or she is still waiting, even if the web site itself thinks the job was finished long ago.

One of the most interesting implications of this observation is that a critical part of the user's perception of the system's performance is determined by software (in this case, the browser) over which "the system" has no control. For example, a big part of how fast a site like Yahoo (where Souders worked when he wrote High Performance Web Sites; he's at Google now) seems to be is determined by the user's browser, and of course Yahoo has no control over which browser the user has chosen to use. The idea that the performance of a system is influenced by software over which it has no control generalizes. Even native applications, for example, are typically at some degree of mercy with respect to the libraries they link with and the operating system they run on.

But that's an issue for another day. What I want to pose now is the idea that when determining the latency of an interactive system, the timer should not be started when something triggers the system, it should be triggered when the person using that system decides that they want to do something. Once you've decided you want to do something, everything between then and when you actually get it done is wait time, even if during that time you're typing in commands or pulling down menus or wading through search results, etc. One of the nice things about thinking about things this way is that it offers a framework for thinking about such disparate performance issues as UI design (minimize the time needed to get from deciding what you want to do and expressing it) and prefetching and speculative execution (both of which entail satisfying requests that have not yet been expressed).

Scott

Fastware!-Related Talk at Portland Code Camp

After a year-long digression to work on C++0x, I'm finally able to get back to Fastware!, and on May 22 (a week from Saturday), I expect to have something to show for it. I'll be giving a talk at Portland Code Camp, in Portland, Oregon. The topic will be CPU Caches and Why You Care, and the material will be primarily taken from the Fastware! chapter on hardware, although I also plan to touch on the impact of cache considerations on algorithm and data structure design.

Like all Code Camps, Portland Code Camp is free, so if you live near Portland, Oregon, and don't mind devoting a Saturday to all things code-related, I encourage you to register, then come by my session for a crash course in CPU caches. It should be interesting to see how it goes, given that the talk currently exists only in my head. But in my head, it's really good :-)

Scott

Wednesday, May 5, 2010

"New Publishing Project" Unveiled

In February, I said that I'd been spending a lot of time on a new publishing project and that I'd say more here when there was more to say. There's now more to say, but, because I recently said it to my mailing list (which I've since migrated to a blog for my professional announcements), I'll make this posting brief and simply refer you to my longer posting there.

I've started publishing annotated versions of my training materials on selected topics. Currently there are materials on C++0x and on making effective use of C++ in embedded systems, and soon my materials on improving software quality (regardless of your programming language) will become available. Viewed as a traditional publication, Pete Isensee described the C++0x materials as "not a book, and not a slide show -- it's something in between," but there are two untraditional characteristics of these materials that I hope will make them appealing:

No DRM, plus a very flexible license. Make as many copies as you want, mark them up in any way you like, print them till your toner runs dry, that's all fine. Pretty much the only restriction is that the materials are for your personal use only, so please, no sharing.
Free updates for life. I maintain my training materials for my own use, and you're entitled to every revised version I come out with. There will never be a "second edition" to buy anew. Pay once, and you get free updates as long as I produce them.

There's more to the sales pitch, but you can get the full spiel here. The complete background story is available here.

From a Fastware! point of view, the fact that I can now put development of C++0x materials and preparing them for publication behind me means I can finally get back to work on Fastware!, and in fact I've already done so. I'm currently working on a presentation entitled "CPU Caches and Why You Care," which is directly tied to material I plan to put in Fastware!.

Scott

The Fastware Project