The Fastware Project

Monday, July 19, 2010

Blog End

When I started this blog, my short-term goal was to have a place to think out loud about the uncertainty I felt as an author of technical material in a publishing world that was on the brink of making the transition from black-ink-on-white-paper to whatever comes next. I can't say I'm now feeling a lot less uncertain, and in fact the pace of change in the publishing world has only accelerated, as electronic publication assumes increasing importance, but I no longer feel as angsty about it. This counts as progress.

My longer-term goal was to engage in a dialogue with people interested in the production of fast software systems such that I could do a better job with the content of Fastware!. Doing that, however, requires that I write up reasonable initial blog posts to spur discussion, and I've found that this is not something I enjoy. To be honest, I view it as overhead. Given a choice between doing background research to learn more about a topic (typically reading something, but possibly also viewing a technical presentation, listening to a technical podcast, or exchanging email with a technical expert) or writing up a blog entry to open discussion, I find myself almost invariably doing the research. One reason for this is that I feel obliged to have done some research before I post, anyway, and I typically find that once I'm done with the research, writing something up as a standalone blog entry is an enterprise that consumes more time than I'm willing to give it. It's typically easier to write the result up in the form of a technical presentation, then give the presentation and get feedback that way. This, for example, is what I did with my work on CPU caches.

My work on Fastware! will continue, but I don't expect to make any more blog entries about it here. Instead, I'll fall back on my usual policy of not making an announcement about anything until whatever it is I have to announce is fairly well-baked. If you're interested in announcements of those type, I suggest you follow my "Professional Announcments" blog.

Scott

Friday, May 21, 2010

Notes for Portland Code Camp Talk Now Available

My presentation isn't until tomorrow, but I somehow managed to finish the materials for it today. If you're interested in what I have to say on the topic "CPU Caches and Why You Care," I encourage you to download the presentation materials (PDF) and take a look. If you find that you have something to say in return, I encourage you to post your comments here or to send me email directly (smeyers@aristeia.com).

I'd like to say a lot more than is in the presentation (and certainly there's a lot more to be said), but the length of the presentation is only 75 minutes, and I'm shooting to talk for only about an hour to allow time for questions.

I'll face a similar constraint when I write about hardware caches in Fastware!, because I can devote only part of one chapter to the topic, and, as Ulrich Drepper demonstrated in What Every Programmer Should Know About Memory, there's an enormous number of things that can be said. If you have comments on what developers really need to understand (because it is likely to affect how they design and implement their systems), please let me know.

Scott

Thursday, May 13, 2010

A Thought About Minimizing System Latency

When programmers think about optimizing things, they often think about optimizing code. After all, code is where they live. It's what they do. If it's not the code that's to be optimized, what is? I confess that this was my way of thinking for many years. Many years. Want to know how to make things faster? Fire up the profiler!

I no longer think that way. Now I think about optimizing the system. (This recognition, in fact, has convinced me that I'm going to have to give Fastware! a new subtitle. It's currently "Straight Talk about Fast Code," which I think has a nice ring to it, but that's too codecentric. I'll have to find a way to say something catchy that corresponds to "Straight Talk about Fast Systems." But I digress.)

In rare cases, the system is a single executable over which you have complete control, so, sure, fire up the profiler. More commonly, the system consists of multiple cooperating components, often broken across multiple executables. Web-based systems, for example, may have different components handling network traffic, business logic, database queries, etc. In that case, if the system is slow, it's the system that's slow, so before you fire up the profiler, you need to know where to point it.

At first glance, the general way you approach this problem is to start a timer when an event brings the system into action (e.g., a browser sends a request to your web site), stop the timer when the system produces the appropriate response (e.g., packets are sent back to the browser), then look at how much time was taken in the various parts of the system. The problem is that in many cases, this is the wrong way to think about things.

Steve Souders has done a great job of demonstrating one aspect of this idea as it applies to web sites, describing in various forms (book, article, video) how he derived what he calls the Performance Golden Rule:

Only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components in the page.

His approach takes the view that you start the timer when the initial request for an HTML page comes in, but you don't stop it until everything on the page has been rendered in the browser. From a user's point of view, this is much more reasonable. After all, until the page has been displayed, he or she is still waiting, even if the web site itself thinks the job was finished long ago.

One of the most interesting implications of this observation is that a critical part of the user's perception of the system's performance is determined by software (in this case, the browser) over which "the system" has no control. For example, a big part of how fast a site like Yahoo (where Souders worked when he wrote High Performance Web Sites; he's at Google now) seems to be is determined by the user's browser, and of course Yahoo has no control over which browser the user has chosen to use. The idea that the performance of a system is influenced by software over which it has no control generalizes. Even native applications, for example, are typically at some degree of mercy with respect to the libraries they link with and the operating system they run on.

But that's an issue for another day. What I want to pose now is the idea that when determining the latency of an interactive system, the timer should not be started when something triggers the system, it should be triggered when the person using that system decides that they want to do something. Once you've decided you want to do something, everything between then and when you actually get it done is wait time, even if during that time you're typing in commands or pulling down menus or wading through search results, etc. One of the nice things about thinking about things this way is that it offers a framework for thinking about such disparate performance issues as UI design (minimize the time needed to get from deciding what you want to do and expressing it) and prefetching and speculative execution (both of which entail satisfying requests that have not yet been expressed).

Scott

Fastware!-Related Talk at Portland Code Camp

After a year-long digression to work on C++0x, I'm finally able to get back to Fastware!, and on May 22 (a week from Saturday), I expect to have something to show for it. I'll be giving a talk at Portland Code Camp, in Portland, Oregon. The topic will be CPU Caches and Why You Care, and the material will be primarily taken from the Fastware! chapter on hardware, although I also plan to touch on the impact of cache considerations on algorithm and data structure design.

Like all Code Camps, Portland Code Camp is free, so if you live near Portland, Oregon, and don't mind devoting a Saturday to all things code-related, I encourage you to register, then come by my session for a crash course in CPU caches. It should be interesting to see how it goes, given that the talk currently exists only in my head. But in my head, it's really good :-)

Scott

Wednesday, May 5, 2010

"New Publishing Project" Unveiled

In February, I said that I'd been spending a lot of time on a new publishing project and that I'd say more here when there was more to say. There's now more to say, but, because I recently said it to my mailing list (which I've since migrated to a blog for my professional announcements), I'll make this posting brief and simply refer you to my longer posting there.

I've started publishing annotated versions of my training materials on selected topics. Currently there are materials on C++0x and on making effective use of C++ in embedded systems, and soon my materials on improving software quality (regardless of your programming language) will become available. Viewed as a traditional publication, Pete Isensee described the C++0x materials as "not a book, and not a slide show -- it's something in between," but there are two untraditional characteristics of these materials that I hope will make them appealing:

No DRM, plus a very flexible license. Make as many copies as you want, mark them up in any way you like, print them till your toner runs dry, that's all fine. Pretty much the only restriction is that the materials are for your personal use only, so please, no sharing.
Free updates for life. I maintain my training materials for my own use, and you're entitled to every revised version I come out with. There will never be a "second edition" to buy anew. Pay once, and you get free updates as long as I produce them.

There's more to the sales pitch, but you can get the full spiel here. The complete background story is available here.

From a Fastware! point of view, the fact that I can now put development of C++0x materials and preparing them for publication behind me means I can finally get back to work on Fastware!, and in fact I've already done so. I'm currently working on a presentation entitled "CPU Caches and Why You Care," which is directly tied to material I plan to put in Fastware!.

Scott

Thursday, February 18, 2010

Updated Fastware!-Related Information Sources

Almost exactly one year ago, I uploaded an Excel spreadsheet summarizing sources of information I'd consulted as research for what I still claim will become a book. The blog has been silent since my promise last July to return to work on the book later that month, and while it's true that other projects have soaked up most of my time since then, I have managed to add some sources to the spreadsheet. I've uploaded the revised version, and, as before, you'll find a link to that spreadsheet at the Fastware! web site.

I still plan to get back to Fastware! as my primary project soon, but that's been the case for months, so I'm not going to make any predictions regarding when it will actually take place. I will say that one of the things taking up a lot of my time has been a new publishing project that I expect to be able to announce within the next few weeks. It's not specifically related to Fastware!, but it does relate to the publishing-related issues I've raised in this blog.

When there's more to say, I'll say it here.

Monday, July 13, 2009

Fastware! has to Time-Share

For better or for worse, my work on Fastware! must sometimes give way to other duties I have, and one of the obligations I signed up for at the beginning of the year was development of a new training course on C++0x. I expected that to take about a month. Oops. Development of a full draft (draft!) turned out to take closer to three months, which is why this blog has been silent since mid-April. I expect to get back to Fastware! later this month, so worry not, the project is still very much alive.

Should you happen to have an interest in the C++0x training course I'm finishing up, feel free to visit its web page.

Sunday, April 12, 2009

TOC Talk Now Available Online

The talk I gave at the Tools of Change in Publishing Conference in February is now available online. It runs about 46 minutes. If you've been reading this blog, you won't get anything new from the talk, but if you're interested in the "live" presentation, you can find it here.

Wednesday, March 18, 2009

Jakob Nielsen Post on Kindle-Friendly Content

The TOC Blog pointed me to Jakob Nielsen's recent post, "Kindle Content Design." Nielsen raises a number of interesting issues, and in general I think we're in violent agreement, but his post ends with this:

Since I started writing Alertbox in 1995, it's been a recurring theme to design for the medium. In the beginning, this meant "don't design your website like a glossy brochure." (I.e., print design is different than online design.)
For Kindle, it's certainly unacceptable to simply repurpose print content. But you can't repurpose website content, either. For good Kindle usability, you have to design for the Kindle. Write Kindle-specific headlines and create Kindle-specific article structures.

I fully agree with the part about designing for the medium, but I don't think Kindle is a medium in and of itself. There are a number of other dedicated book-reading devices, and my guess is that they are similar enough to one another that it's possible to design for the entire category. As an author, I don't want to try to create content on a device-by-device basis, I want to do it on a device-category-by-device-category basis. That may not yield optimal content presentation for every device, but it should yield acceptable presentation for devices I know about as well as devices I don't know about (e.g., that are introduced after I've finished writing). Yes, my software development roots keep showing: I approach cross-platform content authoring the same way I approach cross-platform software authoring.

Tuesday, March 3, 2009

Saving Myself from Physical Formatting

I've been reviewing my FrameMaker source files for Fastware!, looking out for single-sourcing no-nos. High on the list is the use of physical styles instead of logical ones, the poster child for which is the use of italics to style text instead of a logical style like "Book Title." The problem is that the use of italics can be for emphasis, for the title of a publication, for the introduction of a new term, and more. A well-styled manuscript will use only logical style names -- never raw italics.

Alas, nearly every authoring system I know goes out of its way to make it easy to italicize text. WYSIWYG systems (at least those on Windows) define ^I to mean "italicize the selection", HTML offers \i, LaTeX has \it, reST offers *text*, etc. It's a similar situation for making things bold or underlined, both of which are also physical styles and should thus be avoided.

From what I can tell, there is no notion of "italicization" in the DocBook or DTBook schemas, and the screen shot for oXygen's XML Author for DocBook shows a toolbar button only to emphasize text, not italicize it. Score one for XML.

I've decided to use FrameMaker, not XML, so I'll be working in WYSIWYGland. I'm still committed to using only logical styles, and although I'm a pretty disciplined guy, I have no illusions that I'm perfect. If I don't find a way to keep myself from doing it, I know I'll occasionally type ^I (or ^B for bold) in FrameMaker without thinking about it. I won't notice my formatting faux pas (note use of italics to indicate a foreign expression, which is quite different from using them to emphasize something or to indicate a book title), because everything in my WYSIWYG world will look right. The situation would likely be the same in an italics-supporting markup-based world such as LaTeX or reST.

Fortunately, FrameMaker's keyboard and menu command set is determined by configuration files, so, with some guidance from the kind folks at the FrameMaker forum, I was able to excise keyboard and toolbar commands for italics, bold, and underlining. Now, should I foolishly try to italicize something via the errant ^I, nothing happens.