Wednesday, March 18, 2009

Jakob Nielsen Post on Kindle-Friendly Content

The TOC Blog pointed me to Jakob Nielsen's recent post, "Kindle Content Design." Nielsen raises a number of interesting issues, and in general I think we're in violent agreement, but his post ends with this:
Since I started writing Alertbox in 1995, it's been a recurring theme to design for the medium. In the beginning, this meant "don't design your website like a glossy brochure." (I.e., print design is different than online design.)

For Kindle, it's certainly unacceptable to simply repurpose print content. But you can't repurpose website content, either. For good Kindle usability, you have to design for the Kindle. Write Kindle-specific headlines and create Kindle-specific article structures.

I fully agree with the part about designing for the medium, but I don't think Kindle is a medium in and of itself. There are a number of other dedicated book-reading devices, and my guess is that they are similar enough to one another that it's possible to design for the entire category. As an author, I don't want to try to create content on a device-by-device basis, I want to do it on a device-category-by-device-category basis. That may not yield optimal content presentation for every device, but it should yield acceptable presentation for devices I know about as well as devices I don't know about (e.g., that are introduced after I've finished writing). Yes, my software development roots keep showing: I approach cross-platform content authoring the same way I approach cross-platform software authoring.

Tuesday, March 3, 2009

Saving Myself from Physical Formatting

I've been reviewing my FrameMaker source files for Fastware!, looking out for single-sourcing no-nos. High on the list is the use of physical styles instead of logical ones, the poster child for which is the use of italics to style text instead of a logical style like "Book Title." The problem is that the use of italics can be for emphasis, for the title of a publication, for the introduction of a new term, and more. A well-styled manuscript will use only logical style names -- never raw italics.

Alas, nearly every authoring system I know goes out of its way to make it easy to italicize text. WYSIWYG systems (at least those on Windows) define ^I to mean "italicize the selection", HTML offers \i, LaTeX has \it, reST offers *text*, etc. It's a similar situation for making things bold or underlined, both of which are also physical styles and should thus be avoided.

From what I can tell, there is no notion of "italicization" in the DocBook or DTBook schemas, and the screen shot for oXygen's XML Author for DocBook shows a toolbar button only to emphasize text, not italicize it. Score one for XML.

I've decided to use FrameMaker, not XML, so I'll be working in WYSIWYGland. I'm still committed to using only logical styles, and although I'm a pretty disciplined guy, I have no illusions that I'm perfect. If I don't find a way to keep myself from doing it, I know I'll occasionally type ^I (or ^B for bold) in FrameMaker without thinking about it. I won't notice my formatting faux pas (note use of italics to indicate a foreign expression, which is quite different from using them to emphasize something or to indicate a book title), because everything in my WYSIWYG world will look right. The situation would likely be the same in an italics-supporting markup-based world such as LaTeX or reST.

Fortunately, FrameMaker's keyboard and menu command set is determined by configuration files, so, with some guidance from the kind folks at the FrameMaker forum, I was able to excise keyboard and toolbar commands for italics, bold, and underlining. Now, should I foolishly try to italicize something via the errant ^I, nothing happens.

Experience Report: My C++ Books on Kindle

I recently announced to my mailing list that, unbeknownst to me (and, as it turns out, my editor), two of my C++ books have been made available on Kindle:
Two of my books -- Effective C++ and Effective STL -- are now available for Amazon's Kindle. I haven't seen these editions myself, so I can't tell you anything about them. Since I don't have a Kindle, this is unlikely to change anytime soon. If you try these editions, please let me know what you think of them. You'll find links to the Kindle editions of these books at http://www.aristeia.com/books.html .
Shortly thereafter, Herb Sutter wrote me as follows:
I downloaded a Kindle sample of one of your books, which included enough to see some source code examples. In general it looks good, except for one thing: Source code is rendered badly. The text is clear, but the two problems are:

a) Line length (though not as bad as narrow magazine columns & what iPhone would be like): Medium-long lines have bad wraps that make the examples a pain to read. But line length is probably going to be an issue for all small-form-factor devices.

b) Proportional font => comments don’t line up. It’s possible to get fixed-width fonts on Kindle but you have to try hard and use the <pre> tag explicitly. For more, see Larry O’Brien’s recent blog note about targeting technical docs to Kindle, including the comment about getting Greek text to work by explicitly using UTF-8.
Herb included a couple of quick-and-dirty screen shots, including this one:

The code examples, which I'd carefully formatted for the font and page width of the printed book, were apparently moved to the Kindle without any reformatting consideration, so the Kindle tossed in new line breaks wherever it felt they were needed. The result, unsurprisingly, is awful. This is consistent with my belief that the chances of writing a book with anything beyond straight prose that looks good on multiple devices is close to zero. The more special formatting that's needed (e.g., for recipes, poetry, code, etc.), the more care an author (and his or her publisher) is likely to need to take.

Bearing in mind that my recent C++ books use two ink colors (black plus a red highlight color for places where I want to focus readers' attention), Herb wrote:
Some text that I think you had as red or some other color looks gray on Kindle. It’s still readable, if a little less distinct. Probably the best that can be done given that Kindle 1 only has a 4-level gray scale. It’d probably look slightly better on a Kindle 2 with a 16-level gray scale, but only slightly since you do have to make it gray enough to be distinct and so I’d imagine you couldn’t just use levels 14 and 16 for example.
I discussed the problem of writing for devices with different capabilities such as color in an earlier post.

Herb concluded:
If you’re serious about thinking of targeting devices like Kindle or iPhone, it’s worth having one.
I replied:
Practically speaking, I'm sure you're right, but this is the kind of thing I'd like my publisher to address for me. I want to focus on content, not device-dependent presentation issues. One of the things my publisher should do is investigate the landscape of output devices, then give me advice on what I should or should not do in my ms to keep it cross-platform-friendly.
I thank Herb for his mini-review of my books on Kindle and for his permission to use his material here.

Monday, March 2, 2009

IO

When I first started thinking about Fastware!, I knew I'd have a chapter on IO, but I was concerned that it would take me a while to have much to say about the topic, and I expected it to be one of the last chapters I attacked. Since then, my view has changed, and in part due to the excitement I got from reading Tom Leighton's article in CACM (and ACM Queue -- sometimes recycling isn't so great, sigh) about improving performance using the Internet, I'm now ready to draft a chapter on IO.

It's an interesting question to try to define IO. Technically, IO is probably anything that communicates with something off-chip, so accessing main memory can be considered IO, although I don't plan to treat it that way. My current plan is to focus on disk and network issues, but even there things are beginning to get fuzzy, because solid state drives use a traditional disk API, but don't have anything akin to rotational latency. At some point I may decide to create a chapter focusing on storage (cache, memory, and "disks," regardless of technology) and another on network IO, but for now, the traditional view that IO largely means disk and network access seems reasonable to me. I welcome comments on the best way to organize these issues.

So what are the topics relevant to the creation of sizzling disk and network IO? Here are the main ones on my list:
  • Prefetching, buffering, and caching (by hardware, drivers, OSes, language runtime systems, and applications)
  • Asynchronous and concurrent reads/writes (including disk striping)
  • Memory-mapping files
  • Avoiding disk fragmentation
  • Network protocol choices (e.g., TCP vs UDP)
  • Doing IO on deltas instead of full data sets
  • Reducing network distances (e.g., via CDNs)
  • Data compression and "bundling" (e.g., CSS sprites)
  • Latency vs. Bandwidth
I welcome suggestions for issues related to the design and implementation of low-latency IO. Because Fastware! is a language-independent book, I expect to make at most passing references to language-specific techniques (e.g., using C++ rdbuf or istreambuf_iterators), but I'm still interested in hearing about them, because it's not uncommon for different languages to have their own approaches to a more general issue, and I want to include discussions of as many general issues as I can.