Saturday, December 27, 2008

Notes from Effective C++ CD

During 1998, much of my time was devoted to designing and helping supervise the implementation of an electronic version of two of my books. The result was Effective C++ CD, an HTML implementation of the books and some associated magazine articles, where links connected everything on the CD. In our work, we addressed some content issues and many presentation issues, and we produced enough innovations to merit an article for practitioners in Microsoft Internet Developer and a paper for academics in Proceedings of the 5th Conference on Human Factors & the Web.

I recently found myself in the directory with the CD's files, and I took the opportunity to review the notes I'd made regarding how we could improve the CD were we to undertake the project again (e.g., as a second edition). Most of the comments were specific either to the content of the CD or to the web-browser-as-book-viewer decisions we made (hence not germane to my work on Fastware!), but some remain relevant today. Here they are (in no particular order):
  • Text in graphics should be visible to search engines. (This generalizes: text in figures, tables, diagrams, animations, etc., should be visible to search engines.)

  • Electronic versions of books are essentially software and, like contemporary software, they should be updatable via the net. When bugs are fixed in an electronic version of a book, owners of that book have a right to expect to be able to incorporate those bug fixes into the book they've purchased.

  • The books on my CD are organized into either 35 or 50 "Items," which are essentially technical essays. Each book has an extensive set of cross references among its Items, so Item 22 might refer to Item 21 and Item 8. Within a book, this was fine, but when I added links between the books' Items (a feature exclusive to the CD), I had to make clear which book had the Item I was referring to. That is "Item 22" was unambiguous within a print book, but it wasn't unambiguous on a CD with two books, each of which had such an Item. I addressed this by prepending a book-specific letter to the Item number for Items outside the current book, e.g., "Item 22" is in the current book, but "Item M22" or "Item E22" is in the other book.

    A fair number of people found this confusing. One way to address this problem would have been to always use the E or M prefix on all Item cross references, but this would have introduced syntactic noise and led to the electronic books not looking the same as their print versions.

    The real problem, I think, is trying to figure out how to write for something that might stand alone (as, e.g., a print book) but that might also be part of a collection of interlinked documents. Readers of the standalone version shouldn't be bothered with cross reference disambiguation overhead that's needed only in non-standalone environments, but books they know from their standalone versions should look essentially the same as the versions they encounter in linked environments. (For a related discussion, consult my vision for electronic books.)

  • Because link text generally looks different from non-link text, it calls attention to itself. That's the point of it looking different: to communicate, "Hey, I'm clickable!" Unfortunately, when too much text tries to get your attention at the same time, it intrudes on the reading experience. For example, contrast this, where every reference to Amazon is linked,
    You can buy lots of stuff at Amazon. That's because Amazon sells lots of stuff. Amazon customers expect lots of stuff at Amazon, because that's what Amazon is know for. Yay, Amazon!
    with this, where only the first reference is linked:
    You can buy lots of stuff at Amazon. That's because Amazon sells lots of stuff. Amazon customers expect lots of stuff at Amazon, because that's what Amazon is know for. Yay, Amazon!
    Failing to make links out of text that's already been made a link recently is, to me, akin to using pronouns to refer to nouns that have been recently introduced. Pronouns make text more interesting and less repetitive, but harder to understand out of context. Non-link text is similar: it avoids visual repetition, but it makes the text harder to understand out of context.

    There are two additional issues relating to whether repeated text should be made active at each point of repetition. The first has to do with consistency. More than one reader of my CD complained that I was inconsistent about what text was linked and what was not. These readers seemed to expect all naturally linkable text to be linked, no matter how many times that text occurred, even within a short space.

    The other additional issue concerns search engines, which can plop you down in an arbitrary location in an arbitrary document. If you start reading and you encounter a pronoun, you naturally scan backwards looking for the antecedent. But if you encounter text that seems like it should be a link, my guess is that you don't scan backwards looking for the same text in link form. Rather, you get annoyed at the author for failing to make the text you're looking at a link. That leads to the challenge: how do you avoid the visual clutter that accompanies making every occurrence of naturally linkable text into a link while also meeting the link-related expectations of readers who use search engines to take them to the point in a document where they start reading?

  • Clicking on naturally linkable text like "Item 5" or "Section 3.5.1" or "Chapter 4," where the target of the link generally has a title, leads to some readers wanting to see the title without having to traverse the link. Instead of
    As you'll see in Chapter 4, ...
    they'd prefer to see:
    As you'll see in Chapter 4 ("Giant Anteaters"), ...
    Some authors do this in print, but I find it distracting as a reader. As the author of an electronic book, I can offer the title as an option by, e.g., displaying it when the mouse hovers over the link text. But that means I have to make sure that capability is provided when my book is prepared for electronic publication. (A similar capability can be used to avoid making readers turn to a glossary to see a term's definition.)

  • If I'm looking at an nth-level index entry, it would be nice to have an easy way to get to the n-1st level, i.e., essentially a way to move to the parent entry for a child index entry.

  • One reader wrote:
    It would be great if you can have a table of content showing in the navigation area, and there is a toc synchronization function (much like Microsoft Workshops), so that the readers will have a better idea of where they are in the book.
    This is one way to address the "where am I?" problem that can arise as the result of a search or when following a link from one part of a document to another (or from one document to another).

Monday, December 15, 2008

An Introduction to Fastware!

All my blog entries to date have been about issues related to authoring: things that affect my choices among writing tools and my strategies for effectively conveying the information I want to get across to my readers. (As I've noted before, the term "reader" is misleading, because one of the forms in which I'd like Fastware! to be usable is audible. The proper term is probably "content consumer," but I'll stick with "reader," in part because it's a lot less ugly, in part because it better reminds me that I'm primarily writing for humans, not machines.) This blog entry is a bridge between authoring concerns and content issues, because it touches both.

Experienced authors and publishers will tell you that you usually write a book's introduction last, because you can't really know what needs to be introduced until you've written it. When working on my past books, I used the placeholder introductory chapter as a dumping ground for terms that needed to be defined, assumptions that needed to be explained, conventions that needed to be described, etc. When the book proper was done, I'd go back and sift through the debris that had made its way to what was to become the Introduction, take a deep breath, and do my best to make a coherent narrative out of the odds and ends I found there.

For Fastware!, I chose a different approach. My experience has been that the need for a book on how to write software that runs quickly is not self-evident to many people. That bothered me. I view the case as overwhelmingly strong, and I felt compelled to make that case right away. As a result, I wrote Fastware!'s Introduction first, and I've now made a draft available at the book's web site. In its current form, the chapter is more manifesto than Introduction, and I know I'll have to add more material once I've written the rest of the book, but it should give you a good idea of what I envision the book to ultimately be.

There are two parts to that vision, content and presentation, and the Introduction should give you a glimpse of both. (If you've been following this blog, you know that I believe that content and presentation are not really separable. If you haven't been following the blog and are interested in this view, check out this and this.) The content should be self-explanatory. If it's not, I've botched my job, and please let me know about it, either as comments on this blog or as email to smeyers@aristeia.com.

Regarding the draft presentation, here are a few things I think worth pointing out:
  • The book is about speed, and visually, it should come across that way. One way I've tried to convey this is the use of italics in the chapter title, section and sidebar heads, and the footer. Like runners striving to move faster, italic letters lean forward. Another way is the fireball behind the chapter numbers. This is cheesy in its current form, but I have no illusions that I'm an artist; the fireball is a placeholder. My original idea was to have flames shooting out the back of the page numbers, and I'd ultimately like to do something more like that. Another problem with the fireball is that it's too prominent, but that can be toned down in various ways (e.g., increase the transparency of the image). The main thing is to come up with subtle ways to suggest movement -- fast movement -- through the book's layout and formatting.
  • "Voice of Experience" sidebars reinforce material in the chapter they accompany. This is one of the ideas for Fastware! I'm particularly enthusiastic about, and it straddles the line between content and presentation. The book will contain lots of suggestions about how to write fast software, and after a while, I expect readers to roll their eyes and mutter, "Yeah, yeah, yeah..." Some of the suggestions may strike some readers as less important than I know them to be, and I worry that such readers will skip the muttering and simply roll their eyes.

    Some authors, to reinforce the points they make, offer fictional examples demonstrating how things could play out in practice. Other authors give real examples from their own experience. Few authors have the background to personally vouch for the full range of topics I'll cover in Fastware!, and, alas, I'm not one of them. The "Voice of Experience" sidebars are my way of bringing in guest speakers who, in their own words, can back up what Fastware! tells them. My plan is to have two sidebars per chapter, although I currently have only one in the draft Introduction.

    The sidebars are designed to have a different look to them, and not just to make it clear that they are sidebars. For readers reading straight through, I want them to pop up from time to time as visual and semantic treats. For readers flipping through the book, I want them to stand out as easy-to-find nuggets that stand on their own and provide useful "from the factory floor" information.
  • Color output is the default. Especially as time goes on, I expect more and more readers to experience Fastware! on a color-capable device, so the primary presentation format should take advantage of color. In the draft Introduction, I sometimes use color to bring out semantics (e.g., for clickable URLs and email addresses, although they are not active in the PDF I posted, sorry). In other cases, I use color simply to make the work more visually engaging. Some will pooh-pooh this use of color, but it has as great an impact on a prospective reader's evaluation of a book as do things like font choices, interline leading, footnotes vs. endnotes, etc. Black electronic text on a white electronic page looks as anachronistic to contemporary readers as black and white TV shows do to contemporary TV viewers. In my draft introduction, I use color in a number of ways to enliven the visual effect: for section headers, for bold-faced text, for sidebar backgrounds, in the line above the footer, in the page number fireball, in "The Voice of Experience" photographs. My goal is to produce a book that looks somewhat less like a traditional book and somewhat more like magazines and web pages.
If you have any comments on my vision for Fastware! (content or presentation) or about the draft Introduction, please let me know, either as comments on this blog or via email to smeyers@aristeia.com.

Friday, December 5, 2008

XML, Structure, and DocBook

In my last post, I mentioned that all I really need to be able to produce is PDF and XML. PDF can be generated from suitable XML, and a colleague who works extensively with the publishing industry remarked that my leaning towards LaTeX was out of step with the industry's move towards XML-based content representation. In and of itself, that didn't strike me as terribly interesting. As I wrote to my colleague:
I don't doubt your observation that publishers are converging on XML as a representation format, but I don't think it's very meaningful. XML is just text in a particular syntactic format, and anything can be translated into XML: FrameMaker, Word, LaTeX, reST, HTML, you name it. Making sense of an XML document requires knowing the schema that assigns semantics to the document's elements, and my sense is that the publishing world is not converging on a common schema. O'Reilly uses DocBook. The Pragmatic Programmers use PML. Presumably Pearson uses something else. If I take my book in DocBook/XML format and give it to somebody whose tool chain expects XML using a different schema, that tool chain will be unable to do anything interesting with the document until it's been translated from DocBook XML to OtherSchema XML. Such a translation may be easy or difficult, depending on how well the semantic elements of the two schemas correspond.
Still, I started poking into information about generating XML from FrameMaker (what I've used for my previous books), and that led into a detour about the difference between Unstructured FrameMaker (the variant I've been using, where there is no document schema) and Structured FrameMaker (the variant that uses document schemas). Both can generate XML, but then I read an article at scriptorium.com that yielded an XML epiphany. The XML generated by Unstructured FrameMaker consists of a flat sequence of paragraphs identified by their styles, e.g.,
Book Title
Book Author
Book Chapter
Chapter Intro
Chapter Section
Section Para
Section Para
Chapter Section
Section para
Book Chapter
Chapter Intro
...
For purposes of generating PDF, this is fine, because all we need to know is how to format each paragraph style. But the flat sequence of paragraphs fails to reflect the underlying structure of the document. That looks more like this:
Book Title
Book Author
Book Chapter
Chapter Intro
Chapter Section
Section Para
Section Para
Chapter Section
Section para
Book Chapter
Chapter Intro
...
The structural information isn't needed for typesetting, but it's present in my head as I write, and it's reflected in the eventual formatting (e.g., chapter titles are typeset bigger than section titles, which are typeset bigger than subsection titles, etc.), so having it present in the XML seems like a pretty reasonable notion. Furthermore, XML schemas used by the publishing industry for book representation are doubtless going to contain such information, so if I want to facilitate transformation of my book's XML into whatever XML a publisher might want, my XML needs to have the structural information the target XML will require.

In short, there XML and there's XML, and XML without structural information about the book content it represents almost certainly imposes serious restrictions on what can be done with it. Going down that road seems foolish.

My need, then, is to be able to generate XML that reflects the logical structure of my book. I thus need an XML schema that defines that structure. I could come up with one from scratch, but I'm not so näive as to believe that that's a simple task, or, more precisely, a simple task to do well. Call me a reuse buff, but I want to pick up a pre-fab book schema, assume that the people who developed it knew what they were doing, and get on to the real work of producing content. Which pretty much takes me back to DocBook and the search for a DocBook-aware XML editor.

Thursday, December 4, 2008

Reining in Requirements

In an earlier post regarding support for custom color combinations, I wrote:
Being a software person, I'm going to give in to my inclination to generalize and assume that any given set of color choices is going to be problematic for some portion of my readership.
No problem, I still think that's reasonable. The problem is this part:
Being a software person, I'm going to give in to my inclination to generalize....
A friend who's been following this blog, who's an author who has prepared CRC (camera-ready copy) for several books, and who's currently working on a book where he'd like to satisfy many of the same requirements I would, suggested I consider -- I am not making this up -- Microsoft Word. Hope springing eternal, I posted a message to a couple of Word newsgroups asking whether Word was up to the task of producing multiple PDFs from a single document, where each PDF might have custom page dimensions and custom style definitions. To date I've seen no useful responses, but it occurred to me sometime after I'd posted my query that I'd made a classic software development mistake: I'd generalized my problem to the point where what I said I wanted to do was almost certainly beyond anything I'd ever need to do.

Sure, I want to produce content that can be viewed on devices of different physical sizes, and certainly that will result in different page sizes, but does that require different PDFs? I had the nagging feeling that it might not, and a little research into the formats employed by various electronic reading devices (e.g., Kindle, Sony Reader, iPhone, etc.) revealed that all break lines dynamically. Such "reflowable text" is a foundation of the epub standard. It's also contradictory to the idea behind PDF, which inherently assumes the existence of physical pages with a fixed size. Unsurprisingly electronic devices for reading hate PDF. The Mobipocket Developer Center, for example, lists six formats that can be used as the basis for importing content in order of desirability. PDF is number six.

Ebook devices tend to prefer some flavor of XML (often XHTML) as their content format, and that means that what I really need is a way for my authoring toolchain to produce two things: a single fixed-dimension PDF for print publication (which, with some minor processing, can perform double-duty as a directly-consumable eformat) and XML (or something directly convertible to XML). That combination of requirements is much less demanding than what I'd been thinking I needed.

In fact, now that I'm in a "what do I really need" mood, I can probably dump the per-order custom-color requirement, too. In a perfect world, yes, I'd offer each reader their choice of color combinations. In practice, I can probably make almost everybody happy by offering (1) a color document for color output devices and for people who don't suffer from any kind of color blindness and (2) a monochrome document for monochrome output devices and for people with any kind of color blindness. Where the color document uses color, the monochrome document could (for text) use underlining or changes in font face and (for diagrams) use different line thicknesses and fill styles. If Fastware! unexpectedly turns out to have nontrivial code examples where syntax coloring would be useful, I could offer variants using the most commonly employed color combinations, thus yielding e.g., (1a) color using Eclipse syntax highlighting, (1b) color using Visual Studio syntax highlighting, etc. As I said, I'd like to offer color customization on a per copy basis, but that's not a requirement, it's a desideratum. Those are different things. It's important that I not confuse them.

Tuesday, December 2, 2008

A Look at reStructuredText

Several comments on this blog referred me to reStructuredText (reST) as a markup alternative to LaTeX and DocBook. reST is part of the Docutils project, which, probably because I am not a Python programmer, I had not heard of.

I've now spent a little time reading about reST, examing some of its output, looking at some document source files, writing and processing some trivial examples, and posting questions to the Docutils mailing list. On a scale of knowledgability about reST that runs from 1 to 100, that puts me at about 1.1. Still, I've come to a few conclusions:
  • For common things like numbered or bulleted lists, reST markup is less verbose (hence less intrusive) than LaTeX's.
  • For inline style-based markup, they seem to be about the same. reST's :foo:`Text` is LaTeX's {\foo Text}.
  • The LaTeX user community seems to be larger than reST's, and while there are several books on LaTeX, there don't seem to be any dedicated to reST or Docutils.
  • reST documents are typically viewed as HTML, LaTeX documents as PDF. This is noteworthy, because I currently expect PDF generation to be more important for Fastware! than HTML generation.
  • reST more rigorously separates content from presentation.
This last point may be the most interesting. As part of my kicking of reST's tires, I looked at some documents I'd written using LaTeX, trying to figure out how I'd be able to achieve the same effect in reST. I generally had little trouble, but then I noticed a table I'd included in an IEEE Software paper from long ago:

Each entry here is centered both horizontally and vertically, and it occurred to me that I'd never noticed such layout in tables generated from reST. I started googling around for centering support in reST, and, thanks to help from the Docutils mailing list, I eventually came to understand that there is no notion of "centering" in reST. Whether something is centered isn't content, it's presentation, and presentation decisions are made downstream from reST, e.g., by CSS style sheets for presentation via HTML.

Something else that became apparent is that while the above table could be produced from reST by slapping an appropriate "centering" attribute on the entire table, reST doesn't really have a way to express metadata (e.g., presentation information) on a cell-by-cell basis. So if I wanted some cells' content to be centered and others' to be, say, left-justified, reST isn't up to that. I can't think of a case where I'd want to do that, but I know of lots of cases where I create spreadsheets where some rows or columns use different justification settings. Here's part of the spreadsheet I've been using to compare link-related features of various electronic books:
Note that some columns are horizontally left-justified, while others are horizontally center-justified.

This takes us back to a topic I covered in one of my earliest posts: the lack of strict separation of content and presentation. The way information is placed in a table can help comprehension or it can hurt it, and as an author, I want to make sure that the presentation helps. Certainly the proper presentation of table information can vary from row to row and column to column within a table or between tables, and the fact that both Excel and LaTeX also offer per-cell formatting support strongly suggests that there are situations where content creators feel that such control is useful.

In response to one of my requests for information on the Docutils mailing list, David Goodger commented:
In terms of expressive power, LaTex > reST. In terms of readability and convenience, reST > LaTeX. Take your pick. If you're picky about the formatting details, reST may not be for you.
Alas, I am picky about formatting details, in part because I'm a control freak, but in part because I believe that formatting is related to comprehensibility, and, fundamentally, comprehensibility is pretty much the only thing that matters. (Okay, accuracy is kind of important, too.) An author's job is to convey his or her message as effectively as possible. That requires an expressive medium in which to represent that message. My concern is not so much that reST is less expressive than LaTeX, it's that it's less expressive than what I think I might reasonably want. I don't need the most expressive book-writing system available, I just need one that's adequately expressive. If reST doesn't offer a way for me to produce tables in a form I already know I employ, that's a problem.

reST looks to be a nice markup language, easy to learn and use for many purposes, especially the production of web pages. I was impressed with the low barrier to entry: I downloaded and installed Python and docutils and was producing HTML from reST in under an hour. It's not hard to find impressive-looking decidedly nontrivial web pages generated from reST (e..g, the Python multiprocessing documentation pointed out by David Niergarth or the pages at Saifoo). Still, I can't shake the doubt that if I go with reST, I'll eventually bump into something I want to be able to express, but can't. I'm therefore still leaning towards LaTeX.

Besides, I still have that friend who's offered to be my personal LaTeX consultant :-)

Saturday, November 29, 2008

Leaning Towards LaTeX

About three weeks ago I wrote that one of my goals was to make it easy to generate versions of my book using custom colors, in part because approximately 10% of the male population suffers from some kind of color blindness. Since that time, I find that achieving this goal has become increasingly important to me. Most of the world's problems are beyond my control, but this I can do something about. I concluded before that wanting to offer color customization on a per-copy basis pretty much commits me to markup-based authoring, a commitment I don't have much enthusiasm for, but the inconveniences associated with markup-based authoring can't compare with the inconveniences associated with color blindness, so I'm in no position to whine. (Not that it means I won't.) I suspect that the use of markup will ultimately allow me to do other things I'll find useful, but at the end of the day, I find that I can't get away from the idea that if I can make it possible for color blind people to see things in a way that works for them and isn't too onerous for me, I have a moral obligation to do it. I guess it's my way of adapting the ADA's "reasonable accommodation" rule to publishing.

As I wrote before, the choices in MarkupLand seem to boil down to XML+DocBook and LaTeX. I was leaning towards DocBook, but then two things happened. The first is that I went to my local technical bookstore to get a book on DocBook, and it didn't have any. Bad sign, very bad, especially since my local technical bookstore is Powell's, which is not a small store. That Powell's didn't have any books on DocBook gave me the kind of "you're in this alone" feeling that's anything but warm and fuzzy. (Yes, there are lots of online resources, and yes, the two big books on DocBook are online, but (1) I like physical books and (2) I can't shake the feeling that the availability of physical books tends to bear a correlation to the size of the user community.)

The second thing that happened is that a friend of mine who's been pushing me to choose LaTeX agreed to act as my personal LaTeX consultant. That's key. Having a person I know agree to help me is much more reassuring than having only the faceless masses of the Internet to turn to. I have a lot of respect for those faceless masses, and I've found them tremendously helpful on many occasions, but I find that the more specialized my interests, the less likely I am to get help online. When I was working with OpenOffice Writer, for example, it didn't take much time for me to go beyond what I was able to find online, and it didn't take much longer than that for me to start stumping the consultant I was working with. "Huh, nobody has ever wanted to do that before..." is not really the kind of feedback I want to start getting on a regular basis.

And then there's this one silly thing: because DocBook is XML, every paragraph has to be started with <para> and ended with </para>. This is inhuman. Literally. It's fine for machines, but not for humans. LaTeX, in contrast, while not without its own syntactic ideosyncracies, interprets a blank line as a paragraph separator. That's vastly more natural for a writer. Working with XML requires that I be aware that I'm building a tree structure that happens to have nodes of text in it. Judging from the online demos for WYSIWYG DocBook editors like oXygen and XMLMind, this is the case even if you're not typing the markup in manually. When I write, I want to think of sequences of paragraph, not nodes in trees. LaTeX lets me. I'm concerned that DocBook won't.

My current plan, then, is to take my draft Introduction for Fastware! and translate it into LaTeX. We'll see how it goes.

EPub State of the Practice, Part 3: Special Features

[Here "EPub" refers to electronic publishing in general, not specifically to the epub format for electronic documents. At the time I wrote this blog entry, I was not aware that "epub" already had a meaning for many people in the world of, well, EPub.]

Fundamentally, live links and reasonable applications of color -- the topics of my last two posts -- are both "Well, duh" features in electronic publication. Along with full-text searching, the ability to print excerpts and copy text, and, for PDF documents, the ability to add comments and define new bookmarks, they should surprise readers only if they are absent. That they are missing from many of the books I examined is a commentary on how badly the publishing industry trails what readers have a right to expect from ebooks.

However, I found a few features in a few books that take advantage of the capabilities of epublication in interesting ways -- features that go beyond the "Well, duh" threshhold.

The first of these is one-click access to code examples. In an earlier post, I remarked that CodeProject articles allow code to be copied to the Windows clipboard in a single click (for IE users; the option is unavailable for FF users, sigh). This is one way to give readers access to code, but it's not the only way. C++ in Action offers single-click downloads of zip files that contain one or more source files, and several titles from The Pragmatic Programmers (e.g., Programming Erlang and Google Maps API V2) open web browser windows on code examples when the correspoinding download links are clicked. These are steps in the right direction, but I think what readers would really like would be single-click access to ready-to-run code examples in their preferred development environment. Selecting a code example in an ebook might, for example, open Eclipse or Visual Studio or, for throwbacks like me, Emacs (or, God forbid, vi) on the code. The "ready-to-run" proviso means that if the code requires auxillary code or other scaffolding before it's ready to execute, that code or scaffolding would automatically be provided. (For my C++ books, virtually every code example would require such scaffolding, because I almost always show code fragments that won't compile by themselves.)

An interesting feature offered by Programming in Scala is that clicking on the introduction of a term (indicated to readers both by the traditional italicization of the term as well as the application of link color to the term) whisks the reader to the definition of that term in the glossary. I think a better approach would be a pop-up window containing the definition (such as I've seen in some Windows applications), but the notion that readers of ebooks shouldn't have to waste time looking up term definitions strikes me as a good one.

Programming in Scala also has a set of navigation links at the bottom of each page, making it easy to get to the beginning of the book, a course-grained TOC, a fine-grained TOC, the glossary, and the index. This is similar to the navigation links typically found on web pages, but I didn't see a similar feature on any of the other PDFs I looked at. I might quibble with the choice of including a link to the cover of the book, because my PDF viewer (Acrobat) already gives me such a link for all PDF files, but, well, there's a reason I say it'd be a quibble.

Programming in Scala includes two other links at the bottom of each page, one of which is also essentially present in the newer titles I saw from The Pragmatic Programmers: a one-click mechanism for offering page-specific feedback. Clicking on "Suggest" (for Programming in Scala ) or "Report erratum" (for the PragProg titles) opens a web form prefilled with book information (title and version) and the page number on which you're offering feedback. Nifty! The PragProg form even lists the errata that have already been reported for that page, a feature that cuts both ways: it can save readers the trouble of reporting a problem that's already known, but it also performs a subtle shift of the burden of filtering out duplicate reports from the author/publisher to the readers. (The author(s) and publisher have to do the filtering, anyway, because there's no guarantee that readers will read the list of known errata before filing a report, but with the way the web form is currently formatted -- known errata above the fields for filing a new report -- I'm reminded of how I feel when I call tech support and have to wait on hold and be chided by a disembodied voice that I should check their online FAQ and knowledge base before trying to actually talk to them.) It's possible that Programming in Scala behaves the same way, but I don't know. I never saw that behavior, but maybe I just didn't try it on a page with known errata.

I will say that I prefer the more generic "Suggest" link from Programming in Scala to the PragProg "Report erratum" link. I've been getting feedback from my readers since 1992, and while most comments come in the form of bug reports, not all do. Making it easy for readers to offer page-specific feedback seems like a great idea, but I don't think it should be limited to errata reports.

Friday, November 28, 2008

EPub State of the Practice, Part 2: Color

[Here "EPub" refers to electronic publishing in general, not specifically to the epub format for electronic documents. At the time I wrote this blog entry, I was not aware that "epub" already had a meaning for many people in the world of, well, EPub.]

As I noted in my previous blog entry, I've been looking at electronic versions of several books, and because I'm viewing them on a device offering color (my computer monitor), I expect the books to use color where color makes sense. Such use is one of the "Well, duh" features I mentioned last time.

One can argue about when using color makes sense, but I hope we can agree that one place where it does is screen shots, and I was surprised to find that three of the nine books where I found screen shots showed them only in monochrome. I hope we can also agree that most programmers these days are used to seeing their code syntax-highlighted (i.e., in color), and of the 13 books that show code, only four show syntax highlighting. The intersection of these sets -- those books that show screen shots in color and that also use color for syntax highlighting code -- is only 3 books, all of which were published by The Pragmatic Programmers in the last two years.

I also checked for the use of color for live links. Because links on web pages are traditionally rendered in a special color (typically blue, in contrast to non-link text, which is typically black), I believe that the active behavior of link text is likely to be overlooked by many readers if it looks the same as regular text. I thus believe that link text should be distinguished in some visual way. This point could be argued, i.e., one might claim that with a little experience, readers would begin to intuit which text is "linky" (e.g., cross-references, URLs, initial definitions of terms, etc.) and which text is not, and that visually distinguishing linky text wouldn't really do anything except add visual noise. One could also argue that even if linky text is to be visually distinguished, color isn't necessarily the best way to do it. (One could use, e.g., underlining or a different font face instead.) Rather than argue the point one way or the other, I'll simply remark that among the books I looked at were several that do use a special color for link text as well as several that do not.

Mine do, and I spent a lot of time trying to come up with a color that was visually recognizable while at the same time being unobtrusive. My goal was to employ something that looked and acted more or less like a web page for people who were scanning for links, while simultaneously looking and acting more or less like a standard book for people who were reading straight through. I ultimately chose a dark blue for links, which is, I hope, different enough from the surrounding black text to be distinguishable, but similar enough to it to recede into the background. Here's a sample from Effective C++. There are three links in the text shown. (The red text is used to focus readers' attention on the topic at hand and has nothing to do with links).
A different approach is taken in Agile Web Development with Rails 2E, where link text is a shade of red or pink, depending on whether the link is an internal cross-reference or a URL:
One of my concerns is that when you combine such use of color with other uses, such as syntax-colored code, the result can be chromatically rather busy. For example, here's a page from Programming Erlang where, in addition to black, text appears in pink, grey, blue, brown, and red:
I'm not saying that's too much, but I remember the horrors that arose when multiple font faces became available, so I worry about analogous things happening with font colors.

Incidentally, both Agile Web Development with Rails 2E and Programming Erlang are published by The Pragmatic Programmers which, from what I can tell, is leading the pack in thinking about ways to adapt conventional book authoring and publishing for a mixed-delivery-mechanism world. When I discuss their use of red and pink for links and their use of many colors on a page, I'm not criticizing them, I'm taking advantage of the fact that they're doing things that nobody else seems to be. The publishing world doesn't have a lot of experience with books as electronic entities, so the fact that I chose dark blue for my links and they chose pink and red says nothing about which is better. Five years from now, maybe it will be obvious that pink and red is the better choice. Or that blue is. Or that both are inferior to something else. The only way to find out is to try various options and see which ones work best.

Another color-related aspect of the ebooks I examined was whether they use color in display elements like figures, tables, and sidebars. I found that many books use it cosmetically, but far fewer use it semantically. That is, the use of color to visually set display elements off from the primary text flow or to make the display elements more visually attractive was common, but the use of color to help readers understand the information in the display elements was a lot less common. To my surprise, some of the best examples of such use came from SOA Principles of Service Design, a book whose application of semantically-meaningful color in figures contrasts sharply with its rejection of color for links, screen shots, or code fragments. Here's a sample figure from the book:One of my goals is to learn how to take advantage of color in figures, diagrams, tables, etc., to help get my information across to my readers. Whether I do the work myself (as I've done in the past) or have a professional illustrator do it, I need to break out of a monochrome way of thinking about depicting information. A good way to do that, I hope, is to pay attention to what others are doing and then shamelessly apply the sincerest form of flattery to the matter :-)

Thursday, November 27, 2008

EPub State of the Practice, Part 1: Links

[Here "EPub" refers to electronic publishing in general, not specifically to the epub format for electronic documents. At the time I wrote this blog entry, I was not aware that "epub" already had a meaning for many people in the world of, well, EPub.]

I've spent a fair amount of time recently playing around with various electronic versions (primarily PDF) of technical books in an attempt to get a handle on the special "beyond print" features that are offered. In addition to the PDF versions of my books, I looked at the following books (listed in no particular order):
All of these are in PDF format except for C++ in Action, which is published on the web.

There was nothing scientific about this selection. Some books are available for free, some I had purchased for my own use, and some were given to me by the publisher for one reason or another. I choose to assume they are representative of what is available, though that may not be true. Publication years range from 2001-2008, with newer titles generally offering more features. If you know of other books I should take a look at, please let me know.

My primary interest in looking at these books was to get a sense for the "reader experience," with a focus on (1) things I need to keep in mind as an author in order to be able to offer the feature and (2) "obvious" features that were missing, i.e., things I want to be sure to avoid. An example of (1) would be initial definitions of terms that link to glossary entries (much easier to create when writing the book than to go back and add later). An example of (2) would be URLs in the book that aren't live links.

I generally refer to "obvious" features -- the things readers should be able to expect -- as the "Well, duh" features. At the top of the list is that "linky" things should be live links. That includes TOC and index entries, cross-references within a book, and URLs and email addresses. Readers should be able to take for granted that they can click on all these things and be magically whisked to the right place. As a general rule, most books make most of these things links, but I was surprised that only my books make the "see" and "see also" entries in the index live. Color me picky, but I can't understand why, if I see this in the index,
  transaction, see atomic transaction
I should have to manually slog my way back to the beginning of the index to find the entry for "atomic transaction". Hello, computer! You know what entry I want to look at. Freakin' take me there! (Having gone through the process of making these links live in books never written for electronic publication, I can tell you that it was a lot of work, but failing to support the "Well, duh" epub features would be embarrassing, and I hate being embarrassed.)

I was also surprised to find that only my books make email addresses live. I don't get it. What's the point of putting an email address in a book but not making it clickable to send to it?

As a general rule, page numbers in the indices were live, but they were treated in different ways. If, in the index, you see that "transaction" is discussed on page 44 and you click on the 44, where do you expect to be taken? Some books take you to the top of page 44. Some books take you to the beginning of the content of page 44, meaning that the running header on the page has just scrolled off the top of the window. Still others take you to the precise point on page 44 where you discuss "transaction." (To be more precise, they take you to the point on the page where the metadata causing the index entry to be generated is located.) From personal experience I can tell you that where you go when you click on a page number in an index can be determined by the author, the indexer, the software used to write the book, the PDF generation process, or some combination of these. What I don't know is what the "proper" behavior is. Jumping to the point on the page where the metadata is located sounds like it's the most accurate, but making sense of the text you find there generally requires backing up to at least the beginning of the paragraph to pick up the context of the use. Physical books have this problem, too: if the index refers you to page n, but the discussion on page n is at the top of the page, you often have to start reading on page n-1 in order to make sense of what you read. If you have ideas on how links from index page numbers should behave, please let me know.

I checked only four of the books (beyond mine) to see how footnotes were handled, in part because it's generally not that easy to find footnotes (especially if the book doesn't have any). Of the four, only one made the footnote number a live link, and none had a back link from the footnote text to the point of reference. (My books don't make footnotes live in either direction.) For books with static page breaks (such as PDF) non-live footnotes may be defensible, but where page breaks are dynamically determined and pages may be long (e.g., web pages), links in both directions (a la Wikipedia) seems like a more useful approach. I'm inclined to think that bidirectional footnote links should be provided. After all, if they're not intrusive, readers who don't want to use them can just ignore them. If the links are missing, of course, and a reader wants them, that reader is out of luck, and I have a strong disinclination to render my readers unlucky.

Indices in Ebooks

I've been looking through a number of ebooks recently, and one of the things I've been examining has been each book's index. Assuming the book has an index. Some don't. Which raises the question: given full-text search (e.g., via book-viewing software or desktop search tools), do indices continue to be useful? Or are they an artifact of print technology that makes little sense in an electronic environment?

I'm committed to producing indices for my books, because one of the output media I target is print. Indices in print are of proven value, so I'm on the hook for them regardless. But I think they make sense even for electronic-only publication. The reason is that good indices reference not just text, but concepts. Fastware!, for example, is about how to write software that runs quickly, but there are lots of words that correspond to that idea: speed, efficiency, performance, scalability, responsiveness, latency, etc. If you're interested in reducing memory latency, and I happen to say something important in a passage where I'm discussing how to improve the hit rate of the instruction cache, you want the index entry under "latency" to point you to that passage (or, more accurately, I want the index entry under "latency" to point you to that passage), even if I somehow never manage to use the word "latency" in the passage.

Many technical books have awful indices, a situation I attribute to the facts that (1) most indices are prepared by professional indexers, who typically have no understanding of the book's content -- they literally don't know what many of the nouns and verbs mean; (2) these indexers are paid unbelievably badly (typically only a few hundred dollars to index technical books of up to a thousand pages), so they have little incentive to do more than a cursory job; and (3) the quality of a book's index doesn't seem to affect sales, so there is no economic incentive to change the situation. My guess is that as ebooks become more common, indices will fall by the wayside, because it will be easy for authors and publishers to reason that textual searches obviate the need for separate indices, and given the sorry state of most indices, this will probably be true. I'm an old-fashioned guy, however, and I think that a good index improves the usability of a book, and I also believe that the whole point of a book is to serve the interests of its readers, so for the foreseeable future, I plan to produce indices for my books, even though index preparation is, to be honest, probably the single most unpleasant part of writing a book.

So I'm going to produce an index for Fastware!, and that takes me back to page numbers. In an earlier blog entry I worried about the problem of referring to page numbers in a book that may be published in multiple forms, hence have multiple sets of page numbers. Two people commented that the solution is simple: refer to something like section numbers or paragraph numbers instead of page numbers. This is clearly the correct approach, but think of what this means for an index. A single index entry often corresponds to multiple locations in the book, which is traditionally represented as a list of page numbers. If page numbers go away, and if we assume that locations in a book are represented in the form c.p, where c is the chapter number and p is the paragraph number within that chapter, we end with index entries that might look like this:
  containers, standard
C++ 4.3, 4.55, 5.18-22, 7.23
C# 4.4, 4.80-99, 7.65
Java 4.3, 4.60, 5.22-25
It looks a bit odd to me, but I can't think of a reason why there is anything wrong with it. Can you?

Of course, we could also use the form c:p, which would make references look somewhat biblical, hence possibly enhancing their appearance of authority :-)

Friday, November 14, 2008

Beyond Static, Passive Presentation

I use Firefox for browsing, falling back on IE only when I find a site that doesn't seem to work properly with FF. I happened to be using IE the other day when I visited Jim Crafton's CodeProject article on using DocBook on Windows, and I noticed something I'd never seen before: the ability to copy code examples from the article to the clipboard with a single mouse click:

The "Copy Code" option doesn't exist under FF, alas, but the idea of making it easy for readers to work with an electronic book in a natural fashion is a good one.

In fact, it's an example of a more general idea: presentation of a book's content should take advantage of the natural capabilities of the presentation medium. Another example is also shown in the image above: the ability to dynamically collapse and expand a code fragment. Such presentation capabilities don't affect the fundamental content of a book, but authors do need to keep them in mind, because authors can often do a better job of specifying the natural "copy to clipboard" or "collapse/expand" chunks. (At CodeProject, the chunks are presumably determined by the content in some HTML block.)

The fundamental idea is this: although the content of a book is essentially static, the presentation of that content need not be.

Sunday, November 9, 2008

What can go in a Book?

I explained in my last entry that an ink-on-paper book (i.e., a bookp) is simply a physical manifestation of the book content (i.e., bookc) an author has produced. Other manifestations offer different characteristics. Publication as PDF (to be viewed on a traditional monitor) allows the use of multiple colors with no greater cost than the use of black only. Publication as an audio stream eliminates concerns about page breaks, but makes display elements like tables, figures, and code listings problematic. Publication as a web page makes pretty much anything possible: dynamically generated content, full-motion animations and video, interactive elements, etc.

What does it means to write a book (i.e., a bookc), given that you can't assume it will be packaged as a bookp? The question is important, because I can't very well author a book if I don't know which content forms I'm permitted to include and which I'm not.

My answer to this question, perhaps counterintuitively, is based on a bookp. Whatever it means to write a "book," the result should be recognizable as what we currently understand a book to be. We could call TV "radio with pictures," but we don't, and we could call theatrical plays "live TV," but we don't do that, either. I don't see any sense in defining "book" such that somebody familiar with a bookp wouldn't be able to see the connection between what they know and what I've defined.

So here's my initial working definition of a book (i.e., a bookc): it can be reasonably represented as a bookp. That is, if I say "I'm writing a book," you can assume that whatever I produce can be reasonably represented in ink-on-paper form. So we're talking static content. All the usual book stuff is included, i.e., text, diagrams, tables, pictures, etc. Dynamically generated content is out. So are interactive elements. But audio, video, and animations may make the cut, depending on the form they take.

A video of a talking head, for example, can be represented in a bookp as a frame from the video (i.e., a photo of the speaker) accompanied by a transcript of what the speaker says. Readers lose the sound and cadence, etc., of the speaker's voice, and they're deprived of seeing how the speaker's face moves as he or she talks, but -- assuming such information was never the point of the video -- the essential content has been preserved in ink-on-paper form. Talking head videos are thus permissible in a bookc (and would be delivered as such on output devices where that's possible).

Similarly, transcripts of the audio-only equivalent of talking heads ("speaking voices?") make such audio permissible in a bookc. Many podcasts could thus be considered bookc manifestations. (If books in recorded form (i.e., audiobooks) are still books, then textual representations of speech are still speech ("textaudio"?), and since textual representations of speech can be published as recognizable bookps, recorded speech is legit in a bookc.) The way people express things in spoken versus written form typically differs qualitatively, so transcribing spoken audio and publishing it in book form is likely to yield a lousy book, but my goal here is to figure out what's in my author's toolbox and what's not. If something's in, part of my job as an author is to make sure I don't just use it willy-nilly; I'm responsible for using it well.

I find animations to be a particularly interesting case. As a book author, may I include animations? Consider, for example, David Howard's animation of the behavior of a red-black tree. Can such an animation be part of a bookc?

I'm inclined to think that it can. Books have long contained "before" and "after" diagrams to help explain how a transformation between two states occurs. You'll find several in the Wikipedia entry for red-black trees. (Well, you will if you look right now. What the page will look like if much time has elapsed between when I write this and you read it is anybody's guess.) It's easy to imagine such diagrams being frames extracted from an animation. If, as an author, I have enough conditional content control to be able to say
  if (rendering for a device that can show animation)
show this animation along with this explanatory text
else
show these animation frames along with this other explanatory text
then at least some animations are within the purview of a bookc.

The set of acceptable entities in a bookc is thus a superset of those that are directly expressible in a bookp. The information content of everything in a bookc must be representable in a bookp, but if a trivial tranformation needs to be applied (e.g., video is replaced by a selected frame, audio of speech is replaced by a transcript) or even if a nontrivial-but-straightforward transformation needs to be applied (e.g., animation + description replaced by animation frames + alternate description), such content is still valid, hence part of my toolkit as a book author. Writing a multiple-platform bookc thus gives me more choices for expressing myself than I'd have if I restricted myself to bookp publication.

Saturday, November 8, 2008

Bookp versus Bookc

Consider these two statements:
  • I wrote a book.
  • The book I wrote is on my shelf.
Book does not mean the same things here, and increasingly I feel that this ambiguity causes problems. Technological changes are causing the meanings to drift further apart, so I think it's important to clarify the two meanings that book can have.

I'll call the meanings bookp and bookc. A bookp is a physical object. It consists of pages of paper with ink on them. The pages are bound together and held between two covers. Bookps are what bookstores and libraries are filled with.

A bookc is not a physical object. The c stands for content, and if we assume that the content of a book is a simple stream of text (which is essentially true for most novels), bookc is that stream of text. The text might get printed on pages that are bound together, thus yielding a bookp, but the text might also get spoken aloud and recorded as an audiobook. It might get distributed over a set of web pages, thus forming a web site. The content of a book is independent of its packaging, and in fact packaging is what the p in bookp stands for. A bookp is simply one of many different ways of packaging a bookc.

Authors don't generally write bookps, although I suppose those who self-publish and keep boxes of books in their garage do. Rather, authors write bookcs. The semantics of the statements above, then, can be depicted this way:
  • I wrote a bookc.
  • The bookp I wrote is on my shelf.
With this distinction in mind, consider Doug McCune's statement that he doesn't read books or Joel Spolsky's thesis that programmers seem to have stopped reading books (both of which I found out about thanks to Jeff Atwood's blog.) I think these statements refer to bookps, not bookcs, and that opens the door to the possibility that people who claim to not read books or people who appear to not read books actually do read them, they just don't read them in bookp form.

Like Mulder, I want to believe, because I happen to like writing books (i.e., bookcs). Other entries in this blog make clear that I think a lot about bookps, but that's simple pragmatics. The publishing industry is changing, but it's currently set up to produce, market, distribute, and sell books in printed form. Remaining mindful of authoring constraints arising from printing considerations is no different from remaining mindful of software constraints arising from Windows considerations (assuming, in both cases, you want to maximize the number of platforms on which you can deliver what you produce).

Authors who leave all the layout decisions to their publishers (i.e., most of them) worry only about bookc considerations. Authors such as me who can't keep themselves from delving into layout matters have to keep bookp issues in mind, but it doesn't change the fact that, fundamentally, writing a book means writing a bookc.

The Post-Publication Page Break Problem

Traditionally, books are published, and that's that. As I've noted before, however, I typically modify my books for new printings, so the content changes over time. Each change is small and localized: an added, removed, or rewritten sentence here; a touched-up code example there; the addition of an occasional footnote; etc. From a software development point of view, new printings are the publishing equivalent of bug fix releases.

Sometimes even small, localized changes can have extensive implications. Adding or removing a sentence on a page can cause the page breaks for that and subsequent pages to change, and when page breaks change, TOC and index entries can change, too; content that used to be on page n might now find itself on page n-1 or n+1. (There are scenarios where it can change even more than that, but we don't need to worry about those.)

Assuming the existence of a single-source automated build process for the book, each printing should be completely consistent (i.e., all TOC and index entries should be correct), but even overlooking the fact that a fully automatic pagination process might yield unfortunate page breaks, we still have a problem.

Readers shouldn't have to worry about which printing of a book they have. If Person A has a copy of Fastware! and Person B also has a copy of Fastware!, they have the same book, as far as they're concerned. The fact that there might be minor differences between the two because Person A happens to have the first printing and Person B has the tenth shouldn't matter to them. Heck, people have enough trouble remembering that different editions look different. It's not reasonable to ask them to remember that different printings of the same edition might, too.

Given that they think they have the same book (even though they might not), I want to maximize the likelihood that if Person A says something like, "As you can see on page 44, where Meyers brilliantly demonstrates that a cache-unfriendly traversal can have a significant impact on performance...," Person B can go to page 44 and find the passage Person A is referring to. The way to maximize that likelihood is to ensure that once a book is published, page breaks change as little as possible. So if between the first and tenth printings, I removed some text from, say, page 18, it's better to let page 18 be a little short (or increase the interparagraph or interline spacing on the page) than to have some text from page 19 move onto page 18 (plus the associated potential cascading text movement on subsequent pages). Similarly, if I add text to a page, it's best to prevent any existing text from moving across a page boundary.

Practically speaking, once the first printing of a book goes out, I want the page breaks to remain static, even if I tweak the content of the book such that the page breaks would fall in different locations if I were to repaginate from scratch. Not only would this help preserve the illusion that Person A's first printing and Person B's tenth printing are the same book, it would also preserve any hand-tweaking of page breaks that had been done prior to initial publication (as I discussed in my earlier blog entry on page breaks).

Which leads to the inevitable question: how can I preserve initial-publication page breaks across multiple publication platforms (i.e., print, epub, etc.), use an automatic build process, and still preserve the ability to modify content for new printings?

Friday, November 7, 2008

XML ¬⇒ ¬LaTeX!

One of the attractions of LaTeX is that it produces wonderful output. In all likelihood, TeX employs the best line/paragraph/page layout algorithms in the business. Furthermore, LaTeX is terrifically expressive, something I know even though my familiarity with it is limited. (I did a lot of writing with LaTeX in my academic days, but I tended to learn only enough to do what I wanted to do. Unlike many Computer Science graduate students, I never really sat down and studied it.) I know, for example, that LaTeX allows authors to float displays (e.g., figures, tables, listings, etc.) to the top of a page, the bottom of a page, or both. In fact, this is the default, meaning that unless authors disable such flexibility, LaTeX will float displays to the next "good" location. (Experienced LaTeXies know I'm lying a bit, but it's close to the truth, and the actual truth doesn't add anything here.) This behavior seemed so natural and obvious to me, I took it for granted for many years. Only when I started using less capable systems (among them FrameMaker, OpenOffice Writer, and, from what I can tell from varous comments on the web, Microsoft Word) that offered much more limited support for floats, did I realize I'd been spoiled by LaTeX.

Because LaTeX is so expressive and its output is so good, I was reluctant to dismiss it in favor of an XML-based approach such as DocBook. But then, in one of those Duh! moments that happen now and again, I realized that as long as there is a way to take a DocBook document and transform it into a LaTeX document, I could use XML as my document representation and LaTeX as my print rendering engine. In fact, I'm pretty sure that this is what the Pragmatic Programmers do: write books in PML (their book schema), then transform them into LaTeX source for ink-on-paper rendering.

Assuming this line of reasoning is valid, DocBook becomes a pretty appealing option, because it means I can take advantage of all the work done by and tools developed for the DocBook and XML communities, but I can still hold out hope for the layout quality of LaTeX. Such hope can exist only if DocBook offers sufficient expressiveness, however, because there's nothing to be gained by the possibility of using LaTeX as a rendering engine if I can't express what I want it to do via DocBook. My next step, therefore, is to install DocBook and see if it will let me say what I want to say. I already have a pretty good idea of what I want Fastware! to look like, because I've done preliminary page layout in both OpenOffice and FrameMaker. If I can get DocBook to produce the output I want, I can then try translating my recent C++ article into DocBook, which will be a good way to see if its support for floating displays, cross-references, and bibliographies is up to snuff. If so, DocBook will look like a reasonable book representation format, and I can move on to seeing whether I can find a decent WYSIWYG front end for it.

LaTeX, DocBook, or Something Else?

Some publishers already produce books for multiple platforms (i.e., print form as well as at least one electronic format) from a single master source, and in some cases I know (or at least believe I know) the format:
  • O'Reilly: XML using the DocBook schema.
  • Pragmatic Programmer: XML using the PML ("pragmatic markup language," presumably) schema.
  • Artima Press: LaTeX.
I'd be very interested to know of decisions other publishers have made (even if the "publisher" is an individual) for master document source formats for multiple-platform publication. I'd also be very interested in comments on the advantages and disadvantages of any specific formats, including DocBook, LaTeX, and any others you have experience with.

Thursday, November 6, 2008

Conditional Formatting

Because I want to write for multiple output devices, some of which support color and some of which do not, I'd like to be able to specify conditional formatting as I write. In a recent article, for example, I use blue as my standard code color with red as a highlight color:
  void g(MakeFeatures<tepsafe>::type features)
{
int xVal, yVal;
...
f(xVal, yVal, features);
...
}

If I were writing for a device with no color support, I'd probably use black as my standard code color and something like bold as a highlighting technique:
  void g(MakeFeatures<tepsafe>::type features)
{
int xVal, yVal;
...
f(xVal, yVal, features);
...
}
I don't know whether traditional WYSIWYG word processors support this kind of thing. As far as I know, FrameMaker doesn't, but perhaps there is a way to coax it into exhibiting this behavior.

More problematic, I think, is conditional formatting in figures and diagrams. Here's a figure from the same article:

If I wanted to express the same highlighting information in a black and white presentation, I could "bold-face" the lines:

The problem is, I don't know of any software that will let me specify named line styles and define them differently for different contexts. In the figure above (there's only one figure, it's just expressed in two different ways), I want to define and apply a "highlight graphic" line style for some arrows and rectangles, using different definitions of this style.

I can think of a couple of workarounds. I could create different versions of the diagram and use conditional content constructs to choose the one I want. The problem with that approach is that it violates my single-source constraint: I don't want to maintain multiple copies of the same figure with different formatting any more than I want to maintain multiple copies of the same text with different formatting.

Another workaround would be to have a single diagram that uses, say, lines that are both colored and dashed, so they'd be distinguishable on both color and non-color output devices. (On non-color devices, they'd simply look dashed.) For text, I could use both coloring and underlining. (On non-color devices, such text would simply be underlined.) This approach has the advantage that it works, but it really strikes me as kind of a hack.

How do you suggest I handle the problem of conditional formatting?

Single-Source, Automatic Building is Essential

When I say a book is ready to be published, there are, as far as I know, no errors in it. Actually, that's not true. I'm sure there are errors in it, but I don't know what they are. (If I knew, I'd fix them.) Shortly after publication, the situation changes, because readers tell me about mistakes I didn't know about. The problems might be factual, they might be grammatical, they might be expository (e.g., I might have written something that can be interpreted differently from what I intended). I collect the problems I know about in errata lists, and when my publisher tells me that a new printing is planned, I modify the book to eliminate as many errors as I can. I then deliver fresh camera-ready copy to my publisher.

I write my books with a goal of their remaining useful for at least five years, and there are generally at least one or two reprints each year, so camera-ready copy for one of my books should have to be produced at least 10 times. It's often more than that. More Effective C++, which I wrote in 1996, is now in its 26th printing.

Until the recent release of the PDF versions of my books, my books had been laid out for only a single output device: ink on paper. One book thus yielded one PDF. The ebook PDFs double that to two PDFs per book. For Fastware!, I hope to produce not just an ink-on-paper version, but also multiple electronic versions. To keep these varous versions consistent when I make updates, it's crucial that I have a single master source for each book, and it's also crucial that the various target versions of the book can be automatically built from the single master source. If this sounds like the usual requirement for cross-platform software development, it should, because that's exactly how I think of it.

The Page Break Problem

One of the last things I do before finalizing a book's camera-ready copy is walk through the book looking for bad page breaks. Within a printed book, there are two kinds of page breaks: those between facing pages (i.e., between a left and right page) and those between non-facing pages (i.e., between a right and left page). There are probably official terms for these different kinds of breaks, but I don't know what they are, and at any rate, they probably won't take me where I want to go. I'm going to call the breaks between facing pages easy and breaks between non-facing pages difficult. The names are motivated by the amount of trouble they cause me as an author.

Some kinds of text are naturally split across lines, yet semantically belong together. Examples include mailing addresses, lists of ingredients in recipes, and, in programming texts, function and class definitions. If you're reading this blog, you probably have some programming experience, so consider this:
 template<typename IterT, typename DistT>  
void doAdvance(IterT& iter, DistT d,
std::input_iterator_tag)
{
if (d < 0 ) {
throw std::out_of_range("Negative distance");
}

while (d--) ++iter;
}
If this code happens to occur near the bottom of a page, it might be broken across two pages. Suppose it happens to get broken as follows:
  template<typename IterT, typename DistT>   
void doAdvance(IterT& iter, DistT d,
std::input_iterator_tag)
{
if (d < 0 ) {
throw std::out_of_range("Negative distance");
}

[------------------------ Page Break Here ------------------------]

while (d--) ++iter;
}
This is fairly grotesque, but it will serve as an example.

If the page break is easy, it means that the reader can still see everything at once, because easy breaks are between facing pages. The result of this particular break is ugly, but it doesn't really prevent a reader from understanding whatever it is that's being discussed. As an author evaluating the break, I might, depending on how tired I am at the time, simply roll my eyes and let it go.

If it's a difficult break (i.e., across non-facing pages), eye-rolling won't suffice. Making sense of the function requires being able to see the declaration of the parameters while looking at the function body. Asking a reader to flip a page back and forth to see the whole function is unacceptable. They're already working hard to understand the material, and at any rate, making the stuff easy to follow is what they pay me for when they buy the book. If this is a difficult page break, I have to intervene.

I might manually move the break so that the entire function fits on the second page. I might rewrite some text on the first page so that the break moves to an acceptable location. I might move the bottom page margin down on the first page so that the entire function fits. There are several options. Torturing my readers by doing nothing is not one of them.

As I mentioned in an earlier blog entry, I want to write for multiple output devices, of which ink on paper is only one. For some of those devices, all page breaks are easy. For others, all are difficult. For still others, it depends on the configuration of the software being used to access the book:
  • Hardcopy books: As explained above, some page breaks are easy, some are difficult.
  • Kindle: My understanding is that Kindle shows only one page at a time and that scrolling is not supported, so all page breaks are difficult. Whether this applies to other dedicated ebook-reading devices, I don't know.
  • PDF on a computer monitor: Using Acrobat Reader, documents can be viewed as facing pages (in which case some page breaks are easy and some are difficult), as single pages (whereby all page breaks are difficult), or as a continuous stream of pages (in which case all page breaks are easy).
  • Audio stream: There are no pages, so the issue doesn't really arise, but one can think of all page breaks as being easy.
So here's the problem: I want to write a book to be viewed on multiple output devices; different devices have different characteristics regarding the existence of difficult page breaks; different devices have different page sizes, so it is, in general, not possible for me to know the location of all page breaks; and I want to never have text break unacceptably across a difficult page break.

How do I achieve that?

A Vision for Ebooks (circa 2007)

I recently blogged that we've released PDF versions of my books. These PDFs are essentially replacements for the CD version of two of my books that came out in 1999. When the idea of updating the CD arose in early 2007, I sent the following to my publisher. It sketches a vision for ebooks being accessed by readers using conventional computer systems, i.e., it doesn't consider issues that would arise for devices like Kindle, a mobile phone, etc.

The PDFs we ultimately produced don't follow the vision outlined below, but I think the ideas are still interesting, so, for the record, here they are.
As a reader, I want access to my ebook at all times. This means an internet-only approach (such as Safari or Amazon Upgrade) is unacceptable. I want access to the latest version of the ebook, including all changes made since I bought the book. I want access to a complete changelog of everything that has been modified in the book since it was published, ideally along with a rationale for each change. (This should sound like an errata list, which many authors already maintain.) I want to be able to make bookmarks and add comments to the book. I want to be able to see comments others have made about the book, and I want to be able to see others' comments on my comments. I want to be able to comment on their comments. (If this sounds like reader comments on blog entries or like newsgroup discussions, that's not an accident.) I want to be able to view ebook content in a way that suits me on a device that suits me, which means I want control over text and image size, etc., and I want line breaks to be determined dynamically. I want to be able to print chunks of the ebook. I want to be able to perform full-text searches with at least the power of Google. I want all cross-references to be links or similar (e.g., words in the glossary might have their definition pop up if I hold the mouse over them). I want to be able to use the ebook as effortlessly as a real book or web page (unlike my current CD, which has a LONG introduction explaining how the darn thing works). I want to be able to buy access to the ebook for a modest additional fee if I've already shelled out for the hardcopy book. I want to be able to easily take an example or code fragment and get access to it in such a way that I can play around with it, e.g., copy a fully compilable version into my development environment.

That's not all I want as a reader, but it's enough to get us started.

The current CD has two books and some magazine articles. In terms of content, one of the things the CD offers that the paper books do not is links among the books and the articles. That's because the paper books are standalone, but on the CD, I know that readers have both books and a bunch of articles, so it's safe to add references among them. Adding such references is not a lot of work for me, and I think it makes the CD more attractive.

Eclipse is a framework that does nothing but allow functional components to be plugged in. By itself, it does nothing, but it facilitates the cooperation of other independently-developed components. I think the same architecture is a good way to approach ebook publication. Create a framework into which ebooks can be plugged in. Use a web browser as the UI, because everybody already has one and knows how to use it. Also make it possible for comments to be plugged into the framework, i.e., a way to specify "put this comment at this location in this book." A "comment" is a general notion, one that can be simple text, but can also contain a link ("link to this URL or this location in this other book (or the same book) using this text") and could even invoke a program ("run this flash animation explaining how this code works when this text is clicked on"). The framework takes the ebooks and the comments and produces the HTML that the user views, typically omitting comments that refer to books they don't own. So if the user owns EC++/3E (Effective C++, 3rd Edition) and MEC++ (More Effective C++) but not ESTL (Effective STL), they see electronic versions of those two books, any comments within or between those two books, comments from those books to the internet in general, but no comments leading to or from ESTL. If they then buy ESTL, shazaam!, comments involving ESTL suddenly appear in their UI. (They were probably always there -- they shipped with the books they owned -- but they were not shown, at least not by default.)

As an author, the only additional work I have to do to prepare a book for such a framework is to create a file of comments, and if I don't want to do that for some reason, that's okay, the book will still show up in the framework. But if somebody else wants to create comments to or from my book, users can install those comments, and shazaam!, they'll see them. The framework will probably have to offer a way to filter comments, e.g., "I only want to see comments approved by AW" or "I don't want to see any comments by that Meyers guy".

One of the nice things about this approach is that any document can be plugged in, so the framework can support free books, online articles, etc., and they can coexist with ebooks that have been sold.

Another nice thing about this framework is the more content it has, the more useful it is (like a Wiki), so readers are implicitly encouraged to buy more books to fill out the framework. This can be especially the case if links to uninstalled content are shown (which should be optionally possible), because then readers will be constantly reminded of all the good stuff they'd have access to if they'd just plunk down a bit more cash.

The ebook content would normally be stored at an online server (as with Safari and Amazon Upgrade), but local copies could also be cached, so if internet access was unavailable, the cached copies would be used. This solves the "access anywhere" problem and reinforces that when readers buy an ebook, they own an actual copy, not just access to a copy stored elsewhere. In fact, you can separate payment for the two: for, say, $n you can buy an ebook (assuming you've already purchased the paper version), and for an additional $m/year or $n for your lifetime you can have online access to the very latest version, comments from the author, etc.