Saturday, November 8, 2008

The Post-Publication Page Break Problem

Traditionally, books are published, and that's that. As I've noted before, however, I typically modify my books for new printings, so the content changes over time. Each change is small and localized: an added, removed, or rewritten sentence here; a touched-up code example there; the addition of an occasional footnote; etc. From a software development point of view, new printings are the publishing equivalent of bug fix releases.

Sometimes even small, localized changes can have extensive implications. Adding or removing a sentence on a page can cause the page breaks for that and subsequent pages to change, and when page breaks change, TOC and index entries can change, too; content that used to be on page n might now find itself on page n-1 or n+1. (There are scenarios where it can change even more than that, but we don't need to worry about those.)

Assuming the existence of a single-source automated build process for the book, each printing should be completely consistent (i.e., all TOC and index entries should be correct), but even overlooking the fact that a fully automatic pagination process might yield unfortunate page breaks, we still have a problem.

Readers shouldn't have to worry about which printing of a book they have. If Person A has a copy of Fastware! and Person B also has a copy of Fastware!, they have the same book, as far as they're concerned. The fact that there might be minor differences between the two because Person A happens to have the first printing and Person B has the tenth shouldn't matter to them. Heck, people have enough trouble remembering that different editions look different. It's not reasonable to ask them to remember that different printings of the same edition might, too.

Given that they think they have the same book (even though they might not), I want to maximize the likelihood that if Person A says something like, "As you can see on page 44, where Meyers brilliantly demonstrates that a cache-unfriendly traversal can have a significant impact on performance...," Person B can go to page 44 and find the passage Person A is referring to. The way to maximize that likelihood is to ensure that once a book is published, page breaks change as little as possible. So if between the first and tenth printings, I removed some text from, say, page 18, it's better to let page 18 be a little short (or increase the interparagraph or interline spacing on the page) than to have some text from page 19 move onto page 18 (plus the associated potential cascading text movement on subsequent pages). Similarly, if I add text to a page, it's best to prevent any existing text from moving across a page boundary.

Practically speaking, once the first printing of a book goes out, I want the page breaks to remain static, even if I tweak the content of the book such that the page breaks would fall in different locations if I were to repaginate from scratch. Not only would this help preserve the illusion that Person A's first printing and Person B's tenth printing are the same book, it would also preserve any hand-tweaking of page breaks that had been done prior to initial publication (as I discussed in my earlier blog entry on page breaks).

Which leads to the inevitable question: how can I preserve initial-publication page breaks across multiple publication platforms (i.e., print, epub, etc.), use an automatic build process, and still preserve the ability to modify content for new printings?

2 comments:

Keith Fahlgren said...

"""
Which leads to the inevitable question: how can I preserve initial-publication page breaks across multiple publication platforms (i.e., print, epub, etc.), use an automatic build process, and still preserve the ability to modify content for new printings? """

You can't and you shouldn't. If you're really concerned about folks referring to a page in the printed version, number the sections at whatever granularity you want and then every person on every device using every rendering will be able to talk about the same chunk of content by the section number.

Anonymous said...

Why reference page numbers? You could use chapter, section, and paragraph numbers, and put the paragraph numbers in the margin in light gray or something. If you make changes, insert 1.1.2.a.