Thursday, November 6, 2008

The Page Break Problem

One of the last things I do before finalizing a book's camera-ready copy is walk through the book looking for bad page breaks. Within a printed book, there are two kinds of page breaks: those between facing pages (i.e., between a left and right page) and those between non-facing pages (i.e., between a right and left page). There are probably official terms for these different kinds of breaks, but I don't know what they are, and at any rate, they probably won't take me where I want to go. I'm going to call the breaks between facing pages easy and breaks between non-facing pages difficult. The names are motivated by the amount of trouble they cause me as an author.

Some kinds of text are naturally split across lines, yet semantically belong together. Examples include mailing addresses, lists of ingredients in recipes, and, in programming texts, function and class definitions. If you're reading this blog, you probably have some programming experience, so consider this:
 template<typename IterT, typename DistT>  
void doAdvance(IterT& iter, DistT d,
std::input_iterator_tag)
{
if (d < 0 ) {
throw std::out_of_range("Negative distance");
}

while (d--) ++iter;
}
If this code happens to occur near the bottom of a page, it might be broken across two pages. Suppose it happens to get broken as follows:
  template<typename IterT, typename DistT>   
void doAdvance(IterT& iter, DistT d,
std::input_iterator_tag)
{
if (d < 0 ) {
throw std::out_of_range("Negative distance");
}

[------------------------ Page Break Here ------------------------]

while (d--) ++iter;
}
This is fairly grotesque, but it will serve as an example.

If the page break is easy, it means that the reader can still see everything at once, because easy breaks are between facing pages. The result of this particular break is ugly, but it doesn't really prevent a reader from understanding whatever it is that's being discussed. As an author evaluating the break, I might, depending on how tired I am at the time, simply roll my eyes and let it go.

If it's a difficult break (i.e., across non-facing pages), eye-rolling won't suffice. Making sense of the function requires being able to see the declaration of the parameters while looking at the function body. Asking a reader to flip a page back and forth to see the whole function is unacceptable. They're already working hard to understand the material, and at any rate, making the stuff easy to follow is what they pay me for when they buy the book. If this is a difficult page break, I have to intervene.

I might manually move the break so that the entire function fits on the second page. I might rewrite some text on the first page so that the break moves to an acceptable location. I might move the bottom page margin down on the first page so that the entire function fits. There are several options. Torturing my readers by doing nothing is not one of them.

As I mentioned in an earlier blog entry, I want to write for multiple output devices, of which ink on paper is only one. For some of those devices, all page breaks are easy. For others, all are difficult. For still others, it depends on the configuration of the software being used to access the book:
  • Hardcopy books: As explained above, some page breaks are easy, some are difficult.
  • Kindle: My understanding is that Kindle shows only one page at a time and that scrolling is not supported, so all page breaks are difficult. Whether this applies to other dedicated ebook-reading devices, I don't know.
  • PDF on a computer monitor: Using Acrobat Reader, documents can be viewed as facing pages (in which case some page breaks are easy and some are difficult), as single pages (whereby all page breaks are difficult), or as a continuous stream of pages (in which case all page breaks are easy).
  • Audio stream: There are no pages, so the issue doesn't really arise, but one can think of all page breaks as being easy.
So here's the problem: I want to write a book to be viewed on multiple output devices; different devices have different characteristics regarding the existence of difficult page breaks; different devices have different page sizes, so it is, in general, not possible for me to know the location of all page breaks; and I want to never have text break unacceptably across a difficult page break.

How do I achieve that?

1 comment:

Anonymous said...

The point about page breaks is key. If IIRC, the page break from page
99 to 100 in K&R (2e) has been termed one of the most unfortunate page
breaks ever and is an example of formatting leading to alternative
interpretations. In this case it is more than just aesthetics! And it
is a challenging issue to try to fix as it is strongly semantically
related.