Saturday, July 24, 2010

The Curious Case of eCapitalization

An unresolved problem faced by all technology writers is what to do with creative capitalization. When you want to lead off a sentence with a word like "iPad" or "eBook", how do you capitalize it? Do you go with "Ipad" and "Ebook"? Or perhaps "IPad" and "EBook"? Do you stay with "iPad" and "eBook" and consider them to be capitalized versions of "ipad" and "ebook"? Horror of horrors – you could put in a dash. Maybe you just finesse the issue by changing your sentence around to avoid having a camelCased word leading off the sentence. Even then, you have the problem of what to do if the word is in the title of your article, for which you probably use Title Case unless you're a cataloging librarian, in which case you use Sentence case, not that the problem goes away! If you're using an iPhone, you know it has a mind of its own about the first letter of your email address being capitalized.

The practice of capitalizing titles presents issues particularly when the titles are transported into new contexts, for example via an RSS feed or search engine harvest. ALL CAPS MIGHT LOOK OK AS A <TITLE> ON YOUR WEB PAGE, but a search engine might hesitate to scream at people

This is not by any means a new problem, but it's one that changes from era to era because of the symbiotic relationship between language and printing technology. Here's what Charles Coffin Jewett wrote in 1853 when discussing how libraries should record book titles:
The use of both upper-case and lower-case letters
in a title-page, is for the most part a matter of the printer's taste,
and does not generally indicate the author's purpose.  To copy them in a
catalogue with literal exactness would be exceedingly difficult, and of
no practical benefit.  In those parts of the title-page which are
printed wholly in capitals, initials are undistinguished.  It would be
unsightly and undesirable to distinguish the initials where the
printer had done so, and omit them where he had used a form of letter
which prohibited his distinguishing them.  It would teach nothing to
copy from the book the initial capitals in one part of the title, and
allow the cataloguer to supply them in other parts.
The standard practice of libraries in English speaking countries has been to record book titles in Sentence case, in which the first word of the title is capitalized and the rest of the words are capitalized it only if the language demands it (unless the first word is an article like "A", then the second word is also capitalized). The argument for this is that this capitalization style allows for the most meaning to be transmitted; a reader can tell which words of a title are proper names or other words that are capitalized. Which begs two questions: Why are libraries alone in presenting titles this way? Why do libraries persist in this practice when no one in recorded history has ever asked for sentence case titles?

In German and other languages, nouns are capitalized; this used to be true of English (take a look at the US Constitution).  In German, it's easy to tell nouns from verbs, which might be very useful if we still had it in English. Still, I enjoy being able to write that something is A Good Thing. It gives me a way to intone my text with an extra bit of information.

The rules for how English should be capitalized have become quite complicated. Here and here are two web pages I found devoted to collecting capitalization rules. Some of them are pretty arcane.

It's fun to speculate on the future of capitalization. In the late 19th century, there was a fashion to simplify spelling, grammar and capitalization, led by people like Melvil Dewey. I'm guessing part of the reason was the annoyance of needing to press a shift key on those newfangled typewriters. But spelling and capitalization reform didn't get very far. Perhaps they tried to publish articles and got stopped in their tracks by a unified front of copy editors.

If anything, the current trend is in the direction of making capitalization even more idiosyncratic. In addition to a proliferation of Product names like iPod and eBay that have crossed over into the language mainstream,  the shift from print to electronic distribution of text does a better job of preserving the capitalization chosen by the author, thus allowing it to better transmit additional meaning.

The ability to increase the information density in text is useful in a wide range of situations, for example, when you have only 140 characters to work with, or when you want a meaningful function name, like toUpperCase(). If your family name is McDonald, you probably have strong feelings on the issue.

My guess is that life will become increasingly case sensitive. You may already be aware that it takes 8 seconds, not one, to transmit a 1 GB file over a 1 Gb/s link. And that SI unit Mg is a billion times the mass of a mg. If you are a Java programmer, If you know the difference between an integer and an Integer, you'll quickly learn about NullPointerExceptions.

The shift from ascii to Unicode has made it much easier to cling to language specific capitalization rules. Did you know that there are a small number of characters that are different in "upper case" than in title case? They are: Letter DZ, LETTER DZ WITH CARON, LETTER LJ, LETTER NJ, and LETTER DZ. The lower case versions are dz, dž, lj, and nj; the upper case versions are DZ, DŽ, LJ, NJ, and the title case versions are Dz, Dž, Lj, and Njfi, fl, ffi, ffl, ſt, st. And don't forget your Armenian ligatures, ﬓ, ﬔ, ﬕ, ﬖ, ﬗ. For this reason, being "case insensitive" is poorly defined- two strings that are equal when you've changed both to uppercase are not necessarily equal after you've changed them to lower case!

So what do I do when I write about ebooks I don't use a dash? When the word appears in a title, I capitalize the "B". I can't wait till they translate this rule into Armenian.


  1. Also, capitalization rules are different for Turkish, where uppercase i is not I. Instead, uppercase i is İ and uppercase ı is I

  2. Eric, Enjoyed all the background you had in this post. I have a framed set of test catalog cards that we first printed in the early 1980s to ensure we got the diacritics in the right places.

    And as for capitalization signing a scream, after John Irving's A prayer for Owen Meany: a novel, the practice will always mean THE VOICE...

  3. Wow, did you answer the question posed at the beginning of the article, should "ipad" be capitalized at the beginning of a sentence? I read the entire article and still don't have an answer to my question.

    1. IPad or not iPad, that is the question. Whether 'tis nobler to just put another word at the start of the sentence?

      I Pad; I Paid; I Pandered.

      (with sincere apologies!)