Ebook Formats - The Tower of Babel

Let’s say you’ve finished your online novel. You now want to turn the aforementioned story into a download, and perhaps make that available for purchase from the store section of our site. From here on, however, you’re met with two problems: 1) you’ll have to convert your text to an appropriate ebook format; and, 2) which one?
The ebook format fiasco is sometimes called ‘the tower of eBabel’, and for good reason: there are too many of them. But because we deal in digital books, and because ebooks are fast becoming viable models of distribution, we need to consider the sticky question of
which ebook format, and why. This page attempts to answer that question. (Note that this is quite difficult to answer without looking into the future, simply because it is unclear if there’s ever going to be a victor in the ebook format wars. But I’ll get back to that in a bit.)
Context
E-book formats are no longer created from scratch. In most cases, the ebook maker – regardless of whether it’s a vendor or an open-source project – will decide to adapt and use an existing format, or to have some underlying programming language to make coding the format easier. Today, that language is often XML, or eXtensible Markup Language. Before we talk about the various ebook formats in proper, it’ll be good to talk a little about XML, and why it’s so popular as an underlying language.
The answer to that lies in XML’s name. ‘Markup’ and ‘Language’ are pretty self-explanatory; it tells us that XML is a programming language that consists primarily of markup tags, much like HTML.
In fact, an XML document looks pretty much like any HTML page, the only difference being that XML is powerful enough to define and shape other languages. But unlike HTML, XML is extensible. This means that XML allows you to define and create your own tags. For example, if I were an e-book-format creator, I can easily create and define as a tag describing the title of an e-book. <title> doesn’t actually exist in XML. However, because XML is extensible, I can create what is effectively a whole new platform for my e-book format, and it’ll contain <title>, and whatever other tags I see fit to use. All I have to do is to define them, so that my ebook reader will understand which bits are which, and treat those sections accordingly.<br />You can tell that XML is useful precisely for this flexibility of form and function. The language is now used for many, many things – sometimes even as the foundation for web services to send requests and responses, behind the scenes, server-to-server. And if you take a look now at even the simplest of RSS feeds, you’ll find a language that is defined – and made possible – through XML.<br />Most of the major ebook formats today are all built upon some foundation of XML. The ePub format, </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://toc.oreilly.com/2008/04/keep-your-eye-on-the-epub-ball-but-play-nice.html">widely tipped</a></u></span><span style="font:14px Verdana, serif; "> to become wide-spread, is built on a strong XML base. The Amazon Kindle format is built on a modified version of the Mobipocket ebook platform, which is in turn built on XHTML (with a dash of javascript/frame support). So is the format used by the new Sony Reader, though that’s known as the Sony BBeB. The conclusion you can take away from this is that sooner or later, XML will become a major part of your workflow regardless of which ebook format ends up as the eventual winner of eBabel. There’s no running away from it. The good news is, however, that XML is a remarkably convertible format. It’s going to be easier and easier to work with as most major software vendors make the jump to XML-based files; case in point: Microsoft Word’s new docx format is built on XML, and it’s not very hard to convert XML to other formats – say, PDFs, or HTML, or an XML-based ebook format of your choice.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">The e-book Formats<br /></span><span style="font:14px Verdana, serif; ">So let’s get started. The following are the e-book formats in use today, ones that I believe still have a fighting chance of becoming </span><span style="font:14px Verdana-Italic; "><em>the</em></span><span style="font:14px Verdana, serif; "> format of the known universe.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">1. Amazon Kindle’s AZW.</span><span style="font:14px Verdana, serif; "> The Kindle uses Amazon’s proprietary AZW format, but can read unprotected Mobipocket e-books, HTML, Word documents and plain text (.txt) files. You convert to AZW using Amazon’s online </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="https://dtp.amazon.com/mn/signin">Digital Text Platform</a></u></span><span style="font:14px Verdana, serif; ">, and you </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://johnaugust.com/archives/2009/kindle-formatting-for-web-geeks">format your e-book</a></u></span><span style="font:14px Verdana, serif; "> using rudimentary HTML. AZW supports DRM (unfortunately) and is built around the Mobipocket format – though, confusingly, DRM-protected Mobipocket files cannot be read on the Kindle, because they’re not exactly one and the same. </span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">Is it worth it?</span><span style="font:14px Verdana, serif; "> Publishing your work in the AZW format grants you immediate access to the Amazon online store, where a </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://johnaugust.com/archives/2009/spelunking-the-kindle-market">number of online writers</a></u></span><span style="font:14px Verdana, serif; "> have been making a decent sum selling their work … some of which have been regularly hitting the top 10 bestseller lists for Kindle e-books. So … yes, it’s worth it.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">2. Sony Reader’s BBeB</span><span style="font:14px Verdana, serif; ">, which stands for Broadband eBooks, is perplexing: Sony does </span><span style="font:14px Verdana-Italic; "><em>not</em></span><span style="font:14px Verdana, serif; "> offer any tools to convert to the format, making the Sony Reader a closed medium to all but the biggest of publishers. In fact, the only way to publish for the Reader is via RTF or PDF … but XML to PDF conversions aren’t solid, not at the moment, and RTF limits your formatting options (it’s hardly better than a .txt file, to be honest). And there </span><span style="font:14px Verdana-Italic; "><em>is</em></span><span style="font:14px Verdana, serif; "> at least </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://code.google.com/p/bbebinder/">one unofficial converter to BBeB</a></u></span><span style="font:14px Verdana, serif; ">, but Sony’s lack of support for writer releases is discouraging at best. </span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">Is it worth it?</span><span style="font:14px Verdana, serif; "> No.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">3. Mobipocket (also known as mobi)</span><span style="font:14px Verdana, serif; ">. The Mobipocket format was originally created by Mobipocket SA, a French company, in 2000, which was then bought over by Amazon in 2005. It’s been around for quite a bit, and it’s probably the only ebook-ish format at the moment that can claim full multi-platform compatibility. It runs on just about everything: the Kindle, the Palm OS, Symbian, Windows, Mac, and on the iPhone (the </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://www.lexcycle.com/">Stanza reader</a></u></span><span style="font:14px Verdana, serif; "> allows you to read Mobi books, though it was recently bought over by Amazon and is now in a vague sort of flux). It is, however, not very popular, and there doesn’t seem to be a captive audience or a community built around the format. A quick snoop around the official </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://www.mobipocket.com/en/HomePage/default.asp?Language=EN">Mobipocket site</a></u></span><span style="font:14px Verdana, serif; "> confirms this. Why? I’m not sure, not at the moment (and I’m still looking for proper mobi-related numbers) – but a surprising amount of traditional publishers offer their ebooks in a mobi format. </span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">Is it worth it?</span><span style="font:14px Verdana, serif; "> This is hard to say. On one hand, the Mobipocket software suite is completely free, and it’s old enough to make conversion and formatting very easy on the writer. But the truth is that it’s not an exciting format to talk about, and this lack of excitement can probably be attributed to a lack of Mobipocket users … even with free software for just about every platform. And if you’re not likely to get serious ebook readers on Mobipocket (and you can’t sell mobi ebooks on Amazon for Kindle, anyway), then I guess it’s not worth it to spend so much time and energy on a format not many people would use in the first place.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">4. ePub</span><span style="font:14px Verdana, serif; "> originally started off as the OEB (Open eBook) initiative. ePub is currently tipped to be the next big ebook format, if only because it’s backed by a loose consortium of publishers, writers, and programmers, who are tied together in the </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://www.openebook.org/">IDPF</a></u></span><span style="font:14px Verdana, serif; ">, or what is known as a ‘stardards and trade organization for the digital publishing industry’. As mentioned earlier in this article, ePub is built on XML, and so the IDPF leaders are </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://toc.oreilly.com/2008/04/keep-your-eye-on-the-epub-ball-but-play-nice.html">currently trying to push it as a distribution standard for e-books</a></u></span><span style="font:14px Verdana, serif; ">. This means a couple of very interesting things. If the ePub people have their way, publishers will no longer have to produce e-books in different formats for different e-book vendors; they publish in just ePub, and demand that everyone else (say, Amazon) convert ePub to their own proprietary format. And it’s really simple to do that, primarily because ePub’s built on a nearly 100% XML base – itself a highly convertible format. </span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">Is it worth it?</span><span style="font:14px Verdana, serif; "> As of late 2008 Sony announced that their reader would now support the ePub format, and publishers (or at least, the ones who have vested interest in a digital book future) have been relatively supportive of ePub over others. If the IDPF people get their way and ePub becomes the industry standard (or even if it becomes </span><span style="font:14px Verdana-Italic; "><em>just</em></span><span style="font:14px Verdana, serif; "> a distribution standard), ePub would well be worth it. I’m fairly optimistic that ePub will win – at the very least, I </span><span style="font:14px Verdana-Italic; "><em>want</em></span><span style="font:14px Verdana, serif; "> it to win – but the road to that future is far from clear-cut: Amazon has yet to announce any plans about ePub compatibility. They’re the one major player who’s yet to come around to ePub, and for what it’s worth – I think that it’s going to take a bit of time, some elbow grease, and a lot of arm wrestling to get them to see things from the publisher’s point of view. But give it time. It should happen … eventually.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">5. Adobe’s PDF format </span><span style="font:14px Verdana, serif; ">is probably the most known amongst the e-book formats I’ve discussed so far</span><span style="font:14px Verdana, serif; color:#0000F4;"><a href="http://www.novelr.com/2009/06/30/the-novelr-guide-to-ebook-formats#ebookformats_footnotes3">[3]</a></span><span style="font:14px Verdana, serif; ">. There’s not much to talk about: PDFs are simple, familiar, and easy to use regardless of medium, plus they’ve been around long enough for everyone to know, more or less, what a pdf file looks like. And because the PDF format is so old, it’s not likely that you’ll ever meet anyone with a computer that can’t read the PDF file format. </span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">Is it worth it?</span><span style="font:14px Verdana, serif; "> Hell, yes.<br /></span><span style="font:14px Verdana, serif; font-weight:bold; font-weight:bold; ">The Format That Wins<br /></span><span style="font:14px Verdana, serif; ">I want to make a case here that the primary ebook format we’re going to work with is probably going to be whichever ebook format wins on the iPhone. The Apple developer conference, WWDC, happened not very long ago, and several very interesting things became clear during that conference, most of it worrying news to the rest of the mobile phone industry, but good news for the rest of us. Here’s what Daring Fireball’s John Gruber has to </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://daringfireball.net/2009/06/wwdc09_wrapup">say</a></u></span><span style="font:14px Verdana, serif; ">:<br />On the whole, there was a palpable sense that the iPhone is a peer to the Mac in Apple’s eyes. This isn’t about counting how many sessions were devoted to each. Nor is it an indication that the Mac as a platform is slowing. Quite the opposite in fact — Apple is selling more Macs than ever, and, knock on wood, there’s a strong consensus amongst developers that Snow Leopard is going to be the best release of Mac OS X yet. It’s simply that for however fast the Mac is growing, the iPhone is growing far faster.<br />But the two platforms are symbiotically intertwined. The Monday schedule at WWDC is static. In the morning comes the keynote, which the press attends and where all public announcements are made. After lunch, though, there comes what is effectively a second keynote, this time with material aimed squarely at developers. A technical keynote, as compared to the morning’s marketing keynote, if you will. This technical keynote has for as long as I can remember been titled “Mac OS X State of the Union”. This year the title changed to “Core OS State of the Union”.<br />Hence the symbiosis: Apple now has two full-fledged developer platforms, Mac OS X and iPhone OS, derived from one core system. Neither felt more important than the other this year at WWDC, which is remarkable considering that one of them hadn’t even shipped two years ago.<br />But look at their vectors — their relative rates of growth — and ponder how much longer until WWDC begins to feel like an iPhone developer conference with a Mac developer track. My answer: next year. In other words, I think it will have taken just three years for the iPhone to supplant the Mac as Apple’s primary platform. By 2011 it will be obvious.<br />It’s simply a matter of users. During Phil Schiller’s keynote, he showed a graph of the “OS X” user base over time, with steady growth over the first part of this decade followed by a sharp jump from 25 to 75 million over the past two years. This figure was widely mis-cited, however, as showing growth in “Mac OS X” users. It did not. The graph said “OS X”, not “Mac OS X”, and what Apple meant to show were the combined number of users of Mac OS X and iPhone OS. It was a very misleading and poorly-designed chart.<br />This doesn’t prove anything on its own, but stick with me for a bit. I’ve been seeing </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://loopinsight.com/2009/06/is-att-afraid-of-iphone-users-mms-and-tethering/">several</a></u></span><span style="font:14px Verdana, serif; "> </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://daringfireball.net/linked/2009/06/16/att-dalrymple">articles</a></u></span><span style="font:14px Verdana, serif; "> arguing the point that AT&T isn’t providing immediate MMS and tethering support due to fear that their network would crash the very instant a million or so iPhone users decide to connect their devices. And I’ve noticed that the iPhone is itself a remarkably tactile platform, one perfect for reading books, and that we’ve already seen </span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://blog.sangsara.net/2009/05/comparing-ebooks-classics-stanza-and.html">a number of apps</a></u></span><span style="font:14px Verdana, serif; "> showing us just that: that reading, and reading on your iPhone, is one hell of a revelatory experience. We’ve also been introduced to the Apple ipad, with all the touchy goodness associated with their current multi-touch technology, and having that released means bringing the tactile interface to a fully-fledged operating system. And that, lastly, all those people connecting to an online network on such a small device will be a community of captive, fanatical users limited by the processing capabilities of their phones, but not by their phone’s </span><span style="font:14px Verdana-Italic; "><em>features</em></span><span style="font:14px Verdana, serif; "> … making the iPhone all at once better than any ebook reader out there (</span><span style="font:14px Verdana-Italic; "><em>cough the Kindle cough</em></span><span style="font:14px Verdana, serif; ">) but also perfect for reading text on the go.<br /></span><span style="font:14px Verdana, serif; ">But all of the above are small, fragmented pieces of information, hardly worth talking about, individually. It’s when you look at them from a broader perspective that things begin to become a lot more exciting, particularly from a digital-fiction point-of-view. Allow me to pull it all together for you: Apple sees the iPhone as a peer to their traditional Mac platform; the iPhone is a superior tactile device perfect for on-screen reading; the iPhone has a fanatical userbase that is connected to the Internet, one that downloads and consumes content </span><span style="font:14px Verdana-Italic; "><em>through the iPhone itself</em></span><span style="font:14px Verdana, serif; ">; and Apple is a master at</span><span style="font:14px Verdana, serif; color:#0000F4;"><u><a href="http://www.codinghorror.com/blog/archives/001280.html"> enabling 3rd-party (software) innovation</a></u></span><span style="font:14px Verdana, serif; ">. Put two and two together and you’d realize that this platform is ready for just the right ebook app</span><span style="font:14px Verdana, serif; color:#0000F4;"> </span><span style="font:14px Verdana, serif; ">to come along, and whichever one it is – be it Amazon’s Kindle app, or an Eucalyptus-type reader, or even one that we’ve never heard about – whichever one that is, that app will be the turning point that defines our industry. Want to know which format you should end up supporting? Watch the iPhone and ipad, and watch them closely.<br /></span><span style="font:10px Verdana, serif; ">Article first published 2009 by </span><span style="font:10px Verdana, serif; "><a href="http://www.novelr.com" rel="self">novlr.com</a></span><span style="font:10px Verdana, serif; "><br /></span><!-- AddToAny BEGIN --><br /><a class="a2a_dd" href="http://www.addtoany.com/share_save"><img src="http://static.addtoany.com/buttons/share_save_171_16.png" width="171" height="16" border="0" alt="Share"/></a><br /><script type="text/javascript" src="http://static.addtoany.com/menu/page.js"></script><br /><!-- AddToAny END --></p> </div><!-- End content --> </div><!-- End main content wrapper --> <div class="clearer"></div> <div id="footer"><!-- Start Footer --> <p>© 2010 ebooksgold.com <a href="#" id="rw_email_contact">Contact Us</a><script type="text/javascript">var _rwObsfuscatedHref0 = "mai";var _rwObsfuscatedHref1 = "lto";var _rwObsfuscatedHref2 = ":ed";var _rwObsfuscatedHref3 = "ito";var _rwObsfuscatedHref4 = "r@e";var _rwObsfuscatedHref5 = "boo";var _rwObsfuscatedHref6 = "ksg";var _rwObsfuscatedHref7 = "old";var _rwObsfuscatedHref8 = ".co";var _rwObsfuscatedHref9 = "m";var _rwObsfuscatedHref = _rwObsfuscatedHref0+_rwObsfuscatedHref1+_rwObsfuscatedHref2+_rwObsfuscatedHref3+_rwObsfuscatedHref4+_rwObsfuscatedHref5+_rwObsfuscatedHref6+_rwObsfuscatedHref7+_rwObsfuscatedHref8+_rwObsfuscatedHref9; document.getElementById('rw_email_contact').href = _rwObsfuscatedHref;</script></p> <div id="breadcrumbcontainer"><!-- Start the breadcrumb wrapper --> </div><!-- End breadcrumb --> </div><!-- End Footer --> </div><!-- End container --> <!-- Start Google Analytics --> <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-19709940-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script><!-- End Google Analytics --> </body> </html>