Wednesday, August 26, 2009

Download Over a Million Public Domain Books from Google Books in the Open EPUB Format



Over the years, we've heard a lot from people who've unearthed hidden treasures in Google Books: a crafter who uncovered a forgotten knitting technique, a family historian who discovered her ancestor once traveled the country with a dancing, roller-skating bear. The books they found were out of copyright and in the public domain, which meant they could read the full text and even download a PDF version of the book.

I'm excited to announce that starting today, Google Books will offer free downloads of these and more than one million more public domain books in an additional format, EPUB. By adding support for EPUB downloads, we're hoping to make these books more accessible by helping people around the world to find and read them in more places. More people are turning to new reading devices to access digital books, and many such phones, netbooks, and e-ink readers have smaller screens that don't readily render image-based PDF versions of the books we've scanned. EPUB is a lightweight text-based digital book format that allows the text to automatically conform (or "reflow") to these smaller screens. And because EPUB is a free, open standard supported by a growing ecosystem of digital reading devices, works you download from Google Books as EPUBs won't be tied to or locked into a particular device. We'll also continue to make available these books in the popular PDF format so you can see images of the pages just as they appear in the printed book.

To get started, just find any public domain book on Google Books and click on the Download button in the toolbar.


Of course, these public domain books weren't born in EPUB format--or even in digital format at all. Let's say you download a free EPUB copy of Treasure Island. You're taking a final step in a long process that takes a physical copy of Robert Louis Stevenson's book and transforms it into something you can download for your iPhone. The process begins with a book that has been preserved by one of our library partners from around the world. Google borrows the book from one of our library partners, much like you can from your local library. Before returning the book in undamaged form, we take photographs of the pages. Those images are then stitched together and processed in order to create a digital version of the classic book. This includes the difficult task of performing Optical Character Recognition on the page image in order to extract a text layer we can transform into HTML, or other text-based file formats like EPUB (if you're interested, you can read more about this process here).

Digitizing books allows us to provide more access to great literature for a wider set of the world's population. Before physical books were invented, thoughts were constrained by both space and time. It was difficult for humans to share their thoughts and feelings with a set of people too far from their physical location. Printed books changed that by allowing authors to record their experiences in a medium that could be shipped around the world. Similarly, the words written down could be preserved through time. The result was an explosion in collaboration and creativity. Via printed books, a 17th century physicist in Great Britain could build on the work of a 16th century Italian scholar.

Of course, it can be difficult and costly to reproduce and transport the information that older physical books contain. Some can't afford these works. Others who might be able to afford to purchase them can't unless they can find a physical copy available for sale or loan. Some important books are so limited in quantity that one must fly around the world to find a copy. Access to other works is only available to those who attend certain universities or belong to certain organizations.

Once we convert atoms from physical books into digital bits, we can begin to change some of that. While atoms remain fairly expensive, digital bits are on a trend where they become ever cheaper to produce, transport, and store. For example, providing every student in a school district with a paper copy of Shakespeare's Hamlet might cost thousands of dollars. Yet if those same students already have cell phones, laptops, or access to the Internet, then they can access a digital copy of Hamlet for just a fraction of the cost. Often times, public domain texts in digital form are more affordable and accessible to the public than their physical parents.

All of this of course assumes that a digital version of the book exists. I love going into work each morning knowing that we're working to convert atoms into bits and that by doing so, we hope to make knowledge more accessible. In a world where educational opportunities are often disproportionately allocated, it's exciting to think that today anyone with an Internet connection can download any of over one million free public domain books from Google Books. Who knows. Maybe some kid will read Treasure Island on their phone and be inspired to write their own great novel some day.

4 comments:

  1. If this idea is able to make available literature such as my generation were afforded, then an endless panoply of magnificence will be opened for those who,otherwise, would never have the mind numbing opportunity to immerse themselves in such a wonderland.

    ReplyDelete
  2. Anonymous5:23 PM

    It once worked but it doesn't any longer.

    Find any "Free" PUBLIC DOMAIN book, then click on the iindicated "Download PDF" button -- and nothing happens. One cannot download the PDF.

    Search around enough and you might stumble another "Download PDF" button, hidden away at the bottom of a page totally unrealted to the idea of downloading a PDF of a PUBLIC DOMAIN BOOK.

    That button "works" in that it will begin the download. But it invariably stops shortly thereafter; one is NOT able to download a PDF of such "free" PUBLIC DOMAIN texts.

    I complained about this long enough ago for it to be fixed by now; but it is not: nothing ha changed: it is still exactly the same: the promise that one can download PDFs of PUBLIC DOMAIN books is a flat out lie.

    The "feature" -- promise -- is nothing more than a deliberately-difficult run-around -- as is the real-time "response" by Google to this complaint -- intended foremost to frustrate so one either buys copy of the book, or stumbles around until one finds that one CANNOT download a copy of the "free" PUBLIC DOMAIN book.

    ReplyDelete
    Replies
    1. Sheogorath9:18 PM

      I had your problem, so I Googled 'free Public Domain epub downloads', discovering manybooks.net as a result. You gotta love the irony of Google pushing me into using their search engine to find a rival for another of their 'services'!

      Delete
  3. @Anonymous: Can you provide some specific examples where a public domain books can be read, but not downloaded?

    Only books in the public domain -- books whose copyrights have expired -- are available for free download or full view. For users in the United States, this typically means books published before 1923. For users outside the U.S., we make determinations based on appropriate local laws.

    As with all of our decisions related to Google Books content, we’re conservative in our reading of both copyright law and the known facts surrounding a particular book.



    Certain public domain works may not yet be viewable in full or available for download. We're working to enable these features for all public domain works as quickly as possible. To report a book you believe to be in the public domain, please visit: http://books.google.com/support/bin/answer.py?&answer=180577&hl=en

    ReplyDelete