Happy Gutenberg Holiday (or, how I found the spirit of the season in online proof-reading)

Before the Google Books project … before there was a Google … heck, before Larry Page and Sergey Brin were even a gleam in their parents’ eyes -- there was Project Gutenberg.

In the ancient times of 1971, Michael Hart and a couple of buddies got a large block of time on a mainframe computer at the University of Illinois. Hart quickly became rather obsessed with the idea that the best value from computing would be to search and share books from libraries.

Getting books and documents online and searching through digital versions of them is nothing unusual … today. But in those pre-PC, pre-World Wide Web, pre-mobile device days it was a rather, uhm, unwieldy and somewhat goofy futuristic idea.

And yet, there was something there that caught the imagination of enough network users that the process of turning public works into freely sharable ascii text, aka, etext, began. And continues.

For decades, an ongoing online community has been scanning, turning scans into ascii text, proofreading, and formatting books out of copyright. Between Project Gutenberg (http://www.gutenberg.org) and its sister projects around the world, more than 100,000 books are currently available free in searchable text form, with more being added all the time.

Yesterday, A Christmas Carol by Charles Dickens was downloaded 1,060 times.

Myself, I opened the HTML version of Egyptian Ideas of the Future Life by Sir E. A. Wallis Budge (3rd edition dated 1908). I’m kinda’ in an ancient Egypt phase at the moment and it was simply irresistible, not only to me but to another 780 people who also open it up.

The Project Gutenberg’s site is not exactly a model of cutting edge web design. The ascii text base files are, well, basic.  An HTML version of the files are equally basic.

But there’s a reason for the basics; one of the project’s underlying philosophies is to keep technology as simple as possible, to keep the result universally accessible, and well, ASCII text is about as universal as it gets.  No matter what the program, it can probably open and display and ascii text file.

As the project itself explains, “Alice in Wonderland, the Bible, Shakespeare, the Koran and many others will be with us as long as civilization … an operating system,  program, a markup system … will not.”

It is a never-ending process, one that has no returns to investors. It is essentially an open source project, created by the community and shared by all. Which is where the distributed proofreaders effort comes in.

I just began a little volunteering as a distributed proofreader (http://www.pgdp.net) Oddly, I feel warmer and fuzzier proofreading random digital pages from books for future use than by writing guilty checks in response to the holiday nonprofit beg-a-thon currently underway. (Honestly, if I get one more “we need your help” request, I shall scream. They don’t want my help – they want my checkbook! But I digress…) 

So earlier this week I logged on to the Distributed Proofreading project and worked on a page here and there, filling those little moments when I needed a breather from something else.
The pages were pages from a children’s book of an era past. It was a beginners proofreading project and I mostly deleted extra spaces added by the scanner or restored the letter ‘y’ to its rightful place in the word. For some reason, the type in the original book had a ‘y’ that the OCR choked up over.

It was a little thing, but when I realized that I was reading non-consecutive pages because someone else, somewhere out there in the world, was proofing the same book as me … it was, I don’t know, it was like there was magic to it.

I don’t know what other word one can use to describe this ability, this collaboration across time and space, with someone I didn’t know and never will know, to produce something that people who might not even be born yet might access in the future … I think magic is as good a word to use as any other.

You see,  the most amazing thing about technology is the ways it has of creating connection and communities of the most unlikely types. The process by which an available work becomes a Project Gutenberg etext is a perfect example of that power.

Here’s how it works. Volunteer coordinators maintain a database of books. Individual volunteers offer to scan a selected book and upload the scanned images to a central database. Proofreaders (that’s what I’ve been doing) choose a book and work on pages of it in parallel, comparing the scanned text ascii file to the original scanned image page for errors, such as extra spaces and garbled words, and my friend the mangled-letter-Y.  Others insert basic formatting to match the original – making a word italic, for example. Software compiles the text into one book file. The book is then made available to anyone at no cost via Project Gutenberg.

There are no screaming attorneys. No lobbying effort. No one turf-sitting.

You can do it, or not do it. No one is haranguing you over a deadline or pushing you to compete with someone else. Anything you do, even proofing one page, is appreciated and adds into the value of the whole.

No, proofing a few pages out of hundreds so that a copy of The Wouldbegoods gets into the Gutenberg database isn’t going to change the world. At last I don’t think I will, although the strangest things can happen!
It won’t create world peace, feed the hungry, or save the whales, either. There’s nothing dramatic or earth shaking here.
But in a tiny small way, by being part of a global effort with no agenda other than to create ascii text files of books out of copyright for other to share, I know that I’m part of some larger thing, part of a global connection, and part of passing along a bit of history and human culture into the future.

In a season where we talk a lot about keeping the traditions and being people of goodwill and all that, I like the feeling of being part of something bigger than me, of being one voice in a song about us. It is nice to do something that isn’t about winning and doesn’t have the frenetic activity that this week too-often possesses.

Paul McCartney is attributed with a wonderful quote about the dynamics of a group working together that describes the process better than I can, and so I think I’ll end this week’s column with it:

“I love to hear a choir. I love the humanity … to see the faces of real people devoting themselves to a piece of music. I like the teamwork. It makes me feel optimistic about the human race when I see them cooperating like that.”

Happy Holidays, one and all!

CapeCodToday.com welcomes thoughtful comments and the varied opinions of our readers. We are in no way obligated to post or allow comments that our moderators deem inappropriate. We reserve the right to delete comments we perceive as profane, vulgar, threatening, offensive, racially-biased, homophobic, slanderous, hateful or just plain rude. Commenters may not attack or insult other commenters, readers or writers. Commenters who persist in posting inappropriate comments will be banned from commenting on CapeCodToday.com.