Online librarian plans giveaway with no returns

WIRED ON FRIDAY/From Silicon Valley: The Internet Archive has collected more than 100 terabytes, or five times the size of the…

WIRED ON FRIDAY/From Silicon Valley: The Internet Archive has collected more than 100 terabytes, or five times the size of the American Library of Congress, with its 20 million books

It's a two-storey white wooden house in the middle of the Presidio, San Francisco's curiously refined ex-military base. It sits alone, humble but aloof: it looks like a schoolhouse from a settlement in A Little House On The Prairie. Next to it, is a quaint painted little sign. It says, simply The Internet Archive. Brewster Kahle, chief archivist, meets you at the door, and waves you in. Inside, are books.

Every month millions enter the Internet Archive through a different door: a site called the Wayback Machine, hosted at http://web.archive.org/. Pick your website. Now pick a date from the last seven years. The Wayback Machine will present you with what that site contained, as close to that day as it can manage.

The Internet Archive is an incredibly ambitious project: to store and preserve the whole of the Net, through the whole of time. Already it has collected over 100 terabytes, or five times the size of the American Library of Congress, with its 20 million books.

READ MORE

Brewster Kahle has even more ambitious plans for this wooden house - and every other library in the world.

If we entered the Wayback machine ourselves and travelled back a million Net years to 1983, Brewster Kahle would already be a famous face. He was one of the original employees of Thinking Machines, the legendary start-up that designed innovative and incredibly powerful - for their time - supercomputers. One planned application for those supercomputers was to be an early search engine, scanning over text in lightning time.

To Kahle's disappointment, his search feature proved to be less than useful. Not because of any limitations on Thinking Machine's technology, but because there was nothing for it to search. The digital world, at the time, was mostly unoccupied. Most of the world's knowledge still sat, on library shelves, unviewable by any computer.

So Kahle set about sucking the knowledge of the real world and populating these empty data wastelands. He started his own company, WAIS, in 1989, and made deals with publishers like the Wall Street Journal and New York Times to release their contents in machine-readable form.

Kahle sold WAIS to AOL in 1995. With the rise of the Web, he no longer needed to seed the digital world with copies of the New York Times. But the Web proved a poor guardian of its treasures.

A Web page in 1996 lasted on average 75 days, then evaporated. As fast as the Web grew, its receding edge was deleted, caught up in a perpetual cultural revolution.

Kahle set out to preserve what would otherwise be lost. He designed systems that would go out, grab every page they could find on the Net, and squirrel them on to their local hard drives. In the midst of the dotcom boom, such a repository had commercial value, and in 1999 Kahle sold Alexa to Amazon for an undisclosed sum. As part of the deal, he was allowed to continue with the altruistic side-project that he had specifically created Alexa for: the Internet Archive.

Now Kahle's non-profit group shows the Net its history - and has been adding more to its digital collection: public domain films from the 1950s and 1960s, e-texts, and software. Anything that Kahle can get his hands on, can cheaply digitise and freely distribute.

Kahle distributes himself a little freely too. As well as the Internet Archive, he's been working to set up a free wireless network across most of San Francisco. He's given a free copy of the entire archive to the original Library of Alexandria. He's dedicated the archive to scanning in and converting to text a million pre-1926 books by 2005.

This September saw him drive a converted van with a satellite dish on its roof, across the breadth of the United States, stopping at schools and libraries. There, he'd let children browse the thousands of books the Internet Archive holds online - books like Alice in Wonderland, Dickens, and the photo-perfect facsimiles of the Wizard of Oz. If the children saw a book they liked, Kahle let them print, cut and bind it themselves, and take it home. A mobile library filled with thousands of books and a librarian who never wanted any of them back.

Kahle's plan is to turn that theory into practice. Having worked to digitise and preserve the cutting edge of information, he is now turning back the Wayback Machine, and scrabbling to absorb and digitise the bulk of past human knowledge. Combining the infinite "reproductibility" of digital media with new ways to distribute and reproduce books in printed form, Kahle's plan is to give every child on the planet access to the largest library the world has ever known.

He sees librarians of the future sharing their works not only with their local community but, by scanning their own public domains and uploading it to the Net adding an imperfect, but infinitely lendable copy to a worldwide library.

Brewster Kahle's new job, in a nutshell: to turn libraries into places that give books away, and never ask for them back.