Net results: Beautiful complexity of the internet search

It’s an utterly banal observation to state that internet search has become a ubiquitous part of most people’s lives.

But it still is a kind of miracle nonetheless.

Almost all of us with an internet connection of any sort surely use a search engine daily, most likely Google, so popular and so much a part of life that it has become a small "g" verb. People even use them as multifunctional tools: they're spellcheckers, measurement and currency converters, calculators (yes, they do that too) and more.

However, it’s the search function that still astonishes, if you take a moment to think about it.

Have a question, any question, from the simple to the most infernally complex? The answers are all there, even though getting them has become such an everyday bit of magic that those of us who grew up in an analogue, internetless world forget the labour and time it took to get information pre-web.

How to build an engine
But search engines did not spring into being, fully formed, at the dawn of the internet age.

Back in the early days, when there were far fewer internet sites – much less those new, visual representations of information online known as web pages that began to emerge in the 1990s – it was very, very hard to find things online.

You had to already know what you were looking for, and where on the internet it was located. It was as if you had to know, in advance, the exact title and location of a book you were looking for in a library. You couldn’t just arrive and search for suggested books on the topic of sailing, or the 1916 Rising, or growing roses.

A Dublin audience was brought back to those early days in a fascinating panel discussion that kicked off the recent 36th annual Sigir (Special Interest Group on Information Retrieval) conference held at Trinity College.

Among the panelists for the keynote session in the Mansion House was Jonathan Fletcher. He is now working in Hong Kong in the banking information technology sector but, back in the day, he was the fellow who set up the very first internet search engine, Jumpstation.

“I had one problem. I knew nothing about information retrieval, but like a lot of coders, I had an itch to scratch – so you write code. What I wrote was very, very simple. I would go visit a couple of URLs, and record that I’d collected them,” he told the audience.

He hand-entered information about each site – “I settled for the [web page] titles and headings” – and created a search engine that let people search by either titles or headings. Simple, but revolutionary. Within six months, by the end of 1993, he’d documented 25,000 pages in this way. A year later, he’d done 280,000. It was an extraordinary labour of love, done for free to help others, like so many of the web’s best features and tools.

But of course, pre-search engines, there was no easy way for an internet user to even find out that Jumpstation existed.

Like everyone else interested in promoting their new website, he submitted it to Mosaic, the first web browser, who listed it on their “What’s New” page (anyone else remember that? I recall how exciting it was to check the page for the latest websites that had come into being – so many, so quickly, on so many subjects).

It featured there for only a day, but the power of the crowd was already emerging as an internet phenomenon, and word of mouth ensured internet users worldwide began to use Jumpstation.

The site quickly faded away, though, as Fletcher took a job in Hong Kong and found he didn't have time to work and have a personal life and also maintain the site. But it seeded the ground for what was to come – index sites such as Yahoo, the first search engines such as Ask Jeeves, Excite and AltaVista, and ultimately Google, Bing and new challengers such as WolframAlpha and DuckDuckGo.

Future of searching
Towards the end of the discussion, Fletcher nailed the biggest challenge for today's search engines: "The amount of signal that everyone else is scrambling for is very small." Billions of webpages, and many of them junk, are noise obscuring the "signal" pages that contain good information in response to a query.

He believes the web is just today’s presentation format for information online – just as the command line interface, just text on the screen, was once the way people accessed the internet, before the era of the world wide web.

“In my opinion, the web isn’t going to last forever,” he said. “But the problem of finding information is.”