Revving Google's search engine

Now a dominant player in the internet space, Google is turning its attention to mobile phone search and social media services…

Now a dominant player in the internet space, Google is turning its attention to mobile phone search and social media services, writes KARLIN LILLINGTON in Mountain View, California

EVERY TIME you initiate a search on Google, chances are Ben Gomes has had something to do with the replies you get.

Carrying the job title of distinguished engineer, and a lead on its search projects, Gomes was born in Tanzania, graduated in computer science from the University of California, Berkeley, and joined Google 11½ years ago. At the time the company was “just search” and he has worked on most of the developments that have taken place in the area since then.

With Google today known as one of the dominant companies in the internet space – and increasingly moving into other areas, with its Android mobile platform and its Chrome browser and operating system – it’s easy to forget that it was once just a start-up with a new take on how to search the fast-growing web.

READ MORE

“One of the first challenges was just crawling the web adequately,” says Gomes. Initially, Google’s search “spiders” had to tackle what then seemed a vast 30 to 40 million webpages. “Now it’s hundreds of billions. We do in hours, what used to take us weeks.”

At the time, Google’s now-famous PageRank algorithm and other developments introduced major innovations in how a webpage was ranked in search returns.

More recent innovations that Gomes says have increased search relevance include Google’s ability to return keywords in context – those highlighted words contained in a snippet from the webpage that enable a person to gauge quickly if a page is worth viewing.

Gomes says the snippets have made search seem “more alive” but initially posed a daunting storage and management challenge to overcome behind the scenes because an entire copy of the page has to be stored, and returned immediately as a snippet.

Google increasingly also understands complex word contexts, as the algorithm evolves from simply returning keywords to actually understanding documents. “Give me what I mean, not what I just said,” is how Gomes sums up the challenge.

“It was a very hard problem, but it made searching more effective.”

As any clumsy typist knows, Google also added in elements such as a spelling correction algorithm a few years ago, which may seem a trivial if useful addition.

However, Gomes recalls that this seemingly subtle change posed a major technical challenge.

Before the spelling correction algorithm was folded in, a person who mistyped “Britney Spears” ended up “in this weird sort of purgatory of all the people who have misspelled it”, rather than the pages with the correct spelling. Now, the engine automatically returns results for correct spellings, too.

How are decisions made to alter how the algorithm works or adding new features? Gomes says it’s a mixture of people within Google coming up with ideas, and watching how users utilise the search engine and the types of problems they encounter.

For example, the spelling correction addition happened because engineers examined the session logs for search sessions “and were seeing that these are problems people are having”.

The company also encourages people to spend time coming up with fresh ideas, including regular in-house demonstration days, and allowing anybody to work on so-called “20 per cent” projects” – to give 20 per cent of their time to work on an idea that is not part of their normal day-to-day job.

“Google Goggles”, a search feature that lets people use their mobile to snap an image which can then be sent to and analysed in various ways by Google, is an example of a 20 per cent project.

The goal is “to stimulate engineers, but also to have them be guided by the user experience.”

While many new features of the engine are formally announced or quickly noticed by users and get a lot of press and attention – such as Google’s new “+1” social media service that is currently being rolled out – a vast number of tweaks, additions, and changes are quietly tested throughout the year.

More than 20,000 potential changes are evaluated by Google annually. Of these, about 6,000 go into live experiments on the site, with engineers closely watching the results. About 50 to 200 experiments might be ongoing at any one time, Gomes says.

During the course of the year, more than 500 changes are made of some sort to the search algorithm.

“We may not announce them, but suddenly your search got better,” he says.

However, some users may feel that search doesn’t always get better – generally, if it causes a change in the ranking of their own websites. A vast amount of web discussion ensues every time a noticeable tweak is made to the algorithm. The company is well aware that there is a whole science – or perhaps many would say, alchemy – centered around people trying to figure out how to get their websites ranked higher in Google, a technique known as SEO, search engine optimisation. A number of elements are known to affect ranking, including how words are used in the titles of a website, the overall structure of the site, the actual content, and how many other sites link to a given site.

But many also try to “game the system”, notes Gomes, most often in order to push up spam-filled sites that hope to make money off of Google’s Adwords programme or sites filled with malware. These have been a constant headache for Google, as the company tries to keep a step ahead.

Past techniques that Google has tried to tackle include “keyword stuffing”, where words that feature in popular searches but having nothing to do with the actual webpage might be added in white type on a white webpage background so that they are invisible to viewers, but picked up by the search engine; and “cloaking”, where the search engine thinks it is analysing one page, but when users clicked on the link, they are served up a completely different page.

Most recently, Google has tried to take on so-called “content farms”, sites filled with poor quality content on all sorts of topics designed to game the algorithm, get high Google rankings, and make money off of Adwords.

Ongoing challenges for Google are to incorporate social media effectively in search returns, and to develop an effective social media service itself (having failed with its first two tries, Wave and Buzz. “+1” is its latest attempt, Google+).

Translation and speech recognition are also up-and-coming areas for search, says Gomes – a speech recognition element already exists and works on desktops and mobile phones, but is expected to have greater impact on mobiles.

Search should also become increasingly intelligent, with a major goal being to have users able to simply ask a question rather than focus on keywords.

Making search more local is also in Google’s sights – not least because the company realises mobile search is probably going to overtake desktop search in the future, with a whole different range of user needs to address.

“We have to think of it as a spectrum of search,” says Gomes. “it’s so important to ease that connection between mind and machine.”