The Third Generation of Search

The Third Generation of Search

By Greg McNevin

January/February Edition, 2008: IDM kicked off 2007 with a look at Wikia’s new social search engine project, at the time informally dubbed Wikiasari, which was taking its first steps into a brave new world of search. Now, one year later an Alpha version of Wikia Search has appeared, but it is not alone in it push to become the third generation search engine of choice…

On January 7, 2008, Wikipedia founder Jimmy Wales’ much anticipated search engine project Wikia Search finally launched, giving us a glimpse into one possible future for search – even if it was greeted with a chorus of criticism rather than applause.

Like the community-edited Wikipedia, Wikia Search is an attempt to bring the power of open source and social networks to the world of search, giving users the power to rank results against search terms rather than letting the an algorithm decide what is suitable.

The engine still uses established web crawling techniques to create its index, with the Alpha version sporting a 50-100 million page “placeholder” index according to Wales. This is where criticism of the project has begun, with some commentators claiming search results are disappointing at best.

Because improvement comes via community vetting of search results, the current lack of community involvement naturally leaves results - already limited by the incomplete index - hamstrung. Wales himself notes that Wikia Search is far from prime time, saying it’s a project to build a search engine, not a real search engine – yet.

“We want to make sure people understand that it's in its very early days,” says Wales, adding that he expects it to be up to two years until the engine is producing results rivalling the accuracy of other industry heavyweights.

With Wikia Search becoming more accurate the more it is used it could possibly give the Google empire a run for its money in the end. The question is, however, could Wikia’s vision end up being the next generation of search technology?

First generation search engines crawled pages and stored keywords in databases, matching search terms with these keywords to deliver results to users. AltaVista was catapulted to worldwide popularity in this way, however, it was Google’s innovative algorithm that took keyword search a step further, prioritising results based on how many other sites link back to a particular page.

This assumption that more linkbacks make a page more useful quickly saw Google overtake AltaVista and ushered in a second generation of intelligent search that the company has been quietly refining since.

According to the analysts at Comscore, around 60% of searches around the world are conducted via Google. Yahoo! is its closest competitor with a paltry 14% in comparison, while Microsoft hangs even further back near the horizon with 4%.

Now that Google has highlighted the astronomical potential of search there is of course no shortage of competitors jostling for attention behind the big three. Many have amazingly innovative search platforms, however, the killer feature to define the next generation of search is still to be decided.

Some are betting even smarter word search will provide the leap, others looking at search focusing technology and guided queries to increase accuracy, and some, like Wikia, deciding that marrying the mind and the machine is the best way to go.

For smarter word search, engines such as Xerox’s FactSpotter and the Sydney-based Lexxe are attempting to improve results via natural language processing technologies, improving keyword search by looking at search terms contextually. In the case of Xerox’s engine, the meanings of words can be analysed and searches phrased in everyday language accepted. For example, if searching for documents that reference a person, let’s say Larry Page for example, it will also return results where the pronoun "he" is used instead of Page's full name.

Xerox claims the technology takes advantage of the way humans think, speak and ask questions, going beyond today's typical keyword search or data-mining programs and extracting the concepts and relationships among terms.

When it comes to focussing results and guiding queries, engines such as Spock, Yahoo! and Ask.com are working to improve on Google’s “search everything” way of scouring the web, delivering search results based off smaller, topic-sensitive indexes. They are also working on improved search term suggestions, enabling users to steer searches in particular directions. For example, if a user searched for “Rings”, results could be steered towards jewellery or movies rather than just getting a dense listing of all results.

Finally, social search engines like Sproose and Wikia Search are pushing for a future where humans and their infinitely more complex brains work with sophisticated algorithms, voting on results and hopefully improving search results. This method of humanising results appears to be nosing ahead in some respects, particularly considering Google has itself been playing with the concept in its Google Labs testing grounds.

Rather than standing around waiting for a upstart to take its crown, Google has an army of software engineers tirelessly tinkering with its engine. It has added many features similar to those some third generation engines have brought to the table, with the most obviously being spelling and search refinement suggestions, result filtering based on past clicks (provided you have signed up to have your results monitored), specialised searching in a host of niches, government sites, movies and images and even image results limited to human faces thanks to some sophisticated imaging technology.

Beyond these incremental improvements, the idea of social networking and search focussing appears to be gaining traction with the search giant. Google Labs has recently tested a voting system that gives registered users some of the same features Wikia Search is attempting to imbue in its engine. Results could again be voted on for usefulness, and when the same search was conducted the good gained more weight, and the not so good did not appear on the main page.

While this socialised Google was short-lived experiment, the company’s Coop program, a customisable version of Google’s engine introduced in mid 2006, has been quite successful.

The service enables registered users to create their own niche search engine and incorporate it into any page. Results are restricted to sites specified by the owner, or weighted to preference certain sites or pages over others in general searches. This kind of specialised search has been hailed as the future by some, claiming that it will replace web directories.

The real third generation of search is already out there, with many of the engines mentioned above already dubbing themselves just that. However, as AltaVista defined the first generation, and Google the second, there is currently no single third generation engine that can best Google as it bested AltaVista to usher in the next era of search.

There is a lot of possibility, but with Google itself experimenting with social search the smart money may still be lingering on Wikia Search’s community-based effort despite initial criticisms. After all, everyone loves an underdog.

Comment on this story.