Yahoo! Birth of a New Machine

By Chris Sherman Executive Editor, February 18, 2004

Yahoo is rolling out a brand new search engine today, with its own index and ranking mechanisms, casting aside its long-standing use of Google-powered search results. The move is bound to roil the industry and sets in motion a new race for the claim of web search champion.

A longer version of this article that goes into more detail about the new Yahoo search engine, including details about its indexing process, paid inclusion programs, and other details of importance to webmasters, is available to Search Engine Watch members.
Click here to learn more about becoming a member
.

Ever since Yahoo's acquisition of Inktomi nearly a year ago, speculation has focused on when the company would replace its Google powered search results with those from the Inktomi index.

In a surprising move, Yahoo isn't replacing Google with Inktomi. Rather, the company has developed a brand new search engine, drawing on the lessons learned from what the company calls the "critical mass" of search engineering talent that it has brought together through hiring and acquisitions, as well as investment in infrastructure and product quality.

"High quality, talented search engineers are in very short supply these days," said Jeff Weiner, Yahoo's senior vice president of search and marketplace. "Regardless of how good your planning process is, at the end of the day it comes down to people and chemistry."

Weiner said Yahoo has waited until now to make the switch from Google to be certain users would have the best experience possible after the transition. "It was absolutely essential to us that we had a roadmap in place that not only let us sustain our quality, but build on it."

Although the change to self-powered search results is a radical change, Yahoo has steadily made incremental improvements in its search capabilities for more than a year. In October 2002, the company made the most significant change to its operation since its birth, replacing its human-compiled directory listings with Google search results.

Then in April of last year, the company rolled out its new Yahoo Search, introducing a streamlined search page. It also added new tabs to search result pages offering access to its directory listings, news, images, and yellow pages.

Today's launch is the beginning of a progressive rollout that will take place over the next few weeks. It is also the beginning of numerous planned enhancements focusing on web search, personalization and vertical search.

It's important to note that the new search engine is for web results only. Image search is still powered by Google, and News search is still a combination of Yahoo's own editorial and technological resources.

How does the new Yahoo search engine differ from Google? The presentation of the results is very similar. Yahoo has wisely opted to keep things looking mostly the same, with a few exceptions. There's a linked to the cached copy of each indexed page -- now being served from Yahoo, not Google. Just about everything else on search result pages looks the same.

The actual results returned by Google and Yahoo depends on the query. For popular or common queries, there seemed to be very little difference between the two engines in top few results. But once you get past those, the results tend to diverge dramatically. And for less common or non-popular queries, Yahoo results look quite different from Google results.

While Yahoo and Google are likely using similar algorithms, one reason for the differences in what's displayed is that Yahoo's email and search teams are now working together to leverage what they've learned about spam. Since Yahoo mail processes billions of email messages, this knowledge is likely quite helpful in providing Yahoo with a much deeper understanding of the characteristics of spam -- and helping keep the nasty stuff out of the web page index.

Bottom line: I'm impressed with the quality of results that Yahoo is delivering. It's a very viable alternative to Google and the other "last engine standing," Ask Jeeves/Teoma.

What's Being Indexed?

The Yahoo Search index is capturing the full text of web pages, up to a 500K limit. This is greater than the 101K maximum indexed by Google. A broad range of file types, including HTML, PDF, and Microsoft Office documents is also included in the mix.

How big is Yahoo's index? They aren't saying, despite Google's announcement yesterday that it has expanded its index to nearly 4.3 billion documents (6 billion, if you count images and newsgroup postings, as Google does).

Interestingly, in almost all of my tests with random queries, Yahoo reports more results found than Google. Does this mean that Yahoo's index is bigger? Perhaps -- but reported results are estimates, not exact counts. They also can include factors other than keyword matches and so are notoriously unreliable measures of overall index size. Suffice to say that Yahoo's index is comparable to Google's for most queries.

"We're very confident in the quality and size of our index, and we think the results speak for themselves," said Weiner.

What About AltaVista and AlltheWeb?

Last year, before Yahoo acquired Overture, Overture itself was busy acquiring AltaVista and AlltheWeb. Speculation at the time was that Overture would kill off AltaVista's technology, and power both search sites using the AlltheWeb index.

To the contrary, both search engines continued to maintain their own independent indexes. Then, in July 2003, Yahoo bought Overture. Less than a month later, Search Engine Watch editor Danny Sullivan and I visited AltaVista and AlltheWeb, and learned that the plan was to unify the two search engines, keeping the strongest technologies from both.

That was exciting news. But then nothing seemed to change. Today, both AltaVista and AlltheWeb continue to maintain separate indexes, and Yahoo isn't saying publicly whether this will change with the introduction of the new Yahoo Search Technology index.

What's Coming Next

In addition to continually working to improve the quality of its web search results, Yahoo plans to put particular emphasis in the coming months on personalization and vertical search. The company's My Yahoo portal already offers extensive content customization options.

Newly released features like the SmartSort option in Yahoo Shopping, which provides very specific product advice for digital cameras, mp3 players, computers and other electronic devices based on criteria you enter, is one example. The ability to add RSS feeds to your My Yahoo page is another.

"Ultimately we want to understand the intention of the user, and I think we're going to get closer to that through personalization," said Weiner.

In the vertical search arena, Yahoo plans to focus on local, travel, personals, and its Hot Jobs search portal.

But these moves are clearly just the beginning of many more to come at Yahoo. "Over time you're going to see Yahoo extend our search technology, and ultimately into our media properties," said Weiner. "To a large extent that will help drive our growth."

And give Google, Ask Jeeves, and Microsoft's fledgling web search initiative good reason to be even more attentive to the quality of their search results. The coming year promises to be a very good one for searchers.

A longer version of this article that goes into more detail about the new Yahoo search engine, including details about its indexing process, paid inclusion programs, and other details of importance to webmasters, is available to Search Engine Watch members.
Click here to learn more about becoming a member
.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Google, Yahoo Revving Up Search Engines...
ABCNEWS.com   Feb 18 2004 2:25PM GMT
Google expands Web index...
IHT   Feb 18 2004 1:26PM GMT
New laws don't keep spam off the menu...
ZDNet   Feb 18 2004 12:58PM GMT
MSN makes play for more searchers overseas...
CNET   Feb 18 2004 4:15AM GMT
RIAA sues 531 more Internet users over music downloads...
SiliconValley.com   Feb 17 2004 8:21PM GMT
Piercing a Silicon Valley stereotype...
CNET   Feb 17 2004 12:06PM GMT
Search wars are about to get personal...
CNET   Feb 17 2004 12:06PM GMT
Music piracy case tests Net free speech...
Globe and Mail   Feb 17 2004 11:50AM GMT
News Sites Seek Readers via Search Ads...
dmnews.com   Feb 17 2004 6:09AM GMT
Program blocks pop-ups at source...
Washington Times   Feb 17 2004 5:33AM GMT
YahooKimo to focus on Net searchers...
China Post   Feb 17 2004 3:14AM GMT
5 years ago: Playboy gets legal with browsers...
Silicon.com   Feb 13 2004 4:46PM GMT
Search Propels Online Ad Growth...
dmnews.com   Feb 13 2004 6:15AM GMT
powered by Moreover.com

Back to Article