IndustrySearch Engine Watch Reader Q&A: June 2003

Search Engine Watch Reader Q&A: June 2003

Answers to questions from Search Engine Watch readers.

Q. I work for a UK company and would like to optimize our search engine performance in both the UK and US. However, I’m unclear as to whether I need to pay inclusion fees and paid listing fees to US and UK version of the companies listed. For example would I need to pay Yahoo Express twice – once in the US and once in the UK, so that I was listed in the UK and the US?

A. Someday, I’ll finish writing a piece I once started that explores the basics about US/UK submission more. It can be very complicated. In general, for crawler based search engines, getting in them once anywhere is enough. If you are in Inktomi, you’re in Inktomi everywhere. But, paid inclusion can help make you be in UK-specific versions of Inktomi, for when people may do country-specific filtering. Similarly, Ask Jeeves recently began using what should be a UK-specific version of its Teoma-owned results. As for Yahoo, my advice in the past has been that if you target a global audience, is the place to be. It gets you into any English-language version of Yahoo. If you just want the UK, then do Yahoo UK. It’s cheaper, plus you do actually get included in, though as a regional listing. I’d also recommend taking a look at the UK-focused submission guide put together by Martin Belam.


Q. Is there a search engine that would allow me to see web sites as they were presented in the past? I want to see a website as it was displayed in 1999 or 2000.

A. Yes there is! The Internet Archive Wayback Machine provides this type of access. You won’t find every page has a “history,” because some people have opted out over time. But it is very good.


Q. I recognize that search engines look at text and where it falls in the HTML code so that it can be assessed for prioritization. We use .jsp pages, and our HTML coding doesn’t start until line 89, with JavaScript preceding it. Consequently, when one loads up our homepage and does a view source from a browser, there is a lot of white space before one actually sees the HTML code. Is this affecting my text prioritization and if so, is there anything I can do about it?

A. White space don’t matter, so don’t worry about that. As for your JavaScript code, search engines might accidentally read this code and assume it is body text. If they do this, then yes, your “real” body text might be seen as further down on the page and perhaps not ranked as highly. It’s probably not a huge issue for you, but using external .js files should help solve this problem and give you extra peace of mind, if you can do it. The Hiding JavaScript page covers this and JavaScript issues in more detail.


Q. In order to help search engines crawl our web site, will it help us to spell out keywords in URLs fully or with a dash or underscore? For example, what’s best:

A. Using dashes or underscores won’t help with crawling. However, they might — might — give you a tiny bit of weight in terms of ranking well. Very tiny. If it makes sense to use them for users, go ahead — and I find that dashes are more readable. Also be sure to read the ABCs and URLs page for more about this issue.


Q. If I accidentally let a domain expire and and then repurchase it, what do I have to do to get Google to count the old links to my site again?

A. As explained in the Coping With GDS, The Google Dance Syndrome article, expired domains lose what link “credits” they’ve built up if purchased by someone else. And even though you are the same person, it’s still likely Google won’t realize this automatically and reestablish those credits. Instead, you’ll have to write into Google and ask them to do it manually.


Q. Lycos InSite advertises that they distribute to foreign language sites, such as T-online. Do they do localization or page translation for me? Or is this just a case of a company advertising something that it really doesnbt provide but has allegiances with in other regards?

A. Lycos InSite Paid Inclusion is a service that makes it easy to get your pages listed in the Inktomi and AllTheWeb (formerly FAST) indexes, through paid inclusion. These indexes are used by a variety of search engines worldwide. So when you get in them, your pages possibly will show up in front of users searching at these different search engines.

For instance, Lycos InSite tells you that being listed with AllTheWeb (they still say FAST) provides, “submission to all these great search engines: Dogpile, Metacrawler, T-online, BT: Looksmart, ItaliaOnline,, Lycos, AllTheWeb, Excite, Infospace, Webcrawler, HotBot and more.”

So what’s this T-Online? It’s a major ISP in Germany, sort of like AOL is in the US. And those using T-Online may use the search engine offered on its home page. That search engine has paid listings from Overture Germany and unpaid results that come from AllTheWeb. So if you are listed AllTheWeb, your page is available to those searching at T-Online.

Of course, many of those searching at T-Online might be searching in German. Thus, if your page isn’t written in German, then you may get little benefit from being included with this particular portal. In addition, other sites using AllTheWeb or Inktomi results might weight the results to favor pages from a particular country or in a particular language. For example, the BBC search engine uses Inktomi results weighted to favor UK sites.

Overall, it’s nice to be available worldwide, but unless you’ve specifically created content targeting non-US and perhaps non-English speakers, don’t expect to necessarily rank well in some of these other places. The Country And Languages page provides tips about country and language targeting.


Q. What are the best resources for finding information about the technology behind the leading web search engines?

A. I keep a guide to resources and search tech-oriented articles on the Search Engine Technology page.


Q. When a search engine spider crawls across your site, does it count as one unique visitor or multiple unique visitors? We’ve seen some large increases in site visitors in recent weeks which also coincides with a spate of submissions of our URLs to search engines by us.

A. This is going to largely be determined by the log analysis or other measuring tools that you use. Let’s say a spider comes in and grabs some pages over a five minute period, then leaves and comes back the next day for another five minutes. A log tool might consider each session to be from a different “visitor,” so you’d have two unique visitors in all. This is especially so if the spider comes in the second time using a different IP address. But if it kept the same IP, then some tools would be smart enough to know that this was two visits from some the same unique “person.” Alternatively, some tools might mistakenly consider each page request to be from a different visitor. It’s best to follow up with your log or measuring tool vendor to determine exactly how they measure what a visitor is. Plus, look at a list of those visitors. If they are from search engines, you may be able to tell this from the IP address.


Q. Do frames still matter given that “paid listings” are the norm these days? In other words, should I still avoid using frames when designing web sites?

A. If you can eliminate frames, you’ll increase the odds that more of your pages will get indexed by search engines for free, likely providing you with more free traffic. So yes, avoiding them can still be helpful. Of course, with paid listings you are correct that you can simply buy your presence on search engines, if frames are making you invisible. However, if you run short of money, your presence will end. That’s why I’d advise doing both things — dropping frames to try and get at much free traffic that’s likely to come to you organically or naturally while also running a paid campaign to help plug holes in where you don’t get a free presence. Be sure to see my frames tutorial, for more help with frames.


Q. I’m in the process of trying to find a company which can provide search engine software, which will allow me to setup my own search engine to search from about 500 websites (about 300,000 pages and 40 million words).

A. You might find some resources at the SearchTools web site and perhaps among the articles listed on the Search Engine Software page I maintain. Most material is about enterprise search solutions, however.


Q. Is labeling graphics with applicable keywords considered spam?

A. ALT text is a way for you to associate words with graphics. This is something that definitely should be done to help those who cannot view images. However, sometimes people have tried to “stuff” ALT text with repetitive keywords, in hopes of increasing search engine rankings. Some search engines that index ALT text may view this spamming. In short, if you have a picture about a dog and you make ALT text that says, “Picture of a dog,” no problem. But if you say, “dog dog dog dog dog dog dog dog,” then you might get penalized.


Q. Lately I’ve been getting this error whenever I submit a site to DMOZ: “Internal Server Error.” I’m not sure why, and all my emails to DMOZ are going unanswered. Do you have any idea what might be up?

A. The Open Directory is a volunteer project that gets minimal resources from its parent company, AOL Time Warner. The error you are getting means that the system is likely too busy to handle your submission request. I’d keep trying, and you might also seek help from the Open Directory Public Forum. A quick visit there myself found a thread that confirms what I suspected — the servers are busy.


Q. I was looking at your article on German Search engines, and I wondered if you could help me. I am looking for information regarding the best pay per click search engines to target in Germany. I was thinking of something like the “Overture” or “Google AdWords” of the German search engine world.

A. I don’t know the German marketing in depth, which is why I’d encourage you to explore some of the resources that were listed in that article. These are sites that watch German search engines very closely. However, I can say that buying Overture Germany makes sense, as the company has important distribution partnership in the German market. Similarly, Google and Google Germany both have a large amount of German and German-speaking users, so purchase Google AdWords at Google and target these at Germany and German-speakers. Finally, I would also consider Espotting Germany, as they also have important distribution in Germany.


Q. I’m doing a site where 95 percent of the content is generated by JavaScript. The only HTML text is in the title and meta description tag. Will the crawlers “see” the JavaScript-generated HTML or only the JavaScript source?

A. Some of the spiders may not see any of the JavaScript at all, given that if it is hidden within comment tags, they may ignore it. However, the Hiding JavaScript page explains how this isn’t always perfect. So, there is a chance that they will index some of the JavaScript code as if it were ordinary text. For a human visitor, that code will execute to bring in copy to the page itself. For a crawler, this almost certainly won’t happen. In other words, while they may index the code, it won’t execute — so that copy it is supposed to call into the page will remain invisible.


Q. One of my clients has signed with a traffic boosting company that posted 20 copies of each web page contained in her site (renamed slightly) to the server as the primary method for boosting traffic . Is this a legitimate practice in the eyes of the search engine community?

A. Posting duplicate or near-duplicate “mirror” pages in volume like this, for only the purposes of getting traffic from search engines, would be deemed as spam by most major search engines.


Q. Can you help me to find a “open source search engine”. Could you tell me where I can get a software of search engine for Linux?

A. The best resource for help with site-specific search engines is the excellent web site run by Avi Rappoport, Search Engine Watch also provides a list of past product reviews here.


The Third-Party Data Deprecation Playbook

whitepaper | Digital Marketing The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

whitepaper | Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

21 Steps To Email Deliverability Success

whitepaper | Digital Marketing 21 Steps To Email Deliverability Success

Email, The Weapon Against Identity Fraud

whitepaper | Digital Marketing Email, The Weapon Against Identity Fraud