The Google Dance finally explained: This is an awfully interesting look at the the “dance” Google does to get all the pages on the Internet. I had no idea there were secondary and tertiary databases, nor that it made several successvie crawls.
“Each ‘dance’ begins with Google making a major, deep crawl. Let’s call it Crawl A. What it does is it spiders the whole web- over 3.4 Billion pages at last count. Google uses over 15,000 inexpensive PC’s (actually, conventional desktop computers) spread all over the world, located in different data centers. When it sends Googlebot (or DeepBot) out to spider the current sites within its database, as well as to find new websites that have recently been launched on the web. Initially, once Google has completed this Crawl A, effectively catching all of these web pages for its next update, there will be a second update afterwards, roughly two weeks later.”
Interesting as this is, however, the article goes on to explain how to “time” a site launch to hit the right crawl of the GoogleBot or the DeepBot. Now, this is all well and good, but maybe it’s a little too intense. My theory on search engine marketing has always been that all the tricks in the world won’t help a fundamentally bad site, and a good, solid site will shine through with no tricks at all. Make a good site, and they will come.
One other point from this article, though, really caught my eye:
“…a webmaster can install the Google Toolbar and then visit his or her own website through the toolbar. Since mid-2002, there has been countless reports of a direct correlation between a website’s inclusion into the Google database and a visit through the Google Toolbar.”