Google and PageRank Revisited

By Deane Barker on January 17, 2005

Last year, I theorized that since blogs allowed people other than the site owner to enter hyperlinks (by adding a comment), this fundamentally hosed the concept of PageRank.

Remember that the entire point behind PageRank is that a link from Site A to Site B is a “vote” for the content of Site B by the owner of Site A. But when the owner of Site B can add a link from Site A all by himself, the idea behind PageRank breaks down.

Back then, I wondered:

This begs another question — would Google ever allow you to designate links as self-service? By using a special comment or something, you could declare to the GoogleBot that you didn’t add these links yourself, instead they were added by the link’s target.

Well, according to several sites, this is exactly what Google is going to do. The rumor is that “rel=’nofollow’” on A tags will tell Google not to count that link in its PageRank algorithm. Whether or not this will improve the accuracy of PageRank is debatable.

It probably doesn’t matter, since Slashdot is theorizing that Google is about to take over all the data communications of the world anyway.

Another one via Joseph Scott.

Gadgetopia

Comments

  1. I followed through some links in your post, and I tend to agree with this cat:

    “Are you kidding ? It’s Google[‘s] problem to keep their indexes valuable. It’s your problem to keep your blog free from spam.”

    Also, would the addition of that markup in an anchor tag mean that this would make pages Google-specific, or would this be added to a W3C reco somewhere? What would this do to validators? I’m a rookie…

  2. The “rel” attribute is a legal attribute for an A tag — always has been.

    “Defines the formal relationship between the target link and the current page, indicating the relationship from the source document (the current page) to the target document (the linked page).”

  3. But how can a search engine filter a link unless it knows the context? And HTML was designed to give us a way to provide this context on every link.

    The REL attribute has been in the HTML spec for years, and we all technically should be using it on every link anyway (in a perfect world…). Google is just saying that if we do, they’ll take this into account. They’re really just asking us to put metadata into our HTML the way the spec intended us to in the first place.

  4. If its Google’s search results that are getting tainted, wouldn’t it be more feasible for them to maintain their own blacklist of bad guys instead of expecting the world to maintain millions of splintered blacklists? Or to expect the world to add new markup to millions of pages? They seem to have no problem gobbling up new “good guy” sites and pages daily, why can’t they do the same only in reverse? (And why isn’t all the damn porno in a .xxx TLD anyway?)

    “But how can a search engine filter a link unless it knows the context?”

    Somehow, we’ve managed to figure out the “context” of porno/drug/casino links, you’d think the many-headed beast that is Google could do the same.

Comments are closed. If you have something you really want to say, email editors@gadgetopia.com and we‘ll get it added for you.