WordLog

A weblog authored by Carthik about the latest in the WordPress world.

Monday, May 14, 2007

Supplemental Results and WordPress

Filed under: — Carthik @ 7:46 am

I happened upon a curious trick to find out all your pages listed as “supplemental results” in google, and some other associated supplemental result tricks. A lot of you might already know these tricks, but I think reading the rest of this article might get you thinking about these supplemental results in a new way. I spent a good part of 1.5 hours playing with this stuff and reading up on it, which I try to summarize here.

To start off, let us look at the tricks.

Finding all supplemental results for your blog

The trick is to do a search for the string “site:wordlog.com *** -spght” in google. That gives you all the pages on your wordpress blog listed as supplemental results. The search result that google returns will have a “Supplemental Result” in the text that follows the url and the short excerpt, and as you can see, all the results for the string I refer to above have that after the results. The spghy can be changed to some other random characters – it doesn’t matter.

Finding all results that are not supplemental results

The following query will show all results that are not supplemental results:
site:wordlog.com -allinurl:wordlog.com“.

So, for wordlog.com, there are 227 non-supplemental results and 196 supplemental results. However, a search for “site:wordlog.com” returns 325 results, and 196+227 = 423. So I think some of the results returned for “site:wordlog.com” are supplemental results. At the time this article was written, page 25 of the results has two supplemental results right at the top.

What are Supplemental Results?

According to Google,

A supplemental result is just like a regular web result, except that it’s pulled from our supplemental index.

and, additionally, Google maintains that

…the index in which a site is included is completely automated; there’s no way to select or change the index in which a site appears. Please also be assured that the index in which a site is included doesn’t affect its PageRank.

So we know that there is no way to formally request supplemental index pages to be moved to the main results pages. However, one thing bothers me, sort of.

Most of the non-supplemental results for wordlog.com are the archive and category pages. I believe the individual posts should be there in stead. I have noticed, many times, that when I search for a term, I am most often led to the category or date-based archives of a blog, and then I have to manually search for the term again in Firefox, and then, since many themes display only excerpts in these pages, i have to click the article to read it to get the information I need. This is annoying, to say the least.

Fixing the supplemental results problems

There is a duplicate content cure plugin for wordpress that promises to reduce the duplicate content indexed by google by way of your archive and category pages. It does so by adding directives to google to not index archive and category pages by means of meta tags in the page headers. One would think this would cure the supplemental results problem too, and make all your blog posts preferred over the archive pages.

As a small experiment to test this theory that the duplicate content cure plugin will help alleviate the supplemental index problem, I did searches for supplemental and non-supplemental results for seologs.com, the site that published the plugin. Amazingly, seologs has 385 supplemental results and 243 non-supplemental results! So now it appears that the plugin is not the silver bullet for the problem. However, as promised by the plugin, the archive pages are missing from the pages indexed by google. Is this a good thing, though? If the number of indexed, non-supplemental pages are the metric, then it is not. Without the plugin, all of wordlog’s archives are indexed and probably will be returned as search results for some terms. The duplicate content cure plugin prevents some pages from being indexed, totally – it would be nice if it did not do that, really. It is better to have visitors find useful content via your archives if not via a direct link to the relevant article.

Ideally, I would love for the archives pages to be indexed too, with the blog posts being indexed in the main index. Heck, I would love to have all the pages in the supplemental index to be in the main index instead. There are lots of suggested tricks to avoid the supplemental index. The issue with archive pages in wordpress blogs being indexed more prominently is because all WordPress blogs have relative links to the archives pages that look like the following if you look into the source of the page:

"<link rel='archives' title='May 2007' href='http://wordlog.com/archives/2007/05/' />"

In addition to this, you also have links to the archives from the sidebar, which is probably displayed on all pages of your site. The indexing robots should think these pages are really important, since you seem to link to them from every page on your site.

So, a simple way to fix the problem, or at least try to get some pages in the main index might be to have a sitemap containing each and every post on every page in your blog. That would make the pages huge! An alternative would be to have an html sitemap and link to it from the the sidebar or footer. You could also link to posts you think are important from the sidebar. The important things to remember are that:
1) It’s better to have a page in the main index than the supplemental index.
2) It’s better to have a page in the supplemental index than to not have the page indexed at all!

I have a couple of ideas floating around in my brain that I will implement to accomplish item #1 above without violating item #2. I will try them out and let you know if the results are worth mentioning. Do you have any ideas that have worked, that can be verified in a straightforward manner? Blame it on what I do for a living, but I have come to trust verifiable results over speculation and hypothesis.

 

Powered by WordPress

eXTReMe Tracker