Wednesday, July 09, 2008

Google Webmaster Tools And Label Searches

As a publisher of a Blogger blog, besides publishing articles, you have to know how those articles are being read. Just as knowing who is reading your blog, you need to know who is indexing your blog. The Google Search Engine, which feeds 2/3 of the known major search engines, provides Google Webmaster Tools, so we can monitor how well our blogs are being indexed.

If you're going to use Google Webmaster Tools effectively, though, you need to know what reports show problems needing attention, and what reports advise us of problems avoided. Representative of the latter, we have a message in the list of URLs restricted by robots.txt, which can be found from a link in "Overview", or from "Diagnostics" - "Web Crawl".

http://blogging.nitecruzr.net/search/label/Search Engines URL restricted by robots.txt


This list, when run against a blog which has a lot of labels, will be rather long. This will cause panic in the hearts of many. Fortunately, this is unnecessary panic.

The keyword here is
/search/label/
and this shows us a label search that is blocked, intentionally by Blogger, so the search engines won't follow it. Were they to index a label search, they would be indexing the same posts in the blog that they find from the sitemap. They would then penalise the blog, in their indexing, for "duplicate content".

So, when you see "... /search/label/ ... URL restricted by robots.txt" in your access report, don't panic. This restriction is to your advantage.

>> Top

13 comments:

Ananda said...

I was about to panick, but found the "/search/label/" on my warning messages. Thanks a bunch! (Great blog by the way! ... now I am going to look for something on changing a domain from Google Apps to blogger here to see if you can help me again. :-))

Laurie said...

Oh My Gosh, thank you so much for your help. You have been the only one that I found that was able to answer this issue and responded so quickly.

Patt said...

Thank you it'really helpful..I was very nervous that this might cause of limit indexing and dropped traffic.

top said...

So, this means that it is okay if there's crawler errors regarding to /search/label?

and this contained in robots.txt is normal?
User-agent: *
Disallow: /search
Noindex: /feedReaderJson

Chuck said...

Hey Top,

See What's In My "robots.txt" File?.

Vinit said...

Oh good.......I was about to panic, but seems like
"Disallow: /search" line in default robot.txt is doing this trick.

Rico said...

Excellent information...i wish I would understand better SEO...but I seem to get along better with food...yall invited to visit me.

http://ricocoffeeshop.blogspot.com

Owlman said...

I'm still a little confused. I thought that the tags were good and helped the Goolgle bots to categorize your post, so why would they then be blocked?

Chuck said...

Owlman,

When you setup a sitemap for your blog, the posts are indexed as the sitemap is processed. Indexing the posts by tag / label would give the same posts, that are already indexed through the sitemap. The search engines would see this as duplicate content.

AJ @ A Little Bit Nutty said...

Thanks for explaining that. I had no clue what to do. Now I know...nothing.

Chuck said...

AJ,

Knowing that you know ...nothing is the first step to appreciating when you learn something.

andy in china said...

Cheers mate, was worried abit then....kept getting alot of error messages...

akumaukerja said...

oh God, thanks!I very worry and panic when I check my sitemap. but now no more, because I already know any posts from you.Thank you very much,..!