Skip to main content

Controlling The Search Engine Spiders

Most, but not all, blog owners eagerly anticipate the arrival, upon their blog, by the search engine spiders.

The spiders come to index your blog, when the search engines recognise your blog's existence in the Blogosphere.

For those blog owners who don't want their blogs indexed, Blogger provides the settings, in Settings - Basic - Privacy, "Add your blog to our listings?" (for internal Blogger spiders, and Blogger links) and "Let search engines find your blog?" (for external spiders).

The privacy settings control the content of the "robots.txt" file in the blog.

The spiders, when well behaved, read "robots.txt" for instructions.

The spiders, when interested, read "robots.txt" for instructions. You can modify "robots.txt", using the dashboard "Search preferences" wizard.

You make the privacy settings changes, and let Blogger maintain "robots.txt" on your behalf. Alternately, you can use "Settings" - "Search preferences", if you are daring.

These settings are at the blog level - one setting affects the entire blog. If you would like only a part of the blog protected, make a second blog (blogs are free), and include one blog in the other. If you would like specific URLs protected, use the URL removal tool in Google Webmaster Tools.

Read "Search Console" reports carefully, and learn the meanings.

If you use Google Webmaster Tools / Search Console, maybe to add a sitemap or otherwise analyse or maintain your blogs search engine relationship, you may see some interesting details
URL restricted by robots.txt (http://myblog.blogspot.com/search/label/mylabel)
and you'll generally have one of these notations for each label in the blog.

Those restrictions are normal. All label searches are restricted, so the search engines won't detect the label searches as containing duplicate content. Your blog shouldn't depend upon label searches for your readers to find each post.

Some search engines will index private blogs.

Interestingly, I note that the search engines can have access to blogs that require permission to read. Private blogs can't have blog feeds, but the search engines can still index them. The "robots.txt" file is advisory only; search engines may honour the files directives, or they may ignore the directives.

Comments

Aussie Golfing said…
Thanks for the answer
Hoe kan ik al dat bezoek van volgende blogs blokken? Men kan mijn blog helemaal niet lezen en ik krijg een vals beeld van de echte bezoekers.
Nitecruzr said…
Pratik,

Your posts are indexed using the main page / post pages URLs.

If posts were also indexed using label searches, you would have the same content being indexed under two different URLs. This would look like duplicated content, to the search engines. Both the indexing using main page / post pages, and using label searches, would be penalised.

Do not remove the "robots.txt" code - that code prevents indexing using label searches - and that is to your benefit.

http://blogging.nitecruzr.net/2008/07/google-webmaster-tools-and-label.html

Popular posts from this blog

Adding A Link To Your Blog Post

Occasionally, you see a very odd, cryptic complaint I just added a link in my blog, but the link vanished! No, it wasn't your imagination.

Embedded Comments And Main Page View

The option to display comments, embedded below the post, was made a blog option relatively recently. This was a long requested feature - and many bloggers added it to their blogs, as soon as the option was presented to us. Some blog owners like this feature so much, that they request it to be visible when the blog is opened, in main page view. I would like all comments, and the comment form, to be shown underneath the relevant post, automatically, for everyone to read without clicking on the number of comments link. And this is not how embedded comments work.

What's The URL Of My Blog?

We see the plea for help, periodically I need the URL of my blog, so I can give it to my friends. Help! Who's buried in Grant's Tomb, after all? No Chuck, be polite. OK, OK. The title of this blog is "The Real Blogger Status", and the title of this post is "What's The URL Of My Blog?".