Empty Or New Blogs Can Be Classified As Spam Hosts

We're seeing a few recent reports, in Blogger Help Forum: Something Is Broken, from owners of empty and / or new blogs, about spurious spam classification.

Very few owners of empty or new Blogger blogs understand why their blogs should be classified as possible spam hosts, by the Blogger anti-spam processes. Most blog owners seem to think that only blog content is considered, in spam classification.

Spam classification considers many characteristics of a Blogger blog, in classifying any blog as a possible spam host.

Blog post content is an obvious - but not the only - factor in spam classification. Given previously observed behaviour of spammers, many characteristics of blogs, besides simply the extracted and analysed content, are used in classifying possible spam blogs. Using fuzzy spam classification techniques, one might also consider
  • Accessories and decorations, on the blog.
  • Addresses used, in setting up multiple blogs.
  • Overall behaviour in Google, by the owner.
  • Past posting habits of the owner.
  • Previous classification of similar blogs.

Some spammers obscure their activity, with gratuitous Google activity.

Some spammers are active in multiple Google activities and features. It's possible that some spammers even try to make their spam in Blogger harder to detect, by spreading their activity across other Google services - even when non abusive in other Google services.

Some spammers may not realise that activity in other Google areas can be tracked, and included by Blogger spam classifications processes.

Blog publishers, enjoying similar activities, may be spuriously classified.

Blog owners who use one or more techniques involved in setting up and maintaining spam blog farms may be classified as spammers - and their blogs then classified as possible spam hosts.

The term "blog owner" must itself be considered fuzzily, as some spam blog farms contain blogs owned by multiple Blogger accounts and profiles. Some Blogger accounts, similar to known spammer accounts - even with no blogs owned, or with empty blogs published - may then appear as possible spam hosts.

Empty blogs may still provide clues, which suggest spammy purpose.

Empty blogs, that appear similar to blogs in known spam farms, may themselves be classified as possible spam hosts. Various details such as blog name, blog design, and use of various blog accessories and features, can make a blog appear similar to known spam blogs - with no other content.

As long as spammers exist, some Blogger blogs will be spuriously classified.

Given the need for Google to reduce the volume of spam blogs, fuzzy spam classification might use any of the above details, in classifying any empty or new blog as a possible spam blog. This will, unfortunately, lead to spurious spam blog detection.

The possibility of spurious classification cannot prevent classification, in general.

Given the ability of spammers to publish multiple blogs in the Blogger name space, classifying empty and new blogs is necessary, in making it possible for Google to keep up with spammy activity. This technique enables publishers of genuine blogs to have their blogs viewed as righteous Internet content - not as one more blog, in a sea of spam.