This year, we've seen improvement in the automated spam classification process, implied by a noticeable reduction in spam review requests overall - and in a reduction of false positive classifications. During the past few months, blogs requested for review are 2 or 3 times more likely to be confirmed as legitimate spam blogs (true positives), compared to this time last year.
Of the blogs confirmed to be spam hosts, when I am able to examine cached copies of the content, 3 out of 4 of those appear to contain material scraped or syndicated from other blogs or websites.
- Content scraped (stolen), or syndicated (copied, with permission), from other blogs / websites. Content scraped or syndicated to other blogs / websites.
Google describes the problem, in Blogger Help: Spam, phishing, or malware on Blogger, quite simply.
Spam blogs cause various problems, beyond simply wasting a few seconds of your time when you happen to come across one. They can clog up search engines, making it difficult to find real content on the subjects that interest you. They may scrape content from other sites on the web, using other people's writing to make it look as though they have useful information of their own. And if an automated system is creating spam posts at an extremely high rate, it can impact the speed and quality of the service for other, legitimate users.
Long ago, many spam blogs were part of large well named spam blog farms.
Long ago, spam blogs were first encountered as startup components in large spam blog farms.
Later, we explored the involvement of various "get rich quick" schemes, and of affiliate marketing.
- Content or links which reference referral-based activities such as GPT ("Get Paid To"), MLM ("Multi-Level Marketing"), MMF ("Make Money Fast"), MMH ("Make Money from Home"), PTC ("Pay To Click"), or PTS ("Pay To Surf").
- Affiliate marketing (Please, don't confuse this with "affiliate networking"!).
One common characteristic of many spam blogs was lack of uniqueness.
Of these three broad descriptions of confirmed spam blog content - spam blog startups, get rich quick schemes, and affiliate marketing - the one common feature in most of the blogs, confirmed as spam hosts, seems to be the lack of unique content. One of the features of the Panda update to Google Search was described as "content quality" in search results.
The past year tuning to Blogger spam classification appears to be in keeping with Panda, in that it is targeting blogs which rely upon content intentionally replicated from blog to blog - whether "scraped" (stolen, without permission), or "syndicated" (copied, with permission).
If your blog is to avoid spurious classification, it needs unique content.
The end result here is that Blogger blogs, to avoid spurious spam classification, need to contain as much informative, interesting, and unique material as possible. Clever technique is not helpful.
While some amounts of quotation of other blogs and websites is beneficial, the majority of blog content needs to be written by the blog owners and contributors, and properly targeted to the reader population. This helps the blog get better SERP position - and more search engine related traffic.