Taking Content Generation to the Next Level
A discussion on how best to generate content to pass Human inspection.
I just read two blog articles echoing the same sentiment.
One by Mark on Digerati, and one by SlightlyShady. These two articles both highlighted the most obvious epiphany one could glean from the Google Spam Docs: The algorithm cannot be perfect. Google needs a huge team to catch the spam that fools the algo. They need humans. And as SlightlyShady wrote "Humans are easy to fool.".
There are many aspects of a website that might indicate it is spam: design & layout, imagery, links (and linking patterns), site architecture, age, TLD, and of course, content. Content is probably the greatest stumbling block when it comes to creating sites that can pass human inspection.
We know how to create content that can fool the algorithm, but how do we create content that can fool Google's army of Monkeys at typewriters?
We can't expect to get away with markov content or synonym replacement; not for any respectable period of time anyways. After a while, if our sites eventually rank high enough to warrant human intervention, even scraped content is easy to detect as being duplicate.
A basic directive of the Google Spam Wranglers guidelines is to pare away all the scraped content and if whatever is left is just ads, then its most likely spam.
So how do we take content generation to the next level?
We need to create legible, syntacticly correct content that a human can read and make sense of. The key to this is taking small distinct chunks of data and splicing them together with joiner words or phrases. Yes, I'm talking about madlibbing.
A madlib script can create legible content and have thousands of different iterations. It takes a lot more creativity to create a madlib script than it does to set up some feed scraper blog, but the extra time is an investment in the future. This content has great staying power over the long term, and if executed properly, it will never result in your site getting banned.
If you are really worried about the time and creativity required to write madlib scripts, hire a writer. Think of it like this, you could pay an article writer $5 to write a great article that you can use once. Or you can pay an article writer $10 to write a great madlib script that you can use to create 1000 different articles.
Of course, there are more things than content to consider when planning to create spam that passes human inspection.
How about page Design?
You always need a template to spin blogs from, but you really kind of need an `un-template`; something highly versatile. Wordpress does this well.
Wordpress works well because you can easily switch themes. Create a wordpress install package that you can use on all your servers, include a whole bunch of different themes. Also think about including a plugin like this for rotating header images (hxxp://mhough.com/wordpress/2007/header-image-rotator-plugin/) this way, you can always have a different image that is not the themes default image. Be sure to include a whole bunch of different images in your package!
I imagine the truly intrepid among you will rewrite the code for the default blogroll links in your install package.
Remember Google's directive of paring back content to see if just ads are leftover? Well here's a revelation: How about you don't include ads? The decision is really up to you, of course. You have to ask yourself why you are creating these sites...ad money or linking power?
And don't forget linking.
To quote an old post Birds, Bimbos, and Blog Networks:
I'd suggest breaking your network of blogs into chunks of say 5 -10 blogs, assigning an independent IP to each chunk. Consider interlinking the blogs in a chunk if each part of the chunk seems relevant to the others. Obviously, don't link to other chunks/other IP ranges. If you want to take caution a step further, be aware of not cross promoting links on each chunk; by back searching a sites links, your network can be laid open to public view.
Now that I read this quote, I'd also add that sometimes you actually want to link 4 or 5 chunks to the same URL, because you are going to need more linking power then just one chunk can provide. As long as you are sure that you aren't interlinking your chunks en-masse, you can quarantine off parts on your network for other uses.
At any rate, these are just a few extraneous thoughts off the top of the dome. What I'm really focused on right now is the content angle, not so much the other factors. I have many more thoughts on the subject of creating content, and specifically on how to use the technique to maximum advantage. For now, though, I want to ask you guys: What do you think is the best way to create content that passes human inspection?
--Rob
Back
Website: http://www.nickycakes.com
Comment: while i disagree with that linking strategy, and with the idea that you can only put 5 blogs on 1 ip, i like the article. spot on as usual with most of it =)
Comment:
What?!?!
You disagree with me on my own blog????????!
I'm so fucking offended right now.
Website: http://www.nickycakes.com
Comment:
Haha, you know I love you. Yeah, as I was saying I've tried interlinking spammy blogs and what ends up happening is, if one gets shut down, they all end up getting shut down. You would do better not interlinking them and using whatever strategies you can to build inbound links from external sources, and then linking them individually to money sites, not eachother.
As for IP's, you can pack em in pretty good on shared hosting, and for the cost, you can good number of accounts for the price of a dedi.
Website: http://blackhatdigest.com
Comment:
I think I disagree with you on the linking as well. You can certainly put more than 5 blogs on one IP. If your concerned with things just make 5 blogs on topic 1, 5 blogs on topic 2, etc.....
Most of the shared hosts give you so much space and bandwidth these days you can fit 40 blogs on some of them with zero problems (even semi - popular ones).
If you interlink and get busted - you can usually get away with it on the other topics. At least from what's happened to me lately. I don't do much interlinking sitewide at all anymore - so when I do interlink it's from deep pages and just a few instead of hundreds of pages. Might make a difference if your linking sitewide though -
Website:
Comment: Yeah just generate original content on the spam blogs and let them be for a while. They get good rankings.
Website: http://www.inok.su
Comment: I'm interested in buying a database, but the question is how many times you sell one? How it is possible to import one into WordPress to be readable as articles or posts?
Website: http://www.buytravelinsurance.me.uk
Comment:
I also agree on the more than 5 per IP but then with my dedi setup i will be maxing out at 10 per IP. I dont really do spam as i hate geting de-indexedso im more about the scrape and madlib. Country specific sites are easy to madlib with if you can get a good DB of facts and stats about different countries / states.
If I interlink all my scraped sites that have been scraped from lesser sites, i.e. sites with fewer links but older domains, what are the chances i get de-indexed? I have yet to have scraped site banned so I am pushing forward with my scraped sites hoping they will love me long time.
Announcements & News 12 Posts
General news relating to this site
Google Hacking 9 Posts
Oh, the treasures that are to be found on Google!
Links & Points of Interest 9 Posts
Links of interest
Technical 12 Posts
Scripts, Programming, Advanced SEO Techniques
Theory 22 Posts
Off the top of the dome...
Tools & Applications 4 Posts
Tools to help you grow your empire
Twitter 5 Posts
Anything and everything having to do with Twitter
Website Development 4 Posts
Principals and Best Practices for general web development
recent comments:
firewall phil on Finding Recent Proxy Listsdemi on Dude, where's my proxy?!?!
xHydra on Log Spamming is OUT - Widget Spam is IN!
Brad Anderson on Amplify your blog farm with Twitter and Twitterfeed
Rob on Dear Twitter Spammers: You're Doing it Wrong.
The Twitter Blacklist on Dear Twitter Spammers: You're Doing it Wrong.
Rob on Finding Recent Proxy Lists
JohnS0N on Finding Recent Proxy Lists
rob on Finding FREE Keyword Focused Content
Frank on Finding FREE Keyword Focused Content
Subscribe to Recent Posts
Subscribe to Featured Databases
Subscribe to Free Downloads
