Transcraping - Translation Scraping
Why settle for just English language content?
I want to introduce a new topic to you guys: Translation Scraping. Now a day's you see lots of scraper sites that scrape RSS feeds and republish content in adsense laden sites. Well that's all well and good, but clearly, we have other tools in our arsenal to monetize scraper splogs.....we have the ability to translate on the fly.
Consider this: A simple script that takes a keyword, does a google blog search for that keyword, collects all the urls that come up as a match, passes that URL to an online translator, and then posts the translated content to a blog via xml-rpc.
I mean, why not? If you are going to scrape sites in the same language, you might as well cover your bases and give'er in other languages too! Come on, show a little multiculturalism for christ's sake...........
Here is a little example I hacked up using a post from my good friend Eli over at Blue Hat Seo. From my experience, I happen to know that scraped splogs his content convert really well with the russian market........( sorry buddy :P )
This is programmed in ruby and uses mechanize and the xml-rpc library
require 'xmlrpc/client'
module MetaWebLogAPI
class Client
def initialize(server, urlPath, blogid, username, password)
@client = XMLRPC::Client.new(server, urlPath)
@blogid = 1
@username = "bingobango"
@password = "password"
end
def newPost(content, publish)
@client.call('metaWeblog.newPost', @blogid, @username,
@password, content, publish)
end
end
end
require 'mechanize'
agent = WWW::Mechanize.new
agent.user_agent_alias = "Mac Safari"
agent.set_proxy('localhost', '8118')
@source = "http://www.bluehatseo.com/followup-seo-empire-part-1/"
@url = "http://www.online-translator.com/url/tran_url.asp?lang=en&url=#@source&direction=er&template=General&cp1=NO&cp2=NO&autotranslate=on&psubmit2.x=40&psubmit2.y=7"
doc = agent.get @url
title = doc.search("p.post-info").inner_text
guts = doc.search("div.post-content").inner_text
client = MetaWebLogAPI::Client.new('bingobango.wordpress.com', '/xmlrpc.php', 'bingobango', 'bingobango', 'password')
blogpost = {'title' => title, 'description' => guts, }
client.newPost(blogpost, true)
And you can stroll on over to http://bingobango.wordpress.com/ to see the results of our handiwork.
The really cool thing about this, is you can create a spider that automates these procedures indefinately.....so create a script that monitors a group of keywords, and create a few blogs (depending on the size of the keyword niche, and how much content you are dealing with) and have your spider automatically translate and post new content as it comes in.
--Rob
Back
Comment: Good luck man! If you convert it and feel like posting up your script for others to see, feel free to leave it here in the comments!
Website: http://www.adeptmarketingconcepts.com
Comment: My mind is racing with ideas here. It's time to "recycle" some content making it fresh and clean for Google ;)
Announcements & News 14 Posts
General news relating to this site
Google Hacking 9 Posts
Oh, the treasures that are to be found on Google!
Links & Points of Interest 9 Posts
Links of interest
Technical 14 Posts
Scripts, Programming, Advanced SEO Techniques
Theory 23 Posts
Off the top of the dome...
Tools & Applications 5 Posts
Tools to help you grow your empire
Twitter 6 Posts
Anything and everything having to do with Twitter
Website Development 4 Posts
Principals and Best Practices for general web development
recent comments:
nickycakes on I Could Be Anythingabdul on An open letter to all my Friends across all Social Networks.
Musashi on Fun with String Permutations
Rob on An Introduction to Datapresser's Content Generator
stack paper on An Introduction to Datapresser's Content Generator
stack paper on An Introduction to Datapresser's Content Generator
big man on Dude, where's my proxy?!?!
5ubliminal on Stuffing website inputs: A technique for gaining backlinks.
abdul on Stuffing website inputs: A technique for gaining backlinks.
Paul on An Introduction to Datapresser's Content Generator
Subscribe to Recent Posts
Subscribe to Featured Databases
Subscribe to Free Downloads
