July 08, 2009

Google Rank Extractor

Posted at July 8, 2009 04:18 AM

UPDATES:
July 15, 2009 - I've released the first public beta version. The announcement post is here and the download zip file has been updated.

August 15, 2009 - The GRE tool has been re-coded to be an Ajax application, removing the hard and fast requirement that the site's pages be php based. The announcement post of this update is here and I've also set up a dedicated page to the GRE project, which can be found here.

I've got a new little tool for you today. Yippee!

[If you want to skip the story, the download is here in zip format.]

Okay, what's all this Rank Extractor thing about. First a bit of history.

Forever people have been running (usually automated) queries against Google's search engine in an effort to try to figure out where their site ranks for its search phrases. These automated queries have always been against Google's Terms of Service.

Once upon a time Google would let you do this if you had applied for and used an API License Key in whatever automated rank reporting tool you used. Technically this is still the case, with one rather large caveat. The API License Key that's valid for this sort of thing is what is commonly known as the old style SOAP license. And Google stopped issuing new SOAP style API licenses some years ago now.

So basically if you weren't already doing this stuff years ago you had no way to get an API License to collect any ranking data for your site with regard to Google, while still remaining within the constraints of their TOS.

That's the quick history. Basically unless you were doing this stuff several years ago, you were SOL. Thankfully, all of that is about to change. And change for the better I might add.

A quick hit on the dirty details.

Though most haven't even noticed, Google began changing what shows up in the Referal (sic) area of hits a few months ago. It was originally confirmed and announced on the Official Google Analytics Blog back in mid-April.

The gist of it, and all you really need to know if you're not a tech junkie, is that Google is changing their referral string from the old style that looked something like:

http://www.google.com/search?hl=en&q=flowers&btnG=Google+Search

To a new style that embeds a lot information. The new style referral string looks something like:

http://www.google.com/url?sa=t&source=web&ct=res&cd=7&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&ei=0SjdSa-1N5O8M_qW8dQN&rct=j&q=flowers&usg=AFQjCNHJXSUh7Vw7oubPaO3tZOzz-F-u_w&sig2=X8uCFh6IoPtnwmvGMULQfw

The important part for our purposes, and something they didn't mention in the official release document, is that little bit in the referrer string that says cd= some number.

The really important part is that the "some number" value of this cd variable is the Ranking Position of your site for that phrase in Google at the time the link was clicked.

Say what!?!?! They're finally going to tell us exactly where we rank, for every single phrase that sends us traffic from Google.com? They're going to give us access to more information that we could have ever gotten from running some automated queries against their search engine, because there is no way we could ever extract every phrase that sends us traffic via Google, then guess whether what we saw was the same as what that user saw at the moment they clicked through?

Well, I'm here to tell ya that Google are going to do exactly that! And this is sooooo cool. It gives you relatively easy access to tons of information you've never had available to you before. And all you need to gain access to this new, quite important information is a little tool that extracts the information being sent to you during a Google referral hit.

Thus, my new little tool I've been testing on some of my sites for the last 6 weeks or so that I've coined Google Rank Extractor.

It's a really simple tool when you break down the code. All it does it review the referral string data, extract bits and pieces we want to record for posterity (namely the search phrase used, the ranking position, the version of Google searched, the date and time of the hit and the page of our site the user landed on) then drops it all either into a flat file or MySQL database.

Simple really. But also extremely useful.

I've not yet gotten around to developing a front end to help with the sorting of the data, but I can tell you already that I'm seeing all sorts of phrases I didn't even realize my site ranked well for show up with #1 or similar rankings. And apparently my little sites are kicking butt and taking names on some of the regional versions of Google (eg .co.uk, .com.au, etc) because on some of them my most competitive phrases in the worldwide search are consistently sitting at the #1 spot.

Cool beans!

And the Google Rank Extractor tool is incredibly easy to install if you have a .php based site. Especially if like me you use common header or footer files to drop some parts of your html template into each and every page. And to add to the fun, the tool is so lightweight that there is no discernible lag when the pages load. In fact, basically nothing in the php fires when it's not a Google referral hit.

Even though Google has not yet changed this referer string in all of their data centers yet (I'm only seeing it on about 5% of my Google hits so far) I can't stress enough how much data the tool will collect, nor how valuable the data can be both to your SEO and Marketing efforts.

Personally, I've found all sorts of nuggets that I've been able to leverage. Especially for the longer tail phrases where users are getting incredibly specific because they know what they want. Those are people ready to buy, if you just make sure you tell them you have what they want.

So I would encourage you to install the Google Rank Extractor on your PHP site. Let it run for a few days, then start reviewing the information. And check back here (you might want to bookmark the page, though it'll be mentioned in the files above too) in a few weeks. If nobody else gets to it first I plan on building a little front end that will help out considerably in sorting the data being collected so that it's both easier to consume and more actionable.

And if you're a developer please, please, please consider helping out. There are several ways to do so. You could build the front end, which shouldn't take long but my free time is pretty limited currently. Or you could port the code to other languages so that more people can make use of it. Or anything else that pops into your head. The GRE is released under a standard GNU/GPL license just like all of the other tools I release. So please run with it and improve it!

Trackback

You can ping this entry by using http://www.randycullom.com/chatterbox/mt-tb.cgi/52 .

Comments

Randy this sounds awesome something ive always wanted and by what you say above it will allow me to see my populat keywords and maybe capitalise on them, i will take a look at this and give it a go!

Posted by dentist at July 16, 2009 01:03 PM


Awesome - well done, just trying to think if there is a way of putting this into a JavaScript version that would work on all sites and of course then be useful for all our clients!!! Will have to drop you an email if I manage to do it...

Posted by Gerry White at July 23, 2009 03:16 AM

Hi Randy,

what's the difference between your nifty little tool and Webmaster Tools sitemap 'Top search queries'page.

By the way your photoshop tutorial is excellent, that should be a model answer in the PS user guide.

Regards,

Nick

Posted by Nick at July 29, 2009 06:18 AM

I uploaded all the files and went to the install file like it says in the instructions and it wanted to download the file rather than install anything

Help!?

Posted by cherry at July 30, 2009 03:04 PM

Gerry: Hold that thought for a moment. I have a new version coming out as soon as I get it packaged up that works off of the Javascript method. Technically Ajax, but all it requires on the front end is a <script reference to a javascript file. ;)

It does still require PHP and MySQL be available on the server, but completely removes the necessity of using a php include or of the original source files being php based.

Nick: The main difference is where and when the information is collected. And how it's presented as well.

Google doesn't tell us when or where they collect their page ranking data, however one would assume it's pre-search or at the time of search. GRE collects it after the search, when the user clicks through to the site.

Google WMT also doesn't break out regional Google search data (eg Google.com vs Google.co.uk vs Google.com.au), while the GRE tool does. In fact GRE even tells you if the searcher has selected a Global type of search or if they selected "Show sites from XYZ Country" before searching.

The WMT also doesn't give you things like data/time stamps or tell you which page or pages ranked for the term. GRE does.

Similar in some way, but definitely not the same.

Cherry: Does your server support PHP? From your description is sounds like it may not. You'll want to check with your host on that one.

Posted by Randy at August 10, 2009 10:50 AM


Nice program Randy.

Have you thought about utilizing Google's GeoIP lookup services so we can see any ranking differences between states?

I would've posted on the hr forum but apparently my acct 'dhthwy' has been disabled from using it. Weird.

Posted by Don Hathaway at August 20, 2009 04:00 PM

I've considered lots of additions Don. Sadly however lack of time prevents further development for me at this time. Plus I'm not sure knowing the state of the hit is going to be all that critical for most users. Not in conjunction with the data GRE collects anyway.

This sort of localized data is probably better dealt with by other, more robust analytics.

Posted by Randy at November 2, 2009 11:15 AM

Post a comment










Remember personal info?