Category Archives: APIs


A couple of months ago I stumbled on a really interesting dataset, derived from the awesome Common Crawl.

That dataset was host-level PageRank from the Common Search project. This is a 112M host file with, as the name suggests, computated PageRank from the June 2016 Common Crawl. It’s great data but it’s not hugely accessible to most people – opening a 3.5GB file is not something you can do in Excel and remain sane.

So I set to work over a couple of weekends to turn it into something a bit more usable and am now pleased to introduce:

What is it?

OpenRank is a Domain Authority tool returning a simple 0-100 score. It can be used as a free (albeit less accurate) alternative to Moz DA, Ahrefs Rank and Majestic Trust Flow; but there are a few things that you probably need to know:

  • OpenRank merges the 112M host-names into ~40M registrable root domain names.
  • OpenRank fits a 100-point logarithmic scale to return a simple 0-100 score.
  • There is no de-spamming step and so you should expect inflated numbers for bad domains.
  • A host-level webgraph is going to be inherently less accurate than a page-level webgraph.
  • Sites that block crawlers will have inflated PageRank (due to PR hoarding)
  • OpenRank will be updated whenever a new PageRank dataset is released by Common Search, but is likely a few months out of date at any one time

As I already mentioned, OpenRank is completely free to use, but due to costs on my side there will be rate limits in place. You can make up to 1000 requests per day per IP to the HTTP service, and if you sign up for a free API key you can make up to 10,000 requests per day (and return up to 50 domains per request).

If you need higher rate limits, then just drop me a DM on Twitter: @bmilleare

Why not just use the free version of Moz API?

For detailed (eg. page-level) analysis, the Moz API is definitely the way to go and their metrics are some of the best in the industry. However the free version of their API has very restrictive rate limits. This makes it difficult to pull mass data quickly.

What next?

There are a few things I’d like to do to expand OpenRank’s usefulness – if you sign up for an API key you’ll hear about them first!

Google TKOs Adwords API developers

In what can only be described as a blatant attempt to control Adwords data, Google has started to delete Adwords API developer tokens en-masse based on thresholds and reasons seemingly only to them.

I received the following email this afternoon, which I initially thought was an error (emphasis mine):

As stated in the AdWords API Terms and Conditions (Section II.4), we periodically review AdWords API activity. We noticed that there has been low usage of the AdWords API developer token associated with your My Client Center (MCC) manager ID XXXXXXXXXX in the last 30 days. For the purpose of ensuring quality, improving Google products and services and compliance with AdWords API Terms and Conditions, we have disabled this token.

If you wish to re-apply for the token, please visit the AdWords API Center in your account. Remember to answer the following in detail if you re-apply for the token:

  1. Describe the uses of your API application or tool with specific examples. For instance, account management or bid optimization.
  2. Who is or will be using your API application or tool? For example, colleagues in your company or advertisers or agencies to whom you are selling the tool.
  3. Please attach screenshots of your API application or tool. If the application or tool is yet to be developed, please provide relevant design documentation.
  4. Please provide a list of clients that will be using your API application or tool in an automated way.

Please know that that we will take between 5 and 6 weeks to process all developer token re-applications.

The AdWords API Team

TL;DR – We got screwed for ‘low volume’ usage, we can request re-inclusion but it will take 5-6 weeks.

It would be great to know what Google considers to be ‘low volume’. The developer token in question is used by our research tools (which matches the original application description) and makes under 500,000 API calls per month. Our usage is mostly all at once (pulling fresh data) plus ad-hoc usage throughout the month as new data is requested (sure, this is mostly KeywordEstimator queries). The thing is, there are people on the Adwords support forums who are making in excess of 8 million API calls per month and have also been disabled – so what exactly constitutes low volume, and how much data do we need to pull in order to get re-included?

My theory is that these blanket rejections are actually nothing at all to do with volume of usage. Instead, I think they’re entirely to do with specific usage patterns only. Google doesn’t like 3rd party apps using their keyword volumes for automated keyword research as they see this as ‘gaming’ – instead they want to control the spread of this data through captcha-controlled properties of their own.

This is of course great news for tools like Wordtracker who, once they get their own access back (and they will), will likely clean up in the SEO market as everyone seeks an alternative method to accessing Adwords data outside of the Google API.