Category Archives: SEO


A couple of months ago I stumbled on a really interesting dataset, derived from the awesome Common Crawl.

That dataset was host-level PageRank from the Common Search project. This is a 112M host file with, as the name suggests, computated PageRank from the June 2016 Common Crawl. It’s great data but it’s not hugely accessible to most people – opening a 3.5GB file is not something you can do in Excel and remain sane.

So I set to work over a couple of weekends to turn it into something a bit more usable and am now pleased to introduce:

What is it?

OpenRank is a Domain Authority tool returning a simple 0-100 score. It can be used as a free (albeit less accurate) alternative to Moz DA, Ahrefs Rank and Majestic Trust Flow; but there are a few things that you probably need to know:

  • OpenRank merges the 112M host-names into ~40M registrable root domain names.
  • OpenRank fits a 100-point logarithmic scale to return a simple 0-100 score.
  • There is no de-spamming step and so you should expect inflated numbers for bad domains.
  • A host-level webgraph is going to be inherently less accurate than a page-level webgraph.
  • Sites that block crawlers will have inflated PageRank (due to PR hoarding)
  • OpenRank will be updated whenever a new PageRank dataset is released by Common Search, but is likely a few months out of date at any one time

As I already mentioned, OpenRank is completely free to use, but due to costs on my side there will be rate limits in place. You can make up to 1000 requests per day per IP to the HTTP service, and if you sign up for a free API key you can make up to 10,000 requests per day (and return up to 50 domains per request).

If you need higher rate limits, then just drop me a DM on Twitter: @bmilleare

Why not just use the free version of Moz API?

For detailed (eg. page-level) analysis, the Moz API is definitely the way to go and their metrics are some of the best in the industry. However the free version of their API has very restrictive rate limits. This makes it difficult to pull mass data quickly.

What next?

There are a few things I’d like to do to expand OpenRank’s usefulness – if you sign up for an API key you’ll hear about them first!

SEO Sunday: Oct 16 2011

I know, I know – 3 weeks in and I already missed a couple of posts. In my defence, I’ve been pretty damn busy on a Sunday night but hey ho. As always, this entry is from a series of weekly posts highlighting a handful of interesting links I bookmarked each week(s). For the full archives visit the SEO Sunday category page.

SEO Sunday: Sep 25 2011

This post is from a series of weekly posts highlighting a handful of interesting links I bookmarked in the past 7 days. For the full archives (only two weeks so far) visit the SEO Sunday category page.

  • Engeeno
    What happens when two ex-Google spam-team engineers (@pedrodias and @ArielL) leave and create their own startup? We don’t know yet, but I’m on the list and you should probably be too.
  • 62 steps to the definitive link building campaign
    Great post on wordtracker from Mark Nunney covering everything you need to know for link building. Well, maybe not everything, but it’s a big chunk of knowledge anyhow.
  • Google News Ranking Factors 2011
    I’m too lazy to find out who’s directly behind this site – but if you have a strategy that involves targeting screen space in Google News (which you should) then this is probably one for the bookmarks.
  • Step by Step Guide to Spying on the Competition with KeywordSpy
    So this one isn’t strictly SEO – but if data on a competitor sites PPC strategy isn’t something you’d sit up and take notice of then you aren’t doing your job justice. We’ve all heard of KeywordSpy but SEER has gone the extra mile and even provided a downloadable Excel file to make sense of all the raw data.
  • Basic SEO for Facebook Pages
    I haven’t delved too much into social media profile optimisation as my focus is heavily on enterprise SEO and technical issues but this is a great starting point for anyone looking at giving it a go.
  • How to Game Klout
    Who knew Klout was so easily gamed? and more importantly, who knew that with a high Klout score you could get all sorts of perks in Vegas? time to knuckle down and then book some flights methinks….
  • Google -50 Penalty Research
    Some good discussion on WebmasterWorld of a -50 penalty dissection. Certainly worth a read if you ever have potential leads contacting you over similar issues.

SEO Sunday: Sep 18 2011

This post is from a series of weekly posts highlighting a handful of interesting links I bookmarked in the past 7 days. For the full archives (only two weeks so far) visit the SEO Sunday category page.

  • Tracking the KPIs of Social Media
    You rarely see a poor post by @randfish over on SEOmoz and this one is no different. If you’re from the agency world then you’ve no doubt had a conversation with a client who is scared of dabbling in social because “there’s no way to track it”. Rand covers some great advice on how to get a bit more insight into the impact of being friendly.
  • Using Google Docs To Generate Hot Content Strategies [Tool]
    Very nice tool from ex-HP SEO @DBSEO that takes a little bit of pain out of the process of content strategy research. This triggered a few extra ideas in my head too, which is always nice.
  • Google+ Developer Platform
    Awesome! it’s finally here – the Google+ developer API! oh… no, hold up… let me step back a second here. It turns out it’s pretty much a one-way API right now, so essentially just a data export tool. Looks like all of those spam bots will have to be put back on hold for the time being.
  • A Guide to Long Tail Link Building
    Yet another great post by the famous @rishil this time tackling how to create sub-sets of your head terms and create those all important long-tail anchor text variations. A seemingly simple guide but one I’m sure many SEOs don’t action often enough.
  • Quora: Kevin Lacker’s Answers about Web Search
    I think it was Rand Fishkin that tweeted this out originally but I’m too lazy to go and check. Anyway – this guy “wrote search algorithms at Google” so he knows his shit. There’s some really interesting answers that any hardcore SEO will enjoy reading.
  • Introducing Twitter Web Analytics
    About freakin’ time. Twitter finally launch an official analytics tool. Well, I say launch but it’s still closed to a selected beta group for now. Hopefully it will roll-out more widely pretty soon.
  • Blekko Web Grep
    Blekko certainly get the kudos for empowering the modern SEO with decent tools. This one does exactly what it says on the tin – allows you to essentially grep the web using strings or patterns. This could be awesome for certain niche research tasks.

SEO Sunday: Sep 11 2011

This is the first of a series of weekly posts I plan to do as a kind of ‘best of the week’ run down of what I found interesting on the web. If you too find it interesting and valuable, I’d love to know in the comments!

  • Sneaky Keyword Research
    A great little 5-slide deck from @rishil on some outside-the-box techniques for gaining some valuable keyword research/traffic data.
  • The Reason GA Launched Multi-Channel Funnels
    Ever had a client ask the question of why their Adwords data doesn’t match what GA tells them? just point them at this fantastic post.
  • Hire a botnet! (or just loads of cheap proxies)
    I first saw this via a tweet from @richardbaxter of SEOgadget. It looks pretty dodgy if this is anything to go by, but I can think of TONNES of uses for a service like this where you need access to a large pool of IPs for one-off tasks (did someone say search volume manipulation? [yeah, it happens]).
  • Narrative Science creates machine-written copy that looks human
    A good piece from the NYTimes covering a startup that’s using AI to create unique copy with machine-chosen ‘angles’ that is virtually indistinguishable from human copy. Certainly of interest to anyone in the SEO space.
  • Winning at SEO with duplicate content
    I wasn’t at BrightonSEO but by all accounts this preso went down well. It covers an interesting topic and one that definitely warrants a bit of attention and research time in the near future.
  • How I wrote 500,000 unique GoogleBase Descriptions in 2 hours
    Another post tackling dupe content, but from a completely different angle. Anyone who works on enterprise sites will have hit this problem on multiple occasions and not just for GoogleBase purposes – mostly for repetitive manufacturer blurb on retail products. I took a similar route recently on a client site, albeit with a slightly different process and implementation.