SEO APIs: Ranked and Reviewed

David Krevitt

Lover of laziness, connoisseur of lean-back capitalism. Potentially the #1 user of Google Sheets in the world.

There’s an SEO tool for pretty much everything these days, giving you access to everything you need to run your campaigns.

These tools provide dashboards to make data analysis simple (i.e. Google Search Console’s performance data).

gsc api

These dashboards are great, but they’re limited. If you’re looking for deeper insights you’ll want to access the raw data.

With the raw data we can combine data sources to get insights no tool in the market can provide.

For example, instead of just looking at keyword rankings in Google Search Console, we can crosswalk data from Google AdWords to see which keywords are driving paid and organic clicks.

To access and analyze raw data, we have 2 options:

  1. Pulling data manually from the tool’s dashboard via CSV exports (aka, the slow way)
  2. Pulling the data automatically via API (aka, the fast way)

You’re here because you want to learn more about the fast way, aka pulling data using a tool’s API. The remainder of this article will deep dive in the best APIs available for SEO professionals, along with my personal reviews and ratings.

 

What is an API?

In layman’s terms, an API (aka Application Programming Interfaces) is a way to access a tool’s data and pull it into some form of database (whether that’s Google Sheets or a ‘real’ SQL-style database like Google BigQuery).

 

Why use APIs?

Using APIs allows you to avoid a few data analysis traps:

1. Click and Type Overload

Spending time poking around each tool’s dashboard to export reports is a bummer, any way you slice it.

 

2. Missed opportunities for learning

If you’re exporting data from a tool into CSVs, you’ll eventually end up with folders stuffed full of strangely named export files collecting dust.

api data

These tools can be quite expensive (multiple $100s per month), so you’ll want to wring as much value as you can out of the data. Pulling data from an API into some form of database makes that possible.

 

3. Repeatable Analysis

Every SEO we’ve ever worked with combines data from two or more of these tools – mashing up SEMrush keyword rankings data with Google Analytics traffic with Majestic backlinks.

The manual exports from each service often change format, which breaks the spreadsheet formulas you might’ve configured to mash up data (trust us – maintaining this SEO Content Audit template that uses manual exports has been a bear).

Pulling data from APIs into a standard ‘database’ format (whether in Sheets or SQL) allows you to configure a standardized recipe for your data analysis, that can scale to be used across your team.

majestic-sql-query

Internally at CIFL, we do this through our Agency Data Pipeline process, which allows an SEO audit-style analysis to be implemented consistently by an entire agency team.

 

How can I evaluate APIs for SEO?

Let’s dive into how you can evaluate an SEO API and implement it in your analysis process.

We’ve used these tools *a lot* at CIFL, so we’ll also share our personal opinion in a review of the API for each tool listed above (Ahrefs, DeepCrawl, Google Search Console, Majestic, Moz and SEMrush).

Before you pay up for any SEO tool’s API subscription, there are a three tires to kick:

1. Data accessibility

For any API you’re considering, you’ll want to first pick out *how* you’d be able to pull the data.

Generally there are four ways to pull data from an API – ranked in order of ease of use:

import json

With both options 1 and 2, our next move it usually to push data from Sheets up to BigQuery using the CIFL Sheets <> BigQuery connector.

We’ve built a Sheets template that comes pre-loaded with API connectors + BigQuery configuration for Google Search Console, Moz, Majestic and SEMrush, which you can grab from the Template Vault here.

 

2. Data freshness and integrity

Data coverage across SEO APIs can differ widely, depending on the size of the site you’re analyzing, and how often it’s underlying keyword rankings or backlinks are indexed.

At the end of the day, choosing which APIs you trust and prefer for a given dataset is really based on feel.

Some SEOs will only use Majestic for backlinks data, where others find Ahrefs or SEMrush data to be sufficient for sites they’re analyzing – given the difference in indexation frequency between domains, it’s impossible to issue a blanket statement “X API is better than Y API for backlinks data.”

We recommend playing around with a trial account of each service you’re considering and analyzing data integrity by hand before making a decision.

 

3. Price

We recommend considering *total price* of the APIs your SEO analysis package requires, rather than the individual price of each API.

That’s because many of these services overlap – Ahrefs, Majestic, Moz and SEMrush all provide some form of backlinks data.

So if you need keyword rankings + backlinks data, you could use:

Data accessibility and integrity are non-negotiables when working with APIs – so you’d likely choose the APIs whose data you trust and can access easily (via Supermetrics or otherwise).

Minimizing your total cost of analysis is more a matter of selecting from your menu of options once you decide which APIs will get the job done.

 

Reviewing up the APIs

I’m not going to review every SEO API on the market – just the ones we use and recommend.

I’ll be reviewing how each SEO API stacks up against those 3 criteria:

Let’s go!

1. DeepCrawl

Use cases

Accessibility

DeepCrawl’s API is currently not integrated by any 3rd-party provider (Supermetrics, Stitch) to allow you to fetch data without writing code.

Their API is well-documented though, so it’s straightforward to roll your own script integration if you have a developer on your team (this is how we connect to DeepCrawl internally at CIFL as part of the Agency Data Pipeline service).

Freshness + Integrity

DeepCrawl runs a live crawl on your site, so the data is accurate at the time you kick off the crawl.

Price

DeepCrawl prices on a sliding scale ($14 and up) based on the number of projects (sites) and URLs crawled for the month ($62 per month for 3 projects + 40,000 URLs).

For the most part, if you’re using Deepcrawl with more than 3 sites they’ll likely end up building a custom plan for you.

API access is included in each plan, and there is no difference between in-app usage and API usage – which in our opinion is the way life should be.

Given the breadth of datapoints provided by DeepCrawl, we’d say their pricing is completely fair.

They include Majestic backlink count for each page crawled ($399 / month for your own API subscription).

We also use Deepcrawl’s regex crawl functionality, which allows you to pluck out specific pieces of HTML on a page to identify which type of page it is – for example, crawling this course page for the number of ‘$’ present on the page helps us identify it as a product page.

deepcrawl api

 

2. Google Search Console

Use cases

The most commonly-used endpoint is the Query, which returns the impressions, clicks and average position for a given URL and search keyword combination.

Accessibility

API access is available openly via almost any method (Supermetrics or script), although we generally use Supermetrics to pull it internally at CIFL, then push it up to BigQuery using our Sheets to BigQuery Connector Add-on.

Since their API is popular, if you’re looking to go the custom script route, there’s plenty of examples out there for pushing data from Search Console up to your database of choice.

Freshness + Integrity

Google made a *huge* improvement when they opened up Search Console data availability to the previous 16 months (was previously limited to 90 days).

The only downside of the Search Console API, is that Google samples data when it’s returned at a keyword level – ie it may not return results (or complete results) for *every single keyword* that your site is ranking for.

For this reason, summing keyword-level data from GSC won’t add up to the totals displayed in your GSC dashboard.

In the opinion of CIFL, this isn’t the end of the world – all of these APIs return approximate data in some form, rather than absolute truth.

Price

Free!  Can’t beat it.

gsc api

 

3. SEMrush

Use cases

Accessibility

SEMrush data is accessible via almost any method – it returns data in a CSV format from a URL that includes your API key:

https://api.semrush.com/?type=domain_rank&key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&export_columns=Dn,Rk,Or,Ot,Oc,Ad,At,Ac&domain=codingisforlosers.com&database=us

Meaning it can be pulled into Sheets via Supermetrics, via the IMPORTDATA function, or with a simple custom script that pings your URL and returns data.

Their API URL format also allows you to return only specific columns in your response, so you can avoid returning unnecessary data.

Freshness + Integrity

SEMrush’s indices are updated once a month, which is generally sufficient for keyword rankings where intra-month moves are generally spiky noise.

They also allow you to specifically query historical data for a previous month (with the query string &display_date=YYYYMM15), which costs a higher number of credits but can be useful when looking at a site’s data for the first time.

SEMrush offers a ton of datapoints via their API outside of search keywords – some higher quality than others.

For example, their Domain Organic Search Keywords generally has solid coverage across domains we’ve analyzed – but endpoints like Related Keywords or Backlinks can be spotty for some domains.

Price

$399 per month for their ‘Business’ plan, plus the cost of credits.

They used to offer an introductory $15 per month API plan, which allowed lots of SEOs to get started using SEMrush data in Google Sheets – but unfortunately they discontinued that plan a while back.

This ‘base price + credit’ setup does make it difficult to estimate your total cost of ownership for the month, and you have to keep an eye on how quickly you’re burning through your credits.

The upside though, is that SEMrush’s dashboard does provide a lot of functionality that your team likely already uses – so the cost isn’t just attributed to the cost of fetching raw data.

semrush dashboard

 

4. Majestic SEO

Use cases

Accessibility

Majestic’s API is openly available without authentication, meaning you can pass a URL request containing an API key:

_https://api.majestic.com/api/json?app_api_key=API_KEY&cmd=GetBackLinkData&item=majestic.com&Count=5&datasource=fresh_

And return data in JSON format to pretty much anywhere – via Supermetrics’ JSON connector, a custom Sheets formula like IMPORTJSON, or a Python script run on the command line.

At CIFL, we’ve built an internal template that behaves much like IMPORTJSON, but allows us to return only specific columns from the result (saving you a lot of space against Google Sheets’ 5 million cell limit) – you can grab that template from the CIFL Vault here.

Freshness + Integrity

Majestic provides freshness explicitly by providing two separate indices: fresh and historic (denoted by the “&datasource=” section of your query string).

Fresh are backlinks crawled within the last 90 days, and historic includes all-time data – their historic index goes back to 2012, about as long as you’d humanly need given how much the internet has changed in the last 5 years.

Price

Majestic’s pricing is based on a sliding scale of “analysis units” ($399 per month for 100 million units, up to $2,999 per month for 3 billion units). In our experience, we’ve never seen an agency run out of units in a month on the lowest tier API plan ($399 per month).

majestic seo

 

5. Ahrefs

Use cases

Accessibility

Ahrefs recently released their open API, which we’re excited about (previously it could only be accessed via an apps marketplace).

It returns data as JSON, and has straight token-based authentication, meaning you can pull backlinks data in the same ways you can from Majestic (the IMPORTJSON function in Sheets, the Supermetrics JSON connector, or a command-line script):

https://apiv2.ahrefs.com?token=(ENTER-TOKEN-HERE)&target=example.net&limit=1000&output=json&from=ahrefs_rank&mode=subdomains

Freshness + Integrity

Ahrefs is known for having a frequently-updated a rich index of backlinks, which lots of our members over at the Blueprint Training use and love.

Price

The Ahrefs API is a bit more expensive than the other backlinks providers, with pricing starting at $500 per month and going up based on volume.  As of this moment (March 5, 2020), their base plans is priced about like Moz’s Low Volume plan.

ahrefs

 

6. Moz

Use cases

Accessibility

Moz’s API used to be accessible via pretty much any connection method – but since they made a change to their authentication setup, it’s currently not available in our starter template (we’d recommend using Supermetrics).

Freshness + Integrity

Backlinks data is only updated once per month, which is too slow for many SEOs we work with (if we’re wrong on this, please Tweet at us).

And given that domain authority is a proprietary algorithm, we have no choice but to take their word for it on integrity.  It’s nice that folks generally accept DA + PA as a standard metric of authority, but as far as we know there’s no way to vet it.

Price

Ranges from $250 per month for 120,000 rows, up to $10,000 per month for 40 million rows (a row is equal to one backlink or Moz metrics for one URL).

In our experience, the 120,000 rows would likely be eaten up by pulling backlinks for a medium-sized agency – so the all in cost-of-use is likely to match up with that of Ahrefs’ ($500) base plan, Majestic’s base plan ($399 / month), or SEMrush ($399) unless you’re working at a relatively small scale.

moz api

 

How to use SEO APIs

Now it’s time to put this data to work – what can we do with all this glorious data? The options are limitless, but here’s a few ways we leverage them for our clients at Coding is for Losers.

Scrape schema types from pages

Using the Deepcrawl API and Deepcrawl’s “custom extractions” feature, we pull in schemas present on each of a site’s pages.

We use this as part of our Website Quality Audit BigQuery Recipe, in order to generate recommendations about which schemas should be present on a given page:

website quality audit schema recommendations

To pull that via the API (or the Deepcrawl dashboard), we pass the following regex custom extraction (under Advanced Settings -> Custom Extractions) when setting up the crawl:

[‘\”]@type[‘\”]\s?:\s?[‘\”]([^\”‘]*)|(?<=schema.org\/)(\w+)

Which returns each of the schema’s for our pages like so:

deepcrawl custom extraction schema type

This saves us *a ton* of time when making schema recommendations.

Keyword rank tracking

A bunch of APIs provide keyword-level ranking data: SEMrush, Ahrefs and, of course, Google Search Console.

Generally what we’ll do with this data, is build a database (either in Sheets or BigQuery) to pull in keyword rankings each month, so that we can see progress over time.

We use this data downstream as part of our Monthly SEO Report at the Blueprint Training, which pulls monthly keyword data into a Google Data Studio report.

monthly keyword tracking

Search Console is, of course, the cheapest way to access this data, and you can pull up to 16 months of history at a time – but we still like to use either SEMrush or Ahrefs as a secondary check on those average position numbers.

Tagging posts with content topics

When you’re doing a content audit, or analyzing your internal link graph, it’s critical to have an understanding of what topic each page on your site covers.

In our Internal Linking Optimization Sheets template over at the Blueprint Training, we do this by:

  1. Pulling in Search Console keyword data using Supermetrics
  2. Setting a content topic hierarchy, mapping key phrases to content topics (ie BigQuery -> “Data Pipeline”)
  3. Using a regexmatch formula to tag each page’s top keyword with a topic

At the end of the day, this makes it very easy to match up potential internal link pairs, since we know roughly which pages are relevant for each topic the site focuses on:

search console content topic tagging

 

What’s next?

Ready to take the next step? I’ve got 2 options for you:

  1. Build something yourself.  Hopefully this SEO API starter template will help cut your ‘time to glory’ with some of these APIs – pick it up from the Template Vault on Trello here.
  2. Work with us to build something. Our team is standing by to help you wrangle the data and build something amazing. Drop us a note if you want to chat.

As always, drop us a note on Twitter (@losersHQ) if you have any questions.

Wanna get more done?

✋ I can help with that…you can find everything we've ever built in this single Template Vault.

CRACK THE VAULT