If you’re reading this, it’s already too late to do your work without the glorious SEO tools available to you today.
The evolution of SEO data services means that all the data you’d ever need to run an SEO campaign is readily available (click directly on a tool to read our API review):
- Backlinks data from services like Ahrefs, Majestic, Moz and SEMrush
- Keyword rankings and search volume data from Google Search Console and SEMrush
- Proprietary SEO rankings like Moz’s domain authority, or Majestic’s trust flow
- Site crawl data from tools like DeepCrawl
But as we all know, there’s a big difference between having *access* to data and using it effectively.
The traditional way to use data from these services is through their dashboard (here’s Google Search Console for example):
It’s got some pretty pictures there, but if we’re looking to do real analysis we’ll want to access the raw data.
Then we can look at trends for specific target keywords + pages, and mash up that raw data with data from other services (like Google Analytics traffic + conversions, or Majestic backlinks).
If we want raw data from a tool like Majestic, there are only two ways to get it: the Slow Way and the Fast Way.
The Slow Way means pulling data manually from the tool’s dashboard via CSV exports.
The Fast Way means pulling data via the tool’s API (Application Programming Interfaces) into some form of database (whether that’s Google Sheets or a ‘real’ SQL-style database like Google BigQuery).
If you’re ready to dive right into SEO APIs, grab this Sheets template from the Template Vault on Trello here. It comes pre-loaded with API connectors and BigQuery configuration for APIs like Google Search Console, Majestic, Moz and SEMrush.
Using APIs allows you to avoid a few data analysis traps:
Click and Type Overload
Spending time poking around each tool’s dashboard to export reports is a bummer, any way you slice it.
Missed opportunities for learning
If you’re exporting data from a tool into CSVs, you’ll eventually end up with folders stuffed full of strangely named export files collecting dust.
These tools can be quite expensive (multiple $100s per month), so you’ll want to wring as much value as you can out of the data. Pulling data from an API into some form of database makes that possible.
Every SEO we’ve ever worked with combines data from two or more of these tools - mashing up SEMrush keyword rankings data with Google Analytics traffic with Majestic backlinks.
The manual exports from each service often change format, which breaks the spreadsheet formulas you might’ve configured to mash up data (trust us - maintaining this SEO Content Audit template that uses manual exports has been a bear).
Pulling data from APIs into a standard ‘database’ format (whether in Sheets or SQL) allows you to configure a standardized recipe for your data analysis, that can scale to be used across your team.
Internally at CIFL, we do this through our Agency Data Pipeline process, which allows an SEO audit-style analysis to be implemented consistently by an entire agency team.
APIs to the Rescue
Let’s dive into how you can evaluate an SEO API and implement it in your analysis process.
We’ve used these tools *a lot* at CIFL, so we’ll also share our personal opinion in a review of the API for each tool listed above (Ahrefs, DeepCrawl, Google Search Console, Majestic, Moz and SEMrush).
Before you pay up for any SEO tool’s API subscription, there are a three tires to kick:
For any API you’re considering, you’ll want to first pick out *how* you’d be able to pull the data.
Generally there are four ways to pull data from an API - ranked in order of ease of use:
- Using Supermetrics to pull data directly into Sheets.
- Using Google Sheets functions like IMPORTJSON and IMPORTDATA to pull data directly into Sheets.
- Writing Python scripts to pull data from the command line (using a framework like Singer). Custom scripting is generally a no-go unless you have a quality developer on your team.
- Via data pipelining (ETL) tools like Stitch, which provide a UI to push data from an API up to your database of choice. Unfortunately, Stitch has limited coverage for SEO APIs, so we don’t use them much for SEO, but they may cover them in the future.
With both options 1 and 2, our next move it usually to push data from Sheets up to BigQuery using the CIFL Sheets <> BigQuery connector.
We’ve built a Sheets template that comes pre-loaded with API connectors + BigQuery configuration for Google Search Console, Moz, Majestic and SEMrush, which you can grab from the Template Vault here.
Data Freshness and Integrity
Data coverage across SEO APIs can differ widely, depending on the size of the site you’re analyzing, and how often its underlying keyword rankings or backlinks are indexed.
At the end of the day, choosing which APIs you trust and prefer for a given dataset is really based on feel.
Some SEOs will only use Majestic for backlinks data, where others find Ahrefs or SEMrush data to be sufficient for sites they’re analyzing - given the difference in indexation frequency between domains, it’s impossible to issue a blanket statement “X API is better than Y API for backlinks data.”
We recommend playing around with a trial account of each service you’re considering and analyzing data integrity by hand before making a decision.
We recommend considering *total price* of the APIs your SEO analysis package requires, rather than the individual price of each API.
That’s because many of these services overlap - Ahrefs, Majestic, Moz and SEMrush all provide some form of backlinks data.
So if you need keyword rankings + backlinks data, you could use:
- SEMrush or Search Console for keyword rankings
- Majestic, Moz or SEMrush for backlinks
Data accessibility and integrity are non-negotiables when working with APIs - so you’d likely choose the APIs whose data you trust and can access easily (via Supermetrics or otherwise).
Minimizing your total cost of analysis is more a matter of selecting from your menu of options once you decide which APIs will get the job done.
Sizing up the APIs
Let’s dive into how each SEO API stacks up against those 3 criteria: data accessibility, data freshness, and price. We’ll rank each aspect out of 3 possible claps.
Ahrefs for Backlinks
View their API docs.
Zero claps 🙁 - unfortunately Ahrefs does not allow open access to data through their API, so we’re not able to use it.
This is a bummer, as many SEOs consider their backlinks data to be the best around - drop them a note on Twitter to request that they open it up!
Freshness + Integrity 👏👏👏
Ahrefs is known for having a frequently-updated a rich index of backlinks.
Zero claps 🙁 - since they don’t have an open API for us to use, I guess it’s technically priceless?
DeepCrawl for Site Crawls
View their API docs - we generally only use the endpoints to create and get Projects, Crawls and Reports.
DeepCrawl’s API is currently not integrated by any 3rd-party provider (Supermetrics, Stitch) to allow you to fetch data without writing code.
Their API is well-documented though, so it’s straightforward to roll your own script integration if you have a developer on your team (this is how we connect to DeepCrawl internally at CIFL as part of the Agency Data Pipeline service).
Freshness + Integrity 👏👏👏
DeepCrawl runs a live crawl on your site, so the data is accurate at the time you kick off the crawl.
DeepCrawl prices on a sliding scale ($79 - $199 and up for enterprise plans) based on the number of URLs crawled for the month (100,000 for $79, 500,000 for $199).
API access is included in each plan, and there is no difference between in-app usage and API usage - which in our opinion is the way life should be.
Given the breadth of datapoints provided by DeepCrawl, we’d say their pricing is completely fair.
They include Majestic backlink count for each page crawled ($399 / month for your own API subscription).
We also use Majestic’s regex crawl functionality, which allows you to pluck out specific pieces of HTML on a page to identify which type of page it is - for example, crawling this course page for the number of ‘$’ present on the page helps us identify it as a product page.
Google Search Console for Keyword Rankings + Search Volume
View their API docs - the most commonly-used endpoint is the Query, which returns the impressions, clicks and average position for a given URL and search keyword combination.
API access is available openly via almost any method (Supermetrics or script), although we generally use Supermetrics to pull it internally at CIFL.
Since their API is popular, if you’re looking to go the custom script route there’s plenty of examples out there for pushing data from Search Console up to your database of choice.
Freshness + Integrity 👏👏
Google made a *huge* improvement when they opened up Search Console data availability to the previous 16 months (was previously limited to 90 days).
The only downside of the Search Console API, is that Google samples data when it’s returned at a keyword level - ie it may not return results (or complete results) for *every single keyword* that your site is ranking for.
For this reason, summing keyword-level data from GSC won’t add up to the totals displayed in your GSC dashboard.
In the opinion of CIFL, this isn’t the end of the world - all of these APIs return approximate data in some form, rather than absolute truth.
Free! Can’t beat it.
Majestic for Backlinks
View their API docs - our most used endpoint is GetBacklinksData, which provides a list of backlinks for a domain or individual page, including their proprietary ranking of Trust Flow and Citation Flow.
Majestic’s API is openly available without authentication, meaning you can pass a URL request containing an API key:
And return data in JSON format to pretty much anywhere - via Supermetrics’ JSON connector, a custom Sheets formula like IMPORTJSON, or a Python script run on the command line.
At CIFL, we’ve built an internal template that behaves much like IMPORTJSON, but allows us to return only specific columns from the result (saving you a lot of space against Google Sheets’ 2 million cell limit) - you can grab that template from the CIFL Vault here.
Freshness + Integrity 👏👏👏
Majestic provides freshness explicitly by providing two separate indices: fresh and historic (denoted by the “&datasource=” section of your query string).
Fresh are backlinks crawled within the last 90 days, and historic includes all-time data - their historic index goes back to 2012, about as long as you’d humanly need given how much the internet has changed in the last 5 years.
Majestic’s pricing is based on a sliding scale of “analysis units” ($399 per month for 100 million units, up to $2,999 per month for 3 billion units). In our experience, we’ve never seen an agency run out of units in a month on the lowest tier API plan ($399 per month).
Moz for Backlinks and Domain Authority
Moz’s API used to be accessible via pretty much any connection method - but since they made a change to their authentication setup, it’s currently only accessible via Supermetrics.
Freshness + Integrity 👏
Backlinks data is only updated once per month, which is too slow for most SEOs we work with.
And given that domain authority is a proprietary algorithm, we have no choice but to take their word for it :/.
Ranges from $250 per month for 120,000 rows, up to $10,000 per month for 40 million rows (a row is equal to one backlink or Moz metrics for one URL).
In our experience, the 120,000 rows would likely be eaten up by pulling backlinks for a medium-sized agency - so the all in cost-of-use is likely to match up with that of Majestic’s base plan ($399 / month) unless you’re working at a relatively small scale.
SEMrush for Keyword Rankings and Search Volume
View their API docs. Our favorite endpoints are Domain Overview History, which returns a monthly review of keyword count and search volume, as well as Domain Organic Search Keywords, which returns specific keyword rankings for a given domain.
SEMrush data is accessible via almost any method - it returns data in a CSV format from a URL that includes your API key:
Meaning it can be pulled into Sheets via Supermetrics, via the IMPORTDATA function, or with a simple custom script that pings your URL and returns data.
Their API URL format also allows you to return only specific columns in your response, so you can avoid returning unnecessary data.
Freshness + Integrity 👏👏
SEMrush’s indices are updated once a month, which is generally sufficient for keyword rankings where intra-month moves are generally spiky noise.
They also allow you to specifically query historical data for a previous month (with the query string &display_date=YYYYMM15), which costs a higher number of credits but can be useful when looking at a site’s data for the first time.
SEMrush offers a ton of datapoints via their API outside of search keywords - some higher quality than others.
For example, their Domain Organic Search Keywords generally has solid coverage across domains we’ve analyzed - but endpoints like Related Keywords or Backlinks can be spotty for some domains.
$399 per month for their ‘Business’ plan, plus the cost of credits.
They used to offer an introductory $15 per month API plan, which allowed lots of SEOs to get started using SEMrush data in Google Sheets - but unfortunately they discontinued that plan a while back.
This ‘base price + credit’ setup is pretty lame - it leaves the burden on us, the user, to calculate our total cost of ownership for the month.
This lameness, as well as their thrashing on pricing, are why we give their pricing a single 👏.
The upside though, is that SEMrush’s dashboard does provide a lot of functionality that your team likely already uses - so the cost isn’t just attributed to the cost of fetching raw data.
Diving Into Working with APIs
These APIs can be a lot to navigate!
Hopefully this starter template will help cut your ‘time to glory’ with some of these APIs - pick it up from the Template Vault on Trello here.
As always, drop us a note on Twitter (@losersHQ) if you have any questions.