In our industry, it’s easy to fall into the trap of headline reading.
We’re constantly bombarded with new content giving us that “FOMO” feeling, making us feel like we need to learn a new skill in order to keep pace.
I see this happening right now with Python and SEO.
Python is a coding language that helps to automate tasks, which [in theory] is helpful for speeding up tasks like keyword intent tagging.
I love Python – I’ve been using it for a decade to automate tedious tasks for our clients.
However, if there’s one thing we want to avoid as technical marketers, it’s the rabbit hole of shiny object syndrome.
Shiny object syndrome pulls us deep into learning a new skill, but during the process we lose sight of why we were learning it in the first place.
In other words, the goal of Python is to speed up your workflow…But often times, it ends up slowing you down.
Friends, I’ve spent weeks lost in these rabbit holes chasing shiny objects, wasting valuable time I’ll never get back.
This article will discuss our thoughts on learning Python for SEO analysis and some alternatives that will keep you far away from that rabbit hole.
Here at CIFL, we don’t believe in wasting time learning skills that are not critical to success.
We use 3 simple questions to qualify if a project is worth undertaking and focus on building the minimum viable technical stack.
These questions identify how much “coding” is necessary – and whether that means spreadsheet formulas, SQL statements or a Python script.
Our projects generally revolve around analyzing marketing data, so throughout this post we’ll focus on a technical SEO example that’s close to our hearts.
Tagging an organic search keyword with ‘searcher intent.’
Python has been a popular choice to automatically tag intent – but is it the best option?
It’s an interesting data challenge, because this tagging can be done in literally any stack I mentioned above.
But whether you’ll ultimately solve it in Google Sheets, SQL or Python is highly dependent on scale.
Let’s dive into how we’d lay out the project, using those 3 questions as our guideposts.
To avoid mistakenly careening into a rabbit hole, it helps to get a 2nd opinion.
Sharing your work early with clients / colleagues will help you validate that what you’re building is at least going in the right direction (aka you’re not in the wrong rabbit hole).
Let’s get into our example – the task of tagging SEO keywords by “intent.”
When going through the keyword research process, it’s best practice to tag each keyword by “intent”. Generally speaking, that’s broken down into buckets (see graphic below for 2 types).
Understanding why people are coming to our site helps design the site properly to match their expectations – answering key questions:
If we apply this concept to CIFL, “google sheets tutorial” would be classified as an informational keyword. “Google sheets custom dashboard” would be classified a transactional keyword.
Knowing this can help you focus your content on what searchers are looking for, rather than what you think they’re looking for.
So the key technical challenge here is:
Like I mentioned above, this challenge can be solved handily in either Google Sheets, Apps Script, SQL, Python (really any language).
No matter what stack we choose, we’ll ultimately probably use something called regex (for regular expressions) to apply keyword intent tags.
Regex can be run in Sheets, SQL, or Python – so a quick prototype in Sheets can be a foundation we can build on.
This is the case for almost any marketing data analysis – you can prove the concept at a small / medium scale with Google Sheets formulas, then scale up from there if need be.
For example, if we wanted to tag keywords for a 1 million page site, Sheets is unlikely to handle the entire dataset gracefully.
But it can help you nail down your business logic on a 10,000 page sample, which you can then port into a database as you scale up.
The basic logic we’d apply for tagging a keyword as “informational” in Sheets would be:
That formula tags any keyword containing either “how to” or “tutorial” (which makes the regexmatch true) as having an “informational” intent:
Having a prototype in Sheets allows you to at least validate your intent tagging logic on a segment of ~100k keywords, to make sure the output will be helpful to your team.
So do we stop at Google Sheets after validating our prototype, or do we push onwards to SQL or Python?
It all depends on the size of data we’re working with.
Regardless of whether we choose to stay in Sheets or move to a database, the important thing is getting all of our data in one place, so that we can analyze it consistently.
The deepest rabbit hole you can fall into is not being able to repeat your work at will – chasing ghosts as you try to put the pieces back together.
So if we’re working in Sheets, we’d use Add-ons like Blockspring or Supermetrics to pull in keyword data from Google Search Console directly into our Sheet.
If we’re working in BigQuery, we use our own BigQuery Recipes internal tool to pull data from the GSC API into our project.
Just for fun, here’s how we’d execute that same keyword intent tagging Regex in BigQuery SQL:
CASE
WHEN regexp_contains(search_query, {{informational_intent_regex}}) THEN ‘informational’
WHEN regexp_contains(search_query, {{transactional_intent_regex}}) THEN ‘transactional’
ELSE ‘untagged’ END as keyword_search_intent
Note: If you’ve been eager to learn SQL in BigQuery, pick up our free course here.
Please, my friends – if you’re writing a Python script that you plan on sharing with your team, please set it to run on your database!
That way, your team can more easily reproduce your results, vs having to assemble your dataset from the ground up before running the script.
You can easily use Google Cloud Functions to run a script on your BigQuery project.
If you’re curious, we’d write a similar Regex matching script in Python like so:
informational_regex = re.compile(‘informational_regex’)
transactional_regex = re.compile(‘transactional_regex’)
for search_query in search_queries:
if informational_regex.match(search_query):
intent = “informational”
if transactional_regex.match(search_query):
intent = “transactional”
These approaches (Sheets, SQL and Python scripting) all output the exact same results, and are roughly the same number of lines of code / formulas.
So it all comes down to data size, as well your personal comfort with each toolbelt.
Because you still need to ask yourself…
As we like to say at CIFL: “SOP or it didn’t happen.”
See our Website Quality Audit for an example of how we write data analysis SOPs.
If you can’t clearly delineate what’s happening under the hood of your data analysis process, you’re in trouble.
So we must ask…Can the rest of your team reproduce the analysis on their own, without you?
Does your Sheet have a ‘Getting Started’ tab, that lays out the exact steps someone can take to reproduce your analysis?
Does it have a ‘Summary’ tab, that contains everything folks truly need to know about the results?
We highly recommend using a framework like DBT when building SQL models.
DBT allows you to store your SQL files in neat folders of a Github repo, that can be accessed by anyone that needs them.
No more storing SQL queries in random files on your desktop!
And, most importantly, if you’re using a database like BigQuery, it’s super easy to produce reports with a visualization tool like Data Studio:
Just like SQL models built with DBT, we highly recommend that any Python scripts you write live on Github, rather than on your local machine.
Your local machine is an intellectual cul-de-sac where good scripts turn bad.
Publishing a Git repo forces you to write a Readme (which can contain key links to results), and generally makes you appear to your team / clients that you’ve got your act together.
If you think you’re going into a rabbit hole right now, hang up and dial 911.
OK, maybe don’t do that – but definitely step back for a moment, take a deep breath or a walk around the block, and then come back to your keyboard.
If you’re not rushing your work, you’re at a much lower risk
But to wrap up, for the maximum Shiny-object-syndrome risk deterrent, ask yourself these four questions before kicking off a project:
Hope these help you keep your head out of technical rabbit holes!
If SQL ends up being your path of choice, our Getting Started with BigQuery course is here to help.
And if you’re curious to see how our team builds these tools internally, inquire about working with CIFL.
Get off the ground quickly, with customizable data pipeline Recipes for BigQuery.
GET COOKING