9th June 2017
It’s painful to see an important ranking drop, at the best of times. It’s worse when you learn the drop was avoidable. Whether the client changed a tag last week and didn’t tell you or your internal team pushed an unexpected update, no matter what size your team, it can be easy for content updates to slip through the cracks. But, clearly, you can’t spend your whole week trawling through changelogs. What if you could be notified of important changes automatically and prevent the problem in the first place? Enter Python, a simple programming language, to make your life a lot easier.
Anita Valentinova gave a great talk on using Python to automate checks at a recent Brighton SEO and she also prepared these really helpful Python scripts, which we’ll be walking through in this post. (We’ve also added some troubleshooting tips and customisation notes at the end.) Once finished, Anita’s scripts will compare a list of your expected page titles and meta descriptions against the live versions of these tags, record any changes in your spreadsheet, and then send you an email to let you know. Handy, right?
Step 1: Write Your List
First, let’s prepare our spreadsheet of URLs to crawl. For this, we need a list of addresses, and their expected page titles and meta descriptions. (If you haven’t got a URL list already, you can crawl your site with ScreamingFrog SEO Spider.) Then add this information to your sheet where column A is for the URL, B for the Page Title, C for the Meta Description.
Step 2: Get Started with Python
Now, it’s time to get started with Python. If this is your first time using Python don’t worry! It’s not too complicated and you won’t have to change anything once you’ve set it up; you’ll just benefit from the results and look great to your team and clients! Here we go:
- Install the latest version of Python – make sure to tick the box in the installer that says ‘Install Python to PATH’
- Create a new folder in a memorable place called ‘Scripts’. In here there are 2 files: the main script (‘meta.py’) and a configuration script (‘config_script_check.py’), which houses information about where your spreadsheet is stored and where to send an email
- Download Anita’s scripts and extract them to your new ‘Scripts’ folder
- Download the latest version of Chromedriver and save chromedriver.exe in your ‘Scripts’ folder. Chromedriver lets Python navigate web pages.
Before we can run Anita’s scripts, we must install a few extra Python modules. (Essentially, these are groups of functions that mean you don’t have to write your own code.) These are:
- datetime – tells the script the current date and time, necessary for recording updated values in your spreadsheet
- selenium – this operates Chromedriver to pull the data from your URLs
- config – a configuration module
- openpyxl – allows Python to read and write spreadsheets
It’s really easy to install these; with Python installed, just open the Command Prompt and type ‘pip’ followed by the package name, then press enter. (If you have problems with pip, see ‘Troubleshooting’ below.) It will then download and install the necessary files E.g:
pip install openpyxl
Step 3: Configure and Run the Scripts
Next, we need to prepare Anita’s ‘config_script_check.py’ file; go ahead and open this in your preferred text editor (I’m using Brackets). The file is made up of 2 parts:
- Line 2 describes where our Excel sheet is stored, so replace this with the path for your file. For example:
EXCEL_FILE = 'C:\\Username\\Documents\\input.xlsx'
- Lines 4-9 are for setting up email notifications; we’re going to set up a new Gmail address to act as our SEO bot, but you can use any email system that uses SMTP. If using Gmail, just make sure that your Google account’s setting ‘Allow less secure apps’ is set to ‘ON’ from the Connected Apps & Sites page. Next, simply add your new email address, password a recipient email address to ‘config_script_check.py’:
EMAIL_FROM = 'firstname.lastname@example.org' EMAIL_PASSWORD = 'password' EMAIL_SERVER = 'smtp.gmail.com' EMAIL_PORT = 587 EMAIL_TO = 'email@example.com'
And we’re done! Save and close that config file (make sure your spreadsheet is closed too), then double click on ‘meta.py’ to run the script! It should look something like this:
There you have it – check your inbox to see if anything’s changed. All updates are also recorded in a new sheet in your Excel file for you. This is just one use for Python in SEO and there are plenty more, so get involved and automate your checks
Troubleshooting & Customisation
Here are a few tips for getting more out of this script, and some things to try in case it breaks:
- ‘pip install’ not working – Type this instead:
py -m pip install
If it still does not work, uninstall Python and select ‘Custom install’ when re-installing and select the ‘environment variable’ checkbox (don’t forget to check Add Python to PATH too).
- Non-ASCII encoding error – If you get an error that looks something like this when the script attempts to send an email
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 2: ordinal not in range(128)
Navigate to your Python install folder, open the Lib folder and locate ‘smtplib.py’. Open this in your text editor and change row 854 to read:
msg = _fix_eols(msg).encode('ascii', errors='ignore')
This should fix the problem 🙂
- Personalise your email message – To adjust this, open ‘meta.py’ in your text editor and scroll to line 89. Here, you can customise the email message to be whatever you want. At atom42, we host most of our documents in the cloud, so you can paste document link here if you don’t have a shared drive
- Customise your ‘From’ name – e.g. So that the email comes from ‘MattBot’ rather than ‘firstname.lastname@example.org’. To do this replace line 101 in ‘meta.py’ with:
'From:' 'YOURNAME<' + email_from + '>,'
- Automate the script to run daily – Open Windows Task Scheduler and select ‘Create Task’. Give your task a name, switch to the ‘Triggers’ tab and add a new trigger based on a time and select a recurring frequency. Go to the ‘Actions’ tab, add a new action. This should be ‘Start a program’. Type ‘meta.py’ in the ‘Program/script’ box and paste the folder path to your Scripts folder in the ‘Start in’ box. Finally, go to the ‘Conditions’ tab and uncheck ‘Start the task only if the computer is on AC power’ box