If you have dipped your toes into the extensive waters that is SEO, you may know that it can be broken down into three key areas: on page SEO, off page SEO and lesser known to the Average Joe, technical SEO. But what are the differences between the three? Put simply, onpage SEO can be considered optimising individual pages to increase visibility in search and increase traffic (MOZ). Off page SEO is, as the name indicates, any work taken outside of a website to increase visibility and traffic (Ahrefs).
Lastly, technical SEO focuses on optimising websites as a whole for crawling and indexability. In saying this, technical SEO can also include any process intended on improving visibility in search.MOZ
While onpage and technical SEO can be considered similar in the sense that work is generally completed on the website itself, if a website’s technical SEO elements are not optimised, specific onsite efforts will probably be less successful (or not recognised at all by search engines!). In saying this, I think we can agree that conducting a technical SEO audit and fixing issues in relation to searchability and indexability should be the first port of call when working on a website. This post will be covering some of the key elements of a technical SEO audit and what tools you can use. Stay tuned for a deep dive on each of these audit elements in future posts.
The Technical SEO Audit – What Do I Need To Look At?
There are many elements involved in a technical SEO audit – it can be difficult to know where to start! Read on to learn about several of the key tasks that should be completed during a technical SEO audit:
Check your website’s robots.txt file
According to Google, a robots.txt file is a set of instructions that tells web crawlers what pages, images or other files can or can’t be requested from your website to show in search results. The crawl instructions in this file are specified with ‘allow’ and ‘disallow’ rules. To check your website’s robots.txt file, enter your website in a browser’s search bar, followed by /robots.txt (e.g. https://www.example.com/robots.txt).
When entered correctly, your robots.txt file will appear in text format like in image 1 below. Every website’s robots.txt file instructions are different – allow and disallow rules for an eCommerce store will be different to that of a service based website. It is important to note that if your website displays ‘disallow: /’, it means that your robots.txt file is telling crawlers not to crawl and index your entire website. See Image 1 below for an example of a robots.txt file that tells crawlers to not crawl and index a website, as well as an example of a robots.txt file that allows crawlers to access all content on a website.
Check your website’s sitemap to ensure all important pages are present
A website sitemap is a file that lists all the important pages, videos and other files on a website (Google). A sitemap can generally be viewed by adding /sitemap.xml or /sitemap_index.xml at the end of your website’s domain (e.g. https://www.example.com/sitemap.xml). Be aware that the appearance of a website’s sitemap can vary from website to website – some sites contain a single list of URLs, others have sitemap categories within their main sitemap file. See images 2 and 3 below for examples of what a sitemap can look like. If you are using a WordPress website, YoastSEO is the recommended plugin to keep your sitemap in check.
Keep in mind that just because you have a sitemap, it doesn’t mean that search engines won’t crawl and index resources that are not in the file. If pages, images and files can be found and crawled on a website, they often will be – however, by having a website sitemap, you’re letting crawlers know not only what is important, but also make crawlers aware of anything that is not easily found (e.g. no links pointing to the page), as well as helping them to understand your website architecture better (Yoast, MOZ).
Check your website’s speed
Having a fast website, particularly the mobile version, is critical. As of July 1 2019, mobile-first indexing has been enabled by default in Google for all new websites, with older and existing websites being assessed by Google’s best practice standards until they are recognised as ready. You will know your website has transitioned to mobile-first indexing when you log into Google Search Console as you’ll see a notification appear like image 4 below.
But is website speed really that important? Think With Google research has found that the slower a website page’s load time, the bounce rate (the rate of users clicking away from your website without taking any action) increases significantly. Even a page load time increasing from just 1 second to 3 seconds increases the probability of a user bouncing by 32%! Image 5 below provides more insight about how longer load time increases the probability of a user bouncing away from a website. Needless to say, because of how it impacts user experience, page speed is considered an important ranking factor by search engines and is also taken into consideration with Google Ads (MOZ, Google).
So, how can you test your website’s speed? There are a number of tools that are great for looking at speed and identifying issues that need to be fixed. Helpful speed tools include:
Google PageSpeed Insights
An online tool to test both your desktop and mobile speed, Google PageSpeed Insights uses data that is analysed by Lighthouse. Lighthouse is an open-source tool that reviews web page performance, particularly speed – you can download the free Chrome extension here.
Pingdom Website Speed Test
The Pingdom Website Speed Test can be used to test a website’s page load time and uncover any issues that require attention. You can choose the test location – be sure to test from either the location that is closest to your server, or from where the majority of your web traffic is coming from. You can find out this information from your Google Analytics account.
Webpagetest.org can be used to test website speed from different locations and browsers. When selecting a browser, we recommend Google Chrome or the browser that is most commonly used by your page visitors – or you can choose to test more than one! User browser usage data can also be found in Google Analytics.
Check your website’s HTTP response codes for errors and other issues
I think we can all agree that one of the worst things for website user experience is landing on an error page, particularly one that doesn’t direct the user back to a relevant page on the website. Therefore, it’s important to check your website for errors during a technical SEO audit. When auditing your website for errors, you’re reviewing your web page’s response codes. Response codes let you know what happens when a page, image or other file is requested from your website’s server. Common response codes you will come across include:
1xx – Informational responses. If you see status codes like 100, 101, 102 and 103, it means that crawl request is being processed by your website’s server.
2xx – Success responses. This means the crawl request has been processed successfully by your website’s server and is showing on the user end of your website. You will most likely encounter success code 200 – which means the page is okay.
3xx – Redirection responses. A 3xx response code means that the page, image or other file requested has been redirected to another resource. The two most common 3xx response codes are 301 and 302. A 301 response code means that the page has been redirected to another location permanently. A 302 code means that the page has been redirected temporarily. Temporary redirection should only be used if you are planning to bring the page back eventually. A 302 redirection is relevant for new web page testing or if there is a website update or redesign taking place. When a 302 redirection is applied, the original page is still indexed in search results and no link authority is transferred to the redirected URL.
4xx – Client error responses. A 4xx response code means that the page, image or other file requested from the server could not be reached or is no longer valid. The most common 4xx error code you’ll come across is 404 – page not found. It is important when completing an audit to compile a list of 4xx error pages to either redirect to a relevant location, remove the page’s link from the website (an internal link from a blog post, for example) or ensure that there is a custom 404 error page in place that is helpful to website users. Custom 404 pages are particularly important for eCommerce stores, where products run out of stock.
5xx – Server error responses. A 5xx response code means that your website’s server has failed to complete the request that has been made for a page or your website. Common 5xx error codes include:
- 500 – Internal server error. There is an error with the server that is preventing the successful request of a page, image or other resource.
- 501 – Not implemented. This means that your website server does not recognise the request method and cannot fulfill the request.
- 502 – Gateway error. Basically, a gateway error occurs when a website has multiple servers and the first ‘gateway’ (also known as a proxy server) doesn’t get a response from the second server when there’s a request made from your website.
- 503 – Service unavailable. This error code means that your website’s server has been overloaded by other requests (e.g. a spike in traffic to your website, other tools or bots crawling your website or even DoS and DDoS cyber attacks) or your website is undergoing planned maintenance downtime.
- 504 – Gateway timeout. A gateway timeout is similar to a 502, however instead of not receiving a response from your website server, the response is taking too long.
Check your website’s architecture
Website architecture refers to the structure of your site’s pages and how they are linked together. Often overlooked, good website structure is important for both user experience and web crawlers – crawl bots might not crawl and index pages that are several clicks away from your home page, or are not linked to any other pages in your website (backlinko).
Optimised website architecture should be logical, not contain excessive menu headings, uses effective internal linking and has a shallow navigation structure. A shallow navigation structure can be considered a site that has three or fewer clicks to reach every page (SearchEngine Journal).
How can you review your website’s architecture? Screaming Frog, SEMrush and Ahrefs are all great resources for checking your website’s structure and internal linking. See below for what each tool can help with:
- Viewing the crawl depth of pages
- Creating web crawl visualisations. Web crawl visualisations show how Screaming Frog’s spider has crawled your website’s pages by the shortest path from the starting URL provided.
- Creating directory tree visualisations. Directory tree visualisations show the URL structure of a website through the use of nodes. The nodes don’t always reflect resolving URLs and the connecting lines do not represent hyperlinks (Screaming Frog).
- Viewing your website’s link architecture
- Viewing your website’s internal link rank (ILR). SEMrush’s ILR measures the importance of pages in terms of link architecture
- Identifying internal linking issues
- Identifying pages that pass on the most link value
- Viewing the crawl depth of your site’s pages in an easily digestible graph
- And more!
- Viewing the crawl depth of your site’s pages
- Identifying internal linking issues
- Viewing pages with no incoming links
- Viewing the internal links pointing to important and unimportant pages
- Viewing internal links pointing to redirected pages
- And more!
Watch this space for a deep dive into each of these SEO tools and how each one can benefit you!
Check your website for duplicate content
Google defines duplicate content as significant blocks of content on and off your website that is exactly the same, or very similar. Duplicate content can be considered anything from two identical pages on your website, to multiple pages on your website using the same blocks of text, to matching content across different domains.
Duplicate content can be considered problematic for a number of reasons:
- Search engines don’t know which version of the content is the most important to index and show in the SERP (search engine results page). In most instances, they will only show one version of the content – it might not be the intended version, or even your website’s version.
- Search engines don’t know how to distribute link metrics to the duplicate pages – meaning that trust, authority and link equity may be given to the page deemed most relevant to the search engine, or split across the duplicate pages, diluting the strength of the page and its ability to rank as it normally would in search results (MOZ).
How can you check your website for duplicate content? SEO tools like SEMrush, Ahrefs, Screaming Frog and Siteliner are great for detecting duplicate site-specific content issues across areas such as content, title tags, meta descriptions, heading ‘h’ tags and URLs. Copyscape is a free online tool that can be used to check for duplicate copies of individual URL content across the web.
Website protocol – is your website served in HTTPS?
When you visit a website, you’ll notice at the start of the website address in the URL bar either of the following:
For those not familiar, a website served in HTTPS is considered a secure website. This means a SSL certificate has been installed and all website information is encrypted and secure. A HTTPS website is protected from hackers and information that is passed between a browser and website server is private (Searchenginewatch).
A HTTP website on the other hand, is ‘insecure’, meaning that information is not encrypted. Hackers can gain access to information in its raw form which then can be used in malicious ways. It is critical, particularly for websites where personal information such as names, addresses, phone numbers and credit card details are obtained (websites with contact forms, eCommerce websites), that SSL is installed.
Not only is a secure website considered a ranking factor for Google, but page users are more likely to stay on and engage with a secure website. Hubspot’s Consumer Survey in 2017 has even found that 82% of individuals would not continue browsing a website if it was not seen as secure!
The above is not an exclusive list of what you need to look at during a technical SEO audit, however, by completing these tasks you will be on the right track to ensure your website is easily crawlable and indexable by search engines – providing a strong foundation for your targeted SEO onpage efforts.