Log file analysis involves reviewing the data stored by a website’s servers in the form of log files, which record every request made to the site. This process is an essential part of technical SEO.
In SEO, log file analysis provides valuable insights into how Googlebot and other web crawlers interact with a website. By examining log files, you can identify problematic pages, understand the crawl budget, and gain other critical information related to technical SEO.
To better understand log file analysis, it’s important to first know what log files are. These records are created by servers and contain data about each request made to the site, including the IP address of the requesting server, the type of request, the user agent, a timestamp, the requested resource URL path, and HTTP status codes.
Log files contain a wealth of information, but they are typically stored for a limited time, depending on the website’s traffic and data accumulation rate.
Log file analysis is critical in technical SEO because it provides valuable insights into how Google and its crawlers interact with your website. By examining log files, you can track:
Log file analysis provides answers to important questions related to search engine crawling behaviors and helps you make informed decisions about website optimization. By understanding which content is being crawled and how often, you can improve your website’s visibility and performance in search engine results.
You’ll find a high-level overview of the steps needed to complete a log file analysis below.
Log files are kept on the server, so you will need access to download a copy. The most common way of accessing the server is via FTP - such as Filezilla, a free, open-source FTP - but you can also do it through the server control panel’s file manager.
The actual steps you will have to take to access the log files depend on the web hosting solution you are using.
Keep in mind that there are certain issues you might encounter when trying to access log files:
Once connected to the server, you can retrieve the log files you’re interested in analyzing, which will most likely be the logs from search engine bots. Do note that you may have to parse the log data and convert it into the correct format before proceeding to the next step.
You could simply import the data to Google Sheets. However, it can quickly add up, even if you filter it for requests from Googlebot within a limited time frame - and you will likely have to comb through tens, if not hundreds, of thousands of rows of data.
The more efficient - and most certainly less time-consuming - option would be to use specialized software designed to do the manual work for you.
Here are a few examples of tools you can use for log file analysis:
You can also use Ahrefs’ Site Audit to get more data and then combine it with the log file data. Combining data from two different sources and including information about your website’s traffic, crawl depth, indexability, internal links, and status codes will provide in-depth insights.
Here are some things to pay attention to as you go over your data: