You may want to use OnCrawl's SEO log monitoring solution but not want to share all of your server log data with OnCrawl. Not a problem! You can filter your log lines and provide OnCrawl with only the log lines that are useful for SEO analysis.

This also has the advantage of reducing the size of your log files, making it faster and easier to transfer them to OnCrawl and to process them.

Here's how it works.

Filtering the right lines for SEO analysis

SEO analysis concentrates on two types of log lines:

  1. Search engine bot hits. In this case, these are bots with google  or Googlebot  in the full User Agent string. (We'll take care of verifying that these are actually Googlebots.)
  2. Organic user hits (SEO visits). These lines contain google  (as in "https://www.google.com" or "https://www.google.ca") in the Referer field.

Make sure your filtered lines include BOTH of these type of lines.

You may want to use a broader filter than necessary, and keep any and all lines that contain google  anywhere in them.

Ideally, this filter should not be case sensitive.

Filtering the right lines for Ads bots and vertical bots

If you're also including lines for SEA and vertical bots, use a filter that is not case sensitive. This is particularly important, as these bots have varied naming conventions.

Example image bot:
Googlebot-Image/1.0  contains "google", but uses an uppercase G.

Example GoogleAds bot:
Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)  contains both "Google" and "google".

Example "standard" (SEO) Googlebot:
Mozilla/5.0 (compatible ; Googlebot/2.1 ; +http://www.google.com/bot.html) contains both "Google" and "google".

Going further

If you're interested in selecting certain bots only, you'll need to use more specific filters that directly target the specific bots you're interested in.

To filter for specific bots, refer to the user agent strings provided by Google in the Search Console Help:

Did this answer your question?