Segmentation is the key for meaningful analysis. 

Analysis at an entire website level won’t mean anything but average tendency, whereas analysis at local level will give you actionable SEO data.

On OnCrawl, default Segmentation is based on URL pattern (eg. oncrawl.com/solutions/; oncrawl.com/blog/).

URLs patterns is the only information we can get either from the Crawl, from Google Analytics, from the Logs, and any third-party datasets uploaded to OnCrawl.

It means that URLs based Segmentation will apply to all of your reports (Crawl Reports, Logs Monitoring, SEO Impact Reports, Ranking Reports…)

On another hand, there’re as many ways to create Segmentations as there’re SEO metrics on OnCrawl, such as:

  • Word count
  • Inlinks range
  • SEO Sessions
  • Hit Google Bots
  • Page positions…

And there’re as many ways to create Segmentations as there’re informations on your pages:

  • Google Tags
  • Article’s publication dates
  • Sold-out products pages…

So, Segmentation can be defined on any relevant metric among 450 SEO metrics; and on any relevant information among all page content.

But, it doesn’t mean that all these metrics and informations are present in OnCrawl by default.

For instance:

  • Word count and Inlinks range are compute from the Crawl
  • SEO Sessions are known from Google Analytics 
  • Hit Google Bots are known from the Logs
  • Page positions are known from Google Search Console
  • Google Tags, Article’s publication dates and Sold-out product pages are coming from Data Scraping (using RegEx)

Want to learn more about Segmentation using RegEx?

How to use RegEx to create filters

How to segment groups of pages based on your Data Layer

Before creating a custom Segmentation, please be sure that the metric/information needed to define Segmentation’s rules does exist in OnCrawl (either from the Crawl or from some third-party datasets uploaded or scraped to enrich the analysis).

Orphan pages exception: 

To crawl a website, OnCrawl’s bots are following all the links from a start URL, just as Google does. So, OnCrawl’s bots won’t crawl orphan pages because these pages are not linked in the website structure anymore. 

It also means that OnCrawl won’t retrieve any information on orphan pages, and that orphan pages cannot be segmented with crawl data but only with third-party datasets (GA, GSC, Logs…).

Did this answer your question?