All Collections
How to get started
Crawl setup
How to define rules of inclusion/exclusion for URL patterns in your crawl setup
How to define rules of inclusion/exclusion for URL patterns in your crawl setup

How to tell the crawler what part of your site to analyze by including/excluding groups of URLs

Updated over a week ago

When you set up a crawl you can define what the crawler should crawl or not crawl by respecting a pattern of URLs.

To do so, in your crawl configuration, there is a section URL pattern filtering.

Click on this section and then check Enable these filters.

Next, click on the button Add rules to add regex rules that will include or exclude the URLs that match them.

  1. A new window opens where you can inform the different rules of inclusion or exclusion.

  2. Click on the + icon to add more or - to remove one that already exists.

  3. You can then provide a sample (a list of URLs) to check how your rules will work.

4 . Once you have listed some URLs, click on the button Check filters. This will show you the results for each sample URL. If it is taken into account in the crawl you will see a icon. If it is ignored in the crawl, you will see a icon.

When defining the rule list, in the system the exclusions can be evaluated first:

  1. A URL will be included in this crawl if no rules exclude it.

  2. If you provide any include rules, then the URL must also match an include rule.

Did this answer your question?