How to check URLs in a sitemap
Here is how to make sure that your sitemaps will be taken into account during the analysis

Sitemap report: are URLs in sitemaps in the site structure?

According to Google's recommendation, the urls of a sitemaps.xml file must be included in the robots.txt file.

If this is properly included, you will have no extra configuration to do. Oncrawl will automatically take into account your sitemap files to compare the URLs found in the sitemaps and the URLs found in the crawls.

If the URLs contained in the files are not listed in the robots.txt, you must specify these urls in the crawl settings, in the Sitemaps section.

Here are the steps to follow:

  • List your sitemaps.xml urls

  • Edit (or create) your crawl configuration by clicking on Launch crawl on your project homepage

In the crawl setting, find the sitemap section, check the Specify sitemaps URLs box and add all your sitemap or sitemap index urls.

Then save and launch the crawl.

When the crawl will be ready you will find the detailed report on the Crawl Report > Indexability > Sitemaps section.

This report is enabled by default, but you can disable it when setting up a crawl. To disable it, uncheck the Enable sitemaps cross-data analysis under Sitemaps in your crawl settings.

When disabled, you won't see the Sitemaps dashboard in your crawl.

Checking URLs in a sitemap by crawling them

Alternatively, you can check URLs in a sitemap by crawling them to obtain further data about them. This will not allow you to understand which URLs are part of your site structure.

To crawl all URLs in a sitemap, return to the crawl settings. Change the Start URL crawl mode to Sitemaps.
When you launch the crawl, Oncrawl will discover and crawl the URLs in your sitemaps.

