All Collections
General information
FAQ
How to explore all of the URLs in my sitemap
How to explore all of the URLs in my sitemap

Sometimes you just want data on the URLs in your sitemap. It's possible to do this using a list of URLs with Oncrawl.

Updated over a week ago

Oncrawl will analyze and crawl any list of URLs you provide. If you want to crawl the URLs in your sitemap (and only these URLs), the only thing you need to do is provide your sitemap as a list of URLs.

Note: Do not indicate your XML sitemap as your start URL. The Oncrawl bot will not know what to do with a .xml file. We'll explain below how to give the bot something it knows how to explore.

Step 1: Extract the URLs from your sitemap

First, you will need to extract the URLs from your sitemap.

There are many options to do this, from online tools or desktop converters, to bash and command line executables.

You will want to create one or more files in .txt or .csv format with one URL per line.

If you need help with the format of file to create, you'll find additional information here.

Step 2: Zip your files

Zip your files into a .zip compressed archive. You can do this using free tools like 7-zip (Windows) or right-clicking and selecting "compress" (Mac).

Step 3: Upload your .zip files to Oncrawl

From the project home page, click on Add data sources, then on URL files.

Drag your zip file and drop them within the dotted box.

Step 4: Launch a crawl in URL list mode

From the project home page, choose + Set up new crawl.

Click on Start URL to expand the section.

Choose the option List of URLs.

In the drop-down menu, select the .zip file you uploaded in the previous step. 

Launch your crawl.

Did this answer your question?