OnCrawl will analyze and crawl any list of URLs you provide. If you want to crawl the URLs in your sitemap (and only these URLs), the only thing you need to do is provide your sitemap as a list of URLs.
Note: Do not indicate your XML sitemap as your start URL. The OnCrawl bot will not know what to do with a .xml file. We'll explain below how to give the bot something it knows how to explore.
Step 1: Extract the URLs from your sitemap
First, you will need to extract the URLs from your sitemap.
There are many options to do this, from online tools or desktop converters, to bash and command line executables.
You will want to create one or more files in .txt or .csv format with one URL per line.
If you need help with the format of file to create, you'll find additional information here.
Step 2: Zip your files
Zip your files into a .zip compressed archive. You can do this using free tools like 7-zip (Windows) or right-clicking and selecting "compress" (Mac).
Step 3: Upload your .zip files to OnCrawl
From the project home page, click on "Add data sources", then on "URL files".
Drag your zip file and drop them within the dotted blue box.
Step 4: Launch a crawl in URL list mode
From the project home page, choose "+ Set up new crawl".
Click on Start URL to expand the section.
Choose the option "List of URLs".
In the drop-down menu, select the .zip file you uploaded in the previous step.
Launch your crawl.
If you still have questions, drop us a line at @oncrawl_cs or click on the Intercom button at the bottom right of your screen to start a chat with us.
This article can also by found by searching for:
crawler les URLs dans mon sitemap, explorer un sitemap, rastrear las URLs en un sitemap, mapa del sitio