Sometimes you need to run a crawl at a time when you won't be in front of your computer. Maybe you need to crawl a site every Sunday morning.
You might also be using Oncrawl for monitoring, and have set up automated alerts if a crawl finds something unexpected -- or for tracking changes. In both cases, you'll need crawls that run automatically to make it work.
Whatever the case may be, and no matter the reason, here's how to schedule a crawl to run and, if you want, to repeat.
Scheduling a crawl
From the project page, click on the Scheduled crawls tab at the top of the crawl monitoring box.
If you already have scheduled crawls planned, they will be listed here:
Click on Schedule crawl to plan a new crawl.
โ
You will then need to choose:
The crawl profile you want to use for the future crawl
The date, time (in 24-hour format) and time zone (by city) when you want the crawl to run. If this is a repeated crawl, this is the date and time when you want the first crawl to run.
Whether or not you want the crawl to repeat.
If this crawl might end up running at the same time as another schedule crawl on the same site, you can decide whether or not you want both crawls to run at the same time. By default, the scheduled crawl will not run if other crawls using the same profile are already running.
Indicated that you want both crawls to run by ticking the checkbox under Allow simultaneous crawls.
Finally, click Create at the top right of the screen.
Managing scheduled crawls
Your crawl is now schedule and will run at the time and date you indicated.
You can keep track of crawls that are scheduled to run in the Scheduled crawls tab in the crawl monitoring box on the project's home page.
To make changes to a scheduled crawl, click Show/Edit details.
Consult the crawl profile associated with the scheduled crawl by clicking on the name of the profile under Configuration.
If you need to completely cancel a scheduled crawl, you can delete it by clicking on the three dots at the right of the row, and choosing Delete.
Running multiple crawls at once
If, for example, you've chosen a crawl speed that is close to the maximum that your server can handle, there is a risk that running two crawls at once will impact your website. This happens because, if two crawls are running at the same time, Oncrawl will make twice as many requests of your web server as for one crawl.
You can choose, for each scheduled crawl, whether or not it should run when there are other schedule crawls already running.
The number of crawls you can have running at the same time also applies to scheduled crawls. If you already have crawls running when the scheduled crawl is planned to start, the crawls already underway will continue.
If, with the schedule crawl should start, that would make too many simultaneous crawls, the scheduled crawl will not run.
An alert in the crawl monitoring box will let you know what happened.