When a bot visits a website, it views the HTML of a page and visits the links that it finds. If your website is built using JavaScript (whether all of the website, or only parts of the content and navigation), this may mean that the main content of the page is not visible in the HTML, but constructed dynamically after the HTML is fetched. In this case, the bot will not see content or links on the page, and will not be able to discover the pages in your site.

You can solve this by pre-rendering the JavaScript. It’s easy to do in OnCrawl!

Top three reasons to render JavaScript for a crawl

  1. For some sites, it's impossible to crawl the site without JavaScript enabled. But that doesn't mean you don't need more information on how the website performs, and you can only get that information from a crawl.
  2. JavaScript can alter content and links by adding information. OnCrawl helps check that the final version of your website behaves as intended.
  3. JavaScript can be used to redirect one URL to another. Crawling with JavaScript helps you make sure you've audited all of your redirected pages.

Enable the Crawl JS option

On the Crawl settings page, enable JavaScript crawling:

  1. At the top of the page, make sure the extra settings are show. If the toggle button is gray, click "Show extra settings" to display them.
  2. Scroll down to the "Extra settings" section and click on "Crawl JS" to expand the section.
  3. Tick "Crawl the website as a JavaScript website"
  4. Choose either “My site is not pre-rendered.” or “My site is already pre-rendered.”

If you don't know whether your site is pre-rendered or not, you should ask OnCrawl to use the pre-rendering feature. (Note: this feature is not included by default in standard plans. Please contact the OnCrawl business team to discuss adding JavaScript pre-rendering to your plan.)

For more information on pre-rendering, see prerender.io or www.seo4ajax.com.

Once the “Crawl JS” option is enabled, you can launch the crawl as usual.

Best practices for crawling JS

  • If your site is built using JavaScript, it may still be worth it to see what your site looks like with JavaScript disabled. Run your crawl with and without JavaScript enabled to compare the results.
  • The JavaScript crawl is not compatible with authentication by username and password. If you're working with a pre-prod website built with JavaScript, you may need to plan accordingly.
  • Check your quotas before running a JavaScript crawl. JavaScript crawls use more resources than normal crawls, so crawling a page in JavaScript will consume 10 URLs in your quota.

Going further

If you still have questions about crawling a website that uses JavaScript, feel free to drop us a line at @oncrawl_cs or click on the Intercom button at the bottom right of your screen to start a chat with us.

Happy crawling!

You can also find this article by searching for:
cómo crawlear páginas con Javascript
comment réaliser un crawl avec Javascript

Did this answer your question?