Reducing the size of access log files before sending them to ftp.oncrawl.com make them much faster to transfer and to process. This can be achieved with compression and pre-filtering.
Tools like "grep google" and "gzip" are your friends.
On average, compressing text files leads to reduce the size by 85%.
OnCrawl can read multiple compression formats:
- tar (+gz)
- bzip2 (bz2)
However, in some cases, certain compression options within these formats may not be supported.
If you're encountering problems, please contact us so that we can help you out. You can reach us by clicking on the blue Intercom icon at the bottom right corner of your screen.
You can use your favorite compression program to compress access log files before sending them. Please note that RAR files are NOT supported.
On average, pre-filtering reduces logs files by 90%. Sometimes 99% ! Access logs contain a lot of information that is discarded by OnCrawl because it's not relevant for SEO analysis.
Pre-filtering is as easy as keeping only lines that contains the word:
in lowercase or uppercase.
If you work on Linux or MacOS this can be achieved with grep:
grep google MY_LOG_FILE > MY_FILTERED_LOG_FILE
When combined compression and pre-filtering together we usually reduce the file size by 95%. That's why we recommend using both.
On Linux or Macos just do:
grep google MY_LOG_FILE | gzip > MY_SUPER_FILTERED_LOG_FILE
You can also find this article by searching for:
cómo reducir el tamaño de ficheros, fichero pesado, demasiado grande, demasiado pesado
fichiers trop lourdes, comment réduire le poids du fichier log