Reducing the size of access log files before sending them to ftp.oncrawl.com make them much faster to transfer and to process. This can be achieved with compression and pre-filtering.

In short: "grep google" and "gzip" are your friends

Compression

On average, compressing text files leads to reduce the size by 85%.

OnCrawl can read multiple compression formats:
 - zip
 - gzip
 - tar (+gz)
 - bzip2
 - 7z
 - xz
 
You can use your favorite compression program to compress access log files before sending them. Please note that RAR files are NOT supported.

Pre-Filtering

On average, pre-filtering reduces logs files by 90%. Sometimes 99% ! Access logs contain a lot of information that is discarded by OnCrawl because it's not relevant for SEO analysis. 

Pre-filtering is as easy as keeping only lines that contains the word:
google
in lower case.

If you work on Linux or Macos this can be achieved with grep:
grep google MY_LOG_FILE > MY_FILTERED_LOG_FILE

Combining both

When combined compression and pre-filtering together we usually reduce the file size by 95%. That's why we recommend using both.

On Linux or Macos just do:
grep google MY_LOG_FILE | gzip > MY_SUPER_FILTERED_LOG_FILE


Happy Uploading

Did this answer your question?