If you've previously redirected URLs, you may still have links to the old URL on your website. If you're looking for a list of all of those links, the current page they point to (the old URL), and the page they should point to (the new URL), you're in the right place.

The method below may be a useful first step to updating internal links to the old URLs.

Updating old links can help increase page loading speed and improve crawls of your site.

Why this request isn't as straightforward as it sounds

This request mixes data from different sources: page data and links data. At the moment, it isn't possible to request information from both sets of data at the same time.

That's okay: you can still obtain the data you want from each set of data, and then use different tricks to join the results together.

How to get a list of all pages that have been redirected and the URL they are redirected to

Navigate to the Data Explorer under "Tools" in the crawl results.

Use the OnCrawl Query Language to set a filter for all pages with a 301 status code.

Remove all columns except "URL"and "Redirect location". (If "Redirect location" isn't present, you can add it using the "Add columns" button.)

Click "Export data" at the top of the page.

How to get a list of all links to pages that have been redirected

Navigate to the Data Explorer under "Tools" in the crawl results.

Change the dataset to "Links".

Click on the Quickfilter labeled "Pages pointing to 3xx errors".

Remove all columns except "Link origin" and "Link target".

How to merge the files

Using VLOOKUP in Excel

Open a new Excel workbook.

Paste the contents of your first file (the pages) into the first sheet. Name the sheet "Pages".

Create a new sheet. Paste the contents of your second file (the links) into a second sheet and name it "Links".

Add a column to the "Links" sheet and name the column "New location".

In Cell C2 of the links sheet, use a VLOOKUP search for the link destination that you see in B2 in the list of pages in the other sheet.

Here is the full formula:

=VLOOKUP(B2;'Pages!A$1:B$1000;2;FALSE)

Make sure that the "1000" in B$1000 is the number of the last line in the list of Pages.

Copy the cell C2 and page it in the rest of the cells in column C.

Don't forget to save the Excel file.

Using the csvjoint function in the csvkit Python package

Download and install the csvkit Python package, available here on GitHub.

Full documentation for the csvjoin function is also available.

You'll need to ask it to join file 2 (the links) to file 1 (the URLs), using the second column in the first file (the URL linked to) and the first column in the second file (the URL that was redirected).

You'll need:

  • The location of the first file, which will look something like:
 ~/Downloads/export-5975c7e1451c953ed90d7b7c-custom_query.csv
  • The location of the second file, which will look something like:
~/Downloads/export-5975c7e1451c953ed90d7b7c-custom_query\ \(1\).csv

Note the extra backslash \ before the opening and closing parentheses.

 Execute the command -c 2,1.

Here is the full commande line:

csvjoin -c 2,1 ~/Downloads/export-5975c7e1451c953ed90d7b7c-custom_query.csv ~/Downloads/export-5975c7e1451c953ed90d7b7c-custom_query\ \(1\).csv > results_origin_target_location.csv

This may take a while, depending on the number of lines in the first file, which each need to be looked up in the second file.

This will create a file called "results_origin_target_location.csv" with a line for each link. Each line lists the link origin, the page to which it links, and the page it is redirected to.


You can also find this article by searching for:
301 todos los enlaces, página redirigida y página final
tous les liens vers une page en 301 avec la page redirigée et la page finale

Did this answer your question?