Home/Resources/Automations/Automatic Broken Link Checker | n8n

Resources · Automations · n8n FREE · 2026

Automatic Broken Link Checker | n8n

Broken links are website killers. They tank your search rankings, destroy user trust, and make your site look abandoned. The problem? Finding them manually is a nightmare. You’d need to click through every page, check every link, and document each failure. For a site with hundreds of pages, that’s days of mind-numbing work.

Get it free

The full automation, in your inbox

No spam · Unsubscribe anytime

Overview · 22 steps

Automatic Broken Link Checker | n8n

Automatic Broken Link Checker with n8n | Find & Fix 404 Errors Automatically (Free Workflow + Video + Tutorial + Download)

Before you start

Requirements: n8n instance & API keys.

!
You'll need
Requirements: n8n instance & API keys.
- A self-hosted n8n instance with terminal access.
- API credentials for the services used in this workflow.
- Try n8n for free
- Self-host with Hostinger

Get it free

The full automation, in your inbox

No spam · Unsubscribe anytime

Step 01 → 22

n8n workflow breakdown.

22 steps, 0 lines of code. Here is exactly what runs under the hood.

01
Step 01
Schedule Trigger - Set Up Daily Automation.
The Schedule Trigger kicks off your workflow automatically without any manual intervention. This node acts as the heartbeat of your broken link checker, ensuring your site gets scanned consistently.

Setting this to run daily at midnight means you'll always have fresh data waiting for you each morning. You can adjust the timing based on your site's traffic patterns—some prefer running during low-traffic hours to minimize any potential impact.

💡 Tip: If you manage multiple websites, consider staggering the trigger times by 30-60 minutes to avoid overwhelming your n8n instance with concurrent executions.
Parameters
- Trigger Interval: Days - Sets the workflow to run on a daily schedule rather than hourly or weekly
- Days Between Triggers: 1 - The workflow executes every single day without skipping
- Trigger at Hour: Midnight - Runs at 00:00 in your n8n instance's timezone
- Trigger at Minute: 0 - Starts exactly at the beginning of the hour
02
Step 02
Set XML Sitemap URL and Domain - Define Your Target Website.
This Set node establishes the foundation for your entire workflow by defining which website to scan. You'll configure your sitemap URL and domain here, and these values propagate through every subsequent node.

The sitemap URL points to your XML sitemap file—most websites generate these automatically through CMS plugins or server configurations. The domain value helps the workflow identify which links are internal versus external.

💡 Tip: Not sure where your sitemap lives? Try appending /sitemap.xml, /sitemap_index.xml, or /page-sitemap.xml to your domain. WordPress sites with Yoast or RankMath typically use the latter format.
Parameters
- Mode: Manual Mapping - Allows you to define specific field names and values
- site_map_url (String): [YOUR_SITEMAP_URL] - Enter your full sitemap URL (e.g., https://yourdomain.com/sitemap.xml or https://yourdomain.com/page-sitemap.xml)
- domain (String): [YOUR_DOMAIN] - Enter your domain without protocol (e.g., yourdomain.com)
- Include Other Input Fields: OFF - Only passes the fields you explicitly define
03
Step 03
Create Google Sheet Report - Generate Your Daily Report File.
Each workflow execution creates a fresh Google Sheets document named with the current date. This approach gives you a clean historical record—you can look back at any day to see what broken links existed at that moment.

The dynamic title uses n8n's date formatting to generate names like "01-17-2026", making it easy to sort and find specific reports in your Google Drive folder.

💡 Tip: The spreadsheet ID generated by this node gets passed to later steps. Make sure this node executes successfully before the workflow continues, as subsequent nodes depend on its output.
Parameters
- Credential to connect with: Select your configured Google Sheets credential
- Resource: Document - Creates a new Google Sheets document rather than modifying existing sheets
- Operation: Create - Generates a fresh document for this execution
- Title: {{ $now.toFormat('MM-dd-yyyy') }} - Dynamically names the file with today's date in month-day-year format
04
Step 04
Set Broken Link Data - Prepare the Data Structure.
This Set node defines the schema for your broken link data. It creates two string fields that will store the source page URL and the broken link URL when issues are detected later in the workflow.

Think of this as setting up your database columns before populating them with data. The structure ensures consistency across all broken link records.
Parameters
- Mode: Manual Mapping - Defines fields explicitly rather than passing through existing data
- source_url (String): Empty - Will hold the page where the broken link was found
- broken_link (String): Empty - Will hold the actual broken URL
- Include Other Input Fields: OFF - Creates a clean data structure with only these two fields
05
Step 05
Append Row - Configure Sheet Columns for Logging.
The Append Row node writes broken link data into your Google Sheets report. It's configured to add new rows to the spreadsheet created earlier, mapping your data fields to specific columns.

This node connects to the report using the spreadsheet ID generated by the Create node, ensuring data lands in the correct document each day.
Parameters
- Credential to connect with: Select your configured Google Sheets credential
- Resource: Sheet Within Document - Targets a specific sheet inside the spreadsheet
- Operation: Append Row - Adds new rows without overwriting existing data
- Document: By ID → {{ $('Create Report').item.json.spreadsheetId }} - References the newly created spreadsheet dynamically
- Sheet: By ID → 0 - Targets the first (default) sheet in the document
- Mapping Column Mode: Map Each Column Manually - Gives precise control over which data goes where
- Values to Send: A1, B1 columns mapped (values populated by the sub-workflow)
06
Step 06
Move File - Organize Reports in Dedicated Folder.
After the report is created, this node moves it from the Google Drive root to a dedicated folder. Keeping reports organized prevents your Drive from becoming cluttered with daily files scattered everywhere.

Create a folder in your Google Drive called "Broken Link Checker" (or whatever name you prefer) before running the workflow for the first time.

💡 Tip: Create the destination folder in Google Drive before your first workflow run. The node needs the folder to exist—it won't create it automatically.
Parameters
- Credential to connect with: Select your configured Google Drive credential
- Resource: File - Specifies we're moving a file rather than a folder
- Operation: Move - Relocates the file without copying
- File: By ID → {{ $('Create Report').item.json.spreadsheetId }} - References the spreadsheet by its ID
- Parent Drive: From list → My Drive - Selects your primary Google Drive
- Parent Folder: From list → [YOUR_FOLDER_NAME] - Choose your dedicated reports folder
07
Step 07
Fetch Sitemap XML - Download Your Website's URL List.
This HTTP Request node downloads your XML sitemap file. The sitemap contains a structured list of all pages you want search engines to index—and conveniently, all pages you want to check for broken links.

The URL comes from the Set node you configured earlier, making it easy to change target websites without modifying multiple nodes.
Parameters
- Method: GET - Standard HTTP method for retrieving data
- URL: {{ $('Set Domain').item.json.site_map_url }} - Pulls the sitemap URL from your configuration node
- Authentication: None - Most sitemaps are publicly accessible
- Send Query Parameters: OFF
- Send Headers: OFF
- Send Body: OFF
08
Step 08
XML to JSON - Convert Sitemap to Workable Data.
XML sitemaps aren't directly usable in n8n workflows—they need conversion to JSON format first. This node parses the XML structure and transforms it into JavaScript objects that subsequent nodes can easily manipulate.

The conversion preserves the hierarchical structure of your sitemap, including all URL entries and their metadata.

💡 Tip: If your sitemap is compressed (gzip) or stored as a binary file, you'll need to add an "Extract from File" node before this conversion step.
Parameters
- Mode: XML to JSON - Converts from XML format to JSON objects
- Property Name: data - The field name containing the XML content to convert
09
Step 09
Split In Batches - Extract Individual Page URLs.
Your sitemap contains multiple URL entries nested inside an array structure. This Split Out node extracts each URL entry as a separate item, allowing the workflow to process pages individually.

The field path "urlset.url" corresponds to the standard XML sitemap structure where URLs are stored inside the urlset element.
Parameters
- Fields To Split Out: urlset.url - The JSON path to the array of URL entries in your parsed sitemap
- Include: No Other Fields - Only extracts the URL data, discarding sitemap metadata
10
Step 10
SplitInBatches - Process Pages One at a Time.
This node creates a processing loop that handles one sitemap URL at a time. While n8n processes items automatically, explicit batching provides better control over execution flow and error handling.

Processing one page per batch prevents memory issues on large sites and makes debugging easier when something fails.

💡 Tip: For sites with thousands of pages, you might increase the batch size to 5-10 to speed up processing, but watch your memory usage.
Parameters
- Batch Size: 1 - Processes one sitemap URL per loop iteration
11
Step 11
HTTP Request - Fetch Page Content for Link Extraction.
For each page in your sitemap, this node downloads the full HTML content. The next step will parse this HTML to extract all internal links that need testing.

The URL comes dynamically from the current sitemap entry being processed, using the standard loc field from XML sitemaps.
Parameters
- Method: GET - Retrieves the full page content
- URL: {{ $json.loc }} - The page URL from the current sitemap entry
- Authentication: None
- Send Query Parameters: OFF
- Send Headers: OFF
- Send Body: OFF
12
Step 12
Extract Internal Links - Parse HTML for All Links.
This Code node contains the intelligence of your broken link checker. It parses the HTML content, extracts all URLs using regex pattern matching, and filters out CDN resources, API endpoints, and static assets that don't need checking.

The JavaScript code handles deduplication and ensures only relevant internal links pass through to the testing phase.

Code Logic:
- Extracts the HTML from $input.item.json.data
- Gets your domain from the Set Domain node
- Uses regex to find all URLs in href and src attributes
- Deduplicates results using a Set
- Filters out CDN patterns (cloudflare, cloudfront, googleapis, etc.)
- Excludes static assets (.jpg, .png, .css, .js, etc.)
💡 Tip: You can customize the cdnPatterns array in the code to add additional patterns specific to your site's architecture.
Parameters
- Mode: Run Once for All Items - Processes the entire HTML content in a single execution
- Language: JavaScript
13
Step 13
Merge - Combine All Extracted Links.
After extracting links from all pages, this Merge node aggregates everything into a single list. This consolidated dataset then gets sent to the sub-workflow for actual link testing.
Parameters
- Aggregate: All Item Data (Into a Single List) - Combines all items into one array
- Put Output in Field: data - Names the output array "data"
- Include: All Fields - Preserves all extracted link information
14
Step 14
Send to Webhook - Trigger the Link Testing Sub-Workflow.
This HTTP Request node sends all extracted links to a separate webhook-based sub-workflow that handles the actual link testing. Splitting the architecture this way keeps the main workflow clean and allows the testing logic to run independently.

The payload includes both the spreadsheet ID (for logging results) and the complete list of links to test.

💡 Tip: Replace the webhook URL with your own n8n instance URL. The path "/webhook/brokenlinkcheck" should match the path configured in your sub-workflow's Webhook node.
Parameters
- Method: POST - Sends data to the webhook
- URL: [YOUR_N8N_WEBHOOK_URL]/webhook/brokenlinkcheck - Your sub-workflow's webhook endpoint
- Authentication: None
- Send Query Parameters: OFF
- Send Headers: OFF
- Send Body: ON
- Body Content Type: JSON
- Specify Body: Using JSON
- JSON: {{ { spreadsheet_id: $('Create Report').item.json.spreadsheetId, data: $json.data } }} - Passes the report ID and link list
15
Step 15
Brokenlinkchecker Webhook - Receive Links for Testing.
This Webhook node starts the sub-workflow that actually tests each link. It receives the data from the main workflow and kicks off the link-checking process.

The webhook listens for POST requests and passes incoming data to the subsequent nodes for processing.

💡 Tip: The Test URL shown in the interface is for development testing. In production, use the Production URL which becomes active when the workflow is activated.
Parameters
- HTTP Method: POST - Accepts POST requests from the main workflow
- Path: brokenlinkchecker - The URL path for this webhook endpoint
- Authentication: None - The main workflow sends requests without authentication
- Respond: Using 'Respond to Webhook' Node - Delays response until processing completes
16
Step 16
Split Binary Data - Extract Link Data from Payload.
This node extracts the link data array from the webhook payload body. It isolates the "body.data" field containing all URLs that need testing.
Parameters
- Fields To Split Out: body.data - Extracts the data array from the incoming webhook body
- Include: No Other Fields - Only passes the link data forward
17
Step 17
Split In Batches - Process Links Individually.
Each link needs individual testing, so this node creates a loop that processes one URL at a time. This prevents timeout issues and allows for proper error handling on each link check.
Parameters
- Batch Size: 1 - Tests one link per iteration
18
Step 18
HTTP Request - Test Each Link with HEAD Request.
This is where the actual link testing happens. The node sends HTTP HEAD requests to each URL—a lightweight method that checks if a resource exists without downloading its content.

HEAD requests are perfect for link checking because they're fast and consume minimal bandwidth while still returning the status code you need.

💡 Tip: The "Ignore SSL Issues" option is enabled to handle sites with expired or self-signed certificates. While this might flag some false positives, it ensures the checker doesn't fail on SSL-related issues.
Parameters
- Method: HEAD - Lightweight request that only retrieves headers
- URL: {{ $json.url }} - The current link being tested
- Authentication: None
- Send Query Parameters: OFF
- Send Headers: ON (empty headers configured)
- Send Body: OFF
- Ignore SSL Issues (Insecure): ON - Prevents SSL certificate errors from blocking tests
19
Step 19
IF Broken - Filter Non-200 Status Codes.
This conditional node examines each HTTP response. If the status code isn't 200 (OK), the link is considered broken and routes to the logging path. Links returning 200 skip the logging step entirely.

The node catches 404s (not found), 500s (server errors), 301/302s (redirects that might indicate issues), and any other non-success status codes.
Parameters
- Condition: {{ $json.statusCode }} is not equal to 200
- Convert types where required: ON - Handles string/number type differences automatically
20
Step 20
Google Sheets1 - Log Broken Links to Report.
When a broken link is detected, this node appends a row to your Google Sheets report. Each row contains the source page where the broken link was found and the actual broken URL.

The spreadsheet ID comes from the webhook payload, ensuring results land in the correct daily report.
Parameters
- Credential to connect with: Select your configured Google Sheets credential
- Resource: Sheet Within Document
- Operation: Append Row
- Document: By ID → {{ $('Receive Link').item.json.body.spreadsheet_id }} - Uses the ID passed through the webhook
- Sheet: By ID → 0 - First sheet in the document
- Mapping Column Mode: Map Each Column Manually
- source_url: {{ $('Loop Over Items').item.json.sourcePage }} - The page containing the broken link
- broken_link: {{ $('Loop Over Items').item.json.url }} - The broken URL itself
21
Step 21
Merge Items - Aggregate Processing Results.
After all links have been tested, this node consolidates the results into a single output. This aggregated data gets returned to the webhook caller.
Parameters
- Aggregate: All Item Data (Into a Single List)
- Put Output in Field: data
- Include: All Fields
22
Step 22
Respond to Webhook - Complete the Request Cycle.
This final node sends a response back to the main workflow, confirming that link testing has completed. It closes the HTTP connection initiated by the webhook request.
Parameters
- Respond With: First Incoming Item - Returns the first item from the aggregated results

You've seen the full workflow

Get the ready-to-import n8n JSON plus the install guide

Drop your email and we'll send you the complete scenario.

n8n JSON ready to import
Written setup guide
Video tutorial included

2,400+ makers downloaded this workflow this month.

Why this matters

Why Automating Broken Link Detection is a Game-Changer for SEO Professionals

Maintaining website health isn't optional—it's essential for anyone serious about search rankings and user experience. Broken links accumulate silently over time as pages get deleted, URLs change, and external resources disappear. Common problems with manual link checking: Time-consuming process that requires clicking through every page Easy to miss links buried deep in site architecture No historical record of when issues first appeared Inconsistent checking schedules lead to prolonged damage Human error means broken links slip through unnoticed Benefits of automated broken link detection: Daily scans catch issues within 24 hours of occurring Complete coverage of every page in your sitemap Organized Google Sheets reports for easy prioritization Historical records show link health trends over time Zero manual effort once the workflow is configured Scalable to any website size without additional work By automating this process, you transform reactive firefighting into proactive maintenance. Instead of discovering broken links through user complaints or ranking drops, you catch them immediately and fix them before they cause damage. Tools like Semrush or Ahrefs can complement this workflow by providing deeper SEO insights.

Get the workflow

The full automation, in your inbox.

n8n JSON, written guide and video tutorial, everything to ship this in under 15 minutes.

Complete n8n scenario JSON
Step-by-step setup documentation
Full video walkthrough