How Can You Use the Report to Find and Fix Pages That Are Not Indexed Due to Crawl Errors?

Summary

Using webmaster tools like Google Search Console, you can identify and resolve crawl errors that prevent pages from being indexed. This involves analyzing the Crawl Stats report, fixing the underlying issues, and requesting a re-crawl. Here's a detailed step-by-step guide on how to accomplish this.

Access Google Search Console

Google Search Console (GSC) is a free tool provided by Google that helps you monitor, maintain, and troubleshoot your site's presence in Google Search results. To begin, log in to your Google Search Console account. If you haven't set up GSC for your website yet, follow Google's setup guide.

Locate the Coverage Report

In the Google Search Console dashboard, navigate to the "Coverage" section, which is found under the "Index" menu. The coverage report provides detailed insights into which pages have been successfully indexed and which pages have issues that need to be addressed.

Understanding the Coverage Report

  • Errors: Pages that couldn't be indexed due to critical issues.
  • Valid with Warnings: Pages that are indexed but might have some issues you should be aware of.
  • Valid: Successfully indexed pages.
  • Excluded: Pages that weren't indexed for various reasons, including intentional exclusions in your robots.txt file.

Crawl Errors

Focus on the "Errors" and "Excluded" sections. Click on the specific error type to view the list of pages affected and detailed error messages. Common crawl errors include:

  • 404 Not Found: The page cannot be found, likely due to broken links or incorrect URLs.
  • Server Errors (5xx): Issues on your server, such as downtime or server overloads.
  • Redirect Errors: Problems with redirection, such as redirect loops or incorrect redirect URLs.

Fixing Crawl Errors

404 Not Found

To fix 404 errors, identify the broken or incorrect URLs. Alternatively, if the page has been removed intentionally, create a relevant 301 redirect to a similar resource or update internal links pointing to these pages. Resources to help:

Server Errors (5xx)

Server errors need to be addressed promptly, often requiring server-side modifications. This might include server configuration adjustments, ensuring the server is up and running, or optimizing server performance. Use the following guides:

Redirect Errors

Redirect errors could occur due to faulty redirect rules or loops. Ensure all redirects are properly set up, avoiding chain and loop issues. Reference materials:

Requesting a Re-crawl

After fixing the errors, you can request Google to re-crawl the affected pages to expedite the indexation of the corrected URLs:

  1. URL Inspection Tool: In GSC, use the URL Inspection Tool to enter each corrected URL.
  2. Request Indexing: Click on "Request Indexing" to prompt Google to re-crawl the page.

For more detailed instructions, see [Requesting Indexing, 2023].

Monitoring and Prevention

Regular monitoring of your site’s performance and crawl status can prevent issues from recurring. Set up email notifications for crawl errors in Google Search Console and routinely check the Coverage report.

Conclusion

By utilizing tools like Google Search Console, you can efficiently identify and rectify crawl errors preventing your pages from being indexed. Regular monitoring and timely intervention are crucial to maintaining optimal site performance and ensuring comprehensive indexation.

References