The Bulk Redirect Checker is a comprehensive Python tool designed for URL redirect analysis. It offers detailed insights into redirect chains, canonical mismatches, and more.
- Diverse Redirect Detection: Identifies a range of redirects, including HTTP to HTTPS, WWW to Non-WWW, and several others.
- Canonical URL Analysis: Detects and reports canonical mismatches in URLs.
- Redirect Chain Mapping: Traces the complete redirect chain of a URL.
- Checkpoint Functionality: Allows processing to be paused and resumed, preventing data loss.
- Logging Capability: Maintains detailed logs for analysis and debugging.
- Flowchart Visualization: (In conjunction with
index.html
) Presents a graphical representation of redirect chains. - Command Line Flexibility: Supports various command-line options for versatile operations.
- Output Customization: Generates detailed CSV reports of redirect analysis.
- Clone the Repository:
git clone [repository-url]
- Navigate to the Directory:
cd path/to/bulkredirectchecker
- Install Dependencies:
pip install -e .
Process a list of URLs from a CSV file and generate a CSV output file.
bulkredirectchecker input_file.csv -o output_file.csv
Analyze a single URL.
bulkredirectchecker -u [single-url-to-check]
Enable logging to record the process details.
bulkredirectchecker input_file.csv -o output_file.csv -l
Use checkpointing to resume processing from the last processed URL.
bulkredirectchecker input_file.csv --checkpoint
The checkpoint feature is designed to save the progress of URL processing. It ensures that in case of an interruption, the tool can resume processing from the last saved state. This is particularly useful for large datasets where processing is time-consuming. The tool writes the state to a checkpoint.json
file after processing each URL. When resumed, it skips URLs already processed, ensuring efficiency and saving time.
input_file
: Specifies the CSV file containing the list of URLs to process.-o
,--output_file
: Determines the name of the output CSV file.--checkpoint
: Activates checkpointing functionality.-u
,--url
: Specifies a single URL for analysis.-l
,--log
: Enables logging of the process.
Each URL is analyzed for its complete redirect path. This includes detection of the redirect type, status code, and the final URL after all redirects. The tool categorizes redirects into types like HTTP to HTTPS, WWW to Non-WWW, etc.
The tool checks the final URL of a redirect chain for canonical tags. It identifies any mismatches between the final URL and the canonical URL, highlighting potential SEO issues.
The generated CSV file contains detailed information about each URL, including:
- Redirect chain
- Final URL
- Status codes for each redirect
- Canonical URL
- Error messages (if any)
Using index.html
, the tool visualizes the redirect chain as a flowchart. This provides an intuitive graphical representation of the redirect path, making it easier to understand complex redirect chains.
Contributions to enhance the tool's functionality or to improve its efficiency are welcome.
This tool is available under the MIT License.