This script extracts domains (including subdomains) from a CSV file and filters out email addresses. It's designed to be run in a Linux terminal environment, with support for colored output to differentiate between newly added domains and duplicates.
- Processes CSV files to extract domains.
- Filters out email addresses and HTML tags.
- Supports colored terminal output for added domains (green) and skipped duplicates (red).
- Prints the total count of unique domains processed to the console.
- Allows user interaction for selecting the CSV file and specifying the output file name.
- Python 3.x
tldextract
librarytermcolor
library
- Ensure Python 3.x is installed on your system.
- Install the required Python libraries:
pip install tldextract termcolor
- Place the script in the same directory as your CSV files.
- Run the script using Python:
python3 domain_extractor.py
-
Follow the prompts to select a CSV file and specify the name of the output file.
-
If you want verbose output, include the
-v
or--verbose
flag when running the script:
python3 domain_extractor.py --verbose
Contributions are welcome! Please feel free to submit a pull request or open an issue for any bugs or feature requests.
This script is provided under the MIT License. See the LICENSE file for more details.