This Python program checks URLs for safety using a few techniques:
- the Google Safe Browsing API
- Abuse.ch database
For demo purposes it fetches the top 100 URLs from Wikipedia and evaluates each for threats like malware, social engineering, unwanted software, and potentially harmful applications.
- Integration with Google Safe Browsing API.
- Extraction of URLs from an HTML table.
- Assessment of URLs for various threat types.
- Reporting on potential threats or confirming safety.
- Python 3.11
- Clone the repository:
git clone https://github.com/BenderScript/rag_poison
- Navigate to the cloned directory:
cd rag_poison
- Install required packages using
pip3
:pip3 install -r requirements.txt
- Obtain a Google Safe Browsing API key and place it in a
.env
file asGOOGLE_SAFE_BROWSING_API_KEY
.
Run the program with Python 3:
python3 main.py
check_url_with_google_safe_browsing(url, api_key)
: Checks a URL against Google's Safe Browsing API and returns matches.run()
: Main function for URL extraction, API key loading, and URL safety checking.
- Manages non-200 responses from the API.
- Catches exceptions from request failures.
Feel free to contribute! Open an issue or submit a pull request on GitHub.
This project is licensed under the Apache License.