A website scraper tool for extracting text with conversion to markdown.md. Files are placed in a directory named after the directory they was found under. Creates a file structure that replicates the site's.
Use with caution
git clone https://github.com/johnconnor-sec/scrapedown
cd scrapedown
pip install poetry
poetry shell
poetry install
python3 main.py
The tool now includes links gathered from the site and a better output of the markdown text.
This is completely free to anyone who thinks its cool. If anything I think it could work for gathering data for LLMs, notetaking, or finding interesting endpoints.
Just clone it and after installing the dependencies run python3 main.py
. Watch it work.
If you'd like to make this project better, please show me what you have made!