Kankuro-Crawler is a web crawler developed in NodeJS which navigate webs collecting the most important features (requests, screenshot, frames, webSockets, files...). The purpose of this project is to be a crawler template for different types of purposes. Is build using the chrome-remote-interface to navigate the webs.
This section is a introduction to the usage of the tool like a "Getting Started".
> git clone https://github.com/AlbertoFDR/Crawler.git
> cd Crawler/
> chmod +x install.sh
> ./install.sh
- Screenshot: It takes a screenshot of the web.
- Frames: Gather the frames in the page.
- Requests: Collect the requests.
- Any interstitial page.
- Javascript created dialogs.
- WebSockets: Gather the ws communication if any.
- Files: Download the files of the page (HTML,JS,CSS,Wasm...).
- Any window open.
- WebSockets: There are two versions, one with plaintext communication and the other one with Base64. Sudo is to use the port 80.
> pip3 install websockets
> sudo ./script.sh
- Alberto FDR: Web