A Reference Framework for the Automated Exploration of Web Applications. Provides some general web features to let you test crawlers in a well defined environment.
First, clone the repository and cd
into the repository.
Using Docker
-
Clone repository
-
Install Docker
-
Install docker-compose
-
Build and use the docker image with docker-compose
cd crawler-benchmark cp .env.example .env # then edit with desired credentials docker-compose up -d
When it's done, you can visit the app running at localhost:8080
docker-compose run --rm website bash -c 'pytest --cov --cov-report term:skip-covered'
We are using grunt to auto compile scss files into css
files and we may add tasks in the future. npm dependencies are specified in package.json
.
Install sass from the command line (you may need sudo
privileges)
gem install sass
npm install
npm run grunt
- build frontend using webpack and load
pure.scss
fromnode_modules
- Publish docker image so the world can spin this
- Add nodejs docker support
- Add link to home page (from title)
- Add new features!
- Robots.txt validation
- Visited urls
- Provide an api
- Website navigation generation from model
- Improve settings
- Import
- Export
- json? yaml?
- Spread the word, make the application known by crawler authors
- Put online
- Get crawled by general crawlers like google bot
- Share results to the public