This repository contains source code of AI-based structured web data extractor.
- π¨βπ» Author: Jan JoneΕ‘
- π Thesis: PDF, assignment, submission, slides
- π Demo: live, Docker Hub, examples below
- ποΈ Data: SWDE with visuals
- π
awe/
: Python module (data manipulation and machine learning). Seeawe/README.md
. - π
js/
: Node.js app (visual attribute extractor and inference demo). Seejs/README.md
. - π
docs/
- π
dev/
- π
data.md
: dataset preparation. - π
extractor.md
: running the visual extractor. - π
train.md
: training instructions. - π
release.md
: release instructions. - π
demo/
- π
docker pull janjones/awe-demo
docker run --rm -it -p 3000:3000 janjones/awe-demo
Open a web browser and navigate to http://localhost:3000/.
For more details, see docs/demo/run.md
.
docker pull janjones/awe-gradient
docker run --rm -it -v awe:/storage -p 3000:3000 janjones/awe-gradient bash
Then, run inside the Docker container:
git clone https://github.com/jjonescz/awe .
git clone https://github.com/jjonescz/swde-visual data/swde
python -m awe.training.params
python -m awe.training.train
# Model is trained, now you can run the demo.
cd js
pnpm install
DEBUG=1 pnpm run server
For more details, see
Generated by the live demo.