Scan is an easy to use server that lets you score essays automatically. Follow the quickstart instructions to get everything up and running.
Note: This is a relatively new project. It has good unit test coverage, and has some manual testing (not just by me), but please test it yourself before using in anything critical.
The easiest way to get started is with a Vagrant virtual machine:
First, install VirtualBox.
Next, install Vagrant
Then clone this repo. If you are unfamiliar with git, please first install git and then look at the basics of cloning a repo.
git clone https://github.com/VikParuchuri/scan.git
Then we have to navigate to the directory and start up vagrant from the command line:
cd scan
vagrant up
This should take 20-30 minutes to download and install dependencies on newer machines.
Congrats! Visiting http://127.0.0.1:5000
in your browser will now let you use Scan.
If you find yourself running out of memory on the virtualbox (if models fail to build), you will want to increase available memory by editing this line in the VagrantFile:
v.memory = 2048
Linux is currently the best supported platform, but it is also possible to install on windows.
xargs -a apt-packages.txt install -y
pip install -r pre-requirements.txt
pip install -r requirements.txt
- Install the scipy stack from here.
- Install scikit-learn from the same place. Full install instructions are here.
- pip install -r requirements.txt
Running the web server:
python app.py
Running task worker, which does things like model creation and scoring:
celery -A app.celery worker --loglevel=debug -B
Running tests. Test coverage of the core algorithm is high, but not of the web portion.
nosetests --with-coverage --cover-package="core" --logging-level="INFO"
Once you have everything setup and running, here are some steps to try:
- Create an account
- Login with your account
- Click on "questions" to get to the questions list
- Create a question using the form.
- Click on "view essays" under the question to see more details.
- Add essays using a csv file upload (there are two sample set of essays at data/test/censorship to use. train_2.csv will take some time to make a model, and train_2_short will be much faster.)
- Click on "create model and score essays"
- It may take some time to create the model, but you will get a status prompt that auto-refreshes.
- Add more essays and score them using the "score essay" button on each essay. You will have to manually refresh the page after doing this to see the score.
Note: You will want to test results on essays that do not have an actual score. Predicted scores on essays that have a score entered upfront will be misleadingly high!
Enjoy!
Contributions are very welcome. Please fork and pull request to contribute.
Please open a github issue if you see a bug. If you have a general question, feel free to contact vik.paruchuri@gmail.com.