README.md file for CSE_517 assignment 4: Structured Learning with Feature-Rich Models.
###Run Best Model and print result
- Run
source best_model.sh
- Run
source eval.sh
- Run
source read.sh
###Data Location
- First run
cp config.ini.template config.ini
- Change the
data_dir
attributes inconfig.ini
file to the data location you place. - To simplified the procedure above, I have put the data dir inside this folder.
###Usage
- Use
python main.py --help
to see argument list you can use. - Use
--lrn-rate [LEARN_RATE]
to specify learning rate eta of Perceptron. - Use
--epoch [EPOCH_TIME]
to specify how many epoch the algorithms will run through. --no-ctoken
,--no-ptoken
,--no-ftoken
,--no-cpos
,--no-ppos
,--no-fpos
,--no-cchunk
,--no-pchunk
and--no-fchunk
means no current token, no previous token, no future token, no current pos tag, no previous pos tag, no future pos tag, no current syntactic chunk, no previous syntactic chunk and no future syntactic chunk as features, respectively.- Use
--test
to train the model with "train+dev" set and evaluate the model with "test" set. If not specified, it will be trained on "train" and evaluate the model with "dev" set. - When
--small
is specified, the learning algorithms use the truncated (small) dataset.
###Output
-
The output files will be located in the output folder with the filename:
[dev|train]_eta[LEARN_RATE]_epoch_[EPOCH_TIME]_[FEATURE_CODE].txt
-
Use
./conlleval.txt < filename
to eval the result. -
FEATURE_CODE
is a sequence of digits of length six. It's the true-false value of--no-ctoken
,--no-ptoken
,--no-ftoken
,--no-cpos
,--no-ppos
,--no-fpos
,--no-cchunk
,--no-pchunk
and--no-fchunk
. Where 1 means True and 0 means False.