Issues
- 0
How are non-entity tokens handled ?
#110 opened by pzdkn - 0
Access to OCR outputs
#109 opened by vishekau - 1
Issue writing the dataset as parquet in add_features
#105 opened by radkoff - 3
- 0
No token file for ...
#104 opened by zhxgj - 0
Update wand version 0.9 -> 0.10
#92 opened by moredatarequired - 0
- 0
Hand check 2020 sample data, all fields
#61 opened by jstray - 0
- 0
Fix 2012 duplicate data problems
#46 opened by jstray - 0
Create infer.py
#49 opened by jstray - 1
Load 1000 random 2020 documents into Overview
#48 opened by jstray - 0
Train on combined 2012 and 2014 data
#55 opened by jstray - 0
- 1
Run totals model on 2020 data
#50 opened by jstray - 0
Modify create_training_data.py to create labels for advertiser and contract number
#47 opened by jstray - 0
Merge fuzzy-matching code into infer.py
#58 opened by jstray - 0
Continuous 2020 downloading and inference
#59 opened by jstray - 0
Hand-check 2020 test totals
#57 opened by jstray - 0
- 0
Merge 2012 and 2014 training data
#54 opened by jstray - 0
Create 2014 tokens.csv
#53 opened by jstray - 0
Generate start and end date labels from 2014 data
#52 opened by jstray - 0
- 1
Match output token more intelligently
#18 opened by moredatarequired - 1
- 2
- 0
- 0
Pull PDFs on demand for annotation
#16 opened by moredatarequired - 0
Add license
#17 opened by moredatarequired - 0
Create test version of sweep
#22 opened by moredatarequired