Text Splitting
Closed this issue · 0 comments
sap218 commented
- splitting via threads - all comments (not users)
- each thread sorted into bins: short/medium/long
- extract 20% from each bin and that is the test data set: put somewhere and not touch
- training is 80% of each bin
- extract synonyms for ontology