Update the MASC dataset to that used in Recipe 7
francojc opened this issue · 2 comments
francojc commented
The transformed dataset from Recipe 7 is cleaner.
To remove non-words:
pos
- CD, FW, LS, SYM
lemma
- ^\W$
Line 267 in b149ec7
francojc commented
Also, I don't think it is necessary to show nor describe in detail the process of filtering the dataset. Just get to the analysis.
francojc commented
Addressed