useR2018_datawrangling

useR2018 presentation document

2018-07-13 10:50 Working with text /P10/ From poor data to training data /data mining, applications, text analysis/NLP

There are lots of imperfect data. User feedback is not trustworthy, and implicit data is not unlabelled and hard to wrangle - and it is hard to use for machine learning and many other ways. But we can use them with changing thinking and data wrangling. In this presentation, I suggest some new ideas for wrangling data to use in machine learning and show our case studies.