Grenoble Data Science Meetup

Tutorial on single-label, multi-class text classification for the session of Febraury 23th

The tutorial will investigate:

  • Data loading
  • Vectorization and Transformations
  • Application of different families of classification algorithms
  • Evaluation and metrics

The methods will be applied on Wikipedia data available at: http://lshtc.iit.demokritos.gr/ and described at: LSHTC: A Benchmark for Large-Scale Text Classification, Ioannis Partalas, Aris Kosmopoulos, Nicolas Baskiotis, Thierry Artieres, George Paliouras, Eric Gaussier, Ion Androutsopoulos, Massih-Reza Amini, Patrick Galinari, CoRR abs/1503.08581, 2015

*Please unzip the data to use them with the notebook.