/stf-topic-model

Primary LanguageJupyter Notebook

Topic Modelling Brazilian Supreme Court Lawsuits

This repo holds the source code for the work described in the paper below:

@InProceedings{luz_etal_jurix2020,
          author = {Pedro H. {Luz de Araujo} and Te\'{o}filo E. {de Campos}},
          title = {Topic Modelling Brazilian Supreme Court Lawsuits},
          booktitle = {International Conference on Legal Knowledge and Information Systems​ ({JURIX})},
	  publisher = {IOS Press},
	  series = {Frontiers in Artificial Intelligence and Applications},
	  pages = {113--122},
          year = {2020},
          month = {December 9-11},
          address = {Prague, Czech Republic},	  
	  doi = {10.3233/FAIA200855},
	  url = {http://ebooks.iospress.nl/volumearticle/56168},
}	  

The sections below describe the requirements and the files.

We kindly request that users cite our paper in any publication that is generated as a result of the use of our work.

Requirements

  1. Python 3.6
  2. Gensim
  3. pyLDAvis
  4. Pandas
  5. Scikir-learn
  6. XGBoost

Files

  • lda_explore.ipynb: notebook for topic model training
  • train_xgboost_tfidf.ipynb: notebook for classifier training using tfidf values or word counts.
  • train_xgboost_topics.ipynb: notebook for classifier training using topic distribution.