Machine Learning for Natural Language Processing

Different tasks of machine learning for natural language processing. Based on CS475, 2022 Fall, KAIST.

About the Course and the Tasks

Please refer to the course homepage or instructions for each task.

Extensive examination on tokenization metrics. [Pull Request]

Build RNN module, and modify hyperparameters for the best performance. [Pull Request]

Implement and evaluate pooling for token-level representation in BERT. [Report]

Review Ahn & Oh, Mitigating Language-Dependent Ethnic Bias in BERT (EMNLP 2021) and extend fill-mask task on BERT.
Explored religious bias based on ethnicity, in English and Spanish. [Report]

Using google colaboratory is recommended. However, dependencies can be found in requirements.txt for each task.