/UN-Parallel-Corpora-Analysis

This is Kinan Al-Mouk's term project for LING1340 Data Science for Linguists at the University of Pittsburgh

Primary LanguageJupyter NotebookOtherNOASSERTION

United Nations 6 Way Parallel Corpora Analysis

May 1st 2022 By: Kinan Al-Mouk

Goal: Explore the Linguistic Elements of the six Official United Nations' Languages: English, Spanish, French, Russian, Arabic, and Mandarin Chinese.

Data Source: United Nations, Department for General Assembly and Conference Management: UN Parallel Corpora

Summary

This project counts as submission for my term project for LING1340 Data Science for Linguists instructed by Na-Rae Han at the University of Pittsburgh. All data was obtained from the UN website and processed using nltk and SpaCy.

Directory

Guestbook