
Projects from the students of Huawei NLP Course current run. Previous projects could be accessed here.

Name Description Team Repository
Building language model by users review about goods and using model in the context adversting Creation of a word corpus from a website with goods' reviews. Teaching of the classification model based on these reviews. Context detection in the dialog and detection of a good's title in order to show ads according to the required context. @akalend @rehcoeg
Movie Poster Caption Generation Particular case of text generation task from images embeddings @kazzand
A Russian Question Answering System for Inclusive Education A closed domain model for question-answering in Russian built with transfer learning techniques. The model is fine-tuned on a custom dataset collected with the methodology described in the SQuAD original paper @vifirsanova
Indonesian-Russian Machine Translation Indonesian-Russian translation pair is pretty weak now even in Yandex and Google translation systems. I try to find extend knowing ind-ru corpuses by mapping ind-en and en-rus corpuses. The main goal of this project is to practice with models based on Transformers and build the machine translation system producing the decent BLEU. @minakyan
Sentiment Analysis and AutoML The goal is to perform the Sentiment Analysis on the amazon-fine-food-reviews and compare different hyperparameters search engines: hyperopt, BOHB and Optuna @aazarov
English sentiment analysis I will try to solve common NLP problem related to sentiment analysis. The data is taken from Twitter and needs to be pre-processed, bacause the texts are very raw. Also, since the classes are unbalanced, I will try to apply data augmentation. @slavkostrov
Topic modeling and classification news on Hebrew The goal is gathering data and creating a classification model on Hebrew, which is one of the low-resource languages such, and also performing topic modeling @imvladikon
Russian Text Sentiment Transfer Transforming a sentence to alter its sentiment while preserving the content @orzhan
Domain adaptation of transformer model for improving semantic search I am exploring the domain adaptation of the transformer model for improving search relevancy. @algis
Predicting user ratings by movie's description The task of predicting a number from the text is classical, but the found solution for data from the TMDB website on a similar dataset are based not only on the description, and have also large number of other fields. In this task we will try using only description to get a similar result or better @aakzn
Topical extractive summarization Topical extractive summarization is directed toward extract sentences most relevant to a given topic. @pacifikus