/NLP-Final-Project

CSCI-SHU 376 Natural Language Processing | Spring 2021 | Final Project

Primary LanguageJupyter Notebook

NLP-Final-Project

CSCI-SHU 376 Natural Language Processing | Spring 2021 | Final Project

By Tinglong Liao (tl2564), Xue Bai (xb347), Alison Yao (yy2564)

"The purpose of the final project is two-fold. First, it will give you the opportunity and incentive to study in-depth a topic that interests you. Second, it will test your ability to use NLP algorithms to solve a problem within this topic. The project definition is purposefully open- ended. The goal is for you to be able to spend time thinking deeply about NLP and how to best apply it in a real-life scenario."

Dataset

Please check TableQA directory.

GitHub Source: https://github.com/ZhuiyiTechnology/TableQA

Paper detailing the dataset: https://arxiv.org/abs/2006.06434

Data is used for a compitition: https://tianchi.aliyun.com/competition/entrance/231716/introduction

Project Implementation

Please check NLP_final_paper.pdf.

All implementation codes are in code directory.

Single-BERT Model:

Variant 1 – SB_multitask: all_task_7 final.ipynb

Variant 2 – SB_SELECT: all_task_only_sel_final.ipynb

Variant 3 – SB_WHERE: all_task_only_where_final.ipynb

Double Tower Model

Variant 1 – DT_multitask: double_tower_single_task.ipynb

Variant 2 – DT_SELECT & Variant 3 – DT_WHERE: double_tower_multi_task.ipynb

Trained models

We used Google Colab GPU to speed things up. Please check the following link for the trained models.

Baidu Cloud Link: https://pan.baidu.com/s/12Qj_9CC5PvyXcgz_r_2nWw Password: 4imj

References

https://github.com/BaeSeulki/NL2LF/blob/master/README.md

Baseline

Baseline is provided by the data providers: https://github.com/ZhuiyiTechnology/nl2sql_baseline

Improvement

No.1 Model: https://tianchi.aliyun.com/forum/postDetail?spm=5176.12586969.1002.6.694d1ca2X5AViB&postId=78781

PPT: https://github.com/nudtnlp/tianchi-nl2sql-top1/blob/master/天池NL2SQL冠军方案.pdf

Pretrained BERT Embeddings: https://github.com/ymcui/Chinese-BERT-wwm

No.3 Model: https://github.com/lifloveyou/tianchi_nl2sql