This is the assignment 1 of DS-GA 1011 Natural Language Processing and Representaton
- This is the additional exploriation on Batch-Size:
- Note: There are two notebooks in thie repo. [Bag of N-Gram Document Classification.ipynb]{https://github.com/hb1500/NLP-Movie-Sentiment-Analysis/blob/master/Bag%20of%20N-Gram%20Document%20Classification.ipynb} is used to generate all the hyper-parameters without tokenization scheme. The [Bag-of-Word_Tokenization_Without_Clearning .ipynb]{https://github.com/hb1500/NLP-Movie-Sentiment-Analysis/blob/master/Bag-of-Word_Tokenization_Without_Clearning%20.ipynb} is algorithms to run tokenization without any cleaning process on tokens.