
For Word Embedding, because the experimental data set is too small, I merge train_data and test_data and added the TED data set as the Word Embedding data.

For Word Embedding, because the experimental data set is too small, the model parameter selects min_count=30 to avoid some rare words that affect the accuracy of semantics and Syntactic.Set epcoh = 10 to get good results and avoid overfitting

For the Word2Vec with SkipGram model, the window size is set to [2,3,5,7,9] and the word vector dimension is set to [10,20,50,100,200] for comparison experiments.

From the results, it can be seen that the semantic accuracy of the Word2Vec with Skipgram model is higher, and it is more in line with the requirements of sentiment analysis. The window size is 3 and the word vector dimension of 50 is the best choice.

Screen Shot 2021-06-22 at 10 38 41
Screen Shot 2021-06-22 at 10 38 58
Screen Shot 2021-06-22 at 10 38 35

For Sequence Model, I tried two models, Bi-RNN and Bi-LSTM. Since the amount of data is small and the selected word vector dimension is only 50, I set the number of features of the hidden layer to 64, and there is only one rnn or lstm layer. The experimental results are shown in result. The final Bi-LSTM model performed better, with a higher F1 score, and trained for 300 epochs with a learning rate of 0.005 to obtain the best results.

Screen Shot 2021-06-22 at 10 38 12
Screen Shot 2021-06-22 at 10 37 59