/IMDB-Dataset-Classification-using-Pre-trained-Word-Embedding-with-GloVec-6B

In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis

Primary LanguageJupyter Notebook

IMDB-Dataset-Classification-using-Pre-trained-Word-Embedding-with-GloVec-6B

In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis, I used the IMDB dataset, which is a popular dataset for binary sentiment classification. To enhance the model's performance, you leveraged the GloVe 6B with 100D pre-trained word embeddings, which provided valuable semantic information to the words in the corpus. Overall, My project focused on text data preprocessing, tokenization, and leveraging pre-trained word embeddings for sentiment analysis. It demonstrated how these techniques can be applied to text data to build effective machine learning models for classification tasks like sentiment analysis.