This project demonstrates how to generate text related to corruption using a Recurrent Neural Network (RNN) based on the Corruption Perceptions Index (CPI) timeseries data from 2012 to 2021.
The RNN model is created using TensorFlow and Keras. It uses LSTM (Long Short-Term Memory) layers and learns to generate text based on the input data, which includes the corruption scores of various countries over the years. The generated text will be in the format of "Country has a corruption score of X in Year."
- Python 3.6 or above
- TensorFlow 2.0 or above
- Pandas
- NumPy
To install the required packages, run:
pip install numpy pandas tensorflow
-
Clone the repository to your local machine.
git clone https://github.com/nadinejackson1/corruption-text-generation.git cd corruption-text-generation
-
Run the Jupyter notebook containing the complete code.
jupyter notebook corruption-text-generation.ipynb
-
Go through the notebook step by step, which contains the following sections:
- Importing necessary libraries
- Loading and preprocessing the data
- Tokenizing and padding the text
- Creating the RNN model
- Training the model
- Generating text using the trained model
- After training the model, you can use the generate_text() function to generate text related to corruption based on the input seed text and trained model.
This project is licensed under the MIT License. See the LICENSE file for details.
The project is based on the Corruption Perceptions Index Timeseries 2012 - 2021 dataset.