lstm-bayesian-optimization-pytorch

This is a simple application of LSTM to text classification task in Pytorch using Bayesian Optimization for hyperparameter tuning.

The dataset used is Yelp 2014 review data[1] which can be downloaded from here.

Detailed instructions are explained below.

Configurations

You can set various hyperparameters in src/constants.py file.

The description of each variable is as follows.

Note that for Bayesian Optmization, the hyperparameter to be tuned should be passed in a form of tuple.

So you can set an argument as a tuple or a certain value.

The former means that the argument will be included as the subject of Bayesian Optimization and the latter means that it should not be included.

Argument	Type	Description	Default
`device`	`torch.device`	The device type. (CUDA or CPU)	`torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')`
`learning_rates`	`tuple (float, float)` or `float`	The range of learning rates. (or a value)	`(0.0001, 0.001)`
`batch_sizes`	`tuple (int, int)` or `int`	The range of batch sizes. (or a value)	`(16, 128)`
`seq_len`	`tuple (int, int)` or `int`	The range of maximum sequence lengths. (or a value)	`512`
`d_w`	`tuple (int, int)` or `int`	The range of word embedding dimensions. (or a value)	`256`
`d_h`	`tuple (int, int)` or `int`	The range of hidden state dimensions in the LSTM. (or a value)	`256`
`drop_out_rate`	`tuple (float, float)` or `float`	The range of drop out rates. (or a value)	`0.5`
`layer_num`	`tuple (int, int)` or `int`	The range of LSTM layer numbers. (or a value)	`3`
`bidirectional`	`bool`	The flag which determines whether the LSTM is bidirectional or not.	`True`
`class_num`	`int`	The number of classes.	`5`
`epoch_num`	`tuple (int, int)` or `int`	The range of total iteration numbers. (or a value)	`10`
`ckpt_dir`	`str`	The path for saved checkpoints.	`../saved_model`
`init_points`	`int`	The number of initial points to start Bayesian Optimization.	`2`
`n_iter`	`int`	The number of iterations for Bayesian Optimization.	`8`

How to run

Install all required packages.
```
pip install -r requirements.txt
```
Download the dataset and extract it.

Of course, you can use another text classification dataset but make sure that the formats/names of files are same as those of Yelp 2014 review dataset. (See the next step.)
Make a directory named data.

Get files named train.txt, text.txt, dev.txt and wordlist.txt from yelp14 and put them into data.

The directory structure should be as follows.
- data
  - train.txt
  - test.txt
  - dev.txt
  - wordlist.txt
Execute below command to train the model.
```
python src/main.py --mode='train'
```
- --mode: This specify the running mode. The mode can be either train or test.
The Bayesian Optimization is used for hyper-parameter tuning in this task.

You can add/modify the hyperparameter list to tune in main.py.
```
self.pbounds = {
    'learning_rate': learning_rates,
    'batch_size': batch_sizes
}

self.bayes_optimizer = BayesianOptimization(
    f=self.train,
    pbounds=self.pbounds,
	random_state=777
)
```
Currently, the batch size and the learning rate are only subjects to be adjusted.

If you want to modify self.pbounds, add the desired hyperparameter and change its value in constant.py into a tuple consisting of two values, minimum and maximum, sequentially.

Then you should add that hyperparameter as an additional parameter for the function train like batch_size and learning_rate.
After training, you can test the model with test data by following command.
```
python src/main.py --mode='test' --model_name=MODEL_NAME --inference_batch_size=BATCH_SIZE
```
- model_name: This is the file name of trained model you want to test. The model is located in saved_models directory if you didn't change the checkpoint directory setting. (default: None)
- inference_batch_size: This is the batch size for inference step. This is irrelevant with batch_size in src/constants.py since this argument might be subject to Bayesian Optmization process. You can set the separate batch size only for inferencing. (default: 128)

References

[1] Yelp Open Dataset. (https://www.yelp.com/dataset)

devjwsong/lstm-bayesian-optimization-pytorch

lstm-bayesian-optimization-pytorch

Configurations

How to run

References