Logger

In machine learning you perform many experiments until you settle on a good model. During this journey you have a lot of checkpoints, visualizations, results,...etc.

The logger helps you to organize and keep track of:

The experiments results
TODO: The model checkpoints
TODO: Different visualizations and curves

Requirements

pip install requirements.txt

Install

! pip install --upgrade git+https://github.com/ahmadelsallab/ml_logger.git

Collecting git+https://github.com/ahmadelsallab/ml_logger.git
  Cloning https://github.com/ahmadelsallab/ml_logger.git to /tmp/pip-req-build-ccrtkamb
Building wheels for collected packages: mllogger
  Running setup.py bdist_wheel for mllogger ... �[?25ldone
�[?25h  Stored in directory: /tmp/pip-ephem-wheel-cache-swi794aj/wheels/87/e8/54/b5d82d55496a377ebe30a4b436616fe2bb006e9fc9055c6003
Successfully built mllogger
Installing collected packages: mllogger
  Found existing installation: mllogger 1.0
    Uninstalling mllogger-1.0:
      Successfully uninstalled mllogger-1.0
Successfully installed mllogger-1.0
�[33mYou are using pip version 10.0.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.�[0m

or alternatively clone the repo inside your project

!git clone https://github.com/ahmadelsallab/ml_logger.git
!cd ml_logger && pip install .

Cloning into 'ml_logger'...
remote: Enumerating objects: 74, done.�[K
remote: Counting objects: 100% (74/74), done.�[K
remote: Compressing objects: 100% (46/46), done.�[K
remote: Total 74 (delta 44), reused 49 (delta 25), pack-reused 0�[K
Unpacking objects: 100% (74/74), done.
Processing /home/ahmad/Work/Logger/ml_logger/tests/ml_logger
Building wheels for collected packages: mllogger
  Running setup.py bdist_wheel for mllogger ... �[?25ldone
�[?25h  Stored in directory: /tmp/pip-ephem-wheel-cache-0mg36fxm/wheels/4f/c8/a5/a2a66360be84688ab6df5f949420a229abe1d786979b84bfe3
Successfully built mllogger
Installing collected packages: mllogger
  Found existing installation: mllogger 1.0
    Uninstalling mllogger-1.0:
      Successfully uninstalled mllogger-1.0
Successfully installed mllogger-1.0
�[33mYou are using pip version 10.0.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.�[0m

Usage

More use cases under tests/test_experiment.py

Log new experiment result

In general any experiment is composed of:

meta_data: name, purpose, file, commit,...etc
config: mostly the hyperparameters, and any other configuration like the used features. For deep learning, config can be further divided into: data, model, optimizer, learning hyper parameters
results: metrics, best model file, comment,..etc

Suppose all your previous records are in 'results_old.csv'.

And now you want to log a new experiment.

from mllogger.experiments import Experiment

exp_meta_data = {'name': 'experiment_1',
            'purpose': 'test my awesome model',
             'date': 'today',
            }

exp_config = {'model_arch': '100-100-100',
          'learning_rate': 0.0001,
          'epochs': 2,
          'optimizer': 'Adam',
         }

exp_results = {'val_acc': 0.95, 
         'F1': 0.92,
         'Comment': 'Best model'}

experiment = Experiment(csv_file='results_old.csv', meta_data=exp_meta_data, config=exp_config, results=exp_results)

Note that

you can add or remove experiment parameters. In that case, if you add a parameter, old records will have NaN for those. If you delete some parameters, they will remain in the old records, but will be NaN in the new logged one.

Now Write CSV of the results

experiment.to_csv('results.csv')

If you want to see the whole record:

experiment.df

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Name	Purpose	Description	Run file	Commit	Features	Train_test_split	Size	maxlen	batch_size	...	Model file	Comment	date	name	purpose	learning_rate	model_arch	optimizer	F1	val_acc
0	CNN1D_LSTM_Big	Baseline CNN1D-LSTM Text only	Big model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Same as small. Couldnt overfitIssue in model t...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	CNN1D_LSTM_Small	Small CNN1D-LSTM Text only	Small model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	-	No overfit. Needs bigger model	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	NaN	Small data CNN1D-LSTM Text only	Big model Small data	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Also, loss not improving	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	LSTM_Big model_Small data	Small data LSTM Text only	LSTM Small data	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Loss decreasing slowly, then aslo saturates	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317/220	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	LSTM_Big model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6	CNN1D_LSTM_Big binary_crossentropy	Baseline CNN1D-LSTM Text only binary_crossentropy	Big model binary_crossentropy	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Saturation as before \nBUT\nAccuracy improved ...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
7	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
8	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	Best model	today	experiment_1	test my awesome model	0.0001	100-100-100	Adam	0.92	0.95

10 rows × 32 columns

Alternatively, you could init the Experiment with the old records, and later log one or more experiment

from mllogger.experiments import Experiment

# Load the old records
experiment = Experiment(csv_file='results_old.csv')

# TODO: perform you experiment

# Now log the new experiment data
exp_meta_data = {'name': 'experiment_1',
            'purpose': 'test my awesome model',
             'date': 'today',
            }

exp_config = {'model_arch': '100-100-100',
          'learning_rate': 0.0001,
          'epochs': 2,
          'optimizer': 'Adam',
         }

exp_results = {'val_acc': 0.95, 
         'F1': 0.92,
         'Comment': 'Best model'}

experiment.log_experiment(meta_data=exp_meta_data, config=exp_config, results=exp_results)

# Export the whole result
experiment.to_csv('results.csv')

experiment.df

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Name	Purpose	Description	Run file	Commit	Features	Train_test_split	Size	maxlen	batch_size	...	Model file	Comment	date	name	purpose	learning_rate	model_arch	optimizer	F1	val_acc
0	CNN1D_LSTM_Big	Baseline CNN1D-LSTM Text only	Big model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Same as small. Couldnt overfitIssue in model t...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	CNN1D_LSTM_Small	Small CNN1D-LSTM Text only	Small model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	-	No overfit. Needs bigger model	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	NaN	Small data CNN1D-LSTM Text only	Big model Small data	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Also, loss not improving	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	LSTM_Big model_Small data	Small data LSTM Text only	LSTM Small data	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Loss decreasing slowly, then aslo saturates	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317/220	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	LSTM_Big model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6	CNN1D_LSTM_Big binary_crossentropy	Baseline CNN1D-LSTM Text only binary_crossentropy	Big model binary_crossentropy	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Saturation as before \nBUT\nAccuracy improved ...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
7	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
8	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	Best model	today	experiment_1	test my awesome model	0.0001	100-100-100	Adam	0.92	0.95

10 rows × 32 columns

You can init an emtpy experiment, or with a certain csv, and add or change the old records csv.

But in this case, the records will be modified not appended or updated.

from mllogger.experiments import Experiment
# Init empty experiment
experiment = Experiment() # or Experiment(csv_file="another_results.csv")

# Update with another
experiment.from_csv(csv_file='results_old.csv')

# Now log the new experiment data
exp_meta_data = {'name': 'experiment_1',
            'purpose': 'test my awesome model',
             'date': 'today',
            }

exp_config = {'model_arch': '100-100-100',
          'learning_rate': 0.0001,
          'epochs': 2,
          'optimizer': 'Adam',
         }

exp_results = {'val_acc': 0.95, 
         'F1': 0.92,
         'Comment': 'Best model',}

experiment.log_experiment(meta_data=exp_meta_data, config=exp_config, results=exp_results)

# Export the whole result
experiment.to_csv('results.csv')

experiment.df

/home/ahmad/anaconda3/lib/python3.6/site-packages/mllogger/experiments.py:33: UserWarning: No old experiments records given. It's OK if this is the first record or you will add later using from_csv or from_df. Otherwise, old records they will be overwritten
  warnings.warn(UserWarning("No old experiments records given. It's OK if this is the first record or you will add later using from_csv or from_df. Otherwise, old records they will be overwritten"))

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Name	Purpose	Description	Run file	Commit	Features	Train_test_split	Size	maxlen	batch_size	...	Model file	Comment	date	name	purpose	learning_rate	model_arch	optimizer	F1	val_acc
0	CNN1D_LSTM_Big	Baseline CNN1D-LSTM Text only	Big model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Same as small. Couldnt overfitIssue in model t...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	CNN1D_LSTM_Small	Small CNN1D-LSTM Text only	Small model	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	-	No overfit. Needs bigger model	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	NaN	Small data CNN1D-LSTM Text only	Big model Small data	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Also, loss not improving	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	LSTM_Big model_Small data	Small data LSTM Text only	LSTM Small data	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Loss decreasing slowly, then aslo saturates	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317/220	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	LSTM_Big model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	LSTM_Small model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6	CNN1D_LSTM_Big binary_crossentropy	Baseline CNN1D-LSTM Text only binary_crossentropy	Big model binary_crossentropy	jigsaw_simple_text_lstm_tokenizer.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_cnn_lstm_text_only.h5	Saturation as before \nBUT\nAccuracy improved ...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
7	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	LSTM_Huge model_Small data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	20000.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
8	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	LSTM_Huge model_Big data binary_crossentropy	jigsaw_lstm.ipynb	-	comment_text	0.2	1804874.0	317	256.0	...	jigsaw_lstm_1.h5	Accuracy improved a lot. The targets have lots...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	Best model	today	experiment_1	test my awesome model	0.0001	100-100-100	Adam	0.92	0.95

10 rows × 32 columns

Other use cases

You can load old records from pandas.DataFrame instead of csv using orig_df in the Experiment constructor

df = pd.read_csv('results.old.csv')
experiment = Experiment(orig_df=df)

You can log experiment using yaml files, either in init or using from_yaml method

Known issues

https://github.com/ahmadelsallab/logger/issues

Future developments

JSON Support
xlsx support
The model checkpoints
Different visualizations and curves
Upload the result file to gdrive for online updates and sharing

ahmadelsallab/ml_experiments

Logger

Requirements

Install

Usage

Log new experiment result

Alternatively, you could init the Experiment with the old records, and later log one or more experiment

You can init an emtpy experiment, or with a certain csv, and add or change the old records csv.

Other use cases

Known issues

Future developments