microsoft/BatteryML

Using the model to predict after train and evaluation

Opened this issue · 5 comments

Hi,

I was wondering if anyone could give instructions of using the model after it has been trained?

I would like to test 2 trained models against each other using the same dataset, after they have been trained.

Like train a few models on CLO and then test using the HUST/CALCE data etc.

We are very sorry that you have encountered this problem, our current code supports this kind of operation, you can refer to the following code to complete your experiments:

# import pipeline
from batteryml.pipeline import Pipeline

# create first pipeline to train model on CLO
pipeline1 = Pipeline(config_path='configs/baselines/sklearn/variance_model/clo.yaml', workspace='workspaces')

# train model on CLO dataset
model, dataset = pipeline1.train(device='cuda', skip_if_executed=False)

# Then, you will find a new .ckpt file under your workspaces folder

# create second pipeline to evaluate model on HUST
pipeline2 = Pipeline(config_path='configs/baselines/sklearn/variance_model/hust.yaml', workspace='workspaces')

# evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data
pipeline2.evaluate( ckpt_to_resume='./workspaces/20240423132818.ckpt', skip_if_executed=False)

I hope my answer has solved your problem, if you have any questions feel free to leave a comment.

We've been working on the doc lately, and we'd love to hear more from you!

Hi @agiamason,

I tried your implementation but I got the following error:

Seed is set to 0.
Reading train data: 0it [00:00, ?it/s]
Reading test data: 0it [00:00, ?it/s]
Extracting features: 0it [00:00, ?it/s]

RuntimeError Traceback (most recent call last)
in <cell line: 4>()
2
3 # evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data
----> 4 pipeline2.evaluate( ckpt_to_resume='/content/BatteryML/workspaces/rf_clo_new/20240424213213.ckpt', skip_if_executed=False)

3 frames
/content/BatteryML/batteryml/feature/base.py in call(self, cells)
18 for i, cell in enumerate(pbar):
19 features.append(self.process_cell(cell))
---> 20 features = torch.stack(features)
21 return features.float()
22

RuntimeError: stack expects a non-empty TensorList

This looks like your HUST data has not been successfully converted to feature, could you check if your HUST data has been successfully downloaded and preprocessed like the CLO data?

Hi @agiamason,

I have tried your implementation and it seemed to have worked, however when I tried to print the subsequent new prediction ground truth graph it looks identical to what was displayed before. I dont think I have done this correctly. I have a single model PLSR that was trained on HUST and CLO datasets.

# import pipeline
from batteryml.pipeline import Pipeline

# create first pipeline to train model on CLO
pipeline1 = Pipeline(config_path='configs/baselines/sklearn/plsr/hust.yaml', workspace='workspaces')

# train model on CLO dataset
# model, dataset = pipeline1.train(device='cuda', skip_if_executed=False)

# Then, you will find a new .ckpt file under your workspaces folder

# create second pipeline to evaluate model on HUST
pipeline2 = Pipeline(config_path='configs/baselines/sklearn/plsr/clo.yaml', workspace='workspaces')

# evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data
pipeline2.evaluate( ckpt_to_resume='/content/BatteryML/workspaces/plsr/20240501030343.ckpt', skip_if_executed=False)

# get raw data from pipeline
train_cells, test_cells = pipeline2.raw_data['train_cells'], pipeline2.raw_data['test_cells']
prediction2 = model.predict(dataset, data_type='test').to('cpu')
ground_truth2 = dataset.test_data.label.to('cpu')
plot_result(ground_truth2, prediction2,'plsr')
result.append([method, train_loss, test_loss])

Screenshot 2024-05-01 050204

Ideally I want to be able to load the pretrained models from my Gdrive and test them after initial training to compare, but the graph from the above has just led me to the same graph from the model trained on Hust instead of the updated prediction and ground truth values from the newly trained CLO data.

Thanks for any help you can give.

however when I tried to print the subsequent new prediction ground truth graph it looks identical to what was displayed before.

Can you provide more details?