Training time
DanLesman opened this issue · 3 comments
I was hoping to use CellOT on full scRNA-seq data and was wondering what the training times for that should look like and if there is any way to accelerate training. I'm currently running scGen to get the autoencoder embeddings and I'm getting predicted runtimes of 594hrs on 1 GPU for 20k genes in 3k cells and 8hrs for 1k genes in 3k cells.
Thank you!
This was a simple fix, I added the below code to the train_auto_encoder function
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = model.to(device)
Hi, adding that to the train_auto_encoder function cause a error, RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)
.
To follow up on DanLesman's answer above, you can add a call to .to(device) after each call to the dataloader to move the data onto the GPU as well. The error about all tensors expected to be on the same device comes when the model and data are on different devices; you need to move both to the GPU.
Code:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
(after every call to next(iterator)):
target = target.to(device)
source = source.to(device)
Another tip: Make sure to move the networks to GPU before the optimizer is constructed. For example, in cellot.py, I run:
f = f.to(device)
g = g.to(device)
opts = load_opts(config, f, g)
So that the f and g networks are moved to GPU before the Adam optimizer is initialized.
I can confirm that for full-gene scRNAseq data, GPU acceleration makes a large difference in the ETA for training. Hope this helps!