DPO training issues
DonStroganotti opened this issue · 1 comments
DonStroganotti commented
I wanted to give DPO training a try but there seems to be multiple issues with the PairedDataset.
reso and interp are not assigned to self in init but self.reso
is being used later in the file
Line 54 in af5c34b
Getting this error when using the default train_dpo.yaml config:
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/workspace/naifu/data/paired_wds.py", line 63, in __getitem__
example = self.preprocess_train(example)
File "/workspace/naifu/data/paired_wds.py", line 33, in preprocess_train
images = [
File "/workspace/naifu/data/paired_wds.py", line 34, in <listcomp>
Image.open(io.BytesIO(im_bytes)).convert("RGB")
TypeError: a bytes-like object is required, not 'int'```