i have a question in your code about data_loader.py

Question

i have a question in your code about data_loader.py

Closed this issue a month ago · 3 comments

You use this code.

def len(self):
return (len(self.data_x) - self.seq_len - self.pred_len + 1) * self.enc_in

I change above code to

def len(self):
return (len(self.data_x) - self.seq_len - self.pred_len + 1)

then batch change like this: first is original code and second is my version
i use this code for print
print("Data Set Settings:")
print(f"Root Path: {data_set.root_path}")
print(f"Data Path: {data_set.data_path}")
print(f"Features: {data_set.features}")
print(f"Target: {data_set.target}")
print(f"Time Encoding: {data_set.timeenc}")
print(f"Frequency: {data_set.freq}")
print(f"Percent: {data_set.percent}")

print("\nData Loader Settings:")
print(f"Batch Size: {data_loader.batch_size}")
print(f"Number of Workers: {data_loader.num_workers}")
print(f"Drop Last: {data_loader.drop_last}")

total_samples = len(data_set)
total_batches = len(data_loader)
print(f"\nTotal number of samples in dataset: {total_samples}")
print(f"Total number of batches: {total_batches}")

Answer 1 · 2024-05-14T17:29:39.000Z

Hi, they "flatten" input features into one column. So, I did not work with electricity dataset, but it seems that you just have the length of the first feature. Check theirs getitem function :)

Answer 2 · 2024-05-15T06:37:57.000Z

Hi, it seems that I haven't fully understood your question. Could you please specify what your issue is? It appears to be similar to this issue; you might want to take a look for reference.

Answer 3 · 2024-05-15T08:40:27.000Z

thanks for your replies, I understood why 'flatten' channel in dataloader.
But it seems like too many time cost to run training ECL 'M' in two A6000.