Split0 Data Format

Question

Closed this issue a year ago · 1 comments

Hi illidanlab,

Thank you for sharing T-LSTM code! I really need this model to finish my master's thesis

I want to know why the number of samples and dimensionality of elapsed_train.pkl and data_train.pkl in Split0 change?
Is the data normalized? If so, what are the limits for elapsed_train.pkl and data_train.pkl normalization?

Answer 1 · 2023-10-24T12:12:14.000Z

Hi Mils-liu,

Sorry for my late reply.

Elapsed_train size contains elapsed times for the samples, whereas data_train has the input features. elapsed_time sequence could be batch_size x sequence length x 1, but data_train size is a batch_size x sequence length x d where d is the number of input attributes. I am not sure what you meant by the number of samples. Did you mean batch sizes are different? Batch sizes should be the same for corresponding data, elapsed time, and label tensors.
You can use any normalization technique that fits your problem. Elapsed times could be days, months, or years. It's been a while since I worked with this repo's data. Please let me know if you have further questions.