qAp/gresearch_crypto_forecasting_kaggle

Can Spacetimeformer be used for this Kaggle?

qAp opened this issue · 9 comments

qAp commented

In the general LightningModule, a batch of data is fed into the model via the following methods:

  1. step
  2. compute_loss
  3. forward (pl.LightningModule)
  4. forward_model_pass
qAp commented

Essential data variables: x_c , y_c, x_t, and y_t.
c denotes "context", previous timesteps given to the model in order to make predictions.
t denotes "target", future timesteps to predict.
x are the features, shaped (batch size, number of timeteps, number of features)
y are the targets/dependent variables, shaped (batch size, number of timesteps, number of targets/dependent variables).

In general, the model takes in all 4 of these and its output is the prediction for y_t, so it's compared with y_t in the loss function:

outputs, *_ = self(x_c, y_c, x_t, y_t, **forward_kwargs)
loss, mask = self.forecasting_loss(outputs=outputs, y_t=y_t, time_mask=time_mask)
qAp commented

The example csv files are of the form:

timestamp | y1 | y2 | y3 | ... | yN

where N is the number of target variables.

qAp commented
  • Get one of the example csv datasets to train in spacetimeformer

This appears to run fine:

%cd /kaggle/spacetimeformer/spacetimeformer/
! python train.py spacetimeformer asos \
--gpus 0 \
--start_token_len 8 \
--run_name 'kiwi' \
--batch_size 32
qAp commented
  • Check that spacetimeformer can be installed in a Kaggle Notebook with no Internet.
qAp commented

spacetimeformer has been migrated to this repo.

qAp commented
  • Use the spacetimeformer model to predict on some asos samples offline.
qAp commented
  • Create similar dset for competition data.
  • Predict on some samples.
qAp commented

How to specify null values with the competition data?

  1. NULL_VALUE = ? where ? are the NaN values in a pandas dataframe.
  2. Could replace NaN values in the dataframe with some special value, like -999, but VWAP ranges from -inf to +inf.