simaomsarmento/PairsTrading

Data leakage in threshold_strategy

mvinoba opened this issue · 1 comments

In threshold_strategy the spread is calculated as follows:

# calculate normalized spread
spread = y - beta * x
norm_spread = (spread - spread.mean()) / np.std(spread)
norm_spread = np.asarray(norm_spread.values)

In this case, the longs and shorts entry and exit positions are calculated using unseen data, or am I missing something?

I'm pretty sure you are right, unfortunately this minor oversight invalidates all the results. Finance is very unforgiving :(