cvxgrp/cvxportfolio

Data quality issues in `ftse100_daily` example strategy

Opened this issue · 0 comments

Even after considerable effort in data cleaning developed for 1.2.0, the London strategy still has problems with data quality. This is slowing down development of other non-US strategies, assuming (maybe incorrectly) that similar issues will arise with YahooFinance data. Two example names that are problematic are III.L and SMT.L; for both data availability is erratic, one has almost always zero volumes, the other has non-updating prices.

It seems that strategy_executor.py is handling the situation correctly, adjusting starting positions as the open prices are updated, recomputing trades as the volumes are updated; however it's very likely that the data provider updates are phony (at least for those two names).

The obvious solution is to exclude the problematic names. But we should do that automatically, as part of data cleaning. If all or most of the past is deleted, then the name is not included by the market data server with the min_history filter.

So, don't make any more changes to the default parameters of YahooFinance, derive from it specifically for the problematic strategy, change the parameters in the subclass to make cleaning more aggressive for it (only).