As the datasets are too large, we do not upload them on github, u can find a sample of datafile what we use under the ./dataset/training_data/600000.csv
.
split_train_data.py helps extract the original dataset file traing_data.zip and split them with respect to stock codes, saving them under ./dataset/training_data
dbloader.py helps load stocks data via its stock code and assigned date interval
Runing for day price prediction: day_model.py
Seasonal Components of the day model
Runing for minute price prediction: minute_model.py
Model Structure
Search for high correlated stock pairs, saved under ./top_corr/
runing: find_corr_top.py
Search for significant stock pairs, saved under ./rule/trade_rule.csv
runing: cointegration.py
Spread price of pair(600015,600016) with high significant
OLS fit on Spread price of pair(600015,600016)
A pvalue(1-pvalue) test matrix of 600000 with some other stocks
numpy,pandas,matplotlib
day_model.py: fbprophet==0.3.post2
minute_model.py: keras, tensorflow, sklearn
find_corr_top.py: None except Public packages
cointegration.py: statsmodels, seaborn