DDG-DA Assertion Error
l0ngc opened this issue · 2 comments
l0ngc commented
❓ Questions and Help
We sincerely suggest you to carefully read the documentation of our library as well as the official paper. After that, if you still feel puzzled, please describe the question clearly under this issue.
Hi, thanks for help! I met one problem when I try to run DDG-DA.
(qlib) [longc@arch DDG-DA]$ pwd
/home/longc/projects/qlib/examples/benchmarks_dynamic/DDG-DA
(qlib) [longc@arch DDG-DA]$ python workflow.py --conf_path=../baseline/workflow_config_lightgbm_Alpha158.yaml run
2024-03-14 23:45:35.137 | WARNING | qlib.tests.data:qlib_data:175 - Data already exists: ~/.qlib/qlib_data/cn_data, the data download will be skipped
If downloading is required: `exists_skip=False` or `change target_dir`
[761674:MainThread](2024-03-14 23:45:35,137) INFO - qlib.Initialization - [config.py:416] - default_conf: client.
[761674:MainThread](2024-03-14 23:45:35,138) INFO - qlib.Initialization - [__init__.py:74] - qlib successfully initialized based on client settings.
[761674:MainThread](2024-03-14 23:45:35,139) INFO - qlib.Initialization - [__init__.py:76] - data_path={'__DEFAULT_FREQ': PosixPath('/home/longc/.qlib/qlib_data/cn_data')}
[761674:MainThread](2024-03-14 23:45:35,144) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:35,144) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:35,145) INFO - qlib.workflow - [exp.py:258] - Experiment 1 starts running ...
[761674:MainThread](2024-03-14 23:45:35,185) INFO - qlib.workflow - [recorder.py:341] - Recorder 26a7facbe4b742e5ae1c4c29c90336f3 starts running under Experiment 1 ...
ModuleNotFoundError. CatBoostModel are skipped. (optional: maybe installing CatBoostModel can fix it.)
ModuleNotFoundError. XGBModel is skipped(optional: maybe installing xgboost can fix it).
Training until validation scores don't improve for 50 rounds
[20] train's l2: 0.959367 valid's l2: 0.992761
[40] train's l2: 0.941031 valid's l2: 0.996238
[60] train's l2: 0.92202 valid's l2: 0.999542
Early stopping, best iteration is:
[12] train's l2: 0.96859 valid's l2: 0.992723
[761674:MainThread](2024-03-14 23:45:40,958) INFO - qlib.timer - [log.py:127] - Time cost: 0.017s | waiting `async_log` Done
[761674:MainThread](2024-03-14 23:45:41,116) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:41,116) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.021s | Loading data Done
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.000s | fit & process data Done
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.021s | Init data Done
[761674:MainThread](2024-03-14 23:45:44,144) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:44,144) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:44,343) WARNING - qlib.data - [data.py:666] - load calendar error: freq=day, future=True; return current calendar!
[761674:MainThread](2024-03-14 23:45:44,343) WARNING - qlib.data - [data.py:669] - You can get future calendar by referring to the following document: https://github.com/microsoft/qlib/blob/main/scripts/data_collector/contrib/README.md
[761674:MainThread](2024-03-14 23:45:44,366) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[AssertionError: An empty experiment is required for setup `InternalData`].
File "workflow.py", line 40, in <module>
fire.Fire(DDGDABench)
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/rolling/ddgda.py", line 335, in run
self._dump_meta_ipt()
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/rolling/ddgda.py", line 213, in _dump_meta_ipt
internal_data.setup(trainer=TrainerR)
File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/meta/data_selection/dataset.py", line 84, in setup
assert 0 == len(recorders), "An empty experiment is required for setup `InternalData`"
AssertionError: An empty experiment is required for setup `InternalData`
I met this empty recorders error here. I struggled to check the code but I did not have a clue now.
Below is the version of my packages relatively
(qlib) [longc@arch DDG-DA]$ python3
Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28)
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import qlib
>>> import pandas as pd
>>> import numpy as np
>>> import torch
>>> print("Qlib version:", qlib.__version__)
Qlib version: 0.9.3
>>> print("Pandas version:", pd.__version__)
Pandas version: 1.5.3
>>> print("NumPy version:", np.__version__)
NumPy version: 1.23.5
>>> print("PyTorch version:", torch.__version__)
PyTorch version: 1.11.0+cu113
I really appreciate any help if possible. Thanks!!!
ZhongHaoAustin commented
Try to remove the mlrun dir using rm -rf mlrun
in the examples/benchmarks_dynamic/DDG-DA