microsoft/qlib

DDG-DA Assertion Error

l0ngc opened this issue · 2 comments

❓ Questions and Help

We sincerely suggest you to carefully read the documentation of our library as well as the official paper. After that, if you still feel puzzled, please describe the question clearly under this issue.

Hi, thanks for help! I met one problem when I try to run DDG-DA.

(qlib) [longc@arch DDG-DA]$ pwd
/home/longc/projects/qlib/examples/benchmarks_dynamic/DDG-DA
(qlib) [longc@arch DDG-DA]$ python workflow.py --conf_path=../baseline/workflow_config_lightgbm_Alpha158.yaml run
2024-03-14 23:45:35.137 | WARNING  | qlib.tests.data:qlib_data:175 - Data already exists: ~/.qlib/qlib_data/cn_data, the data download will be skipped
        If downloading is required: `exists_skip=False` or `change target_dir`
[761674:MainThread](2024-03-14 23:45:35,137) INFO - qlib.Initialization - [config.py:416] - default_conf: client.
[761674:MainThread](2024-03-14 23:45:35,138) INFO - qlib.Initialization - [__init__.py:74] - qlib successfully initialized based on client settings.
[761674:MainThread](2024-03-14 23:45:35,139) INFO - qlib.Initialization - [__init__.py:76] - data_path={'__DEFAULT_FREQ': PosixPath('/home/longc/.qlib/qlib_data/cn_data')}
[761674:MainThread](2024-03-14 23:45:35,144) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:35,144) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:35,145) INFO - qlib.workflow - [exp.py:258] - Experiment 1 starts running ...
[761674:MainThread](2024-03-14 23:45:35,185) INFO - qlib.workflow - [recorder.py:341] - Recorder 26a7facbe4b742e5ae1c4c29c90336f3 starts running under Experiment 1 ...
ModuleNotFoundError. CatBoostModel are skipped. (optional: maybe installing CatBoostModel can fix it.)
ModuleNotFoundError. XGBModel is skipped(optional: maybe installing xgboost can fix it).
Training until validation scores don't improve for 50 rounds
[20]    train's l2: 0.959367    valid's l2: 0.992761
[40]    train's l2: 0.941031    valid's l2: 0.996238
[60]    train's l2: 0.92202     valid's l2: 0.999542
Early stopping, best iteration is:
[12]    train's l2: 0.96859     valid's l2: 0.992723
[761674:MainThread](2024-03-14 23:45:40,958) INFO - qlib.timer - [log.py:127] - Time cost: 0.017s | waiting `async_log` Done
[761674:MainThread](2024-03-14 23:45:41,116) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:41,116) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.021s | Loading data Done
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.000s | fit & process data Done
[761674:MainThread](2024-03-14 23:45:44,056) INFO - qlib.timer - [log.py:127] - Time cost: 0.021s | Init data Done
[761674:MainThread](2024-03-14 23:45:44,144) INFO - qlib.Rolling - [base.py:162] - The prediction horizon is overrided
[761674:MainThread](2024-03-14 23:45:44,144) INFO - qlib.Rolling - [base.py:173] - {'model': {'class': 'LGBModel', 'module_path': 'qlib.contrib.model.gbdt', 'kwargs': {'loss': 'mse', 'colsample_bytree': 0.8879, 'learning_rate': 0.2, 'subsample': 0.8789, 'lambda_l1': 205.6999, 'lambda_l2': 580.9768, 'max_depth': 8, 'num_leaves': 210, 'num_threads': 20}}, 'dataset': {'class': 'DatasetH', 'module_path': 'qlib.data.dataset', 'kwargs': {'handler': {'class': 'Alpha158', 'module_path': 'qlib.contrib.data.handler', 'kwargs': {'start_time': datetime.date(2008, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'fit_start_time': datetime.date(2008, 1, 1), 'fit_end_time': datetime.date(2014, 12, 31), 'instruments': 'csi300', 'label': ['Ref($close, -21) / Ref($close, -1) - 1']}}, 'segments': {'train': [datetime.date(2008, 1, 1), datetime.date(2014, 12, 31)], 'valid': [datetime.date(2015, 1, 1), datetime.date(2016, 12, 31)], 'test': [datetime.date(2017, 1, 1), datetime.date(2020, 8, 1)]}}}, 'record': [{'class': 'SignalRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'model': '<MODEL>', 'dataset': '<DATASET>'}}, {'class': 'SigAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'ana_long_short': False, 'ann_scaler': 252}}, {'class': 'PortAnaRecord', 'module_path': 'qlib.workflow.record_temp', 'kwargs': {'config': {'strategy': {'class': 'TopkDropoutStrategy', 'module_path': 'qlib.contrib.strategy', 'kwargs': {'signal': '<PRED>', 'topk': 50, 'n_drop': 5}}, 'backtest': {'start_time': datetime.date(2017, 1, 1), 'end_time': datetime.date(2020, 8, 1), 'account': 100000000, 'benchmark': 'SH000300', 'exchange_kwargs': {'limit_threshold': 0.095, 'deal_price': 'close', 'open_cost': 0.0005, 'close_cost': 0.0015, 'min_cost': 5}}}}}]}
[761674:MainThread](2024-03-14 23:45:44,343) WARNING - qlib.data - [data.py:666] - load calendar error: freq=day, future=True; return current calendar!
[761674:MainThread](2024-03-14 23:45:44,343) WARNING - qlib.data - [data.py:669] - You can get future calendar by referring to the following document: https://github.com/microsoft/qlib/blob/main/scripts/data_collector/contrib/README.md
[761674:MainThread](2024-03-14 23:45:44,366) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[AssertionError: An empty experiment is required for setup `InternalData`].
  File "workflow.py", line 40, in <module>
    fire.Fire(DDGDABench)
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/rolling/ddgda.py", line 335, in run
    self._dump_meta_ipt()
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/rolling/ddgda.py", line 213, in _dump_meta_ipt
    internal_data.setup(trainer=TrainerR)
  File "/home/longc/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/contrib/meta/data_selection/dataset.py", line 84, in setup
    assert 0 == len(recorders), "An empty experiment is required for setup `InternalData`"
AssertionError: An empty experiment is required for setup `InternalData`

I met this empty recorders error here. I struggled to check the code but I did not have a clue now.

Below is the version of my packages relatively

(qlib) [longc@arch DDG-DA]$ python3
Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import qlib
>>> import pandas as pd
>>> import numpy as np
>>> import torch

>>> print("Qlib version:", qlib.__version__)
Qlib version: 0.9.3
>>> print("Pandas version:", pd.__version__)
Pandas version: 1.5.3
>>> print("NumPy version:", np.__version__)
NumPy version: 1.23.5
>>> print("PyTorch version:", torch.__version__)
PyTorch version: 1.11.0+cu113

I really appreciate any help if possible. Thanks!!!

Tasks

No tasks being tracked yet.

Try to remove the mlrun dir using rm -rf mlrun in the examples/benchmarks_dynamic/DDG-DA