hudson-and-thames/mlfinlab

TypeError in get_trades_based_kyle_lambda

brunofacca opened this issue · 3 comments

The get_trades_based_kyle_lambda function worked for several million rows of tick data then it raised the following exception.

~/.cache/pypoetry/virtualenvs/redistributor-Sj5_szt9-py3.8/lib/python3.8/site-packages/mlfinlab/microstructural_features/second_generation.py in get_trades_based_kyle_lambda(price_diff, volume, aggressor_flags)
    145     y = np.array(price_diff)
    146     coef, var = get_betas(X, y)
--> 147     t_value = coef[0] / np.sqrt(var[0]) if var[0] > 0 else np.array([0])
    148     return [coef[0], t_value[0]]
    149 

TypeError: '>' not supported between instances of 'list' and 'int'

I believe that the problem is in the following part of the get_betas function:

try:
    xx_inv = np.linalg.inv(xx)
except np.linalg.LinAlgError:
    return [np.nan], [[np.nan, np.nan]]

... the variance returned when there is np.linalg.LinAlgError is a nested list ([[np.nan, np.nan]]) but get_trades_based_kyle_lambda's var[0] > 0 cannot handle nested lists.

Just had the same issue in get_trades_based_hasbrouck_lambda and noticed that all 3 lambdas have the problematic piece of code.

Hi, @brunofacca, thank you for your attention.
That's really an issue I have also faced when applied to my data. Ideally, the function should return nan. I would suggest using try-except structure for now, but probably the problem is with input data passed to the function call.

Hi @proskurin. I've found that the problem happens in bars where all ticks have the exact same price. Although very uncommon, the data is valid. Thank you.