arundo/adtk

[Question]:What is the output type of `anomalies` when I use Outlierdetector?

hmulagad opened this issue · 2 comments

Hello,

I am trying to using adtk to find outliers in my data. I used the below code.
outlier_detector = OutlierDetector(LocalOutlierFactor(contamination=0.05)) anomalies = outlier_detector.fit_detect(df_anomaly) print(anomalies)
My output is type bool where I see something like this
log_time
2022-02-04 09:48:07 False
2022-02-06 16:16:19 False
2022-02-06 16:21:20 True
2022-02-06 16:26:20 False
2022-02-06 16:31:20 True
...
2022-02-07 05:56:23 False
2022-02-07 06:01:23 False
2022-02-07 06:06:23 False
2022-02-07 06:11:23 False
2022-02-07 06:16:23 False

Is this a dataframe? How can I get list of values which have True (which I believe are outliers)? I need to plot those outliers on my plot.

Any help will be really appreciated.

Hi @hmulagad

The anomalies object is a pandas.core.series.Series

You can use the built-in adtk plot function to plot it:

from adtk.visualization import plot
plot(df_anomaly, anomaly=anomalies, ts_linewidth=1, ts_markersize=3, anomaly_color='red', anomaly_alpha=0.3, curve_group='all');

Or you can convert it to a list and use the data in your own plot.

# With datetime and boolean
anomailes_list = anomalies.reset_index().values.tolist()
anomailes_list
# Just boolean results
anomailes_list = anomalies.to_list()
anomailes_list

# List of values that are True
anomalies_true = [value for value in anomalies.to_list() if value]

If you want their indices you can just enumerate the list or you can create a list with the datetime and values doing the list comprehension on

anomalies_true_with_datetime = [item for item in anomalies.reset_index().values.tolist() if item[1]]

Hello @earthgecko

Thank you very much! That is very helpful. I need to use my own plot. I will convert the list to dataframe since I am using Bokeh and to keep it consistent!

~Kishan