bmvandoren/Nighthawk

Warning: "set on a copy of a slice"

Closed this issue · 1 comments

Thank you for your work on Nighthawk. I wanted to report something I am seeing in my logs when running with Vesper, and I wonder if it would be worth fixing:

2023-06-05 23:30:43,852 INFO                 /opt/conda/envs/nighthawk-0.2.0/lib/python3.10/site-packages/nighthawk/run_reconstructed_model.py:107: SettingWithCopyWarning: 
2023-06-05 23:30:43,852 INFO                 A value is trying to be set on a copy of a slice from a DataFrame.
2023-06-05 23:30:43,852 INFO                 Try using .loc[row_indexer,col_indexer] = value instead
2023-06-05 23:30:43,852 INFO                 
2023-06-05 23:30:43,852 INFO                 See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
2023-06-05 23:30:43,852 INFO                   df_split['tmp'] = range(df_split.shape[0])

It seems to reference the middle of:

def split_long_detections(df,max_duration=5):
df = df.reset_index(drop=True)
is_too_long = df['end_sec']-df['start_sec'] > max_duration
df_keep = df.loc[~is_too_long]
df_split = df.loc[is_too_long]
df_split['tmp'] = range(df_split.shape[0])
df_split = df_split.groupby('tmp', group_keys=False).apply(split_long_detections_helper,max_duration=max_duration)
df_split = df_split.drop('tmp',axis=1)
df_split = df_split.reset_index(drop=True)
df_out = pd.concat([df_keep,df_split])
df_out = df_out.sort_values('start_sec').reset_index(drop=True)
return df_out

Fixed in c330a53.