JaneliaSciComp/SongExplorer

.csv index columns in data

Closed this issue · 2 comments

Hi there. I am attempting to look at the labeled sound events provided in the .csvs in the data directory. A couple of the files seem to have the same start and stop indices if I am reading them right. For example, in 20161207T102314_ch1-annotated-person1.csv, the last several rows have the same number. This issue also appears in most if not all rows of 20190122T093303a-7-annotated-person2.csv, 20190122T093303a-7-annotated-person3.csv, 20190122T132554a-14-annotated-person2.csv, and 20190122T132554a-14-annotated-person3.csv, and some rows of 20190122T132554a-14-annotated-person3.csv. Am I interpreting the data correctly to say that of the five columns, the second and third indicate the start and stop indices, and if so, does SongExplorer handle that these are often the same? Thanks for any help.

your interpretation is correct, and yes songexplorer handles that case. if the start and stop times are the same, that nominally indicates the event has no temporal duration.

historically we typically took the time-saving step of annotating drosophila pulses like this, as a double-click is quicker than a click and drag. (pulses are short relative to sine song, but not delta functions). we have found recently however that range annotations (where the start and stop times are not identical) permit a degree of data augmentation for pulses, and so fewer annotations are needed, making a click-and-drag worthwhile. in this case, when training, songexplorer will randomly select a point within that range each time that annotation is selected for a mini-batch. and so the context window input to the neural network will be slightly different each time, yielding better generalization.

i haven't pushed the code yet to github, but there is a new feature, where the double-click gesture is now backed by a plug-in of code. the default is still a point annotation (start==stop), but there exists an alternate function which snaps to the nearest peak in the waveform and lays down a range annotation of user-defined width. if this would be helpful to you i can push that code and re-build the containers. let me know.

Got it. Thanks for the help and the quick reply!