UCSD-E4E/PyHa

Histogram for length of annotations

JacobGlennAyers opened this issue · 1 comments

Create a new function within visualizations.py that builds a histogram so that a user can visually see the length of the annotations that they are working with. This would prove to be a very valuable tool in throwing out certain annotations when you are trying to derive relevant statistics for the cross-correlation pipeline.

Matplotlib has a built-in histogram function that can be encapsulated for this task: https://matplotlib.org/stable/gallery/statistics/hist.html

Play around with some data such as the manual labels already on the repository to figure out a reasonable default bin size, but write it in such a way that the user could modify the number of histogram bins. The y-axis should be labeled as "Count" and the x-axis should be labeled as "Annotation Length (s)". The function should accept in a pandas dataframe of annotations, should work with either human annotations or automated annotations. This shouldn't be tricky since they both have the "DURATION" column. Add two extra parameters that give the user the option to download the file.

Call the function annotation_histogram(annotation_df,n_bins = 15, save_fig = False, filename = "annotation_histogram.png")
*n_bins = 15 was an arbitary number I chose, whoever takes this task on, I can send a couple of examples of annotations to get a feel for a number that works decently for all of them.

Documentation could be updated, but the feature is there up on the PyHa Tutorial