This repository contains a protocol to:
- download a subset of bird vocalizations from Xeno-Canto using R
- label birds' vocalization using Audacity
To download the files of recordings from Xeno-Canto (XC) from R run the R script: download_recordings_from_xc.R
.
A detailed protocol is written in the protol.Rmd
file.
After downloading the recordings from XC a separate CSV/XLSX file is created for each species. The columns of the files are:
- Recording_ID unique IDentifier for each recording assigned by XC
- birdlife_sci_name scientific name following BirdLife International's taxonomy
- English_name common name
- song_description_collins_bird_guide description of the song within the Collins bird guide
- jean_roche file name within the sound CD from Jean-Roche. Useful in case of taxonomy mismatches
- cd location of the CD file (CD are available on the google drive)
- Taxonomy information from XC: latin scientific name used in XC; Genus genus; Specific_epithet epithet; Subspecies subspecies
- Recordist name of the recordist
- Geographical information: Country; Locality; Latitude; Longitude; Altitude
- Vocalization_type Description of the vocalization from the recordist
- Audio_file https address of the file
- License License
- Url url address
- Quality "Recordings are rated by quality. Quality ratings range from A (highest quality) to E (lowest quality)." (see https://xeno-canto.org/help/search)
- Time, Date when was it recorded
- group in our case, the value is always birds
- sex sex of the individual
- stage Life stage of the individual
- method either field recording, capture in the hand, or unknown
- file.name mp3/wav file name as in XC
- Spectrogram_small; Spectrogram_med; Spectrogram_large; Spectrogram_full; osci.small; osci.med; osci.large specifications of the spectrogram available on XC
- Length length of the recording
- Uploaded when the recording was uploaded
- Remarks remarks from recordist
- Bird_seen and animal.seen Yes/No/unknown was the individual also observed?
- Playback_used Yes/No/unknown, if a playback was used or not
- temp temperature
- regnr "The regnr tag can be used to search for animals that were sound recorded before ending up in a (museum) collection. This tag also accepts a 'matches' operator" (see https://xeno-canto.org/help/search)
- auto Yes/No/unknown
- dvc; mic devices used
- smp "The smp tag can be used to search for recordings with a specific sampling rate (in Hz). For example, smp:">48000" will return hi-res recordings. Other frequencies include 22050, 44100 and multiples of 48000." (see https://xeno-canto.org/help/search)
- Other_species, Other_species1, ..., Other_species22 list of species different from the target species as identified by the recordist
- format_audio format: mp3 or wav
- order_name; family_name; birdlife_common_names BirdLife Taxonomy
- easy_or_difficult_song either easy or difficult song to identify
- short_or_continuous_song either short or continuous songs
- expertise_level_labeling either novice or expert labeler
- passing_vocalization_types; seconds; length_less_30_sec; count_other_species; is_france; is_neighbor; is_europe; is_neighbor_and_quality_a_and_is_seen_and_accepted_vocalization_type; is_neighbor_and_quality_a_and_accepted_vocalization_type; is_europe_and_quality_a_and_accepted_vocalization_type summary columns used to sort recordings
- min_amp_between_signals; max_amp_between_signals; min_amp_signal; max_amp_signal; count_number_signals; bck_en_snr_t; snr_time; sharpness; bck_en_freq; snr_freq; max_amp min_amp information on amplitude and signal to noise ratio extracted from the recording with the scikit-maad python library