Files skipped and speed improvment
Maxime-Bru opened this issue · 1 comments
Maxime-Bru commented
I noticed that sometimes, some files were not analyzed by birdnet. I made 2 improvments to the analyzeFile() function in analyze.py and this seems to have solved my problem :
- moved the condition to skip existing results to avoid running unnecessary lines to improve speed
- added a try catch block around audio.getAudioFileLength() to avoid bugs when reading empty or corrupted wav files
Here are the first few lines of analyzeFile which have been updated:
def analyzeFile(item):
"""Analyzes a file.
Predicts the scores for the file and saves the results.
Args:
item: Tuple containing (file path, config)
Returns:
The `True` if the file was analyzed successfully.
"""
# Get file path and restore cfg
fpath: str = item[0]
cfg.setConfig(item[1])
result_file_name = get_result_file_name(fpath)
if cfg.SKIP_EXISTING_RESULTS and os.path.exists(result_file_name):
print(f"Skipping {fpath} as it has already been analyzed", flush=True)
return True
# Start time
start_time = datetime.datetime.now()
offset = 0
duration = cfg.FILE_SPLITTING_DURATION
start, end = 0, cfg.SIG_LENGTH
try:
fileLengthSeconds = audio.getAudioFileLength(fpath, cfg.SAMPLE_RATE)
except Exception as ex:
# Write error log
print(f"Error: Cannot read duration of audio file {fpath}.\n", flush=True)
utils.writeErrorLog(ex)
return False
results = {}
# Status
print(f"Analyzing {fpath}", flush=True)
Josef-Haupt commented
Looks good, I'll make the changes. In the future, if you want to, you can easily propose those changes in the pull request yourself.