Files skipped and speed improvment

Question

Files skipped and speed improvment

Maxime-Bru opened this issue 6 months ago · 1 comments

I noticed that sometimes, some files were not analyzed by birdnet. I made 2 improvments to the analyzeFile() function in analyze.py and this seems to have solved my problem :

moved the condition to skip existing results to avoid running unnecessary lines to improve speed
added a try catch block around audio.getAudioFileLength() to avoid bugs when reading empty or corrupted wav files

Here are the first few lines of analyzeFile which have been updated:

def analyzeFile(item):
    """Analyzes a file.

    Predicts the scores for the file and saves the results.

    Args:
        item: Tuple containing (file path, config)

    Returns:
        The `True` if the file was analyzed successfully.
    """
    # Get file path and restore cfg
    fpath: str = item[0]
    cfg.setConfig(item[1])
    result_file_name = get_result_file_name(fpath)

    if cfg.SKIP_EXISTING_RESULTS and os.path.exists(result_file_name):
        print(f"Skipping {fpath} as it has already been analyzed", flush=True)
        return True
    
    # Start time
    start_time = datetime.datetime.now()
    offset = 0
    duration = cfg.FILE_SPLITTING_DURATION
    start, end = 0, cfg.SIG_LENGTH
    try:
        fileLengthSeconds = audio.getAudioFileLength(fpath, cfg.SAMPLE_RATE)
    except Exception as ex:
        # Write error log
        print(f"Error: Cannot read duration of audio file {fpath}.\n", flush=True)
        utils.writeErrorLog(ex)
        return False
    
    results = {}

    # Status
    print(f"Analyzing {fpath}", flush=True)

Answer 1 · 2024-07-19T09:16:06.000Z

Looks good, I'll make the changes. In the future, if you want to, you can easily propose those changes in the pull request yourself.