loading fully trained brain (a brain who finished training) and then evaluating it with brain.evaluate. will cause a crash when using hpopt context but not reporting in test stage

Question

loading fully trained brain (a brain who finished training) and then evaluating it with brain.evaluate. will cause a crash when using hpopt context but not reporting in test stage

avishaiElmakies opened this issue 4 months ago · 0 comments

Describe the bug

I used the finetuning tutorial to add orion to my code.
i use hp context at the start of the code.
I trained a model and it works well.
the probelm is when i run the script again and it uses the trained model now with hpopt disabled. i don't report objective during the test stage (beacuse i don't want to).
this means that i get a crash when i try to just evaluate a model since it can't find the key.

it trys to report (to orion) when i exit the context but hpopt is false.

Expected behaviour

this should not crash and should just work as a normal recipe.

To Reproduce

yaml

########
# used to finetune a wavlm-large model on emotion from checkpoint
# created by avishai elmakies 15.1.2024
########



# Seed needs to be set at top of yaml, before objects with parameters are made
seed: 12
__set_seed: !apply:torch.manual_seed [12]
trial_id: null
output_folder: finetune_ed_wavlm_large
# eder_file: finetune_emo/eder.txt
save_folder: &id008 !ref <output_folder>/save
train_log: &id009 !ref <output_folder>/train_log.txt
local-rank: 0
distributed_launch: False
save_checkpoint: True

## some important costants 
sample_rate: 16000
download_base_path: "../../models"
dataset_json: "../../data_jsons/seq_classification_emotion_dataset.json"
wav2vec2_hub: "microsoft/wavlm-large"
split_ratio: [0.8, 0.1, 0.1]
seq2seq: False
stratified_split: True
oversample: True
# can change this to use a different label encoder, if doesn't exist will be created
label_encoder_path: !ref <save_folder>/label_encoder.ckpt
hpopt_mode: null
hpopt: null

encoder_dim: 1024
# Outputs
out_n_neurons: 4
# Dataloader options
# With data_parallel batch_size is split into N jobs
# With DDP batch_size is multiplied by N jobs
dataloader_options:
  batch_size: 1
  shuffle: true
  num_workers: 1    # 2 on linux but 0 works on windows
  drop_last: false
  pin_memory: true
  collate_fn: !name:speechbrain.dataio.batch.PaddedBatch

test_dataloader_opts:
  batch_size: 1
  collate_fn: !name:speechbrain.dataio.batch.PaddedBatch


num_epochs: 15

epoch_counter: &id007 !new:speechbrain.utils.epoch_loop.EpochCounter

  limit: !ref <num_epochs>


speed_perturb_aug: &id010 !new:speechbrain.augment.time_domain.SpeedPerturb
  orig_freq: !ref <sample_rate>


snr_low: 12
snr_high: 22

add_noise_aug: &id020 !new:speechbrain.augment.time_domain.AddNoise
  snr_low: !ref <snr_low>
  snr_high: !ref <snr_high>

drop_freq_low: 0
drop_freq_high: 1
drop_freq_count_low: 2
drop_freq_count_high: 5
drop_freq_width: 0.075

drop_freq_aug: &id021 !new:speechbrain.augment.time_domain.DropFreq
  drop_freq_low: !ref <drop_freq_low>   # Min frequency band dropout probability
  drop_freq_high: !ref <drop_freq_high>  # Max frequency band dropout probability
  drop_freq_count_low: !ref <drop_freq_count_low>  # Min number of frequency bands to drop
  drop_freq_count_high: !ref <drop_freq_count_high>  # Max number of frequency bands to drop
  drop_freq_width: !ref <drop_freq_width>  # Width of frequency bands to drop


drop_count_low: 1
drop_count_high: 5
drop_length_low: 0
drop_length_high: 2500

drop_chunk_aug: &id022 !new:speechbrain.augment.time_domain.DropChunk
  drop_count_low: !ref <drop_count_low>  # Min number of audio chunks to drop
  drop_count_high: !ref <drop_count_high>  # Max number of audio chunks to drop
  drop_length_low: !ref <drop_length_low>  # Min length of audio chunks to drop
  drop_length_high: !ref <drop_length_high>  # Max length of audio chunks to drop  

min_augmentations: 1
max_augmentations: 4
repeat_augment: 1
shuffle_augmentations: False
augment_prob: 0.5847

augmentation: !new:speechbrain.augment.augmenter.Augmenter
  parallel_augment: false
  concat_original: false
  min_augmentations: !ref <min_augmentations>
  max_augmentations: !ref <max_augmentations>
  repeat_augment: !ref <repeat_augment>
  shuffle_augmentations: !ref <shuffle_augmentations>
  augment_prob: !ref <augment_prob>
  augmentations: [
    !ref <speed_perturb_aug>,
    !ref <add_noise_aug>,
    !ref <drop_freq_aug>,
    !ref <drop_chunk_aug>
  ]

mean_norm: true
std_norm: true

input_norm: !new:speechbrain.processing.features.InputNormalization
    norm_type: sentence
    mean_norm: !ref <mean_norm>
    std_norm: !ref <std_norm>

wav2vec2: !new:speechbrain.lobes.models.huggingface_transformers.wavlm.WavLM
    source: !ref <wav2vec2_hub>
    output_norm: True
    freeze: False
    freeze_feature_extractor: True
    save_path: !ref <download_base_path>
    # output_all_hiddens: False

window_length: 1
stride: 1
pool_type: "max"

avg_pool: !new:speechbrain.nnet.pooling.Pooling1d
    pool_type: !ref <pool_type>
    kernel_size: !ref <window_length>
    stride: !ref <stride>
    ceil_mode: True

output_mlp: !new:speechbrain.nnet.linear.Linear
    input_size: !ref <encoder_dim>
    n_neurons: !ref <out_n_neurons>
    bias: False

log_softmax: !new:speechbrain.nnet.activations.Softmax
    apply_log: True

compute_cost: !name:emotion.finetune.utils.weighted_nll_loss

# can be used with compute loss, probably better for ddp to work like this
# should also know the labels in advance. can work with diffrent labels and label encoeders

s_weight: 1.2
h_weight: 3.0
n_weight: 1.0
a_weight: 2.0

weights : 
  "s": !ref <s_weight>
  "h": !ref <h_weight>
  "n": !ref <n_weight>
  "a": !ref <a_weight>

modules:
    input_norm: !ref <input_norm>
    wav2vec2: !ref <wav2vec2>
    output_mlp: !ref <output_mlp>

opt_lr: 7.365e-05

opt_class: !name:torch.optim.Adam
  lr: !ref <opt_lr>

wav_lr: 7.105e-06

wav2vec2_opt_class: !name:torch.optim.Adam
  lr: !ref <wav_lr>

lr_annealing: &id005 !new:speechbrain.nnet.schedulers.NewBobScheduler
  initial_value: !ref <opt_lr>
  improvement_threshold: 0.0025
  annealing_factor: 0.8
  patient: 1

lr_annealing_wav2vec2: &id006 !new:speechbrain.nnet.schedulers.NewBobScheduler
  initial_value: !ref <wav_lr>
  improvement_threshold: 0.0025
  annealing_factor: 0.9
  patient: 1


checkpointer: !new:speechbrain.utils.checkpoints.Checkpointer
  checkpoints_dir: !ref <save_folder>
  recoverables:
    input_norm: !ref <input_norm>
    wav2vec2: !ref <wav2vec2>
    output_mlp: !ref <output_mlp>
    scheduler_model: !ref <lr_annealing>
    scheduler_wav2vec: !ref <lr_annealing_wav2vec2>
    counter: !ref <epoch_counter>

train_logger: !new:speechbrain.utils.train_logger.FileTrainLogger
    save_file: !ref <train_log>

error_stats: !name:speechbrain.utils.metric_stats.ClassificationStats
balanced_error: !name:emotion.finetune.utils.BalancedErrorMetric

some code

with hp.hyperparameter_optimization(objective_key="balanced_error_rate") as hp_ctx:
        hparams_file, run_opts, overrides = hp_ctx.parse_arguments(sys.argv[1:])
        .....
        if TEST in datasets:
            brain.evaluate(
                test_set=datasets[TEST],
                min_key="balanced_error_rate",
                test_loader_kwargs=hparams["test_dataloader_opts"],

            )

Environment Details

using latest version from develop from git.

Relevant Log Output

speechbrain.core - Exception:
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/cs/usr/avishai.elma/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/cs/labs/oabend/avishai.elma/src/emotion_src/emotion_finetune.py", line 53, in <module>
    main()
  File "/cs/labs/oabend/avishai.elma/src/emotion_src/emotion_finetune.py", line 42, in main
    brain.evaluate(
  File "/cs/labs/oabend/avishai.elma/speechbrain/speechbrain/utils/hpopt.py", line 402, in __exit__
    reporter.report_objective(self.result)
  File "/cs/labs/oabend/avishai.elma/speechbrain/speechbrain/utils/hpopt.py", line 154, in report_objective
    dict(result, objective=result[self.objective_key]), self.output
KeyError: 'balanced_error_rate'

Additional Context

balanced_error_rate is the objective i use during training.

class HyperparameterOptimizationContext:
    """
    A convenience context manager that makes it possible to conditionally
    enable hyperparameter optimization for a recipe.

    Arguments
    ---------
    reporter_args: list
        arguments to the reporter class
    reporter_kwargs: dict
        keyword arguments to the reporter class

    Example
    -------
    >>> ctx = HyperparameterOptimizationContext(
    ...     reporter_args=[],
    ...     reporter_kwargs={"objective_key": "error"}
    ... )
    """

    def __init__(self, reporter_args=None, reporter_kwargs=None):
        self.reporter_args = reporter_args or []
        self.reporter_kwargs = reporter_kwargs or {}
        self.reporter = None
        self.enabled = False
        self.result = {"objective": 0.0}

setting

self.result = {"objective": 0.0}

is probably the problem he since

def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is None and self.result is not None:
            reporter = self.reporter
            if not reporter:
                reporter = get_reporter(
                    DEFAULT_REPORTER,
                    *self.reporter_args,
                    **self.reporter_kwargs,
                )
            reporter.report_objective(self.result)
        _context["current"] = None

setting

self.result = None

fixes the issue. but i would like to make sure this doesn't affect something else.