speechbrain/speechbrain

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

shripadk opened this issue · 2 comments

Describe the bug

Seems to be a regression of #2137

Traceback (most recent call last):
Traceback (most recent call last):
  File "/pkg/modal/_container_io_manager.py", line 488, in handle_input_exception
    yield
  File "/pkg/modal/_container_entrypoint.py", line 238, in run_input
    async for value in res:
  File "/pkg/modal/_asgi.py", line 127, in fn
    app_task.result()  # consume/raise exceptions if there are any!
  File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 706, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 235, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 161, in run_endpoint_function
    return await dependant.call(**values)
  File "/root/service.py", line 129, in v1_transcribe_mandarin_endpoint
    {"transcription": v1_transcribe_mandarin.remote(request.file)}
  File "/pkg/synchronicity/synchronizer.py", line 531, in proxy_method
    return wrapped_method(instance, *args, **kwargs)
  File "/pkg/synchronicity/combined_types.py", line 28, in __call__
    raise uc_exc.exc from None
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/root/service.py", line 119, in v1_transcribe_mandarin
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 81, in transcribe_file
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 145, in transcribe_batch
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 112, in encode_batch
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/nnet/containers.py", line 191, in forward
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/utils/autocast.py", line 77, in wrapper
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 121, in decorate_fwd
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/lobes/features.py", line 146, in forward
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 538, in forward
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 679, in _create_fbank_matrix
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 610, in _triangular_filters
  File "<ta-01HVY76BS3NKZ7PMG3714WJBF9>:/usr/local/lib/python3.10/site-packages/torch/utils/_device.py", line 77, in __torch_function__
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
  File "/pkg/modal/_container_io_manager.py", line 488, in handle_input_exception
    yield
  File "/pkg/modal/_container_entrypoint.py", line 128, in run_input
    res = imp_fun.fun(*args, **kwargs)
  File "/root/service.py", line 119, in v1_transcribe_mandarin
    result = asr_model.transcribe_file(path)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 81, in transcribe_file
    predicted_words, predicted_tokens = self.transcribe_batch(
  File "/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 145, in transcribe_batch
    encoder_out = self.encode_batch(wavs, wav_lens)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/inference/ASR.py", line 112, in encode_batch
    encoder_out = self.mods.encoder(wavs, wav_lens)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/nnet/containers.py", line 191, in forward
    x = layer(x)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/utils/autocast.py", line 77, in wrapper
    return wrapped_fwd(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 121, in decorate_fwd
    return fwd(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/lobes/features.py", line 146, in forward
    fbanks = self.compute_fbanks(mag)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 538, in forward
    fbank_matrix = self._create_fbank_matrix(f_central_mat, band_mat).to(
  File "/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 679, in _create_fbank_matrix
    fbank_matrix = self._triangular_filters(
  File "/usr/local/lib/python3.10/site-packages/speechbrain/processing/features.py", line 610, in _triangular_filters
    fbank_matrix = torch.max(
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_device.py", line 77, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Expected behaviour

Should not throw a Runtime Error

To Reproduce

Link to notebook:

https://colab.research.google.com/drive/1ibFYTIU-TvjGfdn960U79zbqw5DYgCvm?usp=sharing

Environment Details

torch==2.1.0
torchaudio==2.1.0
torchvision==0.16.0
sox
speechbrain
sentencepiece

Relevant Log Output

No response

Additional Context

No response

@Adel-Moumen the reason for this error comes from a combination of the yaml and scorer not being in the decoder variable. It may actually affect all the HF models with a scorer. The CTC linear layer in the scorer of this model never is given to the modules. Only the modules are sent to the device.

Just adding to modules:
ctc_lin: !ref <ctc_lin>

Will solve the issue. @Adel-Moumen this must be checked for all models with a scorer ......

@Adel-Moumen the reason for this error comes from a combination of the yaml and scorer not being in the decoder variable. It may actually affect all the HF models with a scorer. The CTC linear layer in the scorer of this model never is given to the modules. Only the modules are sent to the device.

Just adding to modules: ctc_lin: !ref <ctc_lin>

Will solve the issue. @Adel-Moumen this must be checked for all models with a scorer ......

Hi, I think this was only related to this specific yaml. I checked the other yamls, and most of them are wrapping the ctc_lin inside of:

asr_model: !new:torch.nn.ModuleList
    - [!ref <CNN>, !ref <Transformer>, !ref <seq_lin>, !ref <ctc_lin>]

which is called in the modules.

I fixed the mentioned issue and now you can use your gpu with this model.