Issues
- 2
- 1
Unsupported subtype: PCM_24
#3806 opened by nicobrb - 0
- 0
`kaldi.fbank` does not work with non-contiguous input when `snip_edges=False`
#3856 opened by gau-nernst - 1
Using MMS model with `star` token for batch size > 1
#3772 opened by huangruizhe - 0
Can anyone provide a real-time pretrain model for Visual Speech Recognition?
#3852 opened by bernie-122 - 2
- 4
Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors"
#3826 opened by dmitry-mli - 1
Not building CUDA 12.6
#3835 opened by johnnynunez - 0
How to train a real-time av-asr pretrain model
#3838 opened by Zhaninh - 0
- 1
torchaudio load opus failed
#3753 opened by Mddct - 1
Prebuilt binaries of torch.audio for aarch64 cuda
#3827 opened by chulkilee - 0
Ability to provide initial phase to Griffin-Lim
#3828 opened by aaron-dees - 7
Torchaudio is not detecting FFmpeg
#3789 opened by ruliworst - 7
StreamRead failing when Reading RTSP stream with CPU
#3798 opened by pedromoraesh - 0
torchaudio.transforms.Resample causes Float Point Exception
#3825 opened by zhc7 - 0
The seek functionality of StreamReader on the video stream does not return the correct frame if the start_time_stamp of the video stream is nonzero.
#3824 opened by w238liu - 1
- 5
Termux patch for default APT version of audio - for relative file paths, tilde (~) expansion not working in filepath for torchaudio.load()
#3802 opened by Manamama - 0
- 0
transforms.MFCC results in NaN values on Jetson Orin Nano
#3822 opened by frmser - 1
torchaudio.load not loading all the frames
#3762 opened by ashinkajay - 0
Division by zero in loudness calculation
#3816 opened by DanTremonti - 0
Division by zero in loudness calculation
#3815 opened by dhanvanth-pk-13760 - 0
Video reading: torchaudio.io.StreamReader seek method returns the first frame, regardless of the input start_timestep (on version 0.13.1)
#3813 opened by StolikTomer - 0
Loading failure errors should indicate what was being loaded when error occured
#3810 opened by pokepress - 0
StreamReader.add_basic_video_stream drops last frame if `frame_rate` is specified
#3809 opened by tyler-rt - 0
- 5
NV12/YUV->RGB colour accuracy and CUDA
#3799 opened by gtebbutt - 4
`torchaudio.functional.lfilter` returns `nan` when processing sub-array but not for the whole input array.
#3807 opened by SuperKogito - 1
frame offset + num frames to utilize http range header
#3783 opened by mogwai - 2
- 3
MAC M3 audio backend no longer appearing
#3785 opened by tval2 - 2
- 0
RTSP with StreamReader
#3797 opened by pedromoraesh - 0
How to use my finetuned version of wave2vec2 for forced alignment as shown in example/
#3796 opened by omerarshad - 0
Packet passthrough support
#3795 opened by materight - 0
Real time synthesis with oscillator_bank
#3788 opened by peastman - 0
Failed to open output "-" (Invalid argument).
#3784 opened by liuliujiang - 1
Can not load commonvoice dataset on windows
#3781 opened by jacobjennings - 0
- 1
Cannot load audio from pathlib.Path
#3775 opened by roedoejet - 0
- 1
cherry-picks for 2.3
#3768 opened by ahmadsharif1 - 0
Windows CI is broken
#3767 opened by ahmadsharif1 - 3
- 2
- 0
Installation with pip git+ is broken
#3752 opened by gdagil - 6
I have some questions about RNNT loss.
#3750 opened by girlsending0