The generated VTT file contains unnecessary blank lines.
Closed this issue · 17 comments
I want to display subtitles with the same name while playing MP3 files in the MPV software. However, the VTT subtitle file generated using the edge-tts command contains extra blank lines, causing the subtitle file to fail to display.
Here are the details.
mpv-player/mpv#15352
Additionally, the temporary .mp3 and .vtt file names generated by edge-playback are different and randomly assigned. This makes it impossible to play subtitles directly. Although parameters can be added to control this, it is not very convenient. I suggest generating files with the same name by default.
Can you try the master branch?
Can you try the master branch?
I just started using GitHub and I'm not very familiar with it yet. Could you tell me how to use it? Thank you!
pip install https://github.com/rany2/edge-tts/archive/refs/heads/master.zip
@rany2 It's still the same issue, no changes.
Try:
edge-playback --text "我记得我老爸手画的我家祖宅的四合院,好像是两进。不过他画的一塌糊涂,我愣是看不懂我祖父和他们兄弟四个是怎么住的。后来回祖宅去看了一眼,那早就被改的一塌糊涂了,完全看不明白原来的格局。" --voice zh-CN-XiaoxiaoNeural
@ichat006 very odd, it's fixed for me...maybe GitHub was providing a cached master zip? Try again in a bit
@rany2
Subtitles can now be displayed, but they are shown as individual characters, which is not reasonable. Before this issue is fixed, the generated subtitle file contains unnecessary blank lines. After manually removing the blank lines, I got the following correct file, which meets the requirements.
hello.vtt-ok.zip
The following is the currently generated, unreasonable subtitle file.
hello.vtt-Now.zip
I'm aware, I'm trying to figure out how to create proper subtitles but unfortunately it's not an easy task and I need help with it.
Would just persisting the character for a few seconds so it doesn't disappear immediately work?
The biggest issue is that the WordBoundary data Microsoft returns doesn't necessarily match the input text. Microsoft internally transforms the input text to expand acronyms, numbers, etc so it's not as simple as matching the input text with WordBoundary event.
The biggest issue is that the WordBoundary data Microsoft returns doesn't necessarily match the input text. Microsoft internally transforms the input text to expand acronyms, numbers, etc so it's not as simple as matching the input text with WordBoundary event.
The subtitle files generated by previous versions of edge-tts were mostly correct, except for some extra blank lines. After I manually deleted the blank lines, they became completely correct and could be displayed properly in the MPV window. Please see the demonstration below.
you can automatically fix any issues such as extra blank lines by just converting subs with ffmpeg
ffmpeg -i hello.vtt hellofixed.vtt
or from one format to another
ffmpeg -i hello.vtt hellofixed.srt
@mrfragger it's weird because the extra blank lines issue only comes up on Windows, at any rate the issue is fixed in master!
#335 contains the potential options for making SubMaker more useful, if someone is interested they can help with developing a solution that meets the second criteria
using the input text to generate the subtitles by matching word from WordBoundary with input text, i.e., recovering the lost punctuation, etc.
It is the solution I want to implement but it is extremely complicated.
@mrfragger it's weird because the extra blank lines issue only comes up on Windows, at any rate the issue is fixed in master!
I installed the latest version 6.1.9, but subtitles are not displaying. The VTT file still contains extra blank lines, as shown below.
WEBVTT
00:00:00.100 --> 00:00:02.237
我 记得 我 老爸 手 画 的 我 家 祖宅
00:00:02.237 --> 00:00:06.375
的 四合院 好像 是 两进 不过 他 画 的 一塌糊涂
00:00:06.638 --> 00:00:09.188
我 愣是 看 不 懂 我 祖父 和 他们 兄弟
00:00:09.188 --> 00:00:12.637
四个 是 怎么 住 的 后来 回 祖宅 去 看
00:00:12.637 --> 00:00:15.738
了 一 眼 那 早就 被 改 的 一塌糊涂 了
00:00:16.075 --> 00:00:18.100
完全 看 不 明白 原来 的 格局
you can automatically fix any issues such as extra blank lines by just converting subs with ffmpeg
ffmpeg -i hello.vtt hellofixed.vtt
or from one format to anotherffmpeg -i hello.vtt hellofixed.srt
These commands can solve the problem, but they add extra steps. It would be better if this could be implemented directly in edge-tts.