TaoRuijie/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
PythonMIT
Issues
- 0
关于如何把用绿色框标注的人脸序列进一步提取出来进行后续的特征提取工作?
#67 opened by hardlucky2023 - 5
request for clarification on evaluation code
#34 opened by Falmi - 0
- 3
关于视频FPS的问题
#64 opened by wanghuii1 - 3
关于说话人概率的计算
#63 opened by kyin2905 - 0
Extract Face region , timestamp of each unique face appearance , active speaker or not from a video , in Json or any format .
#65 opened by 808Code - 1
- 1
关于将代码移植到windows系统的问题
#60 opened by JOKER-rf - 4
About ColumbiaASD dataset
#52 opened by Ontheway361 - 0
Auto Cropping using TalkNet (like Opus.pro)
#61 opened by mvoodarla - 2
Demo with Visualization
#55 opened by mvoodarla - 1
How to annotate the AVA dataset?
#58 opened by rosebbb - 1
can I change the ffmpeg commands to opencv
#59 opened by mhmd-mst - 3
Question about repeated calls to the model by using same duration multiple times in parameter durationset
#51 opened by gomingchen - 1
Identifying speaker change positions
#53 opened by ashu5644 - 1
Can I use less FPS to make things done?
#57 opened by kaka1909 - 1
- 3
About real-time detection
#42 opened by henrycjh - 2
No video attached to the video_out.avi
#54 opened by TheMakerOfWorlds - 1
Minimum length of the audio and video feature
#50 opened by rosebbb - 2
- 7
- 7
Question Regrading Online Inference
#45 opened by hsato1 - 1
No such file or directory: 'demo\\001\\pycrop\\demo\\001\\pycrop\\00000.wav'
#46 opened by 2018212596 - 3
Overlapped speech performance
#43 opened by jayakrishnanmm - 1
Bug in audio loss computation?
#47 opened by SAGNIKMJR - 2
Evaluation score is 43.0% mAP
#44 opened by DevKiHyun - 6
Getting NaN values for prediction
#41 opened by AdhamKhalifa - 1
- 4
talknet训练时minibatch中读取的数据,为什么每次只是从audioFeature, visualFeature, labels取第一个 ?这样不就相当于每次只拿一条视频来训练么 ?
#39 opened by Ontheway361 - 4
ASD confidence/score
#37 opened by dberghi - 6
Talkset数据集中损失比较大
#38 opened by coreeey - 2
problem in test in talkset
#36 opened by coreeey - 16
audio input size
#32 opened by Falmi - 1
Clarification on image normalization.
#35 opened by Falmi - 2
100% mAP on test set
#33 opened by xiang-burlington - 2
Question1 about loss define.
#31 opened by xiejiachen - 4
- 1
h264 mmco: unref short failure
#29 opened by xiejiachen - 0
trouble with extract_audio_clips
#28 opened by xiejiachen - 2
Talk Dataset
#27 opened by xiejiachen - 1
- 2
TalkSet data
#26 opened by junwenxiong - 2
how to download the sfd_face.pth
#24 opened by songhaozhen - 2
- 3
- 2
- 1
"Out of memory" issue
#21 opened by xiang-burlington - 2
Missing License file
#19 opened by eek - 2
Generation of TalkSet/lists_in
#18 opened by zjr954