TaoRuijie/TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

PythonMIT

Issues

关于如何把用绿色框标注的人脸序列进一步提取出来进行后续的特征提取工作？
#67 opened 2 months ago by hardlucky2023
0
request for clarification on evaluation code
#34 opened 2 years ago by Falmi
5
关于消融实验
#66 opened 5 months ago by JOKER-rf
0
关于视频FPS的问题
#64 opened 6 months ago by wanghuii1
3
关于说话人概率的计算
#63 opened 7 months ago by kyin2905
3
Extract Face region , timestamp of each unique face appearance , active speaker or not from a video , in Json or any format .
#65 opened 6 months ago by 808Code
0
Update to PySceneDetect 0.6 - change from VideoManager to open_video
#62 opened 6 months ago by pingaaron
1
关于将代码移植到windows系统的问题
#60 opened 6 months ago by JOKER-rf
1
About ColumbiaASD dataset
#52 opened a year ago by Ontheway361
4
Auto Cropping using TalkNet (like Opus.pro)
#61 opened 9 months ago by mvoodarla
0
Demo with Visualization
#55 opened a year ago by mvoodarla
2
How to annotate the AVA dataset?
#58 opened 10 months ago by rosebbb
1
can I change the ffmpeg commands to opencv
#59 opened 10 months ago by mhmd-mst
1
Question about repeated calls to the model by using same duration multiple times in parameter durationset
#51 opened a year ago by gomingchen
3
Identifying speaker change positions
#53 opened a year ago by ashu5644
1
Can I use less FPS to make things done?
#57 opened a year ago by kaka1909
1
Is it possible to run this on CPU only, without cuda?
#56 opened a year ago by KhalilAmor
1
About real-time detection
#42 opened 2 years ago by henrycjh
3
No video attached to the video_out.avi
#54 opened a year ago by TheMakerOfWorlds
2
Minimum length of the audio and video feature
#50 opened a year ago by rosebbb
1
How long does it take to train from scratch on Talkset and AVA datasets?
#49 opened a year ago by eshoyuan
2
Question about window length and hop size for spectrogram
#48 opened 2 years ago by DevKiHyun
7
Question Regrading Online Inference
#45 opened 2 years ago by hsato1
7
No such file or directory: 'demo\\001\\pycrop\\demo\\001\\pycrop\\00000.wav'
#46 opened 2 years ago by 2018212596
1
Overlapped speech performance
#43 opened 2 years ago by jayakrishnanmm
3
Bug in audio loss computation?
#47 opened 2 years ago by SAGNIKMJR
1
Evaluation score is 43.0% mAP
#44 opened 2 years ago by DevKiHyun
2
Getting NaN values for prediction
#41 opened 2 years ago by AdhamKhalifa
6
Is it possible to use different fps and sample rate than 25 and 16000
#40 opened 2 years ago by asrlhhh
1
talknet训练时minibatch中读取的数据，为什么每次只是从audioFeature， visualFeature， labels取第一个？这样不就相当于每次只拿一条视频来训练么？
#39 opened 2 years ago by Ontheway361
4
ASD confidence/score
#37 opened 2 years ago by dberghi
4
Talkset数据集中损失比较大
#38 opened 2 years ago by coreeey
6
problem in test in talkset
#36 opened 2 years ago by coreeey
2
audio input size
#32 opened 2 years ago by Falmi
16
Clarification on image normalization.
#35 opened 2 years ago by Falmi
1
100% mAP on test set
#33 opened 3 years ago by xiang-burlington
2
Question1 about loss define.
#31 opened 3 years ago by xiejiachen
2
Could you plz explain details meaning of csv file.
#30 opened 3 years ago by xiejiachen
4
h264 mmco: unref short failure
#29 opened 3 years ago by xiejiachen
1
trouble with extract_audio_clips
#28 opened 3 years ago by xiejiachen
0
Talk Dataset
#27 opened 3 years ago by xiejiachen
2
dataset
#20 opened 3 years ago by simonzfei
1
TalkSet data
#26 opened 3 years ago by junwenxiong
2
how to download the sfd_face.pth
#24 opened 3 years ago by songhaozhen
2
"List index out of range" error during evaluation
#22 opened 3 years ago by xiang-burlington
2
Suggestion for adding face recognition in this project
#25 opened 3 years ago by Cuecute
3
FileNotFoundError After Running python demoTalkNet.py --videoName 001 Command
#23 opened 3 years ago by Cuecute
2
"Out of memory" issue
#21 opened 3 years ago by xiang-burlington
1
Missing License file
#19 opened 3 years ago by eek
2
Generation of TalkSet/lists_in
#18 opened 3 years ago by zjr954
2