andrewowens/multisensory

Question about training

Opened this issue · 10 comments

It's really an amazing job. It seems you didn't share the codes for training, such as getting cams, action recognition, and auido-vision separation. I don't know how to train the models, could you add the codes for training.

Following this issue!

In case it helps, the training code is all there (see the train() function in sourcesep.py and shift_net.py). You'll just have to rewrite the I/O code. This involves rewriting the read_data function to read a batch of data from your dataset. My own I/O code uses TFRecord files, and I've provided it here as well (albeit without documentation). It'd probably be easier to just rewrite it, though.

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

yxixi commented

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

yxixi commented

It's really an amazing job. It seems you didn't share the codes for training, such as getting cams, action recognition, and auido-vision separation. I don't know how to train the models, could you add the codes for training.

Sorry to bother you.Do you know the right way to train the models successfully now?

Can you please provide link to dataset you used for training.?
Could you also provide steps to retrain new dataset.?

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

import sourcesep, sep_params
clip_dur=2.135
fn = getattr(sep_params, 'full')
pr = fn(vid_dur = clip_dur)
sourcesep.train(pr,0,False,False,False)

yxixi commented

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

import sourcesep, sep_params
clip_dur=2.135
fn = getattr(sep_params, 'full')
pr = fn(vid_dur = clip_dur)
sourcesep.train(pr,0,False,False,False)

Thank you for your help! I noticed TFRecords were used to train this model .Do you know how to create TFRecords files? Looking forward to your reply!

After I read the comments above, I noticed that the author said need to rewrite the I/O code. If I rewrite the I/O code, Should I read video and audio data separately, and then fed to two branch networks ?

When I rewrite the I/O code, there are those details that need to be noticed.