sharathadavanne/seld-net

Using the first of the 9 subdatasets in the ANSIM dataset, the entry code reports an error.

Qinstudy opened this issue · 12 comments

Hello, @sharathadavanne ! I would be honored that if you have time to look at my problem. My question is as follows: download the first sub dataset (ov1_split1) of your published ANSIM dataset.
I follow the steps of README.md on your Github, I modified batch_feature_extraction.py and cls_feature_class.py. The modified code is as follows:

cls_feature_class.py:
`class FeatureClass:
def init(self, dataset='ansim', ov=3, split=1, nfft=1024, db=30, wav_extra_name='', desc_extra_name=''):

       if dataset == 'ansim':
            #self._base_folder = os.path.join('/wrk/adavanne/DONOTREMOVE', 'doa_data/')
             self._base_folder = 'E:/base_folder/' `

batch_feature_extraction.py:
`import cls_feature_class

dataset_name = 'ansim' # Datasets: ansim, resim, cansim, cresim and real

for ovo in [1]: # SE overlap. Change to [1] if you are only calculating the features for overlap 1.

 for splito in [1]:   
     for nffto in [512]: 
         feat_cls = cls_feature_class.FeatureClass(ov=ovo, split=splito, nfft=nffto, dataset=dataset_name)
        # Extract features and normalize them
        feat_cls.extract_all_feature()
        
        feat_cls.preprocess_features()
        # # Extract labels in regression mode
        feat_cls.extract_all_labels('regr', 0)`

but after running batch_feature_extraction.py, the error is as follows: TypeError: 'float' object cannot be interpreted as an integer.
I finally debugged and tracked the code, but it still didn't solve it.Thank for for your generous help,sharathadavanne.

Hi @Qinstudy thanks for using my work.
Can you paste the actual error here? So that I can locate where exactly you are getting it.
Also what version of python are you using? I have only tested this code on python2.7.

Thank you very much for your comments. My Python version is 3.6, so I think I need to change to a Python version and then see how it works. If there are still errors, I will ask you for advice.

Sure @Qinstudy let me know how it goes.

Sure, thank you for your kind help.

Hello, @sharathadavanne after I changed the Python version to 2.7, the operation was successful. The Python3 runtime will report an error, thanks for your reminder! But I want to know what many tags you generate represent what they mean.
And I find problems with installing tensorflow-gpu==1.10.1. Since the tensorflow version of Python 2.7 is not supported under Windows, I need to install tensorflow-gpu version 1.10.1 on the ubuntu system. Is this correct? By the way, how do you install tensorflow-gpu==1.10.1?

Glad it worked for you @Qinstudy

Let me know what specific tags you are interested in. I can explain them individually. Most of the parameters I am using are already described here.

Also I am not really sure about the installation part, because we use a pre-configured environment at our university which is maintained by the system admins. Try installing a version of tensorflow on windows which is near to 1.10.1 I think it should be compatible.

I ran batch_feature_extraction.py and generated the .npy file. I stored the npy file in a text file(label1_test_0_desc_30_100.txt) format with a size of (5166L, 33L). Here are my steps:
image
image

First, I modified the dataset path under cls_feature_class.py, then I modified the code in batch_feature_extraction.py. When batch_feature_extraction.py is run, the command line will generate tag data in .npy format. I read one of the .npy data (the red arrow) and the result is as follows:
image
What does the result mean (5166L, 33L)? Thanks for your kind help!

The labels for every frame in the audio are saved in that npy file. 5166 is the number of frames, and 33 is the label length. The ansim dataset has 11 sound classes in them, and each of the 11 classes are represented by a spatial coordinate in azimuth and elevation angles. So 11 sound event labels + 2 (azi+ele angle) *11 classes = 33. You can also find this if you check the code in some detail. For example this line.

@sharathadavanne Thank you for your excellent tips. I checked it carefully, Windows 10 system does not support tensorflow installation under Python2.7. Maybe I need to install a Linux system to successfully install tentsorflow. Thank you again for your help.

Hello,@sharathadavanne By modifying cls_feature_class.py and other codes, I can successfully run in Python 3 language. I am trying to reproduce the results in your paper. Can you tell me how to visualize DOA estimates for many time frames? For example, the image you provided ANSYS O1

Hi @Qinstudy I am glad you got it working on Python 3!

I am sorry there isn't a stand-alone script to visualize the output. But it should be straight forward to implement it. You have the results for all the evaluation data in this line. You know the format of this output. So all you need to do is plot them the way you want :) You can obviously write a standalone script that takes in a single audio recording and plots the results for you.

@sharathadavanne Thank you very much for your continuous help. After running on Python 3, I can install tensorflow for Python 3 under Windows. Your prompt solved a lot of my problems, I will write a standalone
script and plot the desired result.