MKLab-ITI/ndvr-dml

Availability of Testing code?

maida-shahid opened this issue · 5 comments

Thank you for sharing such great information on training and evaluation of your model.
i was wondering if we need to visualize what videos are near duplicates of our input test video, how this can be done using your code?

Having a trained model, you can initialize a DNN object by providing the required argument, and then use the embedding function to extract the final video representations. This is exactly what I am doing to evaluate the model performance on CC_WEB_VIDEO dataset, see here. Also, see the following example:

features = np.load(path_to_features)
model = DNN(features.shape[1],
             path_to_model,
             load_model=True,
             trainable=False)
video_embeddings = model.embeddings(features)

Once you have extracted the video embeddings, you may use any dimensionality reduction algorithm (e.g. PCA, tSNE) to transform vectors to 2 or 3 dimension. I would recommend to use the same color for all near duplicates of a particular query, in case you already have such information.

sorry sir for bothering you again. I have some other queries that i have to ask you.

  1. for triplet_generator.py only applicable datasets are vcdb and cc_web_video. if we want to test this model on our video dataset, is the extracted features file .npy using intermediate-cnn-features enough?
  2. to evaluate the cc_web_video dataset .pickle file is used, how can we generate that pickle file for our own testing dataset?

For both of your queries, you have to create a dictionary similar to the one found in the cc_web_video.pickle file. It consists of several list objects that comprise information about the dataset. More precisely, it contains the following key-value pairs:

  1. index: a list that contains the id of each video in the dataset. The index of each video in this list corresponds to the row in the output feature matrix of the global video descriptors. Empty entries are ignored.
  2. queries: a list that contains the indexes of the query video for each query set.
  3. ground_truth: a list that contains the indexes and labels of the videos in the query sets. Each entry is a dictionary that corresponds to a query set and contains key-value pairs of the video indexes and their labels with respect to the query video. Visit the CC_WEB_VIDEO website for more information regarding the labels.

I would recommend you to go through the .pickle file to take a look in its composition. Also, I have made some changes in the format of the cc_web_video.pickle file, so make sure to pull the latest version of the code.

Thanks!