voxel51/eta

ETA needs Featurizer subclasses for each type of input data

brimoor opened this issue · 1 comments

For consistency with eta.core.learning, our Featurizer classes should declare what type of input they support. For example, if one is implementing an eta.core.learning.VideoFramesClassifier, one may want to use a featurizer like C3DFeaturizer to generate features for each image tensor. However, C3DFeaturizer doesn't live in a class hierarchy that declares that it can support featurizing tensors. So we don't have the ability to check at Config-parsing time if a given featurizer is compatible with, for example, a tensor-featurizer-plus-SVM-type model, for example.

Following eta.core.learning conventions, we would implement:

eta.core.features.ImageFeaturizer: takes an image and returns a feature vector (VGG16Featurizer does this now)
eta.core.features.VideoFramesFeaturizer: takes an image tensor and returns a single feature vector (C3DFeaturizer can do this now)
eta.core.features.VideoFeaturizer: takes a video path and returns an embedding vector for the entire video (C3DFeaturizer can also do this now)

This raises the question of what to do with the existing eta.core.features.VideoFramesFeaturizer?? It is a meta-ImageFeaturizer that featurizes each frame of the video and provides a cache-backed interface to access them. This functionality could be grafted onto ImageFeaturizer so that all image featurizers automatically support video-caching out-of-the-box.