LAVA: Data Valuation without Pre-Specified Learning Algorithms
This repository is the official implementation of the "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR 2023). We propose LAVA: a novel model-agnostic framework to data valuation using a non-conventional, class-wise Wasserstein discrepancy. We further introduce an efficient way to measure datapoint contribution at no cost from the optimization solution.
Getting Started
import lava
Coming Soon.
Examples
For better understanding of applying LAVA to data valuation, we have provided examples on CIFAR-10 and STL-10.
Checkpoints
The pretrained embedders are included in the folder 'checkpoint'.
Optimal Transport Solver
This repo relies on the OTDD implementation to compute the class-wise Wasserstein distance.
We are immensely grateful to the authors of that project.