/MDS-ViTNet

We present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency prediction or eye-tracking. Our trained model achieves state-of-the-art results across several benchmarks.

Primary LanguagePython

Stargazers