Variance-aware Contextual Dueling Bandits
Primary LanguagePython
[Variance-aware Regret Bounds for Stochastic Contextual Dueling Bandits] OpenReview, arXiv