OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Primary LanguagePython