This work "How to Leverage Diverse Demonstrations in Offline Imitation Learning" has been accepted by ICML'24.
we introduce a simple yet effective data selection method that identifies positive behaviors based on their \emph{resultant states} -- a more informative criterion enabling explicit utilization of dynamics information and effective extraction of both expert and beneficial diverse behaviors. Further, we devise a lightweight behavior cloning algorithm capable of leveraging the expert and selected data correctly. In the experiments, we evaluate our method on a suite of complex and high-dimensional offline IL benchmarks, including continuous-control and vision-based tasks. The results demonstrate that our method achieves state-of-the-art performance, outperforming existing methods on \textbf{20/21} benchmarks, typically by \textbf{2-5x}, while maintaining a comparable runtime to Behavior Cloning (\texttt{BC}).
- Python == 3.7 (Recommend to use Anaconda or Miniconda)
- PyTorch == 1.8.1
- MuJoCo == 2.3.6
- NVIDIA GPU (RTX A6000) + CUDA 11.1
- Clone repo
git clone [https://github.com/HansenHua/ILID-offline-imitation-learning.git](https://github.com/HansenHua/ILID-offline-imitation-learning.git) cd ILID-offline-imitation-learning
- Install dependent packages
pip install -r requirement.txt
Get the usage information of the project
cd code
python main.py -h
We provide complete training codes for ILIDE.
You could adapt it to your own needs.
```
python main.py
```
The log files will be stored in [https://github.com/HansenHua/ILID-offline-imitation-learning](https://github.com/HansenHua/ILID-offline-imitation-learning).
Illustration
We alse provide the performance of our model. The illustration videos are stored in ILID-offline-imitation-learning/performance.
If you have any question, please email xingyuanhua@bit.edu.cn
.