chaovven

PhD student @ MPI-SWS

Saarbrücken, Germany

Pinned Repositories

alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook0 0 00
chaovven
0 1 00
chaovven.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript0 0 00
codellama
Inference code for CodeLlama models
Language:Python0 0 00
maab
Code for "A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising" WSDM 2022
Language:Python19 1 15
PDFEditor
Language:CSS0 1 00
PyRL
PyRL - Reinforcement Learning Framework in Pytorch (Policy Gradient, DQN, DDPG, TD3, PPO, SAC, etc.)
Language:Python34 3 01
SMIX
Code for "SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning" AAAI 2020
Language:Python26 1 25
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0 00
llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.8k 204 3982.3k

chaovven/PyRL
PyRL - Reinforcement Learning Framework in Pytorch (Policy Gradient, DQN, DDPG, TD3, PPO, SAC, etc.)
Language:Python34 3 01
chaovven/SMIX
Code for "SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning" AAAI 2020
Language:Python26 1 25
chaovven/maab
Code for "A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising" WSDM 2022
Language:Python19 1 15
chaovven/alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook0 0 00
chaovven/chaovven
0 1 00
chaovven/chaovven.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript0 0 00
chaovven/codellama
Inference code for CodeLlama models
Language:Python0 0 00
chaovven/PDFEditor
Language:CSS0 1 00
chaovven/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0 00