Pinned Repositories
alpaca-lora
Instruct-tune LLaMA on consumer hardware
chaovven
chaovven.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
codellama
Inference code for CodeLlama models
maab
Code for "A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising" WSDM 2022
PDFEditor
PyRL
PyRL - Reinforcement Learning Framework in Pytorch (Policy Gradient, DQN, DDPG, TD3, PPO, SAC, etc.)
SMIX
Code for "SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning" AAAI 2020
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
chaovven's Repositories
chaovven/PyRL
PyRL - Reinforcement Learning Framework in Pytorch (Policy Gradient, DQN, DDPG, TD3, PPO, SAC, etc.)
chaovven/SMIX
Code for "SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning" AAAI 2020
chaovven/maab
Code for "A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising" WSDM 2022
chaovven/alpaca-lora
Instruct-tune LLaMA on consumer hardware
chaovven/chaovven
chaovven/chaovven.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
chaovven/codellama
Inference code for CodeLlama models
chaovven/PDFEditor
chaovven/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.