Pinned Repositories
agenta
The LLMOps platform to build robust LLM apps. Easily experiment and evaluate different prompts, models, and workflows.
ASE
basalt_2022
block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)
cgdt
[AAAI'2024] Critic-Guided Decision Transformer for Offline Reinforcement Learning
Continuous-AdvTrain
decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
DeepSpeedExamples
Example models using DeepSpeed
DI-engine
OpenDILab Decision AI Engine
RepoAgent
An LLM-powered repository agent designed to assist developers and teams in generating documentation and understanding repositories quickly.
sharkwyf's Repositories
sharkwyf/cgdt
[AAAI'2024] Critic-Guided Decision Transformer for Offline Reinforcement Learning
sharkwyf/RepoAgent
An LLM-powered repository agent designed to assist developers and teams in generating documentation and understanding repositories quickly.
sharkwyf/safe-rlhf
Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
sharkwyf/agenta
The LLMOps platform to build robust LLM apps. Easily experiment and evaluate different prompts, models, and workflows.
sharkwyf/ASE
sharkwyf/basalt_2022
sharkwyf/Continuous-AdvTrain
sharkwyf/DeepSpeedExamples
Example models using DeepSpeed
sharkwyf/DI-engine
OpenDILab Decision AI Engine
sharkwyf/dify
An Open-Source Assistants API and GPTs alternative. Dify.AI is an LLM application development platform. It integrates the concepts of Backend as a Service and LLMOps, covering the core tech stack required for building generative AI-native applications, including a built-in RAG engine.
sharkwyf/dreamerv3
Mastering Diverse Domains through World Models
sharkwyf/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
sharkwyf/MineDojo
Modified actions space to MineRL style
sharkwyf/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
sharkwyf/HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
sharkwyf/IVR
Author's implementation of SQL and EQL in "Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization"
sharkwyf/langflow
⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.
sharkwyf/latent-adversarial-training
sharkwyf/LLaMA-Factory
Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)
sharkwyf/MotionCLIP
Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"
sharkwyf/neuralmmo
Baselines for Neural MMO -- new users should treat this repo as a starter project
sharkwyf/notion-feeder
🕸 A Node app for creating a Feed Reader in Notion.
sharkwyf/online-dt
Online Decision Transformer
sharkwyf/PDT
Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer
sharkwyf/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
sharkwyf/SimPO
SimPO: Simple Preference Optimization with a Reference-Free Reward
sharkwyf/Stable-Alignment
Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
sharkwyf/trl
Train transformer language models with reinforcement learning.
sharkwyf/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
sharkwyf/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs