Pinned Repositories
targon
A library for building subnets with the manifold reward stack
bittensor
Internet-scale Neural Networks
bittensor-js
bittensor api, but for web applications
alpaca-weight
Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.
CodingSubnet
DALLE-2
fun ai work
langchain
⚡ Building applications with LLMs through composability ⚡
reward-modeling
safe-rlhf
Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
text-generation-inference
Large Language Model Text Generation Inference
robertalanm's Repositories
robertalanm/safe-rlhf
Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
robertalanm/alpaca-weight
Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.
robertalanm/CodingSubnet
robertalanm/langchain
⚡ Building applications with LLMs through composability ⚡
robertalanm/reward-modeling
robertalanm/text-generation-inference
Large Language Model Text Generation Inference
robertalanm/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
robertalanm/airoboros
Customizable implementation of the self-instruct paper.
robertalanm/alpaca-lora
Code for reproducing the Stanford Alpaca InstructLLaMA result on consumer hardware
robertalanm/autocrit
A repository for transformer critique learning and generation
robertalanm/axolotl
Go ahead and axolotl questions
robertalanm/ColossalAI
Making large AI models cheaper, faster and more accessible
robertalanm/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
robertalanm/discord
robertalanm/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
robertalanm/H3
Language Modeling with the H3 State Space Model
robertalanm/langflow
⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.
robertalanm/langfuse
open-source observability for LLM applications
robertalanm/langfuse-python
robertalanm/llama-trl
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
robertalanm/minimal-llama
robertalanm/OpenLLaMA2
A Ray-based High-performance LLaMA2 RLHF framework
robertalanm/opentensorAI-connector-template
robertalanm/orca
Experiments into reproducing orca
robertalanm/pfrl
PFRL: a PyTorch-based deep reinforcement learning library
robertalanm/raodottown
website for rao.town
robertalanm/substrate-indexer
indexer for substrate chain (bt)
robertalanm/t-jepa
robertalanm/validators
Repository for bittensor validators
robertalanm/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs