yun-kwak's Stars
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
AIDC-AI/Marco-o1
An Open Large Reasoning Model for Real-World Solutions
gicheonkang/clip-rt
📎 + 🦾 CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
lazaratan/dyn-gfn
DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks
GFNOrg/gfn-lm-tuning
recursionpharma/gflownet
GFlowNet library specialized for graph & molecular data
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
xjdr-alt/entropix
Entropy Based Sampling and Parallel CoT Decoding
YuxiXie/MCTS-DPO
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
XinyuanWangCS/PromptAgent
This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that autonomously crafts prompts equivalent in quality to those handcrafted by experts, i.e., expert-level prompts.
black-forest-labs/flux
Official inference repo for FLUX.1 models
meta-llama/llama-models
Utilities intended for use with Llama models.
NVIDIA/warp
A Python framework for high performance GPU simulation and graphics
lichess-org/mobile
Lichess mobile app v2
jopetty/word-problem
Experiments on the impact of depth in transformers and SSMs.
yun-kwak/efficient-mcts
[UAI'24 Oral] Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction
iwhwang/NCD
On Discovery of Local Independence over Continuous Variables via Neural Contextual Decomposition (CLeaR 2023)
iwhwang/Fine-Grained-Causal-RL
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning (ICML 2024)
abdulhaim/LMRL-Gym
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
rtqichen/ffjord
code for "FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models".
karpathy/llm.c
LLM training in simple, raw C/CUDA
espanso/espanso
Cross-platform Text Expander written in Rust
state-spaces/s4
Structured state space sequence models
state-spaces/mamba
Mamba SSM architecture
openai/transformer-debugger
gicheonkang/prograsp
🦾 PyTorch Implementation for the ICRA'24 Paper, "PROGrasp: Pragmatic Human-Robot Communication for Object Grasping"
google-deepmind/mujoco_menagerie
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.
sotetsuk/pgx
♟️ Vectorized RL game environments in JAX
open-spaced-repetition/srs-benchmark
A benchmark for spaced repetition schedulers/algorithms