kugwzk's Stars
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
locuslab/fast_adversarial
[ICLR 2020] A repository for extremely fast adversarial training using FGSM
hemingkx/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
NUS-HPC-AI-Lab/Neural-Network-Parameter-Diffusion
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
LargeWorldModel/LWM
openai/dalle3-eval-samples
Text-to-image samples collected for the evaluation of DALL-E 3 in the whitepaper.
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
mmathew23/improved_edm
Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"
deepseek-ai/DeepSeek-Math
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
allenai/OLMo-Eval
Evaluation suite for LLMs
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
SqueezeAILab/KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
InternLM/InternLM-XComposer
InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
XuandongZhao/weak-to-strong
Weak-to-Strong Jailbreaking on Large Language Models
declare-lab/instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
Aleph-Alpha/NeurIPS-WANT-submission-efficient-parallelization-layouts
ddupont808/GPT-4V-Act
AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
kevin-ssy/CLIP_as_RNN
OpenGVLab/MM-Interleaved
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
apple/ml-aim
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
google-deepmind/alphageometry
sgl-project/sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
microsoft/Cream
This is a collection of our NAS and Vision Transformer work.
kyegomez/MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
mistralai/mistral-inference
Official inference library for Mistral models
huggingface/open-muse
Open reproduction of MUSE for fast text2image generation.
huggingface/amused
FuxiaoLiu/LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning