Pinned Repositories
AlphaCLIP
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
AndroidArena
AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Auto-UI
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (stay tuned and more will be updated)
Awesome-Diffusion-Models
A collection of resources and papers on Diffusion Models
awesome-large-multimodal-agents
Awesome-Mamba-Papers
Awesome Papers related to Mamba.
Awesome-Papers-Autonomous-Agent
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
Awesome-Video-Diffusion-Models
[Arxiv] A Survey on Video Diffusion Models
ZhikanggFu's Repositories
ZhikanggFu/AlphaCLIP
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
ZhikanggFu/AndroidArena
ZhikanggFu/AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
ZhikanggFu/awesome-large-multimodal-agents
ZhikanggFu/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
ZhikanggFu/Awesome-Papers-Autonomous-Agent
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
ZhikanggFu/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
ZhikanggFu/cobra
Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
ZhikanggFu/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
ZhikanggFu/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
ZhikanggFu/decision-mamba
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces
ZhikanggFu/diamond
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
ZhikanggFu/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
ZhikanggFu/FLAML
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
ZhikanggFu/generative-models
Generative Models by Stability AI
ZhikanggFu/HierarchicalDecisionMamba
ZhikanggFu/iris
Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.
ZhikanggFu/latent-consistency-model
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
ZhikanggFu/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
ZhikanggFu/LLM-Agents-Papers
A repo lists papers related to LLM based agent
ZhikanggFu/mamba
ZhikanggFu/MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
ZhikanggFu/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
ZhikanggFu/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
ZhikanggFu/policy-guided-diffusion
Official implementation of "Policy-Guided Diffusion"
ZhikanggFu/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
ZhikanggFu/SoM
Set-of-Mark Prompting for LMMs
ZhikanggFu/weak-to-strong
ZhikanggFu/YiVal
Your Automatic Prompt Engineering Assistant for GenAI Applications
ZhikanggFu/zigma
A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model"