MengHao666
Working in Video Generation in Alibaba now. Previous Computer Graphics Engineer on 3D digital human in Tenent. Master in BeiHang University.
AlibabaHangzhou, China
MengHao666's Stars
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
marktext/marktext
📝A simple and elegant markdown editor, available for Linux, macOS and Windows.
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
d2l-ai/d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
openai/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
karpathy/nn-zero-to-hero
Neural Networks: Zero to Hero
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
meta-llama/llama-stack
Composable building blocks to build Llama Apps
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
hojonathanho/diffusion
Denoising Diffusion Probabilistic Models
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
LLaVA-VL/LLaVA-NeXT
NVlabs/VILA
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
microsoft/Phi-3CookBook
This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
amazon-science/auto-cot
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)
mbzuai-oryx/LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
facebookresearch/AudioDec
An Open-source Streaming High-fidelity Neural Audio Codec
facebookresearch/MovieGenBench
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
showlab/Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
WisconsinAIVision/YoLLaVA
🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant
athn-nik/motionfix
MotionFix: Text-Driven 3D Human Motion Editing [SIGGRAPH ASIA 2024]
nvlm-project/nvlm-project.github.io