lawrence-cj

Research Intern @ NVIDIA Research. Research Assistant @ HKU. Ph.D. Candidate @ DLUT.

Dalian University of TechnologyBeijing, China

lawrence-cj's Stars

hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.7k 182 4782.1k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python12.1k 101 527847
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Language:Python11.3k 160 3001k
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python5.6k 50 556438
Fanghua-Yu/SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
Language:Python4.2k 67 139374
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Python4k 116 81301
XPixelGroup/DiffBIR
Official codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
Language:Python3.3k 36 124278
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k 28 131279
Alpha-VLLM/LLaMA2-Accessory
An Open-source Toolkit for LLM Development
Language:Python2.7k 36 134170
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
Language:Python2k 31 8485
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
Language:Python1.9k 34 1.1k243
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python1.8k 27 119145
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Language:Jupyter Notebook1.7k 26 5193
PixArt-alpha/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Language:Python1.6k 39 11777
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Language:Python1.3k 21 5860
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.2k 21 5448
aigc-apps/EasyAnimate
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Language:Python1.2k 17 8588
Vchitect/LaVie
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
Language:Python847 28 2560
zhijian-liu/torchprofile
A general and accurate MACs / FLOPs profiler for PyTorch models
Language:Python559 10 1838
NVlabs/edm2
Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)
Language:Python489 12 519
tianweiy/DMD2
Language:Python436 6 4725
cloneofsimo/minRF
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
Language:Jupyter Notebook420 6 929
bojone/papers.cool
Cool Papers - Immersive Paper Discovery
Language:HTML356 4 595
HaozheLiu-ST/T-GATE
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
Language:Python342 12 1523
IceClear/CLIP-IQA
[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images
Language:Python314 4 3218
OpenGVLab/ControlLLM
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Language:Python185 8 69
daooshee/HD-VG-130M
The HD-VG-130M Dataset
106 6 52
djghosh13/geneval
GenEval: An object-focused framework for evaluating text-to-image alignment
Language:HTML86 1 85
sayakpaul/cmmd-pytorch
PyTorch implementation of CLIP Maximum Mean Discrepancy (CMMD) for evaluating image generation models.
Language:Python85 3 15
sayakpaul/single-video-curation-svd
Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.
Language:Jupyter Notebook81 3 16

lawrence-cj

lawrence-cj's Stars

hpcaitech/Open-Sora

OpenBMB/MiniCPM-V

PKU-YuanGroup/Open-Sora-Plan

OpenGVLab/InternVL

Fanghua-Yu/SUPIR

FoundationVision/VAR

XPixelGroup/DiffBIR

dvlab-research/MGM

Alpha-VLLM/LLaMA2-Accessory

Alpha-VLLM/Lumina-T2X

stanford-crfm/helm

NVlabs/VILA

YangLing0818/RPG-DiffusionMaster

PixArt-alpha/PixArt-sigma

dvlab-research/ControlNeXt

FoundationVision/LlamaGen

aigc-apps/EasyAnimate

Vchitect/LaVie

zhijian-liu/torchprofile

NVlabs/edm2

tianweiy/DMD2

cloneofsimo/minRF

bojone/papers.cool

HaozheLiu-ST/T-GATE

IceClear/CLIP-IQA

OpenGVLab/ControlLLM

daooshee/HD-VG-130M

djghosh13/geneval

sayakpaul/cmmd-pytorch

sayakpaul/single-video-curation-svd