jingtaoli-sony

jingtaoli-sony's Stars

tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Language:Jupyter Notebook5.1k333
lewandofskee/MambaAD
[NeurIPS 2024] Official implementation of MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection.
Language:Python1002
JiayuanWang-JW/YOLOv8-multi-task
Language:Python24141
TIGER-AI-Lab/VIEScore
Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024 main)
Language:Python24
boschresearch/ALDM
Official implementation of "Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive" (ICLR 2024)
Language:Jupyter Notebook513
mcordts/cityscapesScripts
README and scripts for the Cityscapes Dataset
Language:Python2.2k607
TissueImageAnalytics/cerberus
One Model is All You Need: Multi-Task Learning Enables Simultaneous Histology Image Segmentation and Classification
Language:Python6912
leeyeehoo/CSRNet-pytorch
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
Language:Jupyter Notebook648261
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.9k2.1k
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k277
apple/ml-4m
4M: Massively Multimodal Masked Modeling
Language:Python1.6k93
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Language:Python6.9k528
cientgu/InstructDiffusion
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
Language:Python38120
bodaay/HuggingFaceModelDownloader
Simple go utility to download HuggingFace Models and Datasets
Language:Go48049
kylesargent/ZeroNVS
Language:Python46228
NVlabs/genvs
62710
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
Language:Jupyter Notebook2.4k208
baegwangbin/surface_normal_uncertainty
[ICCV 2021 Oral] Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation
Language:Python22522
cvlab-columbia/zero123
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Language:Python2.7k193
kongzhecn/OMG
[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models
Language:Python62743
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Language:Python1.3k185
open-mmlab/mmyolo
OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.
Language:Python3k535
hako-mikan/sd-webui-regional-prompter
set prompt to divided region
Language:Python1.6k131
Sanster/IOPaint
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
Language:Python19.2k2k
advimman/lama
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Language:Jupyter Notebook7.9k842
yhenon/pytorch-retinanet
Pytorch implementation of RetinaNet object detection.
Language:Python2.1k665
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Language:Python9.4k2.2k
DonaldRR/SimpleNet
Language:Python42963
eric-ai-lab/PEViT
Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"
Language:Python945
kyegomez/Vit-RGTS
Open source implementation of "Vision Transformers Need Registers"
Language:Python13613