ylqi

Pinned Repositories

Count-Anything
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
Language:Python137 3 517
f2-nerf
Fast neural radiance field training with free camera trajectories
Language:C2 0 00
GL-RG
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
Language:Python18 2 144
HAF
This repo is the PyTorch implementation of ICASSP2021 paper "HIERARCHICAL ATTENTION FUSION FOR GEO-LOCALIZATION"
Language:Python11 4 15
smerf-3d.github.io
Language:JavaScript2 0 00
STC-Seg
TCSVT Paper: Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration
Language:Python3 1 11
Unbounded-NeRF
2 1 00
UniAD
[CVPR 2023 Award Candidate] Planning-oriented Autonomous Driving
Language:Python2 0 00
visprog
Visual Programming: Compositional visual reasoning without training (CVPR 2023)
Language:Python2 0 00
ylqi.github.io
Homepage
Language:HTML3 1 00

ylqi's Repositories

ylqi/Count-Anything
This method uses Segment Anything and CLIP to ground and count any object that matches a custom text prompt, without requiring any point or box annotation.
Language:Python137 3 517
ylqi/GL-RG
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
Language:Python18 2 144
ylqi/HAF
This repo is the PyTorch implementation of ICASSP2021 paper "HIERARCHICAL ATTENTION FUSION FOR GEO-LOCALIZATION"
Language:Python11 4 15
ylqi/STC-Seg
TCSVT Paper: Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration
Language:Python3 1 11
ylqi/ylqi.github.io
Homepage
Language:HTML3 1 00
ylqi/f2-nerf
Fast neural radiance field training with free camera trajectories
Language:C2 0 00
ylqi/smerf-3d.github.io
Language:JavaScript2 0 00
ylqi/Unbounded-NeRF
2 1 00
ylqi/UniAD
[CVPR 2023 Award Candidate] Planning-oriented Autonomous Driving
Language:Python2 0 00
ylqi/visprog
Visual Programming: Compositional visual reasoning without training (CVPR 2023)
Language:Python2 0 00
ylqi/alpha_visualizer
Visualize the radiance field as point clouds in Switch-NeRF.
Language:C++1 0 0
ylqi/clbrobot_project
Video language navigation client
Language:Python1 1 0
ylqi/diffusionmagic
Easy to use Stable diffusion workflows using diffusers (WIP)
Language:Python1 0 0
ylqi/Image2Paragraph
Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
Language:Python1 0 0
ylqi/Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Language:Python1 0 0
ylqi/NeRF-SLAM
NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields. https://arxiv.org/abs/2210.13641 + Sigma-Fusion: Probabilistic Volumetric Fusion for Dense Monocular SLAM https://arxiv.org/abs/2210.01276
Language:Python1 0 0
ylqi/NeRF_RPN
Language:Python1 0 0
ylqi/ODISE
ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Language:Python1 0 0
ylqi/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language:Jupyter Notebook1 0 0
ylqi/text2room
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models.
Language:Python1 0 0
ylqi/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python1 0 0
ylqi/X-Decoder
X-Decoder for generalized decoding for pixel, image and language
Language:Python1 0 0
ylqi/yolov8_tracking
Real-time multi-object tracking and segmentation using YOLOv8
Language:Python1 0 0
ylqi/HC-STVG
The HC-STVG Dataset
Language:Python0 0
ylqi/mcvd-pytorch
Official implementation of MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (https://arxiv.org/abs/2205.09853)
Language:Python0 0
ylqi/Pathfinding-Algorithm
A pathfinding algorithm for self-driving delivery vehicles.
Language:Python1 0
ylqi/prompt-to-prompt
Language:Jupyter Notebook0 0
ylqi/pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
ylqi/robodreamer
Language:Python0 0
ylqi/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python0 0