shuyansy

In HIT Researching in machine learning and computer vision

Harbin Institude of TechnologyChina

shuyansy's Stars

EvolvingLMMs-Lab/LongVA
Long Context Transfer from Language to Vision
Language:Python1339
BAAI-DCAI/SpatialBot
17
JUNJIE99/MLVU
Language:Python67
BAAI-DCAI/Multimodal-Robustness-Benchmark
Language:Python24
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python5.8k422
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
10.3k694
rese1f/MovieChat
[CVPR 2024] 🎬💭 chat with over 10K frames of video!
Language:Python44937
BAAI-DCAI/Bunny
A family of lightweight multimodal models.
Language:Python77156
ttengwang/Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
1114
kousw/experimental-consistory
Language:Python612
md-mohaiminul/VideoRecap
Language:Python1437
TencentARC/SmartEdit
Official code of SmartEdit [CVPR-2024 Highlight]
Language:Python1833
shengliu66/ICV
Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Language:Python544
shuyansy/Efficient-Ambiguous-Text-Detector
An official Project related to Paper "Perceiving Ambiguity and Semantics without Recognition: An Efficient and Effective Ambiguous Scene Text Detector" (ACM MM 2023)
Language:Python273
shuyansy/Survey-of-Visual-Text-Processing
The official project of paper "Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing"
31
bahjat-kawar/time-diffusion
Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"
Language:Python762
yeungchenwa/Recommendations-Diffusion-Text-Image
A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten generation, scene text recognition and scene text detection.
1553
zhoubolei/bolei_awesome_posters
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
1.3k122
UKPLab/sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
Language:Python14.3k2.4k
dali92002/OCR-TR
Optocal Character Recognition (OCR / HTR) using Transformers
Language:Python10
shuyansy/multilingual-machine-translation
This is some code for multilingual machine translation (English, Korean, Japanese, Arabic)
Language:Python1
dali92002/SSL-OCR
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023
Language:Python225
shuyansy/Synthesis-multilingual-handwritten-text-data
This is a simple yet method focused on handwritten text dataset generation, which is beneficial for handwritten text detection and segmentation
Language:Python1
advimman/lama
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Language:Jupyter Notebook7.5k810
yeungchenwa/OCR-SAM
Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting
Language:Python48035
RockeyCoss/Prompt-Segment-Anything
This is an implementation of zero-shot instance segmentation using Segment Anything.
Language:Python28915
shuyansy/Detect-and-read-meters
This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.
Language:Python10514
KidsWithTokens/Medical-SAM-Adapter
Adapting Segment Anything Model for Medical Image Segmentation
Language:Python88572
ziqi-jin/finetune-anything
Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios
Language:Python68951
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python14.8k1.4k

shuyansy

shuyansy's Stars

EvolvingLMMs-Lab/LongVA

BAAI-DCAI/SpatialBot

JUNJIE99/MLVU

BAAI-DCAI/Multimodal-Robustness-Benchmark

FlagOpen/FlagEmbedding

BradyFU/Awesome-Multimodal-Large-Language-Models

rese1f/MovieChat

BAAI-DCAI/Bunny

ttengwang/Awesome_Long_Form_Video_Understanding

kousw/experimental-consistory

md-mohaiminul/VideoRecap

TencentARC/SmartEdit

shengliu66/ICV

shuyansy/Efficient-Ambiguous-Text-Detector

shuyansy/Survey-of-Visual-Text-Processing

bahjat-kawar/time-diffusion

yeungchenwa/Recommendations-Diffusion-Text-Image

zhoubolei/bolei_awesome_posters

UKPLab/sentence-transformers

dali92002/OCR-TR

shuyansy/multilingual-machine-translation

dali92002/SSL-OCR

shuyansy/Synthesis-multilingual-handwritten-text-data

advimman/lama

yeungchenwa/OCR-SAM

RockeyCoss/Prompt-Segment-Anything

shuyansy/Detect-and-read-meters

KidsWithTokens/Medical-SAM-Adapter

ziqi-jin/finetune-anything

huggingface/peft