zhangboshen's Stars
facebookresearch/llama
Inference code for LLaMA models
fxsjy/jieba
结巴中文分词
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
MaartenGr/BERTopic
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
MaartenGr/KeyBERT
Minimal keyword extraction with BERT
sail-sg/EditAnything
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
magicleap/SuperGluePretrainedNetwork
SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)
towhee-io/towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
scikit-learn-contrib/hdbscan
A high performance implementation of HDBSCAN clustering.
OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
fudan-zvg/Semantic-Segment-Anything
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
gligen/GLIGEN
Open-Set Grounded Text-to-Image Generation
willard-yuan/awesome-cbir-papers
📝Awesome and classical image retrieval papers
lyuchenyang/Macaw-LLM
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
thu-ml/prolificdreamer
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight)
thu-ml/unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
yzhuoning/Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
showlab/Image2Paragraph
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
studio-ousia/luke
LUKE -- Language Understanding with Knowledge-based Embeddings
BaptisteBlouin/EventExtractionPapers
A list of NLP resources focused on event extraction task
showlab/VLog
Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
JindongGu/Awesome-Prompting-on-Vision-Language-Model
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
JialianW/GRiT
GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)