shisantaibao's Stars
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
kaixindelele/ChatPaper
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
eseckel/ai-for-grant-writing
A curated list of resources for using LLMs to develop more competitive grant applications.
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
andimarafioti/florence2-finetuning
Quick exploration into fine tuning florence 2
mbzuai-oryx/GeoChat
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Raguggg/quillbot-premium-for-free
Quillbot Unlock: It is used users paraphrase an unlimited number of words, with access to seven different writing modes and four synonyms options. Its summarizer feature has a word limit of 6000, and can process up to 15 sentences at once. Additionally, users have the option to freeze unlimited words and phrases.
AntonioTepsich/Convolutional-KANs
This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to learnable non linear activations in each pixel.
Ablustrund/LoRAMoE
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
FreedomIntelligence/ALLaVA
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
v2rayA/v2rayA
A web GUI client of Project V which supports VMess, VLESS, SS, SSR, Trojan, Tuic and Juicity protocols. 🚀
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
Coobiw/MPP-LLaVA
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
nwpu-zxr/VadCLIP
VadCLIP official Pytorch implementation
rese1f/MovieChat
[CVPR 2024] 🎬💭 chat with over 10K frames of video!
PKU-YuanGroup/Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
lzw-lzw/GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
AppFlowy-IO/AppFlowy
Bring projects, wikis, and teams together with AI. AppFlowy is an AI collaborative workspace where you achieve more without losing control of your data. The best open source alternative to Notion.
PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
vikhyat/moondream
tiny vision language model
LLaVA-VL/LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
fbcotter/pytorch_wavelets
Pytorch implementation of 2D Discrete Wavelet (DWT) and Dual Tree Complex Wavelet Transforms (DTCWT) and a DTCWT based ScatterNet
alexandrosstergiou/adaPool
[T-IP 2023] Code for exponential adaptive pooling for PyTorch
dwromero/ckconv
Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/2102.02611
JiuTian-VL/JiuTian-LION
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
BAAI-DCAI/Bunny
A family of lightweight multimodal models.