EinesTages

我们在荒凉的盐碱地，捡拾星辰

BJUTBeijing

EinesTages's Stars

kijai/ComfyUI-Florence2
Inference Microsoft Florence2 VLM
Language:Python78054
vivo-ai-lab/BlueLM
BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab
Language:Python86060
showlab/Awesome-GUI-Agent
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
29813
opendilab/awesome-ui-agents
A curated list of of awesome UI agents resources, encompassing Web, App, OS, and beyond (continually updated)
869
e2b-dev/awesome-ai-agents
A list of AI autonomous agents
11.8k876
niuzaisheng/ScreenAgent
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
Language:Python32830
yfzhang114/SliME
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Language:Python1407
kyegomez/ViTAR
Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch
Language:Python271
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python20.9k3.1k
ParadoxZW/LLaVA-UHD-Better
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
Language:Python323
thunlp/LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Language:Python32015
Zefan-Cai/KVCache-Factory
Unified KV Cache Compression Methods for Auto-Regressive Models
Language:Jupyter Notebook961125
harvardnlp/im2markup
Neural model for converting Image-to-Markup (by Yuntian Deng yuntiandeng.com)
Language:Lua1.2k214
zjwang21/StrokeNet
The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling]
Language:Python233
datawhalechina/self-llm
《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合**宝宝的部署教程
Language:Jupyter Notebook9.8k1.1k
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Language:Python10k1k
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Language:Python1.8k132
pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Language:HTML49471
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python35.4k4.4k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
12.9k824
kwuking/TimeMixer
[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
Language:Python1.3k178
Tencent/Tencent-Hunyuan-Large
Language:Python1.2k56
deepseek-ai/DreamCraft3D
[ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Language:Python2k89
antgroup/agentUniverse
agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.
Language:Python927118
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python20.1k2k
Go2Heart/EchoSight
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
Language:Python412
illuin-tech/colpali
The code used to train and run inference with the ColPali architecture.
Language:Python1.2k106
riedlerm/multimodal_rag_for_industry
Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications
Language:Python263
pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Language:Python5.8k536
OpenBMB/VisRAG
Parsing-free RAG supported by VLMs
Language:Python45132

EinesTages

EinesTages's Stars

kijai/ComfyUI-Florence2

vivo-ai-lab/BlueLM

showlab/Awesome-GUI-Agent

opendilab/awesome-ui-agents

e2b-dev/awesome-ai-agents

niuzaisheng/ScreenAgent

yfzhang114/SliME

kyegomez/ViTAR

lucidrains/vit-pytorch

ParadoxZW/LLaVA-UHD-Better

thunlp/LLaVA-UHD

Zefan-Cai/KVCache-Factory

harvardnlp/im2markup

zjwang21/StrokeNet

datawhalechina/self-llm

THU-MIG/yolov10

Yuliang-Liu/Monkey

pliang279/MultiBench

hiyouga/LLaMA-Factory

BradyFU/Awesome-Multimodal-Large-Language-Models

kwuking/TimeMixer

Tencent/Tencent-Hunyuan-Large

deepseek-ai/DreamCraft3D

antgroup/agentUniverse

microsoft/graphrag

Go2Heart/EchoSight

illuin-tech/colpali

riedlerm/multimodal_rag_for_industry

pymupdf/PyMuPDF

OpenBMB/VisRAG