maocaixia

Tsinghua University

TencentBeiJing

maocaixia's Stars

infinigence/Infini-Megrez
Language:Python29918
ppaanngggg/layoutreader
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
Language:Python15211
stacklens/django_blog_tutorial
Django搭建博客教程
Language:Python1.4k419
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Language:Python6.5k571
SocialAI-tianji/Tianji
制作懂人情世故的大语言模型 | 涵盖提示词工程、RAG、Agent、LLM微调教程
Language:Python1k74
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Language:Jupyter Notebook37.5k4.8k
WenmuZhou/TableGeneration
通过浏览器渲染生成表格图像
Language:Python21142
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
Language:Python42.1k6.2k
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Language:Python1.9k132
danny-avila/LibreChat
Enhanced ChatGPT Clone: Features Agents, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.
Language:TypeScript20.3k3.4k
opendatalab/PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Language:Python6.3k413
cv-small-snails/Awesome-Table-Recognition
A curated list of resources dedicated to table recognition
38451
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python21.5k2.1k
CosmosShadow/gptpdf
Using GPT to parse PDF
Language:Python3.2k231
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python27.5k2.6k
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python6.7k625
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
Language:Jupyter Notebook9.8k1.5k
JiaquanYe/TableMASTER-mmocr
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
Language:Python447105
microsoft/table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Language:Python2.4k264
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Language:C++1.6k184
X-PLUG/MobileAgent
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
Language:Python3.2k308
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python13k910
VikParuchuri/marker
Convert PDF to markdown + JSON quickly with high accuracy
Language:Python19.1k1.1k
3DTopia/LGM
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Language:Python1.8k123
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k281
PawanOsman/ChatGPT
OpenAI API Free Reverse Proxy
Language:TypeScript5.7k1k
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Jupyter Notebook6.6k441
plandex-ai/plandex
AI driven development in your terminal. Designed for large, real-world tasks.
Language:Go11k758
nashsu/FreeAskInternet
FreeAskInternet is a completely free, PRIVATE and LOCALLY running search aggregator & answer generate using MULTI LLMs, without GPU needed. The user can ask a question and the system will make a multi engine search and combine the search result to LLM and generate the answer based on search results. It's all FREE to use.
Language:Python8.5k898
midday-ai/midday
Invoicing, Time tracking, File reconciliation, Storage, Financial Overview & your own Assistant made for Freelancers
Language:TypeScript6.4k607

maocaixia

maocaixia's Stars

infinigence/Infini-Megrez

ppaanngggg/layoutreader

stacklens/django_blog_tutorial

Ucas-HaoranWei/GOT-OCR2.0

SocialAI-tianji/Tianji

rasbt/LLMs-from-scratch

WenmuZhou/TableGeneration

hacksider/Deep-Live-Cam

Yuliang-Liu/Monkey

danny-avila/LibreChat

opendatalab/PDF-Extract-Kit

cv-small-snails/Awesome-Table-Recognition

microsoft/graphrag

CosmosShadow/gptpdf

infiniflow/ragflow

modelscope/DiffSynth-Studio

NielsRogge/Transformers-Tutorials

JiaquanYe/TableMASTER-mmocr

microsoft/table-transformer

AlibabaResearch/AdvancedLiterateMachinery

X-PLUG/MobileAgent

OpenBMB/MiniCPM-V

VikParuchuri/marker

3DTopia/LGM

dvlab-research/MGM

PawanOsman/ChatGPT

FoundationVision/VAR

plandex-ai/plandex

nashsu/FreeAskInternet

midday-ai/midday