kingwenChen's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
marktext/marktext
📝A simple and elegant markdown editor, available for Linux, macOS and Windows.
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
pytorch/vision
Datasets, Transforms and Models specific to Computer Vision
fishaudio/fish-speech
Brand new TTS solution
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
google/latexify_py
A library to generate LaTeX expression from Python code.
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
opendatalab/PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
fundamentalvision/Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
lm-sys/RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
facebookresearch/Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
lyuwenyu/RT-DETR
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
UlionTse/translators
🌏🌍🌎Translators🌎🌍🌏 is a library that aims to bring free, multiple, enjoyable translations to individuals and students in Python. Translators是一个旨在用Python为个人和学生带来免费、多样、愉快翻译的库。
IDEA-Research/MaskDINO
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
test-time-training/ttt-lm-pytorch
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
NLPJCL/RAG-Retrieval
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT,Cross Encoder
OleehyO/TexTeller
TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability, enabling it to cover most usage scenarios.
ox-vgg/via
(MIRROR) see https://gitlab.com/vgg/via/
emo-box/EmoBox
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
ppaanngggg/layoutreader
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
wangyu-ustc/MemoryLLM
The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models"
aimagelab/FourBi
Binarizing Documents by Leveraging both Space and Frequency. (ICDAR 2024)
deepopinion/ocr_wrapper
A Python wrapper for multiple OCR solutions
yujunhuics/LayoutReader