wuhenbai's Stars
py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
scanny/python-pptx
Create Open XML PowerPoint documents in Python
shobhitsharma/pptx-compose
Parser to convert PPTX to JSON format
euske/pdfminer
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
bhaskatripathi/pdfGPT
PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
mem0ai/mem0
The Memory layer for your AI apps
niedev/RTranslator
Open source real-time translation app for Android that runs locally
AdeDZY/DeepCT
DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
eosphoros-ai/DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
karpathy/llama2.c
Inference Llama 2 in one file of pure C
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
feifeibear/LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
xverse-ai/XVERSE-13B
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
shibing624/zh-normalization
Chinese(zh) sentence NSW(Non-Standard-Word) Normalization
iwangjian/Paper-Reading-ConvAI
📖 Paper reading list in conversational AI (constantly updating 🤗).
OpenLMLab/MOSS-RLHF
MOSS-RLHF
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
yanyiwu/gojieba
"结巴"中文分词的Golang版本
zjunlp/Prompt4ReasoningPapers
[ACL 2023] Reasoning with Language Model Prompting: A Survey
km1994/llms_paper
该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)
TommyZihao/CyberDog2_Tutorials
小米CyberDog2仿生四足机器人教程,同济子豪兄主讲
imoneoi/openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data