caoxuwen's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
OpenInterpreter/open-interpreter
A natural language interface for computers
gpt-engineer-org/gpt-engineer
Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
guidance-ai/guidance
A guidance language for controlling large language models.
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—language models
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
GaiZhenbiao/ChuanhuChatGPT
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
triton-lang/triton
Development repository for the Triton language and compiler
apache/doris
Apache Doris is an easy-to-use, high performance and unified analytics database.
datahub-project/datahub
The Metadata Platform for your Data and AI Stack
mshumer/gpt-prompt-engineer
brycedrennan/imaginAIry
Pythonic AI generation of images and videos
opendatalab/PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
mukulpatnaik/researchgpt
A LLM based research assistant that allows you to have a conversation with a research paper
baichuan-inc/Baichuan-13B
A 13B large language model developed by Baichuan Intelligent Technology
comet-ml/kangas
🦘 Explore multimedia datasets at scale
facebookresearch/cc_net
Tools to download and cleanup Common Crawl data
tonbo-io/tonbo
A portable embedded database using Arrow.
opendatadiscovery/awesome-data-catalogs
📙 Awesome Data Catalogs and Observability Platforms.
amosjyng/langchain-visualizer
Visualization and debugging tool for LangChain workflows
yanagishima/yanagishima
Web UI for Trino, Hive and SparkSQL
devchat-ai/devchat
Automate your dev tasks with AI-powered scripts, from your IDE's chat panel.
Menghuan1918/pdfdeal
A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处理(提升PDF在RAG中的召回率)。
GAIR-NLP/ProX
Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
apple/ml-np-rasp