wangchao0502's Stars
jgm/pandoc
Universal markup converter
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
apache/datafusion
Apache DataFusion SQL Query Engine
PyO3/pyo3
Rust bindings for the Python interpreter
chinese-poetry/chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
Vonng/Capslock
Make Capslock Great Again!
tokio-rs/mini-redis
Incomplete Redis client and server implementation using Tokio - for learning purposes only
git-lfs/git-lfs
Git extension for versioning large files
digoal/blog
Opensource,Database,AI,Business,Minds. git clone --depth 1 https://github.com/digoal/blog
microsoft/api-guidelines
Microsoft REST API Guidelines
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
huggingface/datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
kuangbin/ACM-ICPC
ACM/ICPC
FlowiseAI/Flowise
Drag & drop UI to build your customized LLM flow
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
bminor/glibc
Unofficial mirror of sourceware glibc repository. Updated daily.
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
zh-google-styleguide/zh-google-styleguide
Google 开源项目风格指南 (中文版)
rosedblabs/database-learning
(Chinese) 数据库/存储学习路径推荐
cmu-db/bustub
The BusTub Relational Database Management System (Educational)
heibaiying/BigData-Notes
大数据入门指南 :star:
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
oceanbase/miniob
MiniOB is a compact database that assists developers in understanding the fundamental workings of a database.
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
youngyangyang04/leetcode-master
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
remzi-arpacidusseau/ostep-code
Code from various chapters in OSTEP (http://www.ostep.org)
KaTeX/KaTeX
Fast math typesetting for the web.
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.