wduo

A wandering machine learning researcher, bouncing between groups. I want to understand things clearly, and explain them well. - Colah

Pretending in Hangzhou Creative Culture Company(PH3C)Beijing(wangduo.cnblogs.com)

wduo's Stars

apache/doris
Apache Doris is an easy-to-use, high performance and unified analytics database.
Language:Java12.8k 284 7.5k3.3k
tmux-plugins/tpm
Tmux Plugin Manager
Language:Shell12.3k 88 206433
apache/thrift
Apache Thrift
Language:C++10.4k 465 04k
ymcui/Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）
Language:Python9.7k 143 2401.4k
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Language:Python9.5k 77 1.5k2.2k
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML9.3k 61 1.1k769
hluk/CopyQ
Clipboard manager with advanced features
Language:C++8.9k 137 2.3k453
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.4k 100 1.2k1.4k
jsvine/pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
Language:Python6.8k 92 558679
tesseract-ocr/tessdata
Trained models with fast variant of the "best" LSTM models + legacy models
6.5k 231 1572.2k
pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Language:Python5.8k 64 2.1k534
Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
Language:Python4.9k 74 149474
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Language:Python4.7k 47 199481
huggingface/safetensors
Simple, safe way to store and distribute tensors
Language:Python2.9k 43 186201
ArtifexSoftware/pdf2docx
Open source Python library for converting PDF to DOCX.
Language:Python2.6k 26 258381
brightmart/roberta_zh
RoBERTa中文预训练模型: RoBERTa for Chinese
Language:Python2.6k 52 95409
neo4j-labs/llm-graph-builder
Neo4j graph construction from unstructured data using LLMs
Language:Jupyter Notebook2.6k 24 462410
chen3feng/blade-build
Blade is a powerful build system from Tencent, supports many mainstream programming languages, such as C/C++, java, scala, python, protobuf...
Language:Python2.1k 146 447497
huggingface/evaluate
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
Language:Python2k 48 299259
stanford-oval/WikiChat
WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
Language:Python1.2k 18 25107
nyu-mll/GLUE-baselines
[DEPRECATED] Repo for exploring multi-task learning approaches to learning sentence representations
Language:Python775 27 27165
quqxui/Awesome-LLM4IE-Papers
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
773 12 042
thu-coai/CrossWOZ
A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Language:Python651 16 32114
xuyige/BERT4doc-Classification
Code and source for paper ``How to Fine-Tune BERT for Text Classification?``
Language:Python619 9 2199
HUSTAI/uie_pytorch
PaddleNLP UIE模型的PyTorch版实现
Language:Python603 4 45101
Thriftpy/thriftpy2
Pure python approach of Apache Thrift.
Language:Python577 12 13292
Unstructured-IO/unstructured-inference
Language:Python162 20 6353
CLUEbenchmark/SuperCLUE-Llama2-Chinese
Llama2开源模型中文版-全方位测评，基于SuperCLUE的OPEN基准 | Llama2 Chinese evaluation with SuperCLUE
127 2 28
qingyujean/document-level-classification
超长文本分类（大于1000字）；文档级/篇章级文本分类；主要是解决长距离依赖问题
Language:Python119 1 628
volcengine/volc-sdk-python
Language:Python112 2 2823

wduo

wduo's Stars

apache/doris

tmux-plugins/tpm

apache/thrift

ymcui/Chinese-BERT-wwm

Megvii-BaseDetection/YOLOX

Unstructured-IO/unstructured

hluk/CopyQ

NVIDIA/apex

jsvine/pdfplumber

tesseract-ocr/tessdata

pymupdf/PyMuPDF

Layout-Parser/layout-parser

allenai/OLMo

huggingface/safetensors

ArtifexSoftware/pdf2docx

brightmart/roberta_zh

neo4j-labs/llm-graph-builder

chen3feng/blade-build

huggingface/evaluate

stanford-oval/WikiChat

nyu-mll/GLUE-baselines

quqxui/Awesome-LLM4IE-Papers

thu-coai/CrossWOZ

xuyige/BERT4doc-Classification

HUSTAI/uie_pytorch

Thriftpy/thriftpy2

Unstructured-IO/unstructured-inference

CLUEbenchmark/SuperCLUE-Llama2-Chinese

qingyujean/document-level-classification

volcengine/volc-sdk-python