huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
PythonApache-2.0
Watchers
- 0xh3xSan Francisco, CA
- adbarBerlin-Brg. Academy of Sciences (BBAW)
- aliabd
- anwarchkAR3SYSTEMS
- apolinario
- asetsuna
- beurkinger@huggingface
- clmnt@huggingface
- DN6New York
- drbhdrbh
- dsisnero
- eemailme
- evijit@huggingface
- graniet00 00 80 00
- guipenedoHuggingFace
- hynky1999
- jagwar
- julien-c@huggingface
- juliensimonHugging Face
- justHungryMan
- krampstudio@huggingface
- leot13Hugging Face
- mbofb
- meganrileyChicago
- mfuntowicz@huggingface
- MKhalusovaUnstructured.io
- oOraph
- rwightmanVancouver, BC
- sanxoreParis
- saran-io@hubden @tekvo @sprocketeer
- thomwolf@huggingface
- VictorSanh@huggingface
- ydshiehParis
- yjerniteCIMS, NYU
- ykoyfmanIBM Research
- yotamnahum@Samplead