huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
PythonApache-2.0
Watchers
- 0xh3xSan Francisco, CA
- adbarBerlin-Brg. Academy of Sciences (BBAW)
- aliabd
- anwarchkAR3SYSTEMS
- apolinario
- asetsuna
- beurkinger@huggingface
- clmnt@huggingface
- cyrilzakka@huggingface
- DN6@huggingface
- drbhdrbh
- dsisnero
- evijit@huggingface
- graniet00 00 80 00
- guipenedoHuggingFace
- jagwar
- jmukiibi@UNGlobalPulse
- julien-c@huggingface
- juliensimonArcee.ai
- kasper-piskorski@AccessIntelligence
- krampstudio@huggingface
- manu-chauhan
- mbofb
- meganrileyChicago
- mehdikianiEsfahan,Iran
- MKhalusovaUnstructured.io
- oOraph
- rwightman@huggingface
- sanxoreParis
- saran-io@hubden @tekvo @sprocketeer
- thomwolf@huggingface
- VictorSanh@huggingface
- wlikeXVerse
- ydshiehParis
- yjerniteCIMS, NYU
- ykoyfmanIBM Research