mattbernst's Stars
PaddlePaddle/PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
romansky/dom-to-semantic-markdown
DOM to Semantic-Markdown for use in LLMs
mufeedvh/code2prompt
A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.
CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
jhc13/taggui
Tag manager and captioner for image datasets
LatencyUtils/LatencyUtils
Utilities for latency measurement and reporting
plasma-umass/scalene
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
maqp/tfc
Tinfoil Chat - Onion-routed, endpoint secure messaging system
scanamo/scanamo
Simpler DynamoDB access for Scala
mahyarnajibi/SNIPER
SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm
intoli/exodus
Painless relocation of Linux binaries–and all of their dependencies–without containers.
lnsmith54/super-convergence
Files to create the figures in the paper "Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates"
EpistasisLab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
LRMPUT/FastABLE
com-lihaoyi/mill
Mill is a fast JVM build tool that supports Java and Scala. 2-3x faster than Gradle and 5-10x faster than Maven for common workflows, Mill aims to make your project’s build process performant, maintainable, and flexible
pyscf/pyscf
Python module for quantum chemistry
giving-a-fuck-about-climate-change/carbondoomsday
A RESTish web API for climate change related data :earth_africa:
drhaney/gencodata
Generates Fortran, C, and Python header files containing CODATA 2014 physical constants
nwchemgit/nwchem
NWChem: Open Source High-Performance Computational Chemistry
WZBSocialScienceCenter/pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.