aswanthkrishna's Stars
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
feast-dev/feast
The Open Source Feature Store for Machine Learning
Morten010/WireUp
build databases easier
Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
Renumics/spotlight
Interactively explore unstructured datasets from your dataframe.
cleanlab/cleanvision
Automatically find issues in image datasets and practice data-centric computer vision.
FlowiseAI/Flowise
Drag & drop UI to build your customized LLM flow
charmbracelet/gum
A tool for glamorous shell scripts 🎀
ml-tooling/opyrator
🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.
cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
kelvins/awesome-mlops
:sunglasses: A curated list of awesome MLOps tools
aimhubio/aim
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
kevin-hanselman/dud
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
merantix-momentum/squirrel-core
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:
huggingface/dataset-viewer
Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
airctic/icedata
IceData: Datasets Hub for the *IceVision* Framework
merantix-momentum/squirrel-datasets-core
Squirrel dataset hub
iterative/dataset-registry
Dataset registry DVC project
openvinotoolkit/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
datahub-project/datahub
The Metadata Platform for your Data and AI Stack
silverbulletmdc/showdata
Large scale image dataset visiualization tool.
clearml/clearml
ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
jomariya23156/full-stack-on-prem-cv-mlops
"1 config, 1 command from Jupyter Notebook to serve Millions of users", Full-stack On-Premises MLOps system for Computer Vision from Data versioning to Model monitoring and drift detection.
koordinates/kart
Distributed version-control for geospatial and tabular data
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
refinedev/refine
A React Framework for building internal tools, admin panels, dashboards & B2B apps with unmatched flexibility.
rentruewang/bocoel
Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few lines of modular code.