dsumpter's Stars
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
hegelai/prompttools
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
frankiethull/kuzco
LLM image classification library using ollama in R
AnswerDotAI/shell_sage
ShellSage saves sysadmins’ sanity by solving shell script snafus super swiftly
apoorvalal/TestingInEventStudies
tests for cohort-level heterogeneity in panel regression
smicallef/spiderfoot
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Alfredredbird/tookie-osint
Tookie is a advanced OSINT information gathering tool that finds social media accounts based on inputs.
bradleyess/awesome-object-storage-infra
An opinionated list of awesome projects, frameworks, databases, and resources for building atop object storage systems. Whether it’s about implementing modern data platforms, logs, event-driven architectures, or large-scale analytics, these projects harness object-based backends to power scalable and resilient infrastructures.
tonbo-io/tonbo
A portable embedded database using Arrow.
cfahlgren1/observers
A Lightweight Library for AI Observability
probabl-ai/skore
the scikit-learn sidekick
soda-inria/carte
Repository for CARTE: Context-Aware Representation of Table Entries
PriorLabs/TabPFN
⚡ TabPFN: Foundation Model for Tabular Data ⚡
taylorai/mlx_embedding_models
run embeddings in MLX
bodo-ai/Bodo
High-Performance Python Compute Engine for Data and AI
eleanormurray/CausalSurvivalAnalysisWorkshop
localstack/localstack
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
tursodatabase/libsql
libSQL is a fork of SQLite that is both Open Source, and Open Contributions.
microsoft/markitdown
Python tool for converting files and office documents to Markdown.
chiphuyen/aie-book
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
openzipkin/zipkin
Zipkin is a distributed tracing system
softprops/typed-lambda
λ formal type definitions for aws lambda events
google-research/population-dynamics
PDFM Embeddings: location-based vectors for geo-spatial analysis.
katanemo/archgw
AI-native (edge and LLM) proxy for agents. Handles all the pesky heavy lifting in building agentic apps -- fast ⚡️ query routing, seamless integration of prompts with business APIs, and unified access and observabilty of LLMs. Built by the contributors of Envoy proxy.
huggingface/smol-course
A course on aligning smol models.
jlowin/fastmcp
The fast, Pythonic way to build Model Context Protocol servers 🚀
breezedeus/Pix2Text
An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
karpathy/llm.c
LLM training in simple, raw C/CUDA