shauryashaurya
20+ years of cloud, big data, analytics, machine learning, consulting and tech leadership.
Bombay, India
Pinned Repositories
airbyte
Data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
bRAG
Braggosaurus Rex, the fragrant one, dragon of prague, keeper of the ragtag dragoons...
CooleRE
coolRE (cooler) is a set of regular expression engines written in Python - implementing a toy engine for learning, then one based on backtracking and finally a NFA-DFA based engine.
inside-a-data-engine
What's inside a data engine? Let's build one from scratch, for fun (and profit).
kandinsky
Kandinsky - analysis of color in photographic images through clustering and other algorithms.
learn-data-munging
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
shauryashaurya.github.io
Static pages for shauryashaurya's web presence
The-Meat-and-Potatoes-of-MLOps
The Meat and Potatoes of MLOps - essentials that make the practice unique compared to XXX-Ops (Dev, Data, Others).
tutorial-x.509certificates-mongo
Tutorial for building self signed X.509 certificates on Windows 10 and using them with MongoDB
shauryashaurya's Repositories
shauryashaurya/learn-data-munging
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
shauryashaurya/kandinsky
Kandinsky - analysis of color in photographic images through clustering and other algorithms.
shauryashaurya/langchain
⚡ Building applications with LLMs through composability ⚡
shauryashaurya/superset
Apache Superset is a Data Visualization and Data Exploration Platform
shauryashaurya/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
shauryashaurya/DAGWorks-hamilton
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
shauryashaurya/dair-ai-ML-Papers-Explained
Explanation to key concepts in ML
shauryashaurya/dair-ai-Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
shauryashaurya/Dao-AILab-flash-attention
Fast and memory-efficient exact attention
shauryashaurya/Deep-Learning-Labs
Various experiments with Deep-Learning, Large-Language-Models and related.
shauryashaurya/EleutherAI-lm-evaluation-harness
A framework for few-shot evaluation of language models.
shauryashaurya/explodinggradients-ragas
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
shauryashaurya/great_expectations
Always know what to expect from your data.
shauryashaurya/huggingface-text-generation-inference
Large Language Model Text Generation Inference
shauryashaurya/huggingface-transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
shauryashaurya/inside-a-transformer
Exploring the layers within a trasformer model, other experiments with transformers.
shauryashaurya/karpathy-llm.c
LLM training in simple, raw C/CUDA
shauryashaurya/LLaMA-Factory
Unify Efficient Fine-tuning of 100+ LLMs
shauryashaurya/mlabonne-llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
shauryashaurya/NVIDIA-accelerated-computing-hub
NVIDIA curated collection of educational resources related to general purpose GPU programming.
shauryashaurya/ollama
Get up and running with Llama 2, Mistral, and other large language models.
shauryashaurya/oobabooga-text-generation-webui
A Gradio web UI for Large Language Models.
shauryashaurya/PicPick
PicPick - Your Personal AI Movie Recommendation System using LLMs and RAG
shauryashaurya/Pints-AI-1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
shauryashaurya/pytorch-forecasting
Time series forecasting with PyTorch
shauryashaurya/rasbt-LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
shauryashaurya/statsforecast
Lightning ⚡️ fast forecasting with statistical and econometric models.
shauryashaurya/statsmodels
Statsmodels: statistical modeling and econometrics in Python
shauryashaurya/swiftide
Fast, streaming indexing and query library for AI (RAG) applications, written in Rust
shauryashaurya/swiftide-bench-comparison