alok-abhishek
Product Management professional, MBA with computer engineering background. Passionate about machine learning and SaaS. I love all things tech.
san francisco bay area
alok-abhishek's Stars
NirDiamant/RAG_Techniques
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
sdv-dev/SDV
Synthetic data generation for tabular data
sdv-dev/TGAN
Generative adversarial training for generating synthetic tabular data.
ProductHired/open-product-management
A curated list of product management advice for technical people.
apache/phoenix
Apache Phoenix
apache/spark
Apache Spark - A unified analytics engine for large-scale data processing
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
prestodb/prestorials
Tutorials and examples of how to deploy Presto and connect it to different data sources
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
getomni-ai/zerox
PDF to Markdown with vision models
google-research/google-research
Google Research
soumilshah1995/hudi-trino-integeration-guide
hudi-trino-integeration-guide
soumilshah1995/trino-k8-locally
trino-k8-locally
unionai-oss/pandera
A light-weight, flexible, and expressive statistical data testing library
trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
huggingface/blog
Public repo for HF blog posts
hendrycks/test
Measuring Massive Multitask Language Understanding | ICLR 2021
meta-llama/llama3
The official Meta Llama 3 GitHub site
Netflix/metaflow
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
openai/openai-openapi
OpenAPI specification for the OpenAI API
databricks/terraform-databricks-mlops-azure-project-with-sp-creation
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Azure Databricks staging and prod workspaces. It also creates the relevant Azure Active Directory (AAD) applications for the service principals.
databricks/mlops-stacks
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
coteditor/CotEditor
Lightweight Plain-Text Editor for macOS
dlt-hub/dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
superlinked/VectorHub
VectorHub is a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector retrieval to their ML stack.
ItzCrazyKns/Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
alok-abhishek/Quantizing-LLMs-and-inferencing-Quantized-model-from-HF
This repo contains colab notebook utity to help qunatize large language model and store it on HuggingFace hub. The notebook also contain code to inference using the quantized model.
apache/nifi
Apache NiFi
alok-abhishek/User_Review_Analysis_using_App_store_reviews
This repository houses a Python tool designed to analyze customer reviews from Apple Store and Google play store and develop user empathy and insights.
sodadata/soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io