isfuku's Stars
google-research-datasets/Synthetic-Persona-Chat
The Synthetic-Persona-Chat dataset is a synthetically generated persona-based dialogue dataset. It extends the original Persona-Chat dataset.
Azure-Samples/azure-search-openai-demo
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
stoyan-stoyanov/llmflows
LLMFlows - Simple, Explicit and Transparent LLM Apps
0hq/tinyvector
A tiny nearest-neighbor embedding database built with SQLite and Pytorch. (In development!)
aakash222/text-segmentation-NLP
sidpalas/devops-directive-terraform-course
Companion repo for complete Terraform course
AnswerDotAI/RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
snowfort-ai/awesome-llm-webapps
A collection of open source, actively maintained web apps for LLM applications
facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Lightning-AI/lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
damianavila/RISE
RISE: "Live" Reveal.js Jupyter/IPython Slideshow Extension
NetAppEMEA/kubernetes-netapp
Sample projects and examples on using NetApp technologies with Kubernetes
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
shafiab/HashtagCashtag
My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggregates Twitter and US stock market data for user sentiment analysis using open source tools - Apache Kafka for data ingestions, Apache Spark & Spark Streaming for batch & real-time processing, Apache Cassandra f or storage, Flask, Bootstrap and HighCharts f or frontend.
gTile/gTile
A window tiling extension for Gnome.
fastapi/sqlmodel
SQL databases in Python, designed for simplicity, compatibility, and robustness.
dswah/pyGAM
[HELP REQUESTED] Generalized Additive Models in Python
cookiecutter/cookiecutter
A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
laylacomparin/CheatSheets
This repo contains all the cheatsheets you need to keep handy, I will add more soon.
omnata-labs/dbt-ml-preprocessing
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
vecxoz/vecstack
Python package for stacking (machine learning technique)
mindsdb/mindsdb
Platform for building AI that can learn and answer questions over federated data.
MBrouns/timeseers
Time should be taken seer-iously
ipeaGIT/geobr
Easy access to official spatial data sets of Brazil in R and Python
geopy/geopy
Geocoding library for Python.
kedro-org/kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
vvaezian/metabase_api_python
A python wrapper for Metabase API
Cheneth/coup-online
An online port of the card game 'Coup'