gdabas's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
microsoft/markitdown
Python tool for converting files and office documents to Markdown.
unslothai/unsloth
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
mem0ai/mem0
The Memory layer for AI Agents
modelcontextprotocol/servers
Model Context Protocol Servers
VikParuchuri/marker
Convert PDF to markdown + JSON quickly with high accuracy
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—language models
agno-agi/agno
Agno is a lightweight library for building Multimodal Agents. It exposes LLMs as a unified API and gives them superpowers like memory, knowledge, tools and reasoning.
black-forest-labs/flux
Official inference repo for FLUX.1 models
CopilotKit/CopilotKit
React UI + elegant infrastructure for AI Copilots, AI chatbots, and in-app AI agents. The Agentic last-mile 🪁
OpenTalker/SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
anthropics/anthropic-quickstarts
A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
KoljaB/RealtimeSTT
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
QwenLM/Qwen2.5-Coder
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
katanaml/sparrow
Data processing with ML, LLM and Vision LLM
Picovoice/porcupine
On-device wake word detection powered by deep learning
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
NirDiamant/Prompt_Engineering
This repository offers a comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies. It serves as an essential resource for mastering the art of effectively communicating with and leveraging large language models in AI applications.
NovaSky-AI/SkyThought
Sky-T1: Train your own O1 preview model within $450
langchain-ai/langgraph-studio
Desktop app for prototyping and debugging LangGraph applications locally.
NVIDIA/nv-ingest
NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
svpino/alloy-voice-assistant
THUDM/LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
hwchase17/langgraph-engineer
docker/mcp-servers
Model Context Protocol Servers
jbarnes850/deepseek-r1-finetune
A step by step guide to fine-tuning the DeepSeek R1 Distilled models on Apple Silicon machines.
ssc-dsai/canchat-v2
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
gdabas/PowerPrompter