mnky4a6p's Stars
genmoai/mochi
The best OSS video generation models
phidatahq/phidata
Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
usefulsensors/moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
microsoft/SynthMoCap
SynthMoCap Datasets
NachiketGadekar1/browserllama
Browser extension that lets you summarize and chat with any webpage using a local LLM of your choice.
robustsam/RobustSAM
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
adarshb3/Virtual-Try-On-Application-using-Flask-Twilio-and-Gradio
This repository contains the code for a virtual try-on application built using Flask, Twilio's WhatsApp API, and Gradio's virtual try-on model. Users can send images via WhatsApp to try on garments virtually, and the results are sent back to them.
microsoft/BitNet
Official inference framework for 1-bit LLMs
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
mediar-ai/screenpipe
library & platform to build, distribute, monetize ai apps that have the full context (like rewind, granola, etc.), open source, 100% local, developer friendly. 24/7 screen, mic, keyboard recording and control
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
jabowery/HumesGuillotine
Hume's Guillotine: Beheading the social pseudo-sciences with the Algorithmic Information Criterion for CAUSAL model selection.
souzatharsis/podcastfy
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
souzatharsis/podcastfy-demo
podcastfy.ai gradio demo app
ArmastusChen/inverse_painting
Inverse Painting: Reconstructing The Painting Process (SIGGRAPH ASIA 2024)
neural-maze/agentic_patterns
Implementing the 4 agentic patterns from scratch
xjdr-alt/entropix
Entropy Based Sampling and Parallel CoT Decoding
SendWithSES/Drag-and-Drop-Email-Designer
Free, open source, HTML email template editor and no code designer.
dheerajrhegde/servicedesk_langgraph_tavily
feder-cr/Jobs_Applier_AI_Agent
Auto_Jobs_Applier_AI_Agent aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in an tailored way.
sentient-engineering/sentient
the framework/ sdk that lets you build browser controlling agents in 3 lines of code. join chat @ https://discord.gg/umgnyQU2K8
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
dynamiccreator/voice-text-reader
Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)
RexanWONG/text-behind-image
https://textbehindimage.rexanwong.xyz - create text behind image designs easily
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
baaivision/Emu3
Next-Token Prediction is All You Need
google/sbsim
tungbq/devops-basics
🚀 Practical and document place for DevOps toolchain
aceinnolab/Inkycal
Create awesome e-paper dashboards within minutes! Modularity? Check! Python3? Check? Works on Raspberry Pi Zero W? Check! Support for own modules? Check!