Mihaiii

Romania

Mihaiii's Stars

serp-ai/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
Language:Jupyter Notebook3.2k429
Constellate-AI/voice-chat
Chat with AI using whisper, LLMs, and TTS
Language:TypeScript213
sandrohanea/whisper.net
Whisper.net. Speech to text made simple using Whisper Models
Language:C#59192
comet-ml/opik
Open-source end-to-end LLM Development Platform
Language:Java2.5k152
PABannier/bark.cpp
Suno AI's Bark model in C/C++ for fast text-to-speech generation
Language:C++74161
PatrickJS/awesome-cursorrules
📄 A curated list of awesome .cursorrules files
3.2k173
zml/zml
High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild
Language:Zig1.7k60
luckyrobots/luckyrobots
We are on a mission to make robotics available to the regular software engineers, by decoupling it from ROS and physical hardware.
Language:Python957
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Language:Python4.6k677
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language:Python2.6k179
zhangfaen/finetune-Qwen2-VL
Language:Jupyter Notebook22921
RayFernando1337/MLX-Auto-Subtitled-Video-Generator
Generate accurate transcripts using Apple's MLX framework
Language:Python33230
flipperdevices/flipperzero-firmware
Flipper Zero firmware source code
Language:C13k2.8k
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
Language:Swift4k338
PragmaticMachineLearning/docai
Structured information extraction from documents
Language:Python28627
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3.6k215
feizc/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
Language:Python1.6k123
merveenoyan/smol-vision
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
Language:Jupyter Notebook92088
JUSTSUJAY/nlp-zero-to-hero
NLP Zero to Hero in just 10 Kernels
Language:Jupyter Notebook52569
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
Language:Python4.5k395
1mrat/cursor
Repo of cursor prompts
21317
AIHawk-FOSS/Auto_Jobs_Applier_AI_Agent
Auto_Jobs_Applier_AI_Agent aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in an automated and personalized way.
Language:Python22.7k3.4k
njucckevin/SeeClick
The model, data and code for the visual GUI Agent SeeClick
Language:HTML23512
showlab/GUI-Narrator
Repository of GUI Action Narrator
Language:JavaScript5
microsoft/UICaption
We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This dataset was used to pre-train the Lexi model which provides a generic representation of UI screens and their components.
Language:Python374
mihaidobrescu1111/guess_the_word
Language:Python1
AMAAI-Lab/MidiCaps
A large-scale dataset of caption-annotated MIDI files.
Language:Python491
bytedance/GiantMIDI-Piano
Language:Python1.7k177
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。
Language:Python20.3k1.4k
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python20.2k2k

Mihaiii

Mihaiii's Stars

serp-ai/bark-with-voice-clone

Constellate-AI/voice-chat

sandrohanea/whisper.net

comet-ml/opik

PABannier/bark.cpp

PatrickJS/awesome-cursorrules

zml/zml

luckyrobots/luckyrobots

NexaAI/nexa-sdk

ictnlp/LLaMA-Omni

zhangfaen/finetune-Qwen2-VL

RayFernando1337/MLX-Auto-Subtitled-Video-Generator

flipperdevices/flipperzero-firmware

argmaxinc/WhisperKit

PragmaticMachineLearning/docai

linkedin/Liger-Kernel

feizc/FluxMusic

merveenoyan/smol-vision

JUSTSUJAY/nlp-zero-to-hero

modelscope/ms-swift

1mrat/cursor

AIHawk-FOSS/Auto_Jobs_Applier_AI_Agent

njucckevin/SeeClick

showlab/GUI-Narrator

microsoft/UICaption

mihaidobrescu1111/guess_the_word

AMAAI-Lab/MidiCaps

bytedance/GiantMIDI-Piano

opendatalab/MinerU

microsoft/graphrag