gasoved's Stars
sharkdp/bat
A cat(1) clone with wings.
ultralytics/ultralytics
Ultralytics YOLO11 🚀
microsoft/markitdown
Python tool for converting files and office documents to Markdown.
ajeetdsouza/zoxide
A smarter cd command. Supports all major shells.
Genesis-Embodied-AI/Genesis
A generative world for general-purpose robotics & embodied AI learning.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
vi/websocat
Command-line client for WebSockets, like netcat (or curl) for ws:// with advanced socat-like functions
gcanti/io-ts
Runtime type system for IO decoding/encoding
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
django-oscar/django-oscar
Domain-driven e-commerce for Django
pawelsalawa/sqlitestudio
A free, open source, multi-platform SQLite database manager.
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
microsoft/LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
microsoft/presidio
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
magic-quill/MagicQuill
Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
PintaProject/Pinta
Simple GTK# Paint Program
gpt-omni/mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Standard-Intelligence/hertz-dev
first base model for full-duplex conversational audio
huggingface/smollm
Everything about the SmolLM & SmolLM2 family of models
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
edwko/OuteTTS
Interface for OuteTTS models.
lhl/voicechat2
Local SRT/LLM/TTS Voicechat
mozilla/mozilla-django-oidc
A django OpenID Connect library
huggingface/meshgen
A blender addon for generating meshes with AI
digidem/leaflet-side-by-side
A Leaflet control to add a split screen to compare two map overlays
ScalingIntelligence/Archon
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
eole-nlp/eole
Open language modeling toolkit based on PyTorch