ThomasJanssoone's Stars
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PKU-YuanGroup/LLaVA-CoT
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
SylphAI-Inc/LLM-engineer-handbook
A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
openhuman-ai/awesome-gesture_generation
Awesome Gesture Generation
lalanikarim/webrtc-ai-voice-chat
A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.
jiawen-zhu/TrackGPT
Tracking with Human-Intent Reasoning
andresprados/SPIGA
SPIGA: Shape Preserving Facial Landmarks with Graph Attention Networks.
sotopia-lab/sotopia
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
Majdoddin/nlp
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
shankarpandala/lazypredict
Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
boreshkinai/delta-interpolator
capjamesg/visionscript
A high-level programming language for using computer vision.
smazzanti/mrmr
mRMR (minimum-Redundancy-Maximum-Relevance) for automatic feature selection at scale.
jasonppy/syllable-discovery
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
plamere/SmarterPlaylists
web app for creating sophisticated playlists
pliang279/MultiViz
[ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models
jrin771/Everything-LLMs-And-Robotics
The world's largest GitHub Repository for LLMs + Robotics
marl/pysox
Python wrapper around sox.
telepathylabsai/OpenDF
Code to reproduce LREC Paper Simplifying Semantic Annotations of SMCalFlow
syhw/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
tomas-gajarsky/facetorch
Python library for analysing faces using PyTorch
chiragraman/OpenFace
OpenFace – a state-of-the art open source tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
mpc001/Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages
datitran/face2face-demo
pix2pix demo that learns from facial landmarks and translates this into a face
ggerganov/llama.cpp
LLM inference in C/C++
facebookresearch/TalkingWithHands32M
Talking with Hands
microsoft/PromptCraft-Robotics
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration