ThomasJanssoone

ThomasJanssoone's Stars

NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.9k2.6k
PKU-YuanGroup/LLaVA-CoT
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
Language:Python1.7k64
SylphAI-Inc/LLM-engineer-handbook
A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
2.5k301
openhuman-ai/awesome-gesture_generation
Awesome Gesture Generation
1705
lalanikarim/webrtc-ai-voice-chat
A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.
Language:Python11824
jiawen-zhu/TrackGPT
Tracking with Human-Intent Reasoning
672
andresprados/SPIGA
SPIGA: Shape Preserving Facial Landmarks with Graph Attention Networks.
Language:Jupyter Notebook31134
sotopia-lab/sotopia
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
Language:Python17623
Majdoddin/nlp
Language:Jupyter Notebook46457
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
Language:Jupyter Notebook821104
shankarpandala/lazypredict
Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
Language:Python3.1k350
boreshkinai/delta-interpolator
Language:Python7511
capjamesg/visionscript
A high-level programming language for using computer vision.
Language:Python34418
smazzanti/mrmr
mRMR (minimum-Redundancy-Maximum-Relevance) for automatic feature selection at scale.
Language:Python54580
jasonppy/syllable-discovery
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Language:Python316
plamere/SmarterPlaylists
web app for creating sophisticated playlists
Language:Python39758
pliang279/MultiViz
[ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models
Language:Python945
jrin771/Everything-LLMs-And-Robotics
The world's largest GitHub Repository for LLMs + Robotics
79758
marl/pysox
Python wrapper around sox.
Language:Python51781
telepathylabsai/OpenDF
Code to reproduce LREC Paper Simplifying Semantic Annotations of SMCalFlow
Language:Python256
syhw/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
1.9k226
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook9.6k861
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.5k5.7k
tomas-gajarsky/facetorch
Python library for analysing faces using PyTorch
Language:Python52546
chiragraman/OpenFace
OpenFace – a state-of-the art open source tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Language:C++2
mpc001/Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages
Language:Python36957
datitran/face2face-demo
pix2pix demo that learns from facial landmarks and translates this into a face
Language:Python1.4k421
ggerganov/llama.cpp
LLM inference in C/C++
Language:C++70.7k10.2k
facebookresearch/TalkingWithHands32M
Talking with Hands
8812
microsoft/PromptCraft-Robotics
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
Language:Python1.9k208