oyly16

University of Tokyo

oyly16's Stars

declare-lab/MELD
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Language:Python804201
YebowenHu/MeetingBank-utils
Language:Python101
yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
Language:Python85.2k6.6k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python26.7k3k
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook12.1k1.9k
pytorch/torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
Language:Python3.3k211
IDEA-Research/OpenSeeD
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
Language:Python64840
aistairc/FineBio
Data and code for the paper "FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation"
Language:Python8
psychoinformatics-de/remodnav
Robust Eye Movement Detection for Natural Viewing
Language:Python6116
ut-vision/S2DHand
Language:Python273
ut-vision/ActionVOS
[ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation
Language:Python17
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Language:Python17k1.2k
ErikEkstedt/TurnGPT
TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog
Language:Python417
sangmin-git/MMSI
Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)
Language:Python111
DAMO-NLP-SG/VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Language:Python79852
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Language:Go93.9k7.4k
facebookresearch/VMZ
VMZ: Model Zoo for Video Modeling
Language:Python1k157
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python2.9k212
thuiar/MIntRec
MIntRec: A New Dataset for Multimodal Intent Recognition (ACM MM 2022)
Language:Python7612
mehdifatan/Egocom-IRI-UPC
Language:Python3
ai4r/AIR-Act2Act
Language:Python296
SALT-NLP/PersuationGames
[ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduction Games"
Language:Python112
relh/moves
[CVPR 2023] MOVES: Manipulated Objects in Video Enable Segmentation
Language:Python5
postech-ami/SMILE-Dataset
[NAACL'24] Repository for "SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models"
Language:Jupyter Notebook92
jayleicn/VideoLanguageFuturePred
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Language:Python474
sotopia-lab/sotopia
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
Language:Python15519
0nutation/SpeechAgents
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems
753
facebookresearch/EgoCom-Dataset
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
Language:Jupyter Notebook529
facebookresearch/EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
1037
CHATS-lab/KokoMind
KokoMind: Can LLMs Understand Social Interactions?
Language:JavaScript1037

oyly16

oyly16's Stars

declare-lab/MELD

YebowenHu/MeetingBank-utils

yt-dlp/yt-dlp

meta-llama/llama3

meta-llama/llama-recipes

pytorch/torchchat

IDEA-Research/OpenSeeD

aistairc/FineBio

psychoinformatics-de/remodnav

ut-vision/S2DHand

ut-vision/ActionVOS

unslothai/unsloth

ErikEkstedt/TurnGPT

sangmin-git/MMSI

DAMO-NLP-SG/VideoLLaMA2

ollama/ollama

facebookresearch/VMZ

PKU-YuanGroup/Video-LLaVA

thuiar/MIntRec

mehdifatan/Egocom-IRI-UPC

ai4r/AIR-Act2Act

SALT-NLP/PersuationGames

relh/moves

postech-ami/SMILE-Dataset

jayleicn/VideoLanguageFuturePred

sotopia-lab/sotopia

0nutation/SpeechAgents

facebookresearch/EgoCom-Dataset

facebookresearch/EasyComDataset

CHATS-lab/KokoMind