takenpeanut

Second-year PH.D. student in PKU Focusing on RL & Large Models now

Peking UniversityBeijing, China

takenpeanut's Stars

deepseek-ai/DeepSeek-V3
Language:Python18.6k1.5k
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Language:Python89843
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript20.2k2.5k
GengzeZhou/NavGPT-2
[ECCV 2024] Official implementation of NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
Language:Python1036
JeremyLinky/YouTube-VLN
[ICCV'23] Learning Vision-and-Language Navigation from YouTube Videos
Language:Python461
peteanderson80/Matterport3DSimulator
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
Language:C++527131
jacobkrantz/VLN-CE
Vision-and-Language Navigation in Continuous Environments using Habitat
Language:Python32856
Dantong88/LLARVA
Language:Python421
LLaVA-VL/LLaVA-NeXT
Language:Python3.2k288
mu-cai/TemporalBench
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Language:Python261
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.2k988
RenShuhuai-Andy/TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Language:Python32229
joez17/VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
Language:Python33
Vision-CAIR/LongVU
Language:Python33824
RUCAIBox/POPE
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
Language:Python1926
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.7k237
open-compass/MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
17210
ScanNet/ScanNet
Language:C1.9k349
facebookresearch/open-eqa
OpenEQA Embodied Question Answering in the Era of Foundation Models
Language:Jupyter Notebook25022
feizc/Cleaned-Webvid
Use strategy to achieve clean webvid-10m dataset
Language:Python4
rohitrango/automatic-watermark-detection
Project for Digital Image Processing
Language:Jupyter Notebook1.2k399
NJU-PCALab/OpenVid-1M
Language:Python2218
boomb0om/watermark-detection
Model for watermark classification implemented with PyTorch
Language:Jupyter Notebook10321
OpenGVLab/unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Language:Python31217
hkchengrex/XMem
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Language:Python1.8k195
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language:Jupyter Notebook2.9k343
nltk/nltk
NLTK Source
Language:Python13.8k2.9k
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Language:Python1.5k94
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
Language:Jupyter Notebook3k282
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Language:Python30.6k4.4k

takenpeanut

takenpeanut's Stars

deepseek-ai/DeepSeek-V3

PKU-YuanGroup/Chat-UniVi

HumanSignal/label-studio

GengzeZhou/NavGPT-2

JeremyLinky/YouTube-VLN

peteanderson80/Matterport3DSimulator

jacobkrantz/VLN-CE

Dantong88/LLARVA

LLaVA-VL/LLaVA-NeXT

mu-cai/TemporalBench

salesforce/LAVIS

RenShuhuai-Andy/TimeChat

joez17/VideoNIAH

Vision-CAIR/LongVU

RUCAIBox/POPE

open-compass/VLMEvalKit

open-compass/MMBench

ScanNet/ScanNet

facebookresearch/open-eqa

feizc/Cleaned-Webvid

rohitrango/automatic-watermark-detection

NJU-PCALab/OpenVid-1M

boomb0om/watermark-detection

OpenGVLab/unmasked_teacher

hkchengrex/XMem

z-x-yang/Segment-and-Track-Anything

nltk/nltk

OpenGVLab/InternVideo

xinyu1205/recognize-anything

explosion/spaCy