satolab12

通信会社で研究開発をしています．コンピュータビジョン，継続学習に関する研究をしています． Qiitaは以下： https://qiita.com/satolab ※公開しているプログラムは個人の成果物であり，当方の属する団体・企業を代表するものではありません．

japan

satolab12's Stars

LLaVA-VL/LLaVA-NeXT
Language:Python3.2k282
okankop/vidaug
Effective Video Augmentation Techniques for Training Convolutional Neural Networks
Language:Python39179
VisionLearningGroup/SSDA_MME
Semi-supervised Domain Adaptation via Minimax Entropy
Language:Python30147
ayushtues/ADDA_pytorch
Pytorch implementation of Adversarial Discriminative Domain Adaptation
Language:Python103
erictzeng/adda
Language:Python22476
fungtion/DANN
pytorch implementation of Domain-Adversarial Training of Neural Networks
Language:Python860155
adapt-python/adapt
Awesome Domain Adaptation Python Toolbox
Language:Python31945
tim-learn/DINE
code for our CVPR 2022 paper "DINE: Domain Adaptation from Single and Multiple Black-box Predictors"
Language:Python787
tim-learn/awesome-test-time-adaptation
Collection of awesome test-time (domain/batch/instance) adaptation methods
80855
vishal-siddegowda/CrowdAnomalyDetection_DeepLearning
In the recent times, machine learning and deep learning have demonstrated an key advancement in the field of anomaly detection especially in the crowd. This progress has impacted immensely in detecting suspicious/abnormal activity such as robbery, vandalism, stempedes, road rage, and so on. For the purpose of this study, transfer learning is employed because of it proven efficiency to train on a small dataset yet producing a promising output. Transfer learning is utilized to train the dataset using Visual Geometry Group network 19 (VGGNet-19) and Visual Geometry Group network 16 (VGGNet-16) pre-trained network to extract the human motion features from a RGB video data. The designed model is employed on two dataset, UMN(University of Minnesota) and WEB crowd dataset to achieve the objective. Analysing the experimental results, VGGNet-19 has been demonstrated a good quality performance of approximately 98% on UMN data while VGGNet-16 has produced a approximately 58% on WEB dataset.
Language:Jupyter Notebook31
v-iashin/video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Language:Python55397
yyuanad/Pytorch_C3D_Feature_Extractor
Pytorch C3D feature extractor
Language:Python13135
RupertLuo/Valley
The official repository of "Video assistant towards large language model makes everything easy"
Language:Python21514
seqam-lab/DMVAE
Language:Python154
xaggi/claws_eccv
Project page for the 'CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection', ECCV 2020 paper.
124
xiaobai1217/Awesome-Video-Datasets
Video datasets
1.3k96
TeCSAR-UNCC/CHAD
132
yahoojapan/ja-vg-vqa
28
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
Language:Python70275
zhuyiche/llava-phi
Language:Python38138
bdytx5/finetune_LLaVA
Language:Python297
tosiyuki/LLaVA-JP
LLaVA-JP is a Japanese VLM trained by LLaVA method
Language:Python5713
fesvhtr/CUVA
[CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly"
Language:Python6922
zackschen/CoIN
Instruction Tuning in Continual Learning paradigm
Language:Python352
wyzjack/Awesome-XAD
Paper and Dataset Summary for paper "Explainable Anomaly Detection in Images and Videos: A Survey"
232
longtanle/awesome-federated-LLM-learning
This is a collection of research papers for Federated Learning for Large Language Models (FedLLM). And the repository will be continuously updated to track the frontier of FedLLM.
761
lucazanella/lavad
Official implementation of "Harnessing Large Language Models for Training-free Video Anomaly Detection", CVPR 2024
Language:Python653
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python21k2.3k
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook26.8k3.4k
pipixin321/HolmesVAD
Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"
Language:Python893

satolab12

satolab12's Stars

LLaVA-VL/LLaVA-NeXT

okankop/vidaug

VisionLearningGroup/SSDA_MME

ayushtues/ADDA_pytorch

erictzeng/adda

fungtion/DANN

adapt-python/adapt

tim-learn/DINE

tim-learn/awesome-test-time-adaptation

vishal-siddegowda/CrowdAnomalyDetection_DeepLearning

v-iashin/video_features

yyuanad/Pytorch_C3D_Feature_Extractor

RupertLuo/Valley

seqam-lab/DMVAE

xaggi/claws_eccv

xiaobai1217/Awesome-Video-Datasets

TeCSAR-UNCC/CHAD

yahoojapan/ja-vg-vqa

TinyLLaVA/TinyLLaVA_Factory

zhuyiche/llava-phi

bdytx5/finetune_LLaVA

tosiyuki/LLaVA-JP

fesvhtr/CUVA

zackschen/CoIN

wyzjack/Awesome-XAD

longtanle/awesome-federated-LLM-learning

lucazanella/lavad

haotian-liu/LLaVA

openai/CLIP

pipixin321/HolmesVAD