satolab12
通信会社で研究開発をしています. コンピュータビジョン,継続学習に関する研究をしています. Qiitaは以下: https://qiita.com/satolab ※公開しているプログラムは個人の成果物であり,当方の属する団体・企業を代表するものではありません.
japan
satolab12's Stars
LLaVA-VL/LLaVA-NeXT
okankop/vidaug
Effective Video Augmentation Techniques for Training Convolutional Neural Networks
VisionLearningGroup/SSDA_MME
Semi-supervised Domain Adaptation via Minimax Entropy
ayushtues/ADDA_pytorch
Pytorch implementation of Adversarial Discriminative Domain Adaptation
erictzeng/adda
fungtion/DANN
pytorch implementation of Domain-Adversarial Training of Neural Networks
adapt-python/adapt
Awesome Domain Adaptation Python Toolbox
tim-learn/DINE
code for our CVPR 2022 paper "DINE: Domain Adaptation from Single and Multiple Black-box Predictors"
tim-learn/awesome-test-time-adaptation
Collection of awesome test-time (domain/batch/instance) adaptation methods
vishal-siddegowda/CrowdAnomalyDetection_DeepLearning
In the recent times, machine learning and deep learning have demonstrated an key advancement in the field of anomaly detection especially in the crowd. This progress has impacted immensely in detecting suspicious/abnormal activity such as robbery, vandalism, stempedes, road rage, and so on. For the purpose of this study, transfer learning is employed because of it proven efficiency to train on a small dataset yet producing a promising output. Transfer learning is utilized to train the dataset using Visual Geometry Group network 19 (VGGNet-19) and Visual Geometry Group network 16 (VGGNet-16) pre-trained network to extract the human motion features from a RGB video data. The designed model is employed on two dataset, UMN(University of Minnesota) and WEB crowd dataset to achieve the objective. Analysing the experimental results, VGGNet-19 has been demonstrated a good quality performance of approximately 98% on UMN data while VGGNet-16 has produced a approximately 58% on WEB dataset.
v-iashin/video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
yyuanad/Pytorch_C3D_Feature_Extractor
Pytorch C3D feature extractor
RupertLuo/Valley
The official repository of "Video assistant towards large language model makes everything easy"
seqam-lab/DMVAE
xaggi/claws_eccv
Project page for the 'CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection', ECCV 2020 paper.
xiaobai1217/Awesome-Video-Datasets
Video datasets
TeCSAR-UNCC/CHAD
yahoojapan/ja-vg-vqa
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
zhuyiche/llava-phi
bdytx5/finetune_LLaVA
tosiyuki/LLaVA-JP
LLaVA-JP is a Japanese VLM trained by LLaVA method
fesvhtr/CUVA
[CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly"
zackschen/CoIN
Instruction Tuning in Continual Learning paradigm
wyzjack/Awesome-XAD
Paper and Dataset Summary for paper "Explainable Anomaly Detection in Images and Videos: A Survey"
longtanle/awesome-federated-LLM-learning
This is a collection of research papers for Federated Learning for Large Language Models (FedLLM). And the repository will be continuously updated to track the frontier of FedLLM.
lucazanella/lavad
Official implementation of "Harnessing Large Language Models for Training-free Video Anomaly Detection", CVPR 2024
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
pipixin321/HolmesVAD
Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"