PVC-GAN: A Python repository from IMKBLE

Partial Multi-view Clustering via Consistent GAN

This repo contains the source code and dataset for our ICDM paper:

Qianqian Wang, Zengming Ding, Zhiqiang Tao, Quanxue Gao and Yun Fu, Partial Multi-view Clustering via Consistent GAN, IEEE ICDM, 2018: 1290-1295.

paper link

@inproceedings{wang2018partial,
title={Partial multi-view clustering via consistent GAN},
author={Wang, Qianqian and Ding, Zhengming and Tao, Zhiqiang and Gao, Quanxue and Fu, Yun},
booktitle={IEEE ICDM},
pages={1290--1295},
year={2018},
}

This model has been extended to partial multi-view clustering on TIP：

Qianqian Wang, Zhengming Ding, Zhiqiang Tao, Quanxue Gao, Yun Fu, Generative Partial Multi-View Clustering with Adaptive Fusion and Cycle Consistency, IEEE Transactions on Image Processing, 2021.

PVC-GAN Model:

Figure 1: Network Model

Introduction

With the development of data collection and feature extraction methods, nowadays, multi-view data are easily to obtain. Multi-view clustering, as one of the most important methods to analyze this kind of data, has been widely used in many real-world applications. Most existing multi-view clustering methods perform well on the assumption that each sample appears in all views. Nevertheless, in real-world application, each view may well face the problem of the data missing due to noise or malfunction or equipments malfunction. In this paper, a new consistent generative adversarial network is proposed for partial multi-view clustering. We learn a common low-dimensional representation, which can both generate the missing view data and capture a better common structure from partial multi-view data for clustering. Different from the most existing methods, we use the common representation encoded by one view to generate the missing data of the corresponding view by generative adversarial networks, then we use the encoder and clustering networks. This is intuitive and meaningful because encoding common representation and generating the missing data in our model will promote mutually. To be specific, the common representation learned by encoder networks can help impute the missing view data and the generated missing view data can put forward to learn a more consistent low-dimensional representation. Experimental results on three different multi-view databases illustrate the superiority of the proposed method.

Figure 2: Illustration of partial multi-view data with image and text views

Figure 3: Illustration for multi-view data

Dataset:

BDGP is a two-view database. One is visual view and the other is textual view. It contains 2,500 images about drosophila embryos belonging to 5 categories. Each image is represented by a 1,750-D visual vector and a 79-D textual feature vector. In our experiment, we use all data on BDGP database, and evaluate the performance on both visual feature and textual feature.

Requirements

Python: Python 3.6.2：
Pytorch: 0.1.12
Numpy: 1.13.1
TorchVision: 0.1.8
Cuda: 11.2

Train the model

python train.py

Acknowledgments

Code is inspired by VIGAN CycleGAN.

IMKBLE/PVC-GAN