PHEVA: Privacy-preserving Human-centric Video Anomaly Detection Dataset

Overview

The PHEVA dataset is a pioneering resource designed to advance research in Video Anomaly Detection (VAD) by addressing key challenges related to privacy, ethical concerns, and the complexities of human behavior in video data. PHEVA is the largest continuously recorded VAD dataset, providing comprehensive, de-identified human annotations across diverse indoor and outdoor scenes. PHEVA provides two distinct data settings: conventional training and continual learning which can be found in this repository. You can find the paper in the following link: PHEVA: A Privacy-preserving Human-centric Video Anomaly Detection Dataset.

Anomalous Behaviors

PHEVA’s individual anomalies involve throwing, hands up, lying down, and falling. In group situations, anomalies include punching, kicking, pushing, pulling, hitting with an object, and strangling. You can find several segmented examples below.

Example 1: Slapping	Example 2: Kicking
Example 3: Falling	Example 4: Pushing

Key Features

Privacy-Preserving: PHEVA only includes de-identified human annotations, removing all pixel information to safeguard privacy.
Large-Scale Data: Over 5 million frames with pose annotations, offering more than 5× the training frames and 4× the testing frames compared to previous datasets.
Context-Specific Scenarios: Includes a novel context-specific camera dedicated to law enforcement and security personnel training, allowing for the evaluation of models in highly specialized environments.
Continual Learning: PHEVA supports benchmarks for continual learning, bridging the gap between conventional training and real-world deployment.

Figure 1: The camera views in PHEVA dataset.

Dataset Statistics

Dataset	Total Frames	Training Frames	Testing Frames	Normal Frames	Anomalous Frames	Scenes	Cameras
PHEVA	5,196,675	4,467,271	729,404	517,286	212,118	7	7
SHT	295,495	257,650	37,845	21,141	16,704	13	13
IITB	459,341	279,880	179,461	71,316	108,145	1	1
CHAD	922,034	802,167	119,867	60,969	58,898	1	4

Table 1: Statistical comparison of PHEVA with major VAD datasets.

How to Download The Dataset

For downloading the annotations, anomaly labels, and splits, please use the following link:

Main Link: PHEVA

Mirror Link 1: PHEVA

Structure of Annotations

Each video has its own dedicated annotation file in .pkl format.

The naming of the files has the following pattern:

<camera_number>_<video_number>.pkl

Camera number ranges from 0 to 6 with camera 6 representing the CSC camera.

Video number is the number of the video from the specific camera.

The annotation files contains a dictionary with the following format:

{
  "Frame_number": 
  {
    "Person_ID": [array([Boudning_Box]), array([Keypoints])]
  }
}

Bounding boxes are in XYWH format, and keypoints are in XYC format, where X and Y are coordinates, W is width, H is height, and C is confidence.

You can use the following code snippet to read the pickle files:

import pickle

# Open the pickle file for reading
with open('PHEVA/annotations/test/file.pickle', 'rb') as f:
    # Load the contents of the file into a dictionary
    my_dict = pickle.load(f)

# Print the dictionary to verify that it has been loaded correctly
print(my_dict)

Structure of Anomaly Labels

Anomaly labels are in .npy format.

They exactly follow the same naming pattern, and we have one file per each video.

Each file is an array of 0s and 1s with the length of the number of frames in each video. 0 means the frame is normal, and 1 means the frame is anomalous.

You can use the following code snnipet to load the files:

import numpy as np

# Load the .npy file
data = np.load('file.npy')

# Print to see the data
print(data)

Benchmarking Results

We benchmarked several State-of-the-Art (SotA) pose-based VAD models on the PHEVA dataset:

Model	AUC-ROC	AUC-PR	EER	10ER
MPED-RNN	76.05	42.83	0.28	0.49
GEPC	62.25	28.62	0.41	0.67
STG-NF	57.57	83.77	0.46	0.90
TSGAD	68.00	34.61	0.36	0.64

Table 2: Benchmarking of SotA pose-based models on PHEVA.

Continual Benchmark Train and Test Set Characteristics

Less than 1% of the training data is anomalous to mimic real-world scenarios. The test set is edited to be balanced with an approximate 1:1 ratio of normal to anomalous frames to make metrics such as AUC-ROC and AUC-PR more informative.

	Continual Train Set				Continual Test Set
	Total	Normal	Anomalous	Anomaly Percentage	Total	Normal	Anomalous	Anomaly Percentage
C0	487,835	483,220	4,615	0.95	52,145	26,093	26,052	49.96
C1	796,860	791,186	5,674	0.71	57,120	28,597	28,523	49.93
C2	787,301	780,420	6,881	0.87	50,592	25,300	25,292	49.99
C3	1,260,314	1,251,189	9,125	0.72	31,604	15,818	15,786	49.95
C4	449,686	447,918	1,768	0.39	74,482	37,274	37,208	49.95
C5	690,730	686,435	4,295	0.62	56,621	28,353	28,268	49.92
CSC	558,492	555,223	3,269	0.58	56,644	28,343	28,301	49.96

Citation

If you use PHEVA in your research, please cite our paper:

@article{noghre2024pheva,
  title={PHEVA: A Privacy-preserving Human-centric Video Anomaly Detection Dataset},
  author={Ghazal Alinezhad Noghre and Shanle Yao and Armin Danesh Pazho and Babak Rahimi Ardabili and Vinit Katariya and Hamed Tabkhi},
  journal={Arxiv},
  year={2024},
}

Contact

For any questions or support, please contact the authors at galinezh@charlotte.edu.

TeCSAR-UNCC/PHEVA