The PHEVA dataset is a pioneering resource designed to advance research in Video Anomaly Detection (VAD) by addressing key challenges related to privacy, ethical concerns, and the complexities of human behavior in video data. PHEVA is the largest continuously recorded VAD dataset, providing comprehensive, de-identified human annotations across diverse indoor and outdoor scenes. PHEVA provides two distinct data settings: conventional training and continual learning which can be found in this repository. You can find the paper in the following link: PHEVA: A Privacy-preserving Human-centric Video Anomaly Detection Dataset.
PHEVA’s individual anomalies involve throwing, hands up, lying down, and falling. In group situations, anomalies include punching, kicking, pushing, pulling, hitting with an object, and strangling. You can find several segmented examples below.
Example 1: Slapping |
Example 2: Kicking |
Example 3: Falling |
Example 4: Pushing |
- Privacy-Preserving: PHEVA only includes de-identified human annotations, removing all pixel information to safeguard privacy.
- Large-Scale Data: Over 5 million frames with pose annotations, offering more than 5× the training frames and 4× the testing frames compared to previous datasets.
- Context-Specific Scenarios: Includes a novel context-specific camera dedicated to law enforcement and security personnel training, allowing for the evaluation of models in highly specialized environments.
- Continual Learning: PHEVA supports benchmarks for continual learning, bridging the gap between conventional training and real-world deployment.
Figure 1: The camera views in PHEVA dataset.
Dataset | Total Frames | Training Frames | Testing Frames | Normal Frames | Anomalous Frames | Scenes | Cameras |
---|---|---|---|---|---|---|---|
PHEVA | 5,196,675 | 4,467,271 | 729,404 | 517,286 | 212,118 | 7 | 7 |
SHT | 295,495 | 257,650 | 37,845 | 21,141 | 16,704 | 13 | 13 |
IITB | 459,341 | 279,880 | 179,461 | 71,316 | 108,145 | 1 | 1 |
CHAD | 922,034 | 802,167 | 119,867 | 60,969 | 58,898 | 1 | 4 |
Table 1: Statistical comparison of PHEVA with major VAD datasets.
For downloading the annotations, anomaly labels, and splits, please use the following link:
Each video has its own dedicated annotation file in .pkl format.
The naming of the files has the following pattern:
<camera_number>_<video_number>.pkl
Camera number ranges from 0 to 6 with camera 6 representing the CSC camera.
Video number is the number of the video from the specific camera.
The annotation files contains a dictionary with the following format:
{
"Frame_number":
{
"Person_ID": [array([Boudning_Box]), array([Keypoints])]
}
}
Bounding boxes are in XYWH format, and keypoints are in XYC format, where X and Y are coordinates, W is width, H is height, and C is confidence.
You can use the following code snippet to read the pickle files:
import pickle
# Open the pickle file for reading
with open('PHEVA/annotations/test/file.pickle', 'rb') as f:
# Load the contents of the file into a dictionary
my_dict = pickle.load(f)
# Print the dictionary to verify that it has been loaded correctly
print(my_dict)
Anomaly labels are in .npy format.
They exactly follow the same naming pattern, and we have one file per each video.
Each file is an array of 0s and 1s with the length of the number of frames in each video. 0 means the frame is normal, and 1 means the frame is anomalous.
You can use the following code snnipet to load the files:
import numpy as np
# Load the .npy file
data = np.load('file.npy')
# Print to see the data
print(data)
We benchmarked several State-of-the-Art (SotA) pose-based VAD models on the PHEVA dataset:
Model | AUC-ROC | AUC-PR | EER | 10ER |
---|---|---|---|---|
MPED-RNN | 76.05 | 42.83 | 0.28 | 0.49 |
GEPC | 62.25 | 28.62 | 0.41 | 0.67 |
STG-NF | 57.57 | 83.77 | 0.46 | 0.90 |
TSGAD | 68.00 | 34.61 | 0.36 | 0.64 |
Table 2: Benchmarking of SotA pose-based models on PHEVA.
Less than 1% of the training data is anomalous to mimic real-world scenarios. The test set is edited to be balanced with an approximate 1:1 ratio of normal to anomalous frames to make metrics such as AUC-ROC and AUC-PR more informative.
Continual Train Set | Continual Test Set | |||||||
---|---|---|---|---|---|---|---|---|
Total | Normal | Anomalous | Anomaly Percentage | Total | Normal | Anomalous | Anomaly Percentage | |
C0 | 487,835 | 483,220 | 4,615 | 0.95 | 52,145 | 26,093 | 26,052 | 49.96 |
C1 | 796,860 | 791,186 | 5,674 | 0.71 | 57,120 | 28,597 | 28,523 | 49.93 |
C2 | 787,301 | 780,420 | 6,881 | 0.87 | 50,592 | 25,300 | 25,292 | 49.99 |
C3 | 1,260,314 | 1,251,189 | 9,125 | 0.72 | 31,604 | 15,818 | 15,786 | 49.95 |
C4 | 449,686 | 447,918 | 1,768 | 0.39 | 74,482 | 37,274 | 37,208 | 49.95 |
C5 | 690,730 | 686,435 | 4,295 | 0.62 | 56,621 | 28,353 | 28,268 | 49.92 |
CSC | 558,492 | 555,223 | 3,269 | 0.58 | 56,644 | 28,343 | 28,301 | 49.96 |
If you use PHEVA in your research, please cite our paper:
@article{noghre2024pheva,
title={PHEVA: A Privacy-preserving Human-centric Video Anomaly Detection Dataset},
author={Ghazal Alinezhad Noghre and Shanle Yao and Armin Danesh Pazho and Babak Rahimi Ardabili and Vinit Katariya and Hamed Tabkhi},
journal={Arxiv},
year={2024},
}
For any questions or support, please contact the authors at galinezh@charlotte.edu.