/Outlier-Detection-and-Removal-from-Multimedia

Detection and removal of specific types of outliers present in different data formats which includes detection and removal of contextual outliers from textual data using LOF, outliers from tabular numeric data using LOF, gaussian noise from image data using NLM.

Primary LanguageJupyter Notebook

Outlier-Detection-and-Removal-from-Multimedia

Detection and removal of specific types of outliers present in different data formats, including contextual outliers from textual data using LOF, outliers from tabular numeric data using LOF, Gaussian noise from image data using NLM, and Gaussian noisy image frames from video data using autoencoder.

Procedure

Textual Data

Synthetic textual data is generated based on a prompt and contaminated with 40% outliers (non-contextual data). LOF is used to detect and remove outlier sentences in the anomalous synthetic data.

Numeric Tabular Data

Synthetic data is generated - scaled and unscaled, scored in the data frame, and contaminated with 40% outliers. LOF is used to detect and remove outlier sentences in the anomalous synthetic data.

Image Data

Gaussian noise is added in intervals from 0 to 40% and filtered using Gaussian Filter and NLM (Non-Local Means) filter. NLM shows superior performance in terms of the SSIM metric. The standard SSIM score has been taken as 0.77.