/visualimiss

Small Python package to visualize missing data

Primary LanguagePythonMIT LicenseMIT

visualimiss

"visualize missing data". A simple missingno clone. Some features were removed from the original and some were added.

Features

Some features differ from the original one:

  • Have matrix(), bar() modified function from the original one
  • Simpler and faster
  • Maybe more readable?

Installation

pip install visualimiss

This also install all dependencies: Numpy, Pandas, Matplotlib

Quickstart

This quickstart uses datasets of the NYPD Motor Vehicle Collisions Dataset.

import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/ResidentMario/missingno-data/master/nyc_collision_factors.csv")

matrix

visualimiss.matrix(df)

image

bar

visualimiss.bar(df)

image

info

visualimiss.info(df)
- DataFrame has 7303 rows, 26 columns
- Memory usage: 1.519152 MB
                               Null count    Dtype
DATE                                    0   object
TIME                                    0   object
BOROUGH                               383   object
ZIP CODE                              384  float64
LATITUDE                                0  float64
LONGITUDE                               0  float64
LOCATION                                0   object
ON STREET NAME                       1065   object
CROSS STREET NAME                    1137   object
OFF STREET NAME                      6542   object
NUMBER OF PERSONS INJURED               0    int64
NUMBER OF PERSONS KILLED                0    int64
NUMBER OF PEDESTRIANS INJURED           0    int64
NUMBER OF PEDESTRIANS KILLED            0    int64
NUMBER OF CYCLISTS INJURED           7303  float64
NUMBER OF CYCLISTS KILLED            7303  float64
CONTRIBUTING FACTOR VEHICLE 1           0   object
CONTRIBUTING FACTOR VEHICLE 2        1085   object
CONTRIBUTING FACTOR VEHICLE 3        7000   object
CONTRIBUTING FACTOR VEHICLE 4        7244   object
CONTRIBUTING FACTOR VEHICLE 5        7289   object
VEHICLE TYPE CODE 1                    58   object
VEHICLE TYPE CODE 2                  1520   object
VEHICLE TYPE CODE 3                  7019   object
VEHICLE TYPE CODE 4                  7249   object
VEHICLE TYPE CODE 5                  7291   object

Configuration

Sorting

visualimiss.matrix(df, sort='asc')

image

Rotate, hide label

# visualimiss.matrix(df, label_rotation=90)
visualimiss.matrix(df, show_label=False)

image

Change color, font size

visualimiss.matrix(df, color=(43, 102, 189), fontsize=16)

image