"visualize missing data". A simple missingno clone. Some features were removed from the original and some were added.
Some features differ from the original one:
- Have
matrix()
,bar()
modified function from the original one - Simpler and faster
- Maybe more readable?
pip install visualimiss
This also install all dependencies: Numpy, Pandas, Matplotlib
This quickstart uses datasets of the NYPD Motor Vehicle Collisions Dataset.
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/ResidentMario/missingno-data/master/nyc_collision_factors.csv")
visualimiss.matrix(df)
visualimiss.bar(df)
visualimiss.info(df)
- DataFrame has 7303 rows, 26 columns
- Memory usage: 1.519152 MB
Null count Dtype
DATE 0 object
TIME 0 object
BOROUGH 383 object
ZIP CODE 384 float64
LATITUDE 0 float64
LONGITUDE 0 float64
LOCATION 0 object
ON STREET NAME 1065 object
CROSS STREET NAME 1137 object
OFF STREET NAME 6542 object
NUMBER OF PERSONS INJURED 0 int64
NUMBER OF PERSONS KILLED 0 int64
NUMBER OF PEDESTRIANS INJURED 0 int64
NUMBER OF PEDESTRIANS KILLED 0 int64
NUMBER OF CYCLISTS INJURED 7303 float64
NUMBER OF CYCLISTS KILLED 7303 float64
CONTRIBUTING FACTOR VEHICLE 1 0 object
CONTRIBUTING FACTOR VEHICLE 2 1085 object
CONTRIBUTING FACTOR VEHICLE 3 7000 object
CONTRIBUTING FACTOR VEHICLE 4 7244 object
CONTRIBUTING FACTOR VEHICLE 5 7289 object
VEHICLE TYPE CODE 1 58 object
VEHICLE TYPE CODE 2 1520 object
VEHICLE TYPE CODE 3 7019 object
VEHICLE TYPE CODE 4 7249 object
VEHICLE TYPE CODE 5 7291 object
visualimiss.matrix(df, sort='asc')
# visualimiss.matrix(df, label_rotation=90)
visualimiss.matrix(df, show_label=False)
visualimiss.matrix(df, color=(43, 102, 189), fontsize=16)