/pySankey

create sankey diagrams with matplotlib

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

pySankey

Uses matplotlib to create simple Sankey diagrams flowing only from left to right.

Requirements

Requires python-tk (for python 2.7) or python3-tk (for python 3.x) you can install the other requirements with:

    pip install -r requirements.txt

Example

With fruits.txt :

true predicted
0 blueberry orange
1 lime orange
2 blueberry lime
3 apple orange
... ... ...
996 lime orange
997 blueberry orange
998 orange banana
999 apple lime

1000 rows × 2 columns

You can generate a sankey's diagram with this code:

import sankey
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

pd.options.display.max_rows=8
%matplotlib inline

df = pd.read_csv('fruits.txt',sep = ' ',names=['true','predicted'])
colorDict =  {'apple':'#f71b1b','blueberry':'#1b7ef7','banana':'#f3f71b',
              'lime':'#12e23f','orange':'#f78c1b'}
sankey.sankey(df['true'], df['predicted'], aspect=20, colorDict=colorDict,
              fontsize=1, figure_name="fruit")

Fruity Alchemy

You could also use weight:

,customer,good,revenue
0,John,fruit,5.5
1,Mike,meat,11.0
2,Betty,drinks,7.0
3,Ben,fruit,4.0
4,Betty,bread,2.0
5,John,bread,2.5
6,John,drinks,8.0
7,Ben,bread,2.0
8,Mike,bread,3.5
9,John,meat,13.0
import sankey
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.options.display.max_rows=8
%matplotlib inline
df = pd.read_csv('customers-goods.csv',sep = ',',names=['id', 'customer','good','revenue'])
# actual call is left as an exercice to the reader but it could be something like
# sankey.sankey(left=df['customer'], right=df['good'], rightWeight=df['revenue'],
#               aspect=20, colorDict=colorDict, fontsize=20,
#               figure_name="customer-good")

Customer goods