Uses matplotlib to create simple Sankey diagrams flowing only from left to right.
Requires python-tk (for python 2.7) or python3-tk (for python 3.x) you can install the other requirements with:
pip install -r requirements.txt
With fruits.txt :
true | predicted | |
---|---|---|
0 | blueberry | orange |
1 | lime | orange |
2 | blueberry | lime |
3 | apple | orange |
... | ... | ... |
996 | lime | orange |
997 | blueberry | orange |
998 | orange | banana |
999 | apple | lime |
1000 rows × 2 columns
You can generate a sankey's diagram with this code:
import sankey
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.options.display.max_rows=8
%matplotlib inline
df = pd.read_csv('fruits.txt',sep = ' ',names=['true','predicted'])
colorDict = {'apple':'#f71b1b','blueberry':'#1b7ef7','banana':'#f3f71b',
'lime':'#12e23f','orange':'#f78c1b'}
sankey.sankey(df['true'], df['predicted'], aspect=20, colorDict=colorDict,
fontsize=1, figure_name="fruit")
You could also use weight:
,customer,good,revenue
0,John,fruit,5.5
1,Mike,meat,11.0
2,Betty,drinks,7.0
3,Ben,fruit,4.0
4,Betty,bread,2.0
5,John,bread,2.5
6,John,drinks,8.0
7,Ben,bread,2.0
8,Mike,bread,3.5
9,John,meat,13.0
import sankey
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.options.display.max_rows=8
%matplotlib inline
df = pd.read_csv('customers-goods.csv',sep = ',',names=['id', 'customer','good','revenue'])
# actual call is left as an exercice to the reader but it could be something like
# sankey.sankey(left=df['customer'], right=df['good'], rightWeight=df['revenue'],
# aspect=20, colorDict=colorDict, fontsize=20,
# figure_name="customer-good")