UDST/urbanaccess

Gracefully handle when requesting more bins than there are unique edges results in error

kuanb opened this issue · 5 comments

kuanb commented

Right now, if you request more bins than there are unique edges, an error will be raised. I wonder if it would be reasonable to simply pick themin() of the number of unique edges or the requested argument parameter num_bins?

Example:

>>> colors = urbanaccess.plot.col_colors(
...                         df=network,
...                         col='mean',
...                         num_bins=10,
...                         cmap='spectral',
...                         start=0.1,
...                         stop=0.9)
Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
  File "urbanaccess/plot.py", line 160, in col_colors
    categories = pd.qcut(x=col_values, q=num_bins, labels=bin_labels)
  File "/usr/local/lib/python2.7/site-packages/pandas/tools/tile.py", line 175, in qcut
    precision=precision, include_lowest=True)
  File "/usr/local/lib/python2.7/site-packages/pandas/tools/tile.py", line 194, in _bins_to_cuts
    raise ValueError('Bin edges must be unique: %s' % repr(bins))
ValueError: Bin edges must be unique: array([  0.        ,   8.15789474,  13.        ,  16.71428571,
        25.5       ,  29.66666667,  30.        ,  30.        ,
        30.6       ,  37.5       ,  86.        ])

Thank you Kuan, that is a good idea and sounds reasonable. Would you be willing to add this to your existing plot PR #22?

kuanb commented

Sounds good, will get to it later today!

kuanb commented

Resolved via latest commit, the kuanb-color-cols-fix branch PR is ready to be reviewed again.

cc @pksohn @sablanchard

kuanb commented

Example operation:

>>> color_range = col_colors(
...                         df=urbanaccess_nw.net_edges,
...                         col='mean',
...                         num_bins=16,
...                         cmap='YlOrRd',
...                         start=0.1,
...                         stop=0.9)
Too many bins requested, using max bins possible. To avoid duplicate edges, 8 bins used.

Fixed with PR #22