nschloe/perfplot

Error while trying to run the appended script

encrypted-soul opened this issue · 5 comments

I am trying to compare the performance of np.ndarray.take as compared to pandas.Series.map below. But I am facing errors while running this and am unable to makeout what is going wrong.

Script

import numpy as np
import pandas as pd
import perfplot
import math

arr = np.random.randint(1, 101, 1000)
df = pd.DataFrame()
df['r'] = arr

def create_random_dict(x):
	res = {}
	test_keys = np.random.randint(1, 101, x)
	test_values = np.random.randint(1, 101, x)
	for key in test_keys:
	    for value in test_values:
	        res[key] = value
	        break 

	return res

def lets_map(x):
	return df['r'],map(create_random_dict(x))

def lets_take(x):
	return arr.take(list(create_random_dict(x).keys()))

perfplot.show(
    setup=np.random.randint,
    n_range=[2 ** k for k in range(20)],
    kernels=[create_random_dict, lets_take, lets_map],
    xlabel="len(x)",
)

Error Message

Traceback (most recent call last):
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 224, in __next__
    is_equal = self.equality_check(reference, val)
  File "<__array_function__ internals>", line 5, in allclose
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/numpy/core/numeric.py", line 2256, in allclose
    res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
  File "<__array_function__ internals>", line 5, in isclose
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/numpy/core/numeric.py", line 2362, in isclose
    xfin = isfinite(x)
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test_perf.py", line 27, in <module>
    perfplot.show(
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 504, in show
    out = bench(*args, **kwargs)
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 465, in bench
    timings_s[i] = next(b)
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 226, in __next__
    raise PerfplotError(
perfplot._exceptions.PerfplotError: Error in equality_check. Try setting equality_check=None.

Perfplot looks like a good tool for me to benchmark stuff for my projects. Would greatly appreciate any help in working out with this. Thanks :)

The error message says:

Error in equality_check. Try setting equality_check=None.

Have you tried that?

@nschloe Thank you for the response. The script above works fine after setting the equality_check = None parameter. But the x-variable in the above script is the length of the dictionary. I was actually more interested in varying the length of the nd-array and the dataframe column. So I just modified it to this

import numpy as np
import pandas as pd
import perfplot
import math

# arr = np.random.randint(1, 101, 1000)
# df = pd.DataFrame()
# df['r'] = arr

def create_random_array(x):
	arr = np.random.randint(1, 10000, x)
	return arr

def create_random_column(x):
	arr = create_random_array(x)
	df = pd.DataFrame()
	df['r'] = arr

	return df['r']

dicti = {}

for i in range(1, 100):
	dicti[i] = i

def lets_map(x):
	return create_random_column(x).map(dicti)

def lets_take(x):
	arr = create_random_array(x)
	print(arr)
	return arr.take(list(dicti.keys()))

perfplot.show(
    setup=np.random.randint,
    n_range=[2 ** k for k in range(20)],
    kernels=[lets_take, lets_map],
    xlabel="len(x)",
    equality_check= None
)

And now, the program terminates with this,

[]
Overall ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Kernels ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
Traceback (most recent call last):
  File "test_perf.py", line 34, in <module>
    perfplot.show(
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 504, in show
    out = bench(*args, **kwargs)
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 465, in bench
    timings_s[i] = next(b)
  File "/home/gaganaryan/anaconda3/envs/radis/lib/python3.8/site-packages/perfplot/_main.py", line 215, in __next__
    val = kernel(data)
  File "test_perf.py", line 32, in lets_take
    return arr.take(list(dicti.keys()))
IndexError: cannot do a non-empty take from an empty axes.

Any ideas of how this can be resolved?

Is setup=np.random.randint really what you want?

I want to have a comparison with varying lengths of arrays and column and hence chose that setup. You think some other setup would be more relevant ?

I don't know what you're trying to do exactly, but note that with setup=np.random.randint, you get

np.random.randint(n)

in each step, so a random number between 0 and n.