ARM-software/mango

Python int too large to convert to C long

AJFeng opened this issue · 2 comments

AJFeng commented

param_dict = {"s": range(50,600),
"Ts": range(20,100),
"Ts2": range(20,100),
"c":range(1,100),
"n_hidden1":range(100,1000),
"n_hidden2":range(10,100),
"n_hidden3":range(5,30),
"selected_range": [0.5,0.6,0.7,0.8,0.9]}

conf_Dict = dict()
conf_Dict['batch_size'] = 1
conf_Dict['num_iteration'] = 100
conf_Dict['domain_size'] = 50000
conf_Dict['initial_random'] = 1

@scheduler.parallel(n_jobs=2)
def objective(s,Ts,Ts2,c,n_hidden1,n_hidden2,n_hidden3,selected_range):

global X, Y, N, p

f1s=[]
accs=[]
all_common_numbers=[]
all_loss=[]

return random.randint(1, 100000)

error:
mango_results = tuner.minimize()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:160 in minimize
return self.run()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:147 in run
self.results = self.runBayesianOptimizer()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:208 in runBayesianOptimizer
X_list, Y_list, X_tried = self.run_initial()

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\tuner.py:184 in run_initial
X_tried = self.ds.get_random_sample(self.config.initial_random)

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\domain_space.py:45 in get_random_sample
return self._get_random_sample(size)

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\domain_space.py:63 in _get_random_sample
domain_list = list(BatchParameterSampler(self.param_dict, n_iter=size))

File ~\AppData\Local\anaconda3\envs\Ecoli\Lib\site-packages\mango\domain\batch_parameter_sampler.py:49 in iter
for i in sample_without_replacement(grid_size, n_iter,

File sklearn\utils_random.pyx:218 in sklearn.utils._random.sample_without_replacement

OverflowError: Python int too large to convert to C long

Hi, I was able to start training as you can see in the collab notebook below:
https://colab.research.google.com/drive/1rxwzXMPIFHEcOU6dz_gev4rmiQhcrHZC?usp=sharing

But, my suggestion will be to reduce the search space complexity. This is an extremely large search space.
Search space size = 550*80*80*100*900*90*25*5 = 3*e15 values, which are too large to handle during sampling, and to get good results.

Something as below:

param_dict = {"s": range(50,600, 30),
"Ts": range(20,100, 10),
"Ts2": range(20,100, 10),
"c":range(1,100, 10),
"n_hidden1":range(100,1000, 30),
"n_hidden2":range(10,100, 10),
"n_hidden3":range(5,30, 5),
"selected_range": [0.5,0.6,0.7,0.8,0.9]}

You can start from a small space, and then fine-tune the model near the best parameters which you previously found.

AJFeng commented

Thank you so much for the quick response. Yes, this one works and really makes sense to me.