algorithm.run(100) works, algorithm.run(101) or greater doesn't work

Question

algorithm.run(100) works, algorithm.run(101) or greater doesn't work

themagpipers opened this issue 8 months ago · 10 comments

Hi, I am facing a strange bug, which looks possibly the same as the one described there: https://stackoverflow.com/questions/62432086/platypus-nsga-ii-shows-unhashable-type-numpy-ndarray-after-200-evaluat

I am solving a problem in a way that used to work without any problem, but after some code modification, I get an error whenever I choose n > 100 in the expression algorithm.run(n).

My code looks like so:

from platypus import NSGAII, Problem, Real, Integer, CompoundOperator, SBX, HUX, PM, BitFlip
from input_parameters import InputParameters
from indtest import IndTest
import numpy as np
import plotly.graph_objects as go
from scipy.interpolate import griddata

def evaluate(x):
    cr_w = x[1]
    input_params = InputParameters(co_h = x[0],
                                   cr_h = x[1],
                                   cr_w = cr_w,
                                   air_gap_n = 2 * x[2],
                                   cl_n = x[3],
                                   air_gap = x[4])
    ind_test = IndTest(input_params)
    total_mass = ind_test .cr_m + ind_test .cl_wire_m + \
                    ind_test .pot_m + ind_test .cov_m
    total_loss = ind_test .j_losses + ind_test .m_losses
    final_ind = ind_test .final_ind
    constraint_1 = abs(final_ind - 55e-6) / 55e-6
    return [total_mass, total_loss], [constraint_1]

problem = Problem(5, 2, 1)
problem.constraints[:] = "<=0.05"
problem.types[0] = Real(0.06, 0.16) 
problem.types[1] = Real(0.0057, 0.0085)
problem.types[2] = Integer(1, 5)
problem.types[3] = Integer(10, 110)
problem.types[4] = Real(0.01, 0.03)
problem.function = evaluate

variator = CompoundOperator(SBX(probability=1.0, distribution_index=1.0), 
                            HUX(probability=1.0), 
                            PM(probability=1.0, distribution_index=2.0), 
                            BitFlip(probability=1.0))

algorithm = NSGAII(problem, variator = variator, pop_size = 4)
algorithm.run(101)

Here is the traceback:

Traceback (most recent call last):
  File "\git\InductorDesign\platypus\platypus_test.py", line 73, in <module>
    algorithm.run(101)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 410, in run
    self.step()
  File "C:\Python3.10\lib\site-packages\platypus\algorithms.py", line 182, in step
    self.iterate()
  File "C:\Python3.10\lib\site-packages\platypus\algorithms.py", line 208, in iterate
    nondominated_sort(offspring)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 1070, in nondominated_sort
    crowding_distance(archive)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 1090, in crowding_distance
    solutions = unique(solutions)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 1016, in unique
    if not id in unique_ids:
TypeError: unhashable type: 'numpy.ndarray'

I have added a print statement in core.py, line 1015 to debug.
The line I added is print(f'unique_ids: {unique_ids}'). This returns unique_ids: set().

If I choose n <= 100, the code works as expected and the debug line I added isn't even executed (which means the function unique() in core.py is not even called).

This is a very annoying bug which prevents me from running an optimization... I'd appreciate any pointer to fix the issue.

Edit: a little bit more debug. I have added more print lines and I see that, in the case when the code fails (i.e. n>100), the function debug nondominated_sort() is called.

Answer 1 · 2024-01-25T11:01:49.000Z

More debug. Very strange. No matter what n>100 I choose (be it either 101, 103 or 1003), I have added a line to print the length of the solution like so:

def nondominated_sort(solutions):
    """Fast non-dominated sorting.

    Performs fast non-dominated sorting on a collection of solutions.  The
    solutions will be assigned the following attributes:

    1. :code:`rank` - The index of the non-dominated front containing the
       solution.  Rank 0 stores all non-dominated solutions.

    2. :code:`crowding_distance` - The crowding distance of the given solution.
       Larger values indicate less crowding near the solution.

    Parameters
    ----------
    solutions : iterable
        The collection of solutions
    """
    rank = 0

    while len(solutions) > 0:
        print(len(solutions))
        archive = Archive()
        archive += solutions

and 200 is always returned, no matter what. However if I choose n<=100, this part of the code is not executed.

Edit: Another hint of debugging. I added a line to print id in the function unique(). This is what I get: id: (0.6705456830477191, array([35.70357467])).
So I think this is it... In that case, when n is big enough some part of the code gets executed, and it is not happy if some output is a numpy array rather than something hashable. So I need to convert my array into something hashable.

Edit: Hmm no, this is not as simple as this... I converted all my outputs to tuples. Now the code returns:

Traceback (most recent call last):
  File "\git\InductorDesign\platypus\platypus_test.py", line 73, in <module>
    algorithm.run(1003)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 410, in run
    self.step()
  File "C:\Python3.10\lib\site-packages\platypus\algorithms.py", line 180, in step
    self.initialize()
  File "C:\Python3.10\lib\site-packages\platypus\algorithms.py", line 190, in initialize
    super().initialize()
  File "C:\Python3.10\lib\site-packages\platypus\algorithms.py", line 71, in initialize
    self.evaluate_all(self.population)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 383, in evaluate_all
    results = self.evaluator.evaluate_all(jobs)
  File "C:\Python3.10\lib\site-packages\platypus\evaluator.py", line 87, in evaluate_all
    return list(self.map_func(run_job, jobs))
  File "C:\Python3.10\lib\site-packages\platypus\evaluator.py", line 54, in run_job
    job.run()
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 350, in run
    self.solution.evaluate()
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 523, in evaluate
    self.problem(self)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 167, in __call__
    self.evaluate(solution)
  File "C:\Python3.10\lib\site-packages\platypus\core.py", line 191, in evaluate
    (objs, constrs) = self.function(solution.variables)
  File "\git\IndTest\platypus\platypus_test.py", line 19, in evaluate
    total_mass = tuple(ind_test.cr_m + ind_test.cl_wire_m +
TypeError: 'float' object is not iterable

Edit: Converting the array to float works. I can proceed with the optimization without this limit.

Answer 2 · 2024-01-25T15:16:33.000Z

Hi,

TypeError: unhashable type: 'numpy.ndarray'

Yes, as you already worked out in the update, the objectives and constraints should just be floats, so would need to extract the value out of the NumPy array.

More debug. Very strange. No matter what n>100 I choose (be it either 101, 103 or 1003), I have added a line to print the length of the solution like so:

I think this is caused by the line:

algorithm = NSGAII(problem, variator = variator, pop_size = 4)

pop_size should instead be population_size. Otherwise, it's not overriding this argument and instead using the default value of 100. Consequently, that's why len(solutions) is always 200. The non-dominated sorting is applied to the combined set of parents and offspring. There's 100 of each, so 200 total.

This is also why the behavior changes with N > 100. After initializing the population of size 100, it then switches to the evolution stage where it applies crossover/mutation and calls non-dominated sorting.

Answer 3 · 2024-01-26T09:14:33.000Z

Thanks a lot for all these insights, I have fixed my code accordingly.
Now, in order to avoid future such errors, would it be better for the code to return either a warning or an error if the objectives and constraints aren't floats? Or is it really just the user's fault (for not having read the doc)?

Answer 4 · 2024-02-01T02:17:37.000Z

Now, in order to avoid future such errors, would it be better for the code to return either a warning or an error if the objectives and constraints aren't floats? Or is it really just the user's fault (for not having read the doc)?

Let me think about what makes sense to do here. I don't want to enforce a specific type (float) as other types could be used (int, decimal, fraction, etc.). The root cause is that some operations need the value to be hashable, so I think the preferred approach is to catch the TypeError and provide a more descriptive error message.

Answer 5 · 2024-04-02T02:02:15.000Z

This issue is stale and will be closed soon. If you feel this issue is still relevant, please comment to keep it active. Please also consider working on a fix and submitting a PR.

Answer 6 · 2024-04-02T08:05:54.000Z

The issue is still relevant to me. I hope someone writes a patch :).

Answer 7 · 2024-06-02T02:10:29.000Z

This issue is stale and will be closed soon. If you feel this issue is still relevant, please comment to keep it active. Please also consider working on a fix and submitting a PR.

Answer 8 · 2024-06-05T15:03:11.000Z

The issue will always be relevant unless addressed. This bot is annoying...

Answer 9 · 2024-08-05T02:14:39.000Z

This issue is stale and will be closed soon. If you feel this issue is still relevant, please comment to keep it active. Please also consider working on a fix and submitting a PR.

Answer 10 · 2024-08-18T16:47:07.000Z

@dhadka would you support removing the "stale" GitHub bot? I think it's always better to close issues manually than let a bot do it while potentially unresolved.