godatadriven/evol

`SummaryLogger` is not called when `.log` is called from Evolution.

koaning opened this issue · 5 comments

I was preparing a genetic algorithm for a training. I'll copy the rough code in, even though it is not reproducible it does show what is going wrong.

This works fine:

num_cities = len(problem_instance.house_coordinates)
init_chromosomes = [init_random(num_cities) for l in range(100)]
pop = Population(chromosomes=init_chromosomes,
                 eval_function=problem_instance.score_standard, 
                 maximize=False,
                 logger=SummaryLogger(file="/tmp/random-evol.log"))

# apply random search of sorts
for i in range(500):
    pop = (pop
           .mutate(random_switch)
           .survive(n=10)
           .breed(parent_picker=pick_random_parents, 
                  combiner=lambda x, y: x)
           .log())

While this next bit of code does not seem to log to the SummaryLogger but the BaseLogger;

# use evolution objects ! 
evo1 = (Evolution()
       .mutate(random_switch, n=2)
       .survive(n=10)
       .breed(parent_picker=lambda x: random.sample(x, k=2), 
              combiner=crossover_method)
       .log())

evo2 = (Evolution()
       .mutate(random_switch)
       .survive(fraction=0.25)
       .breed(parent_picker=lambda x: random.sample(x, k=2), 
              combiner=crossover_method)
       .log())

evo3 = (Evolution()
       .repeat(evo1, n=1)
       .repeat(evo2, n=5))

# apply it! 
pop = pop.evolve(evo, n=10)

Investigate!

The error got a a little stranger. Just for reference I've created a self contained version of the problem that clearly demonstrates what is going wrong.

"""
This example demonstrates how logging works in evolutions.
"""

import random
from evol import Population, Evolution
from evol.logger import SummaryLogger

random.seed(42)

def random_start():
    """
    This function generates a random (x,y) coordinate in the searchspace
    """
    return (random.random() - 0.5) * 20, (random.random() - 0.5) * 20

def func_to_optimise(xy):
    """
    This is the function we want to optimise (maximize)
    """
    x, y = xy
    return -(1-x)**2 - 2*(2-x**2)**2

def pick_random_parents(pop):
    """
    This is how we are going to select parents from the population
    """
    mom = random.choice(pop)
    dad = random.choice(pop)
    return mom, dad

def make_child(mom, dad):
    """
    This is how two parents are going to make a child. 
    Note that the output of a tuple, just like the output of `random_start`
    """
    child_x = (mom[0] + dad[0])/2
    child_y = (mom[1] + dad[1])/2
    return child_x, child_y

def add_noise(chromosome, sigma):
    """
    This is a function that will add some noise to the chromosome. 
    """
    new_x = chromosome[0] + (random.random()-0.5) * sigma
    new_y = chromosome[1] + (random.random()-0.5) * sigma
    return new_x, new_y

pop = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=SummaryLogger(file=None, stdout=True))

# this will create a log
print("something should log now")
pop.log()


evo1 = (Evolution()
       .survive(fraction=0.1)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=0.2)
       .log())

evo2 = (Evolution()
       .survive(n=10)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=0.1)
       .log())

evo3 = (Evolution()
       .repeat(evo1, n=20)
       .repeat(evo2, n=20))

# this will not
print("lots of logs should appear now")
pop = pop.evolve(evo3, n=3)

# and suddenly this will not log
print("something should log now")
pop.log()

# but this somehow will
pop2 = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=SummaryLogger(file=None, stdout=True))

# this will create a log, but will list it ... twice?
print("something should log now ... once!")
pop2.log()

I fear that maybe something with the population.copy() is still not working as we'd like.

The problem is that the __copy__ does not provide the logger to the __init__ of the new population.

(And the same goes for generation)

I have fixed that over at 0d3500f but there is something more.

At the bottom of the file, even with the copy fix. The pop2.log() at the bottom of the file will log two lines. It seems to occur when you run a population like this:

pop = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=SummaryLogger(file=None, stdout=True))
pop2 = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=SummaryLogger(file=None, stdout=True))

# this will create a log, but will list it ... twice?
pop2.log()

But if non longer occurs when you run it like this:

logger = SummaryLogger(file=None, stdout=True)
pop = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=logger)
pop2 = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise,
                 maximize=True,
                 logger=logger)

# this will create a log, but will list it ... once
pop2.log()

Somehow the two SummaryLoggers are not independant. Am investigating this and writing tests for all these edge cases.

Fixed.