A problem with code in Genetic Algorithm Basic.py
clamli opened this issue · 7 comments
你好!感谢你的视频分享,很通俗易懂!但是我有一点小疑问,如下:
for parent in pop:
child = crossover(parent, pop_copy)
child = mutate(child)
parent[:] = child # parent is replaced its child
我对这一段代码的最后一句有一点疑惑,就是为什么要重新更改parent的值呢?它在for循环里作为迭代变量,更改它的值是不是没有意义啊?为什么不是把child的值更新到pop里面?
谢谢你!
哈罗,我不是作者本人。但最近在看这些算法,提了一些问题也在询问作者,如果你不建议可以看看我的想法。
你的想法其实很不错,同时保留父母和子女,那么优秀的父母和子女都可以保留下来,不过这里很重要的一点是在select函数中并没有更改pop的数量,或者说整个代码都没有更改pop的数量,当迭代到一定量时,数据会无限大,程序就会出问题。说了这些之后,你可以自己想想,如果有什么想法可以发给我,谢谢!
对, 像 @igo312 说到的那样, 主要的原因是这个 pop 不能无限大, 如果 pop 越来越大, 消耗的计算资源会越来越多, 程序也越来越慢, 这不是理想的程序形式.
@igo312 对的,可能我没有表述清楚,但是其实我的想法并不是改变pop的数量。pop的数量其实还是和原来一样。我的想法是把pop的值更新为父母交配和变异以后的后代的值,因为两个父母进行交叉配对,生成的还是两个儿子,因此pop的值的个数不会改变。
如果是作者的那样,parent[:] = child,然后接下来又是for parent in pop:,是不是其实新产生的child并没有对下一轮pop的循环产生影响?因为它只是更新了parent的值,而parent的值在下一个循环又被重写了。
我顺便附上我改的一个代码:
"""
Visualize Genetic Algorithm to find a maximum point in a function.
Visit my tutorial website for more: https://morvanzhou.github.io/tutorials/
"""
import numpy as np
import random
import matplotlib.pyplot as plt
DNA_SIZE = 10 # DNA length
POP_SIZE = 100 # population size
CROSS_RATE = 0.8 # mating probability (DNA crossover)
MUTATION_RATE = 0.003 # mutation probability
N_GENERATIONS = 200
X_BOUND = [0, 5] # x upper and lower bounds
def F(x):
return np.sin(10*x)*x + np.cos(2*x)*x # to find the maximum of this function
# find non-zero fitness for selection
def get_fitness(pred):
return pred + 1e-3 - np.min(pred)
# convert binary DNA to decimal and normalize it to a range(0, 5)
def translateDNA(pop):
return pop.dot(2 ** np.arange(DNA_SIZE)[::-1])*1.0 / (2**DNA_SIZE-1) * X_BOUND[1]
def select(pop, fitness): # nature selection wrt pop's fitness
idx = np.random.choice(np.arange(POP_SIZE), size=POP_SIZE, replace=True,
p=fitness/fitness.sum())
return pop[idx]
def crossover(child1, child2): # mating process (genes crossover)
if np.random.rand() < CROSS_RATE:
cross_points = np.random.randint(0, 2, size=DNA_SIZE).astype(np.bool) # choose crossover points
tmp = child1.copy()
child1[cross_points] = child2[cross_points] # mating and produce one child
child2[cross_points] = tmp[cross_points]
return child1, child2
def mutate(child):
for point in range(DNA_SIZE):
if np.random.rand() < MUTATION_RATE:
child[point] = 1 if child[point] == 0 else 0
return child
pop = np.random.randint(0, 2, (1, DNA_SIZE)).repeat(POP_SIZE, axis=0) # initialize the pop DNA
# print pop
plt.ion() # something about plotting
x = np.linspace(X_BOUND[0], X_BOUND[1], 200)
plt.plot(x, F(x))
for _ in range(N_GENERATIONS):
F_values = F(translateDNA(pop)) # compute function value by extracting DNA
# something about plotting
if 'sca' in globals(): sca.remove()
sca = plt.scatter(translateDNA(pop), F_values, s=200, lw=0, c='red', alpha=0.5); plt.pause(0.05)
# GA part (evolution)
fitness = get_fitness(F_values)
# print("Most fitted DNA: ", pop[np.argmax(fitness), :])
print("Most fitted DNA: ", pop[np.argmax(fitness), :])
# print "----------"
# print fitness
pop = select(pop, fitness)
pop_copy = pop.copy()
cnt = 0
while len(pop_copy):
[ind1, ind2] = random.sample(range(0, pop_copy.shape[0]), 2)
child1, child2 = crossover(pop_copy[ind1], pop_copy[ind2])
pop_copy = np.delete(pop_copy, [ind1,ind2], 0)
child1 = mutate(child1)
child2 = mutate(child2)
pop[cnt] = child1
pop[cnt+1] = child2
cnt = cnt + 2
# print "----", ind1, ind2
# print len(pop_copy), cnt
plt.ioff(); plt.show()
@clamli parent 会替代掉 pop 中的原数据的. 下面有个简单的实验:
placeholder = np.zeros((3, 1))
for p in placeholder:
p[:] = 1
print(placeholder)
[[ 1.]
[ 1.]
[ 1.]]
这是 numpy 的一个内存调用属性.
@MorvanZhou 啊对的,是我蠢了,因为python是指针指向对象,其实他们都指向同一个对象的。真是抱歉,问题的根源还是python不够熟练。
谢谢!~