[Push] Advice on Fitness
Rud Merriam
k5rud at arrl.net
Fri May 7 11:41:28 EDT 2010
I suspected most of these points so made changes last night after re-reading
the articles on the web site. Especially helpful was the Robinson thesis
which went into a few more details on PushGP.
added mutation
added better control by percentage of reproduction strategy
removed purging just this morning after your comments
I just did a run with population of 1000, 40 generations with 80% crossover,
5% mutation, and 10% direct reproduction. Tournament size is 7. Fitness is
as described previously: -1000 no boolean, 10 for correct result, o bad
result. I excluded all the RAND instructions. Points maximum is 20.
A fitness of 100, which is 50% correct responses, takes over. Almost all the
programs have FALSE or TRUE in them to generate the response. The few that
don't do something like EXEC.= to generate a boolean.
I observed that there are a lot of duplicate programs generated after
reproduction. Is that usual? If not, I might have something wrong in
crossover not providing sufficient variation.
What other simple test might I try instead of ODD?
- 73 -
Rud Merriam K5RUD
http://mysticlakesoftware.com/
> -----Original Message-----
> From: Lee Spector [mailto:lspector at hampshire.edu]
> Sent: Friday, May 07, 2010 9:12 AM
> To: k5rud at arrl.net
> Cc: push at lists.hampshire.edu
> Subject: Re: [Push] Advice on Fitness
>
>
>
> Hi Rud,
>
> The fitness function looks reasonable to me.
>
> I would probably not include BOOLEAN.RAND or the other RAND
> instructions for this problem, since you are presumably
> looking for a deterministic solution.
>
> Your "purging" step seems potentially problematic to me.
> Usually in GP parents are selected in a random but
> fitness-proportionate way, e.g. via tournaments of size 2-7,
> which will usually mean that poorly-performing individuals
> have a low but non-zero chance of being parents. Maybe that
> specific issue (that below-average guys can't be parents)
> isn't a big deal, but I think the combination of this and the
> way you're selecting crossover parents -- randomly from the
> unpurged? -- might have some bad consequences.
>
> There might be an issue if a lot of programs have exactly the
> average fitness. In that case if you're purging programs with
> strictly less than the average then you won't be purging very
> many, and since it this purging is the only way your
> population can progress, if you don't purge very many then it
> wouldn't be likely that you'd ever get better programs. On
> the other hand if you are purging programs with "less than or
> equal" the average fitness then you may be purging too many.
>
> Another issue may be that among the top half there's no bias
> toward the better programs. In other words, if the top half
> includes a bunch -- let's say the top 10% -- that are really
> good then you'd like them to make more offspring than the
> others. But I don't think you are doing this.
>
> The introduction of new completely random individuals in each
> generation is also non-standard... I can imagine that this
> might slow things down a bit, since your evolved population
> should soon be better than random and you should be
> introducing new genetic material via mutation.
>
> I don't see any mention of mutation... That may also be a
> problem. If you're relying on new material coming in via new
> random individuals and then via crossover then that may not
> be happening if all of those new random individuals are
> purged before the crossover step.
>
> As for maintaining parents in the next generation, normally
> "straight reproduction" is one of the ways in which
> "children" (here actually clones) can be made, so some number
> of good individuals (selected in a fitness proportionate way)
> will make it into the next generation. But this is normally
> not connected to whether or not they also served as parents,
> and it is normally a very small percentage of the population
> (less than 10%).
>
> A lot may also depend on your population size, particularly
> since the ODD problem, having only single Boolean outputs for
> each case, has a fairly coarse fitness landscape -- that is,
> there aren't a lot of possible fitness values, so there
> aren't smooth fitness gradients for evolution to follow, and
> one generally has to compensate for this by sampling a large
> space of programs. I solve it pretty fast with my Clojush
> system with the default population size of 1000, but I don't
> know what you're using.
>
> -Lee
>
More information about the Push
mailing list