[Push] Advice on Fitness

Fri May 7 14:57:31 EDT 2010

Thanks much to the tips and your time, Lee. Hope you have a good weekend.

 - 73 - 
Rud Merriam K5RUD 
http://mysticlakesoftware.com/

> -----Original Message-----
> From: Lee Spector [mailto:lspector at hampshire.edu] 
> Sent: Friday, May 07, 2010 1:21 PM
> To: k5rud at arrl.net
> Cc: push at lists.hampshire.edu
> Subject: Re: [Push] Advice on Fitness
> 
> 
> 
> Rud,
> 
> Aside from ODD some of the things I'd try are symbolic 
> regression, maybe both some polynomials and something like 
> factorial, and maybe some boolean problems (e.g. parity). But 
> I think I'd keep working on ODD to work out some of the kinks 
> in your system that it is exposing first.
> 
> The large number of duplicates sounds like a clue. It could 
> indeed mean that something's wrong with your 
> crossover/mutation operations, so I would test those 
> independently, and/or it could mean that you're doing too 
> much direct reproduction. I did say that this is usually less 
> than 10%, but often it's much lower, like 2%. You say you're 
> doing 80 xover, 5 mutation, 10 reproduction.... what's the other 5?
> 
> Another issue here is that answering constant TRUE or FALSE 
> is just a really good local maximum for this problem. You get 
> 50% of the inputs right and you can do it very simply. So 
> it's hard to get over that, especially with strong selection 
> (tournament=7 is pretty strong; I'd see what happens if you 
> change this to 2) and lots of direct reproduction. 
> 
> If you've ensured that your genetic operators are working 
> correctly, and done the other things to try to slow down 
> convergence to a local maximum (lower tournament size, less 
> straight reproduction), and you're still not getting 
> successes then you could try something more heavy-handed like 
> penalizing programs that give constant output (the same 
> output for all inputs). That's not so unreasonable, since you 
> often know that your answer shouldn't be constant. As a 
> practical matter it can often be useful to penalize other 
> known local maxima as well -- although this isn't a very 
> general technique, it can be handy if you're trying to solve 
> a real problem. But for this problem you shouldn't have to go 
> that far -- at least I've never had to do anything like that 
> to find solutions in other Push systems.
> 
>  -Lee
> 
> 
> On May 7, 2010, at 11:41 AM, Rud Merriam wrote:
> 
> > I suspected most of these points so made changes last night after 
> > re-reading the articles on the web site. Especially helpful was the 
> > Robinson thesis which went into a few more details on PushGP.
> > 
> > 	added mutation
> > 	added better control by percentage of reproduction strategy
> > 	removed purging just this morning after your comments
> > 
> > I just did a run with population of 1000, 40 generations with 80% 
> > crossover, 5% mutation, and 10% direct reproduction. 
> Tournament size 
> > is 7. Fitness is as described previously: -1000 no boolean, 10 for 
> > correct result, o bad result. I excluded all the RAND instructions. 
> > Points maximum is 20.
> > 
> > A fitness of 100, which is 50% correct responses, takes 
> over. Almost 
> > all the programs have FALSE or TRUE in them to generate the 
> response. 
> > The few that don't do something like EXEC.= to generate a boolean.
> > 
> > I observed that there are a lot of duplicate programs 
> generated after 
> > reproduction. Is that usual? If not, I might have something 
> wrong in 
> > crossover not providing sufficient variation.
> > 
> > What other simple test might I try instead of ODD?
> > 
> > 
> > 
> > 
> > - 73 -
> > Rud Merriam K5RUD 
> > http://mysticlakesoftware.com/
> > 
> > 
> >> -----Original Message-----
> >> From: Lee Spector [mailto:lspector at hampshire.edu]
> >> Sent: Friday, May 07, 2010 9:12 AM
> >> To: k5rud at arrl.net
> >> Cc: push at lists.hampshire.edu
> >> Subject: Re: [Push] Advice on Fitness
> >> 
> >> 
> >> 
> >> Hi Rud,
> >> 
> >> The fitness function looks reasonable to me.
> >> 
> >> I would probably not include BOOLEAN.RAND or the other RAND
> >> instructions for this problem, since you are presumably 
> >> looking for a deterministic solution. 
> >> 
> >> Your "purging" step seems potentially problematic to me.
> >> Usually in GP parents are selected in a random but 
> >> fitness-proportionate way, e.g. via tournaments of size 2-7, 
> >> which will usually mean that poorly-performing individuals 
> >> have a low but non-zero chance of being parents. Maybe that 
> >> specific issue (that below-average guys can't be parents) 
> >> isn't a big deal, but I think the combination of this and the 
> >> way you're selecting crossover parents -- randomly from the 
> >> unpurged? -- might have some bad consequences.
> >> 
> >> There might be an issue if a lot of programs have exactly the
> >> average fitness. In that case if you're purging programs with 
> >> strictly less than the average then you won't be purging very 
> >> many, and since it this purging is the only way your 
> >> population can progress, if you don't purge very many then it 
> >> wouldn't be likely that you'd ever get better programs. On 
> >> the other hand if you are purging programs with "less than or 
> >> equal" the average fitness then you may be purging too many.
> >> 
> >> Another issue may be that among the top half there's no bias
> >> toward the better programs. In other words, if the top half 
> >> includes a bunch -- let's say the top 10% -- that are really 
> >> good then you'd like them to make more offspring than the 
> >> others. But I don't think you are doing this.
> >> 
> >> The introduction of new completely random individuals in each
> >> generation is also non-standard... I can imagine that this 
> >> might slow things down a bit, since your evolved population 
> >> should soon be better than random and you should be 
> >> introducing new genetic material via mutation.
> >> 
> >> I don't see any mention of mutation... That may also be a
> >> problem. If you're relying on new material coming in via new 
> >> random individuals and then via crossover then that may not 
> >> be happening if all of those new random individuals are 
> >> purged before the crossover step.
> >> 
> >> As for maintaining parents in the next generation, normally
> >> "straight reproduction" is one of the ways in which 
> >> "children" (here actually clones) can be made, so some number 
> >> of good individuals (selected in a fitness proportionate way) 
> >> will make it into the next generation. But this is normally 
> >> not connected to whether or not they also served as parents, 
> >> and it is normally a very small percentage of the population 
> >> (less than 10%).
> >> 
> >> A lot may also depend on your population size, particularly
> >> since the ODD problem, having only single Boolean outputs for 
> >> each case, has a fairly coarse fitness landscape -- that is, 
> >> there aren't a lot of possible fitness values, so there 
> >> aren't smooth fitness gradients for evolution to follow, and 
> >> one generally has to compensate for this by sampling a large 
> >> space of programs. I solve it pretty fast with my Clojush 
> >> system with the default population size of 1000, but I don't 
> >> know what you're using.
> >> 
> >> -Lee
> >> 
> > 
> 
> --
> Lee Spector, Professor of Computer Science
> School of Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspector at hampshire.edu, http://hampshire.edu/lspector/
> Phone: 413-559-5352, Fax: 413-559-5438
> 
> Check out Genetic Programming and Evolvable Machines: 
> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
> 
> _______________________________________________
> Push mailing list
> Push at lists.hampshire.edu 
> https://lists.hampshire.edu/mailman/listinfo/push
>