[Cs254f11] length of individual thoughts...

Fri Dec 2 15:18:12 EST 2011

Here's something that clearly shows the effects of lexicographic 
parsimony pressure (this is with a population size of 2000):

Generation: 29
Best error: 9
Best program: (pmod (prem (prem percent_in_mixolydian (min (min 
average_velocity percent_in_pentatonic) (max percentage_of_pitchbend 
percent_in_blues))) (min (+ (pmod (prem (prem percent_in_mixolydian (min 
(min average_velocity percent_in_pentatonic) (max 
percentage_of_pitchbend percent_in_blues))) (min (+ 40 (pd 47 
high_note)) (prem percent_in_pentatonic percent_in_lydian))) 
std_dev_mostcom) percentage_of_percussion) average_velocity)) 
std_dev_mostcom)
  Median error: 19
  Average program size: 8.0525
Generation: 30
Best error: 9
Best program: (min (prem percent_in_pentatonic 11) (pd (min (min 
percent_in_pentatonic high_note) percent_in_lydian) std_dev_mostcom))
  Median error: 19
  Average program size: 8.1925

Without lexicographic parsimony pressure and no bloat control method at 
all, my programs, by this generation, were getting up into 40-50 in 
size, which I think (well no, I know) is larger than necessary.

I think I'll try double tournament next, since that article suggests 
that it's a very good method. That seems intuitive to me.

     -Josiah

On 12/2/11 2:56 PM, Wm. Josiah Erikson wrote:
> Neat, I didn't know I was programming lexicographic parsimony 
> pressure. That makes me feel very important indeed :)
>
> Reading the article... not sure I'll get through the entire thing, but 
> thanks for the light reading!
>
>     -Josiah
>
>
> On 12/2/11 2:46 PM, Lee Spector wrote:
>> Ah, you have come to a pretty deep vein of genetic programming 
>> research, on bloat control methods (of which this is one, for which 
>> you could use a fancy name like "lexicographic parsimony pressure").
>>
>> Short answer: Maybe! Depends! On what? Not sure! It'll usually make 
>> things smaller but at what cost in terms of problem solving?
>>
>> Somewhat longer: Certainly a good idea to try sometimes. Or if you're 
>> mainly interested in readability then you might do it only late in a 
>> run (on which there's some literature).
>>
>> The tip of the iceberg: In my journal we've just published a new and 
>> I think very good contribution to the bloat literature, which 
>> includes a good overview of related work. I've put a link to it on 
>> the class moodle page, with the name "Silva et al. GPEM article on 
>> bloat" (it's with the other links under the syllabus info, above the 
>> calendar).
>>
>>   -Lee
>>
>>
>>
>> On Dec 2, 2011, at 2:33 PM, Wm. Josiah Erikson wrote:
>>
>>> Hehe. Yes. Now I pose an interesting question (well, at least *I* 
>>> think it's interesting):
>>>
>>> Does doing this in one's tournament selection (preferring smaller 
>>> individuals with the same error) prevent program bloat? (Yes, I 
>>> think, from preliminary experiments) And does it inhibit evoluation?
>>>
>>> I'm not sure yet on the second question, but I think the answer is 
>>> probably yes as well. It certainly does with small population sizes, 
>>> like 100 - you just hit a local minimum and sit there almost 
>>> immediately. With a population size of 1000, given what I'm doing, 
>>> it seems less clear that this is occurring, and the things I'm 
>>> getting are certainly a lot shorter and more readable! This doesn't 
>>> necessarily mean they are better, but perhaps it's easier to discern 
>>> what they are actually doing and if it's at all interesting 
>>> (probably not).
>>>
>>>     -Josiah
>>>
>>>
>>>
>>> On 12/2/11 2:16 PM, Lee Spector wrote:
>>>> On Dec 2, 2011, at 2:02 PM, Wm. Josiah Erikson wrote:
>>>>
>>>>> You interpreted what I was doing correctly, and actually that 
>>>>> "mistake" you pointed out was intentional, because to do it 
>>>>> "right", the line gets even more insane:
>>>>>
>>>>> (first (first (sort #(<   (codesize (first %1)) (codesize (first 
>>>>> %2))) (filter #(= (second %) (second (first sorted))) sorted))))
>>>>>
>>>>> And it doesn't gain you anything.
>>>>>
>>>> Yeah, well...... in a sense, but in another it gains you clarity of 
>>>> purpose when you try to read/change your code, which is actually 
>>>> pretty valuable.
>>>>
>>>>
>>>>> But I don't quite know how to write the second thing you suggest. 
>>>>> Let's take a stab:
>>>>>
>>>>> (let
>>>>>       [sorted (sort (fn [[prog1 err1] [prog2 err2]] (or (<   err1 
>>>>> err2)(<   (codesize prog1) (codesize prog2)))) population)]
>>>> Close, but not quite.
>>>>
>>>> First, let's reformat this -- I can't stress enough how much this 
>>>> can help, at least for me, and I suspect for everyone:
>>>>
>>>> (let [sorted (sort (fn [[prog1 err1] [prog2 err2]]
>>>>                       (or (<   err1 err2)
>>>>                           (<   (codesize prog1)
>>>>                              (codesize prog2))))
>>>>                     population)]
>>>>
>>>> We're sorting population by a function of two arguments that are 
>>>> destructured into program/error pairs, and the guts of the thing is:
>>>>
>>>> (or (<   err1 err2)
>>>>      (<   (codesize prog1)
>>>>         (codesize prog2)))
>>>>
>>>> So you're telling sort that pair number 1 comes before pair number 
>>>> 2 if its error is lower OR its size is smaller. Which means it'll 
>>>> make it come first if it's smaller, even if it's ... AHA -- you 
>>>> discovered this and emailed me mid sentence!
>>>>
>>>> Now you propose (reformatted!):
>>>>
>>>> (let
>>>>       [sorted (sort (fn [[prog1 err1] [prog2 err2]]
>>>>                       (or (<   err1 err2)
>>>>                           (and (<   (codesize prog1) (codesize prog2))
>>>>                                (= err1 err2))))
>>>>                     population)]
>>>>
>>>> And that looks right to me!
>>>>
>>>>   -Lee
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> ...
>>>>>
>>>>> I think that's what you meant, and is, indeed, a better idea. 
>>>>> Thanks :)
>>>>>
>>>>>     -Josiah
>>>>>
>>>>>
>>>>>
>>>>> On 12/2/11 12:20 PM, Lee Spector wrote:
>>>>>> Close.... or rather, I think it will work but slightly by accident.
>>>>>>
>>>>>> I'm assuming that "sorted" here contains a sequence of [program 
>>>>>> error] pairs -- note that in some of my example code I had those 
>>>>>> pairs the other way around, so make sure you're treating them 
>>>>>> consistently!
>>>>>>
>>>>>> In any event, with that assumption (which is the only thing that 
>>>>>> makes sense given the rest of your code) your call to filter is 
>>>>>> producing a sequence of only those that have the same error as 
>>>>>> the first one (which will be the best). Then you're sorting those 
>>>>>> on the basis of which has the lowest codesize, but you're calling 
>>>>>> codesize on the whole pairs rather than on just the programs. 
>>>>>> Which I guess always comes to the same thing, since the error 
>>>>>> will just add 1 to everything uniformly. So it'll do the right 
>>>>>> thing but not quite in the right way (which you could do by 
>>>>>> calling codesize on just the programs).
>>>>>>
>>>>>> Alternatively, and this is probably what I'd do, you could change 
>>>>>> the predicate used in the call to sort that gives a value to 
>>>>>> "sorted" it say that %1 should come before %2 if either %1 has a 
>>>>>> lower error OR if they have the same error by the program in %1 
>>>>>> has a lower codesize.
>>>>>>
>>>>>>   -Lee
>>>>>>
>>>>>>
>>>>>> On Dec 2, 2011, at 11:24 AM, Wm. Josiah Erikson wrote:
>>>>>>
>>>>>>> So in the way I'm evolving things, I've often got a population 
>>>>>>> full of individuals with similar error. Given Lee's
>>>>>>>
>>>>>>> (let [sorted (sort #(<    (second %1) (second %2)) population)]
>>>>>>>
>>>>>>> would this function get the smallest individual that had the 
>>>>>>> best error? :
>>>>>>>
>>>>>>> (first (first (sort #(<    (codesize %1) (codesize %2)) (filter 
>>>>>>> #(= (second %) (second (first sorted))) sorted))))
>>>>>>>
>>>>>>> (It seems to work)
>>>>>>>
>>>>>>> Is there an easier way to do that? Is this even a good idea? I'm 
>>>>>>> currently just using it as the final return value so that my 
>>>>>>> final result is as easy to read as possible :)
>>>>>>> _______________________________________________
>>>>>>> Cs254f11 mailing list
>>>>>>> Cs254f11 at lists.hampshire.edu
>>>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>>>> -- 
>>>>>> Lee Spector, Professor of Computer Science
>>>>>> Cognitive Science, Hampshire College
>>>>>> 893 West Street, Amherst, MA 01002-3359
>>>>>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>>>>>> Phone: 413-559-5352, Fax: 413-559-5438
>>>>>>
>>>>>> _______________________________________________
>>>>>> Cs254f11 mailing list
>>>>>> Cs254f11 at lists.hampshire.edu
>>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>>> -- 
>>>>> Wm. Josiah Erikson
>>>>> Network Engineer
>>>>> Hampshire College
>>>>> Amherst, MA 01002
>>>>> (413) 559-6091
>>>>>
>>>>> _______________________________________________
>>>>> Cs254f11 mailing list
>>>>> Cs254f11 at lists.hampshire.edu
>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>> -- 
>>>> Lee Spector, Professor of Computer Science
>>>> Cognitive Science, Hampshire College
>>>> 893 West Street, Amherst, MA 01002-3359
>>>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>>>> Phone: 413-559-5352, Fax: 413-559-5438
>>>>
>>> -- 
>>> Wm. Josiah Erikson
>>> Network Engineer
>>> Hampshire College
>>> Amherst, MA 01002
>>> (413) 559-6091
>>>
>> -- 
>> Lee Spector, Professor of Computer Science
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>

-- 
Wm. Josiah Erikson
Network Engineer
Hampshire College
Amherst, MA 01002
(413) 559-6091