[Cs254f11] length of individual thoughts...
Wm. Josiah Erikson
wjerikson at hampshire.edu
Fri Dec 2 15:18:12 EST 2011
Here's something that clearly shows the effects of lexicographic
parsimony pressure (this is with a population size of 2000):
Generation: 29
Best error: 9
Best program: (pmod (prem (prem percent_in_mixolydian (min (min
average_velocity percent_in_pentatonic) (max percentage_of_pitchbend
percent_in_blues))) (min (+ (pmod (prem (prem percent_in_mixolydian (min
(min average_velocity percent_in_pentatonic) (max
percentage_of_pitchbend percent_in_blues))) (min (+ 40 (pd 47
high_note)) (prem percent_in_pentatonic percent_in_lydian)))
std_dev_mostcom) percentage_of_percussion) average_velocity))
std_dev_mostcom)
Median error: 19
Average program size: 8.0525
Generation: 30
Best error: 9
Best program: (min (prem percent_in_pentatonic 11) (pd (min (min
percent_in_pentatonic high_note) percent_in_lydian) std_dev_mostcom))
Median error: 19
Average program size: 8.1925
Without lexicographic parsimony pressure and no bloat control method at
all, my programs, by this generation, were getting up into 40-50 in
size, which I think (well no, I know) is larger than necessary.
I think I'll try double tournament next, since that article suggests
that it's a very good method. That seems intuitive to me.
-Josiah
On 12/2/11 2:56 PM, Wm. Josiah Erikson wrote:
> Neat, I didn't know I was programming lexicographic parsimony
> pressure. That makes me feel very important indeed :)
>
> Reading the article... not sure I'll get through the entire thing, but
> thanks for the light reading!
>
> -Josiah
>
>
> On 12/2/11 2:46 PM, Lee Spector wrote:
>> Ah, you have come to a pretty deep vein of genetic programming
>> research, on bloat control methods (of which this is one, for which
>> you could use a fancy name like "lexicographic parsimony pressure").
>>
>> Short answer: Maybe! Depends! On what? Not sure! It'll usually make
>> things smaller but at what cost in terms of problem solving?
>>
>> Somewhat longer: Certainly a good idea to try sometimes. Or if you're
>> mainly interested in readability then you might do it only late in a
>> run (on which there's some literature).
>>
>> The tip of the iceberg: In my journal we've just published a new and
>> I think very good contribution to the bloat literature, which
>> includes a good overview of related work. I've put a link to it on
>> the class moodle page, with the name "Silva et al. GPEM article on
>> bloat" (it's with the other links under the syllabus info, above the
>> calendar).
>>
>> -Lee
>>
>>
>>
>> On Dec 2, 2011, at 2:33 PM, Wm. Josiah Erikson wrote:
>>
>>> Hehe. Yes. Now I pose an interesting question (well, at least *I*
>>> think it's interesting):
>>>
>>> Does doing this in one's tournament selection (preferring smaller
>>> individuals with the same error) prevent program bloat? (Yes, I
>>> think, from preliminary experiments) And does it inhibit evoluation?
>>>
>>> I'm not sure yet on the second question, but I think the answer is
>>> probably yes as well. It certainly does with small population sizes,
>>> like 100 - you just hit a local minimum and sit there almost
>>> immediately. With a population size of 1000, given what I'm doing,
>>> it seems less clear that this is occurring, and the things I'm
>>> getting are certainly a lot shorter and more readable! This doesn't
>>> necessarily mean they are better, but perhaps it's easier to discern
>>> what they are actually doing and if it's at all interesting
>>> (probably not).
>>>
>>> -Josiah
>>>
>>>
>>>
>>> On 12/2/11 2:16 PM, Lee Spector wrote:
>>>> On Dec 2, 2011, at 2:02 PM, Wm. Josiah Erikson wrote:
>>>>
>>>>> You interpreted what I was doing correctly, and actually that
>>>>> "mistake" you pointed out was intentional, because to do it
>>>>> "right", the line gets even more insane:
>>>>>
>>>>> (first (first (sort #(< (codesize (first %1)) (codesize (first
>>>>> %2))) (filter #(= (second %) (second (first sorted))) sorted))))
>>>>>
>>>>> And it doesn't gain you anything.
>>>>>
>>>> Yeah, well...... in a sense, but in another it gains you clarity of
>>>> purpose when you try to read/change your code, which is actually
>>>> pretty valuable.
>>>>
>>>>
>>>>> But I don't quite know how to write the second thing you suggest.
>>>>> Let's take a stab:
>>>>>
>>>>> (let
>>>>> [sorted (sort (fn [[prog1 err1] [prog2 err2]] (or (< err1
>>>>> err2)(< (codesize prog1) (codesize prog2)))) population)]
>>>> Close, but not quite.
>>>>
>>>> First, let's reformat this -- I can't stress enough how much this
>>>> can help, at least for me, and I suspect for everyone:
>>>>
>>>> (let [sorted (sort (fn [[prog1 err1] [prog2 err2]]
>>>> (or (< err1 err2)
>>>> (< (codesize prog1)
>>>> (codesize prog2))))
>>>> population)]
>>>>
>>>> We're sorting population by a function of two arguments that are
>>>> destructured into program/error pairs, and the guts of the thing is:
>>>>
>>>> (or (< err1 err2)
>>>> (< (codesize prog1)
>>>> (codesize prog2)))
>>>>
>>>> So you're telling sort that pair number 1 comes before pair number
>>>> 2 if its error is lower OR its size is smaller. Which means it'll
>>>> make it come first if it's smaller, even if it's ... AHA -- you
>>>> discovered this and emailed me mid sentence!
>>>>
>>>> Now you propose (reformatted!):
>>>>
>>>> (let
>>>> [sorted (sort (fn [[prog1 err1] [prog2 err2]]
>>>> (or (< err1 err2)
>>>> (and (< (codesize prog1) (codesize prog2))
>>>> (= err1 err2))))
>>>> population)]
>>>>
>>>> And that looks right to me!
>>>>
>>>> -Lee
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> ...
>>>>>
>>>>> I think that's what you meant, and is, indeed, a better idea.
>>>>> Thanks :)
>>>>>
>>>>> -Josiah
>>>>>
>>>>>
>>>>>
>>>>> On 12/2/11 12:20 PM, Lee Spector wrote:
>>>>>> Close.... or rather, I think it will work but slightly by accident.
>>>>>>
>>>>>> I'm assuming that "sorted" here contains a sequence of [program
>>>>>> error] pairs -- note that in some of my example code I had those
>>>>>> pairs the other way around, so make sure you're treating them
>>>>>> consistently!
>>>>>>
>>>>>> In any event, with that assumption (which is the only thing that
>>>>>> makes sense given the rest of your code) your call to filter is
>>>>>> producing a sequence of only those that have the same error as
>>>>>> the first one (which will be the best). Then you're sorting those
>>>>>> on the basis of which has the lowest codesize, but you're calling
>>>>>> codesize on the whole pairs rather than on just the programs.
>>>>>> Which I guess always comes to the same thing, since the error
>>>>>> will just add 1 to everything uniformly. So it'll do the right
>>>>>> thing but not quite in the right way (which you could do by
>>>>>> calling codesize on just the programs).
>>>>>>
>>>>>> Alternatively, and this is probably what I'd do, you could change
>>>>>> the predicate used in the call to sort that gives a value to
>>>>>> "sorted" it say that %1 should come before %2 if either %1 has a
>>>>>> lower error OR if they have the same error by the program in %1
>>>>>> has a lower codesize.
>>>>>>
>>>>>> -Lee
>>>>>>
>>>>>>
>>>>>> On Dec 2, 2011, at 11:24 AM, Wm. Josiah Erikson wrote:
>>>>>>
>>>>>>> So in the way I'm evolving things, I've often got a population
>>>>>>> full of individuals with similar error. Given Lee's
>>>>>>>
>>>>>>> (let [sorted (sort #(< (second %1) (second %2)) population)]
>>>>>>>
>>>>>>> would this function get the smallest individual that had the
>>>>>>> best error? :
>>>>>>>
>>>>>>> (first (first (sort #(< (codesize %1) (codesize %2)) (filter
>>>>>>> #(= (second %) (second (first sorted))) sorted))))
>>>>>>>
>>>>>>> (It seems to work)
>>>>>>>
>>>>>>> Is there an easier way to do that? Is this even a good idea? I'm
>>>>>>> currently just using it as the final return value so that my
>>>>>>> final result is as easy to read as possible :)
>>>>>>> _______________________________________________
>>>>>>> Cs254f11 mailing list
>>>>>>> Cs254f11 at lists.hampshire.edu
>>>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>>>> --
>>>>>> Lee Spector, Professor of Computer Science
>>>>>> Cognitive Science, Hampshire College
>>>>>> 893 West Street, Amherst, MA 01002-3359
>>>>>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>>>>>> Phone: 413-559-5352, Fax: 413-559-5438
>>>>>>
>>>>>> _______________________________________________
>>>>>> Cs254f11 mailing list
>>>>>> Cs254f11 at lists.hampshire.edu
>>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>>> --
>>>>> Wm. Josiah Erikson
>>>>> Network Engineer
>>>>> Hampshire College
>>>>> Amherst, MA 01002
>>>>> (413) 559-6091
>>>>>
>>>>> _______________________________________________
>>>>> Cs254f11 mailing list
>>>>> Cs254f11 at lists.hampshire.edu
>>>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>>> --
>>>> Lee Spector, Professor of Computer Science
>>>> Cognitive Science, Hampshire College
>>>> 893 West Street, Amherst, MA 01002-3359
>>>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>>>> Phone: 413-559-5352, Fax: 413-559-5438
>>>>
>>> --
>>> Wm. Josiah Erikson
>>> Network Engineer
>>> Hampshire College
>>> Amherst, MA 01002
>>> (413) 559-6091
>>>
>> --
>> Lee Spector, Professor of Computer Science
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>
--
Wm. Josiah Erikson
Network Engineer
Hampshire College
Amherst, MA 01002
(413) 559-6091
More information about the Cs254f11
mailing list