[Cs254f11] My plan

Thomas Helmuth thGST at hampshire.edu
Tue Oct 18 11:00:08 EDT 2011


To me, it sounds like you're trying to do the following:

Each genome is a small chunk of random code that looks at the MIDI file and
calculates a number; in the machine learning literature, I believe this is
often called a "feature" of the data. To be redundant for the sake of
clarity, each of these genomes, or "features", is a function from the MIDI
file to a real number. Then, each critic takes a subset of all created
genomes, and adds them together to create a judgement on the MIDI. This may
or may not involve learning weights that the critic uses to make certain
genomes more important than others. You then plan on breeding critics by
having them swap genomes between each other, but not changing the code of
the genomes.

To me, this sounds more like a machine learning articulation of the problem
than a genetic programming articulation. If you were doing machine learning,
you would have a single critic that has all of the genomes, and then just
learn the weights to use when adding the results of applying genomes to
MIDIs. While this could potentially work, it seems unlikely that one would
be able to create the right genomes from random code to the extent necessary
to solve this problem.

I will try to expound on Lee's suggestions to try to make it more clear how
you could turn this into a genetic programming articulation of the problem.
First, as Lee said, it would be good to have each critic be a single
function, which takes a MIDI file as input and returns a judgment in the
form of a real number. At the beginning of evolution, each critic will be
random code. During evolution, you will do things to reproduce critics,
which may include things like mutation (taking a single individual and
changing some of its code to create one new critic) or crossover (taking two
or more individuals and combining their code in some way to create one new
critic). To evaluate a critic, you will likely want to run the critic on
each MIDI file, comparing how close its judgements are to yours to create a
fitness value. The instructions that you make available to your critics to
use will be the instructions that you previously planned on using in your
genomes. But, critics will be large programs that may arithmetically combine
many different aspects of the MIDI, instead of small bits of code as in your
genomes.

I hope this is helpful in understanding how your proposed direction, though
possibly very interesting, was not really what is normally called "genetic
programming".

-Tom

P.S. Lee: This thread is making me wonder if there is some general confusion
in the class about actually evolving programs. Maybe we should discuss this
more in class?

On Tue, Oct 18, 2011 at 9:35 AM, Wm. Josiah Erikson <wjerikson at hampshire.edu
> wrote:

> Right, yes, I think that randomly GENERATING the genomes to make each
> critic will be they key, but they will stay the same after they have been
> generated, otherwise evolution will be impossible. I agree with and
> understand you here.
> Where I don't understand you, I think, is where you suggest that I should
> have each genome BE a critic. I'm not sure how I would then breed the
> critics together, since I was thinking of a genome as being the smallest
> possible unit that did anything useful (and the combination of these genomes
> would be what made a critic unique). However, as I go to code this, I may
> discover what you mean, as perhaps this makes no sense in practical terms. I
> suppose one could breed the genomes together by giving the arguments that
> one genome passed to a function to the function in another genome, but I
> don't think that will create useful evolution. I was thinking of each genome
> as being a random function (I would write a set of them that performed
> operations on the file) and random argument (s) within meaningful parameters
> (like length of the line, or number of lines in the file). This doesn't
> really make for tree-based GP, though, I suppose.... hmm... not sure
> anything other than a depth of one would work properly or usefully in this
> context, though. Maybe I'm boxing myself in.
>
>    -Josiah
>
>
>
>
> On 10/17/11 8:15 PM, Lee Spector wrote:
>
>> [Re-adding the list to the cc: -- hope that's okay! I do think that others
>> will also benefit from this.]
>>
>> Comments below...
>>
>>
>> On Oct 17, 2011, at 4:22 PM, Wm. Josiah Erikson wrote:
>>
>>>    OK, so each critic is made up of a number of "scoring genomes", that
>>> are either generated randomly or pulled randomly from a large predetermined
>>> set of soup of things you could do to get a score. Add all the scoring
>>> genomes together and you get a critic, which will then evaluate each song
>>> and the total deviation of each individual score from my own will be that
>>> critic's fitness. Then I will breed together the x best critics in each
>>> generation of y members (adjustable) and run it again.
>>>
>> It sounds from the below like each "scoring genome" can itself be a pretty
>> complicated beast, do math and comparisons etc. So why couldn't *one* of
>> these be able to do all of the things that you're thinking that a bunch of
>> them could do summed together? You're saying that a critic will be the sum
>> of a bunch of things, each of which can do a bunch of things including
>> making sums of bunches of things... right? If I've got that right then I
>> think it'll be simpler and probably just as good to make each critic *be*
>> just *one* of these scoring genomes.
>>
>>     Now each scoring genome could both DO random things (like pull two
>>> random characters out from two random lines and compare them to each other,
>>> or pull a random character from a random line and compare it to the most
>>> common character in the whole file, or a million other possibilities) and
>>> could also assign scores to those things, positive, negative, who knows. It
>>> could also be that having a particular genome hard-coded with the actual
>>> positions in the files that it pulls from, once it's been randomly
>>> generated, or whatever, would be helpful and create more consistent results
>>> when breeding the resulting critic with another one.
>>>
>> My advice is to avoid any randomness in the actions or calculations of the
>> scoring genomes. In other words, the same genome, if evaluated twice on the
>> same file, should produce exactly the same score.
>>
>> It's not that I can't imagine it being useful to do random stuff (or, as
>> you suggest below, grab random data from a file) as part of a calculating a
>> score, but rather that I predict that randomness here will make evolution
>> impossible.
>>
>> If your scores are nondeterministic (giving different scores on different
>> applications to the same data) then your fitnesses (the differences between
>> the scores and your own personal scores) will also be nondeterministic. This
>> means that the "selection" performed in the evolutionary loop will be basing
>> its selections on luck to a fairly large extent, and that this luck factor
>> may well be as important as actual quality in determining who gets selected.
>> This would mean that the evolutionary loop will not be able to amplify
>> quality over generations.
>>
>> If you really want to have scoring genomes that involve randomness then I
>> think you'd have to take pretty serious countermeasures, like testing each
>> one by running it a large numbers and averaging the results. But I think
>> it'd be simpler and better just to leave out the randomness altogether.
>>
>> Of course, if you're not going to be using a random number you'll have to
>> have a non-random number and that will have to come from somewhere.
>> Presumably there could just be numbers in the genome itself, or some
>> functions could operate on *all* lines in a file or all lines of a
>> particular type, etc.
>>
>>
>>  It will be interesting to see what tools I can come up with to try to
>>> help this evolve towards something useful, i.e. what to put in the soup. The
>>> ideas I have so far involve
>>>
>>> Fetching:
>>>    -set of characters, same random position on each line of the file
>>> (that position would be in the genome and would get passed on from
>>> generation to generation)
>>>    -getting the most common character in the file
>>>    -random number of random characters from the file on different lines
>>>    -same as above except same line
>>>    -random initial number determines starting number of the position on
>>> the line, increase for each line until out of characters
>>>
>>> Operators:
>>>    -modulus division, addition, subtraction, multiplication, division,
>>> mode, mean, median, and mixtures of all of these with comparison
>>>
>>>    Maybe this is all too heterogeneous and I need to make everything take
>>> two operators or something. I'll see as I start actually implementing this.
>>> I'm going to start with a few simple genomes :)
>>>
>> All of this looks good except for the randomness...
>>
>>     I figured out the slurp problem - for some reason when I stick it in a
>>> vector of lists, it works fine:
>>>
>>> (def rita (vec (string/split-lines (slurp "/Volumes/cs254/group_storage/
>>> **josiah/LovelyRita.csv"))))
>>>
>>> (get rita (rand-int (count rita)))
>>> ;"11, 6864, Note_on_c, 9, 42, 0"
>>>
>>>
>> Well... I don't know, but this does have half of the smell of a laziness
>> problem, since calling vec on something lazy forces it to be fully realized.
>> But the doc string on split-lines says it's not lazy, and neither should
>> slurp be. So this doesn't fully make sense. Another thing that doesn't is
>> that you said you had the same problem with the slurp example in clojinc
>> with Jabberwocky.txt, and I haven't experienced any such problem... it
>> returns instantly for me.
>>
>>  -Lee
>>
>>
>>
>>
>>
>> --
>> Lee Spector, Professor of Computer Science
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>>
> --
> Wm. Josiah Erikson
> Network Engineer
> Hampshire College
> Amherst, MA 01002
> (413) 559-6091
>
> ______________________________**_________________
> Cs254f11 mailing list
> Cs254f11 at lists.hampshire.edu
> https://lists.hampshire.edu/**mailman/listinfo/cs254f11<https://lists.hampshire.edu/mailman/listinfo/cs254f11>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/cs254f11/attachments/20111018/679e33c8/attachment-0001.htm>


More information about the Cs254f11 mailing list