[Cs254f11] My plan

Mon Oct 17 15:55:22 EDT 2011

I think my confusion on this is because you're using "fitness test" to describe the things that you're evolving, but there's also the "fitness test" that says how good each of those things is, and I keep getting them all mixed up when I read/hear what you're doing.

Let's call the things that you're evolving "critics." Each of these takes a MIDI file and returns a number saying how good the music in the file is. We'll call that returned number the critic's "judgement" of the MIDI file. You're trying to evolve a critic that produces judgements that match your own. One *might* use the word "fitness" for judgements and/or "fitness test" for critics but let's not! You're evolving critics that take MIDI files and produce judgements.

The "fitness function," on the other hand, is a function that takes a critic and returns a number that says how good the critic is, which we'll call the critic's "fitness."

Can you say what you're doing and/or ask about what you might do with this terminological switch? Then I think I might be able to answer without getting pretzel brain :-)

 -Lee

On Oct 17, 2011, at 1:08 PM, wjeNS at hampshire.edu wrote:

> Thanks to you both for the excellent thoughts.
> 
> My original theory was to not teach it about notes at all, but perhaps that
> won't work. I can certainly easily write high-level functions that will pull
> notes out. I think I'd like to try it without teaching it anything about what
> notes are first, just to see what happens.
> 
> And on your new point, Lee, I think that I was unclear. I didn't mean to do
> random things IN the fitness test, but rather to randomly generate the initial
> generation of the fitness test itself, obviously (that's what makes it GP,
> right?). I was wondering whether having part of the genome be the value
> assigned by each piece of the fitness test that makes up each whole fitness
> test (because each individual fitness test will be made up of little genomes
> that are both tests and values assigned if it passes each individual test).
> What I mean is that the "soup" that I can make random fitness tests out of will
> be little tests that will assign values between x and y if it "succeeds". Or
> perhaps I'm thinking about this totally wrong.
> 
> 
> 
> Quoting Lee Spector <lspector at hampshire.edu>:
> 
>> 
>> What Tom said :-)
>> 
>> On his first point, try to structure things so that the population size and
>> number of survivors are parameters, so that you can play with them easily if
>> you find that you want to.
>> 
>> On the second, I know that you don't want to pre-digest the data too much,
>> but it still might be a good idea to provide some reasonably-high-level
>> functions for evolution to work with, e.g. to pull out all of the note
>> numbers within a chunk of the file, or do averaging of notes or intervals, or
>> something like that. Hard to say what will be useful, and it depends on what
>> else will be in your function set and what data types your functions will
>> work with, but often it will help to provide evolution with a reasonable
>> toolkit.
>> 
>> On the third: definitely excellent advice.
>> 
>> A new point: I don't see why you'd want to do *anything* random in your
>> fitness test. Don't you just want to evaluate the fitness of a program by
>> running it on each MIDI file, getting a score for each, and adding up the
>> errors of those scores relative to your hand-coded scores?
>> 
>> -Lee
>> 
>> On Oct 17, 2011, at 12:12 PM, Thomas Helmuth wrote:
>> 
>>> A few thoughts, which might be useful to everyone:
>>> 
>>> - Why limit yourself to 2 survivors and 20 individuals (soup-mixes) each
>> generation? This small of a population and of surviving individuals can
>> sometimes work, but is generally unpreferred in the evolutionary computation
>> community. One reason is that having very few surviving individuals is not
>> very robust - you may lose promising individuals to ones that are only
>> slightly better in a totally different way. Similarly, this will cause your
>> algorithm to only concentrate in a small area of the search space, instead of
>> using the power of a larger number of surviving individuals to search the
>> space more in parallel, which also helps prevent it from getting stuck in
>> local optima. I would suggest using at least 100 to 200 individuals and 10 to
>> 50 surviving individuals that are allowed to reproduce. (One reason you may
>> choose to go with a smaller population is if your fitness evaluations are
>> going to be very computationally expensive. But, it doesn't seem like that
>> will be the case here).
>>> 
>>> - I think you need to think a bit about how your individuals will input
>> your midi files. Are you going to give evolution instructions to read the
>> whole file, read a specific line of the file, or read a specific value from
>> the CSV? If it is a specific value, will it be indexed by line and column, or
>> just by a single integer? To me, it seems like any of these methods might not
>> be strong enough to give evolution the things it needs, so you might need to
>> give it more specialized input methods, such as the interval between a pair
>> of notes, etc. Though, I might be wrong here.
>>> 
>>> - You're a bit vague about your program language and structure. Will these
>> be lisp expressions? What instructions will be included? It is always good
>> when developing a genetic algorithm to write a sample program by hand that
>> won't solve your problem, but which you think might have a chance at doing
>> better than random junk. If you find you can't do so with the instructions
>> you give evolution, chances are that it won't do any better.
>>> 
>>> I hope this helps!
>>> 
>>> -Tom
>>> 
>>> On Mon, Oct 17, 2011 at 11:13 AM, Wm. Josiah Erikson
>> <wjerikson at hampshire.edu> wrote:
>>> So I'm now messing about with reading in files and putting them into data
>> structures that I can do useful things with, but before I get too deep into
>> committing myself to particular methods, I want some people who know more
>> than me to read over my plan and tell me if they see any big problems with
>> this other than the questions I have already identified below that I will
>> have to figure out. Comments from my code:
>>> 
>>> ;I'm going to attempt to evolve a program that has the same opinions of
>> various MIDI files that I do.
>>> ;I'm going to listen to 20-50 publicly available MIDI files and score them,
>> then hardcode those scores
>>> ;into this program. Then I'm going to covert said MIDI files into .csv, and
>> attempt to evolve some scoring
>>> ;method that comes up with the same scores for these files that I do. I'm
>> going to do this without teaching
>>> ;my program very much about how the MIDI file is formatted, so my
>> evolutionary "soup" will be fairly dumb.
>>> ;The idea behind this is that one of the things that GP does well is to
>> find patterns in things that we don't
>>> ;understand very well, and in that light, I ought to be able to evolve a
>> music critic without teaching it the
>>> ;format of MIDI files - it will try sampling the MIDI file in lots of
>> random ways, applying random scores (think
>>> ;this through - maybe actually I have to always apply the same scores? I
>> can't have randomness twice, unless
>>> ;both randomnesses are factored into making that "genome" unique, otherwise
>> I'll get faulty evolution), then
>>> ;applying the same set of "soup components" to all of the MIDI files,
>> comparing the scores to my scores, adding
>>> ;up the difference, and then that will be the fitness value of that
>> particular "soup mix" or genetic code.
>>> ;Each generation will have, say, 20 different "soup mixes". I will pick the
>> two that have the best fitness values,
>>> ;breed them together and randomly mutate them (how, exactly?) to get 20 new
>> ones, them start over. I will stop
>>> ;after either an arbitrary number of generations or when the fitness value
>> goes below x, and then see what I get!
>>> 
>>> Thanks in advance for any feedback!
>>> 
>>> --
>>> Wm. Josiah Erikson
>>> Network Engineer
>>> Hampshire College
>>> Amherst, MA 01002
>>> (413) 559-6091
>>> 
>>> _______________________________________________
>>> Cs254f11 mailing list
>>> Cs254f11 at lists.hampshire.edu
>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>>> 
>>> _______________________________________________
>>> Cs254f11 mailing list
>>> Cs254f11 at lists.hampshire.edu
>>> https://lists.hampshire.edu/mailman/listinfo/cs254f11
>> 
>> --
>> Lee Spector, Professor of Computer Science
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>> 
>> 
> 
> 

--
Lee Spector, Professor of Computer Science
Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspector at hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438