Talk:Evolutionary algorithm: Difference between revisions

Content added Content deleted

Inline

Revision as of 21:16, 1 October 2011

A Priori Assumptions?

Looking at this project, I wonder how it demonstrates evolution (in the classic goo-to-you-via-the-zoo sense.) "Methinks it is like a weasel" is like getting in a time machine and going back to the beginning with the express purpose of directing an undirected process to produce a weasel. At the beginning there exists no information for weasels. Starting with no information, we have to come up with the information to generate a weasel (if indeed the process ever generates one) through natural selection (a process that reduces populations) and random mutation (a process that can only work if there's something there already to mutate).

Apologies if I'm not making myself clear here, I'm just struggling with the concept as a whole: here we are at the end of the process saying what constitutes fitness and then using that as the means for directing a process which is undirected to come up with the fit thing that was unknown (unknowable?) at the outset. Okay, it doesn't stop us writing software which compares implementations across languages. Maybe I should just be satisfied with that. That and hang on to this weird belief that spontaneous generation is demonstrably impossible but had to have happened at least once. Axtens 06:29, 16 June 2011 (UTC)

If you are getting lost, I would focus on these issues:

This is a search algorithm
The algorithm favors results which maximize (or minimize, depending on how you implement it) its "fitness function"
Computer implementations are radically simpler than anything that goes on in real life

In other words:

On the one hand, I would not over project the simplicity of a toy program like this onto real life and expect too much out of it.

On the other hand, I would also not over project the complexities of real life onto a program when trying to read or write working code. --Rdm 12:00, 16 June 2011 (UTC)

Thanks, Rdm, that makes it a whole lot clearer. Axtens 15:07, 16 June 2011 (UTC)

Dawkins explains the difficulty of this model and clarifies it's purpose (full quote from Wikipedia below). The purpose of the algorithm is to model cumulative selection. He goes on to state that evolution has no "long-distance target" and that is "absurd notion" - giving the example of humans being the final target. As explained by Rdm, this is a search algorithm. That means it gets to its target with absolute certainty, every time. Only the path to the target varies. So it's not trying to properly model evolutionary process but it aims to show how significant change is possible incrementally (cumulatively) not by "blind chance" in a single iteration.

"Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective 'breeding', the mutant 'progeny' phrases were judged according to the criterion of resemblance to a distant ideal target, the phrase METHINKS IT IS LIKE A WEASEL. Life isn't like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success." - http://en.wikipedia.org/wiki/Weasel_program - --Davidj 18:15, 1 October 2011 (UTC)

Actually, whether the target exists, and whether it is reliably reached, depends in part on both the search space and the fitness function. Anyways, I would not overgeneralize too much, based on this particular example. --Rdm 19:31, 1 October 2011 (UTC)

Adherence to similarly named variables/functions?

When I added the text:

Note: to aid comparison, try and ensure the variables and functions mentioned in the task description appear in solutions

I was not sure if I might be "stepping on implementers toes" too much, so if your languages normal idioms would be broken by the above, then please discuss, although caSe changes/lngth changes should just be done. (I would expect it to be broken by languages created solely to be obtuse such as brainf*!ck, and calculator programs might have limits on name sizes). --Paddy3118 03:40, 7 October 2009 (UTC)

Controversy on Wikipedia

WP mentions that this is a controversial algorithm (with all the noise apparently coming from people who don't grok what it is actually showing, naturally). However, there are only really two controversial points:

This is (somewhat) directed evolution, though this mostly just affects the fitness function.
There is an end-point to the evolution (only really needed to make it into an algorithm rather than an ongoing process).

These issues can be technically resolved by making only the fitness function know what the target is and having the algorithm only terminate once perfect fitness is achieved (i.e., when no possible mutation will improve things). At that point, the scientific issues are moot, and the non-scientific ones dispute whether the algorithm should even exist, which is out of scope for this site anyway. As a plus side, it also makes the presentation of the algorithm more aesthetically pleasing as it removes the assumption that there is only one possible maxima (informal testing using a fitness function that uses the max of individual fitness functions as in this task, except with different target strings, indicates that this works quite well). —Donal Fellows 10:12, 9 October 2009 (UTC)

That controversy does not currently seem to be there, and apparently it has been replaced by a critique of the limitations of these kinds of algorithms. Also, detecting that an absolute maxima has been reached may be prohibitively expensive. --Rdm 19:38, 2 September 2010 (UTC)

Tacit J solution

I was wanting to add a purely tacit (arguments not explicitly referenced) J solution. I came up with the following two options and am seeking opinions on which one is "better".
VERSION ONE
Uses a user-defined adverb and conjunction that help keep the rest of the code tidy. <lang j>CHARSET=: 'ABCDEFGHIJKLMNOPQRSTUVWXYZ ' NPROG=: 100 NB. number of progeny (C) MRATE=: 0.05 NB. mutation rate

create =: (?@$&$ { ])&CHARSET NB. creates random list from charset of same shape as y fitness =: +/@:~:"1 copy =: # ,: mutate =: &(>: $ ?@$ 0:)(`(,: create))} NB. adverb select =: ] {~ (i. <./)@:fitness NB. select fittest member of population

nextgen =: select ] , [: MRATE mutate NPROG copy ] while =: conjunction def '(] , (u {:))^:(v {:)^:_ ,:'

evolve=: nextgen while (0 < fitness) create</lang>

VERSION TWO
Only uses verbs (functions) which are can be easier to understand/parse, especially to start with.

<lang j>CHARSET=: 'ABCDEFGHIJKLMNOPQRSTUVWXYZ ' NPROG=: 100 NB. number of progeny (C)

create =: (?@$&$ { ])&CHARSET NB. get random list from charset of same shape as y fitness =: +/@:~:"1 copy =: # ,: mrate =: %@:# NB. mutation rate mutate =: (0&{:: >: $@(1&{::) ?@$ 0:)`((,: create)@(1&{::))}@; select =: ] {~ (i. <./)@:fitness NB. select fittest member of population

nextgen =: ] , [ select {:@] , mrate@[ mutate NPROG copy {:@] notperfect =: (0 < fitness) {:

evolve=: nextgen ^: notperfect ^:_ ,:@create</lang> --Tikkanz 00:53, 4 November 2009 (UTC)

Put them both in, while explaining the ways in which each is better than the other (perhaps with a little more explanatory text than above). It's a shame that there isn't highlighting for J yet (especially the comments) as that would aid reading. –Donal Fellows 08:59, 5 November 2009 (UTC)

Problem in previous Python implementation

In the first Python example: <lang Python>def mutaterate():

   'Less mutation the closer the fit of the parent'
   #This formula does not work as intended because it performs int division
   return 1-((perfectfitness - fitness(parent)) / perfectfitness * (1 - minmutaterate))</lang>
For ex-  ((20 - 10)/20*(1-0.09)) will evaluate to 0
Thus it gets stuck in an infinite loop waiting for mutation to occur.
Edited the program to avoid this by casting perfectfitness as float.

--lookog --14:22, 26 February 2010 (UTC)

I should have stated that the original was created on Python 3.x. Thanks for making it work on 2.X too. --Paddy3118 23:10, 26 February 2010 (UTC)

Are the Python and C++ solution cheating?

The Python solution makes the mutation rate depend on the distance to the target. This sounds to me like cheating, because the target (and therefore the distance to it) should ideally be unknown. Note that I don't see a principal problem with modifying the mutation rate; the problem is using information about the distance to the target in determining it. The mutation rate could well itself evolve, but the knowledge about the target should only be used for selection, not for mutation. --Ce 09:00, 1 September 2010 (UTC)

Python

   def mutaterate():
       'Less mutation the closer the fit of the parent'
       return 1-((perfectfitness - fitness(parent)) / perfectfitness * (1 - minmutaterate))

C++

   double const mutation_rate = 0.02 + (0.9*fitness)/initial_fitness;

Looks like it's within the letter of the task, but as to the spirit, I don't know. Intuition tells me that knowing the true distance to optimal will help avoid problems of local minima, and some (but certainly not all!) problems that evolutionary algorithms are applied to have true distance (or a close approximation of such) available. This may be a good case for splitting the task and specifying a goal-agnostic algorithm. --Michael Mol 16:19, 1 September 2010 (UTC)

In a real problem, you've got a high-dimensional space that you're searching and the fitness function is only poorly known (the profusion of species is clear demonstration that there are many local minima in the problem space that is biology). However, the only effect of varying the mutation rate with fitness, given that we have a reasonable metric, is that it results in faster convergence with smaller populations at each step. It doesn't change the fact that you're still having to do the evolution towards a solution through random variation and selection, which is the whole point. –Donal Fellows 08:26, 2 September 2010 (UTC)

Varying the mutation rate is not necessarily cheating but it is deviating from Richard Dawkins' purpose of demonstrating "random variation combined with non-random cumulative selection". The Weasel model uses a mutator and a selector. The mutator is intended to be random while the selector is non-random. If you add a non-random process to the mutator it breaks down the whole purpose of Dawkins' model. I don't understand why it's necessary to vary the mutation rate in the model. Is there biological evidence that nature reduces mutations when we near the ideal target? Dawkins states the notion of the ideal target is "absurd". It's important to stick with the purpose of the model and not change the essence of the model to simple converge more quickly. It's not a competition about who has the most rapidly converging model. --Davidj 18:02, 1 October 2011 (UTC)

Out of curiosity, could the author of the C++ solution explain the constants 0.02 and 0.9 used to calculate the mutation_rate. Thank you. (double const mutation_rate = 0.02 + (0.9*fitness)/initial_fitness;) --Davidj 18:02, 1 October 2011 (UTC)

"Official" Algorithm

Is there an "official" algorithm or program? Has Dawkins published the algorithm he has used? Just looking at the C & the C++, there are non trivial differences.

Also just want to flag that I'm making two minor changes to the C program. I'm removing the hardcoded "27" (number of characters A..Z) from a couple of formulas and introducing a constant called POSSIBILITIES that is calculated from the number of characters that are possible. Also doing a cast to INT in two places because of a type compile error. --Davidj 18:54, 1 October 2011 (UTC)

@@ Line 103: / Line 103: @@
 : Looks like it's within the letter of the task, but as to the spirit, I don't know. Intuition tells me that knowing the true distance to optimal will help avoid problems of local minima, and some (but certainly not all!) problems that evolutionary algorithms are applied to have true distance (or a close approximation of such) available. This may be a good case for splitting the task and specifying a goal-agnostic algorithm. --[[User:Short Circuit|Michael Mol]] 16:19, 1 September 2010 (UTC)
 : In a real problem, you've got a high-dimensional space that you're searching and the fitness function is only poorly known (the profusion of species is clear demonstration that there are many local minima in the problem space that is biology). However, the only effect of varying the mutation rate with fitness, given that we have a reasonable metric, is that it results in faster convergence with smaller populations at each step. It doesn't change the fact that you're ''still'' having to do the evolution towards a solution through random variation and selection, which is the whole point. –[[User:Dkf|Donal Fellows]] 08:26, 2 September 2010 (UTC)
-: Varying the mutation rate is necessarily cheating but it is deviating from Richard Dawkins' purpose of demonstrating "random variation combined with non-random cumulative selection". The Weasel model uses a mutator and a selector. The mutator is intended to be random while the selector is non-random. If you add a non-random process to the mutator it breaks down the whole purpose of Dawkins' model. I don't understand why it's necessary to vary the mutation rate in the model. Is there biological evidence that nature reduces mutations when we near the ideal target? Dawkins states the notion of the ideal target is "absurd". It's important to stick with the purpose of the model and not change the essence of the model to simple converge more quickly. It's not a competition about who has the most rapidly converging model. --[[User:Davidj|Davidj]] 18:02, 1 October 2011 (UTC)
+: Varying the mutation rate is not necessarily cheating but it is deviating from Richard Dawkins' purpose of demonstrating "random variation combined with non-random cumulative selection". The Weasel model uses a mutator and a selector. The mutator is intended to be random while the selector is non-random. If you add a non-random process to the mutator it breaks down the whole purpose of Dawkins' model. I don't understand why it's necessary to vary the mutation rate in the model. Is there biological evidence that nature reduces mutations when we near the ideal target? Dawkins states the notion of the ideal target is "absurd". It's important to stick with the purpose of the model and not change the essence of the model to simple converge more quickly. It's not a competition about who has the most rapidly converging model. --[[User:Davidj|Davidj]] 18:02, 1 October 2011 (UTC)
 : Out of curiosity, could the author of the C++ solution explain the constants 0.02 and 0.9 used to calculate the mutation_rate. Thank you.  (double const mutation_rate = 0.02 + (0.9*fitness)/initial_fitness;) --[[User:Davidj|Davidj]] 18:02, 1 October 2011 (UTC)