Definitions of Phenotype and Genotype
Asked Answered
V

1

15

Can someone help me understand the definitions of phenotype and genotype in relation to evolutionary algorithms?

Am I right in thinking that the genotype is a representation of the solution. And the phenotype is the solution itself?

Thanks

Vig answered 2/5, 2015 at 13:32 Comment(0)
C
27

Summary: For simple systems, yes, you are completely right. As you get into more complex systems, things get messier.

That is probably all most people reading this question need to know. However, for those who care, there are some weird subtleties:

People who study evolutionary computation use the words "genotype" and "phenotype" frustratingly inconsistently. The only rule that holds true across all systems is that the genotype is a lower-level (i.e. less abstracted) encoding than the phenotype. A consequence of this rule is that there can generally be multiple genotypes that map to the same phenotype, but not the other way around. In some systems, there are really only the two levels of abstraction that you mention: the representation of a solution and the solution itself. In these cases, you are entirely correct that the former is the genotype and the latter is the phenotype.

This holds true for:

  • Simple genetic algorithms where the solution is encoded as a bitstring.
  • Simple evolutionary strategies problems, where a real-value vector is evolved and the numbers are plugged directly into a function which is being optimized
  • A variety of other systems where there is a direct mapping between solution encodings and solutions.

But as we get to more complex algorithms, this starts to break down. Consider a simple genetic program, in which we are evolving a mathematical expression tree. The number that the tree evaluates to depends on the input that it receives. So, while the genotype is clear (it's the series of nodes in the tree), the phenotype can only be defined with respect to specific inputs. That isn't really a big problem - we just select a set of inputs and define phenotype based on the set of corresponding outputs. But it gets worse.

As we continue to look at more complex algorithms, we reach cases where there are no longer just two levels of abstraction. Evolutionary algorithms are often used to evolve simple "brains" for autonomous agents. For instance, say we are evolving a neural network with NEAT. NEAT very clearly defines what the genotype is: a series of rules for constructing the neural network. And this makes sense - that it the lowest-level encoding of an individual in this system. Stanley, the creator of NEAT, goes on to define the phenotype as the neural network encoded by the genotype. Fair enough - that is indeed a more abstract representation. However, there are others who study evolved brain models that classify the neural network as the genotype and the behavior as the phenotype. That is also completely reasonable - the behavior is perhaps even a better phenotype, because it's the thing selection is actually based on.

Finally, we arrive at the systems with the least definable genotypes and phenotypes: open-ended artificial life systems. The goal of these systems is basically to create a rich world that will foster interesting evolutionary dynamics. Usually the genotype in these systems is fairly easy to define - it's the lowest level at which members of the population are defined. Perhaps it's a ring of assembly code, as in Avida, or a neural network, or some set of rules as in geb. Intuitively, the phenotype should capture something about what a member of the population does over its lifetime. But each member of the population does a lot of different things. So ultimately, in these systems, phenotypes tend to be defined differently based on what is being studied in a given experiment. While this may seem questionable at first, it is essentially how phenotypes are discussed in evolutionary biology as well. At some point, a system is complex enough that you just need to focus on the part you care about.

Causation answered 2/5, 2015 at 18:28 Comment(5)
Thanks for the in-depth response. Exactly what I was looking for!Vig
Awesome response, was any formal research done about this issue?Inscription
Hmm, that's a good question. I'm sure all of these points have been stated in isolation (and I could point you to some articles that state some of them), but I'm not aware of any formal articles that nicely lay out the entire set of ways these terms get used. I wrote this answer based mostly on personal experience listening to a lot of different people use these terms in different ways, both in papers and verbally.Causation
I fail to see how a genotype is less abstract/lower level than a phenotype. I would say the opposite: the genotype contains a description about to get the phenotype, therefore it's more abstract. I would say a function is more abstract than the result of applying that function, and I would say C is more abstract than opcodes, and you have to translate C to opcodes to make it executable. Besides this, very nice answer!Nemathelminth
That's an interesting thought. I think the reason that people usually say the genotypes are less abstract is that we think of them as encodings for phenotypes, rather than functions that produce phenotypes. So genotypes would be the opcodes and phenotypes would be the C programs. I don't know if the genotype is a function that produces a phenotype, so much as it is input to a function that produces a phenotype (the function in nature is physics, in a computer it's whatever you want).Causation

© 2022 - 2024 — McMap. All rights reserved.