I'm trying to solve the problem at http://rosalind.info/problems/iprb/
Given: Three positive integers
k
,m
, andn
, representing a population containingk+m+n
organisms:k
individuals are homozygous dominant for a factor,m
are heterozygous, andn
are homozygous recessive.Return: The probability that two randomly selected mating organisms will produce an individual possessing a dominant allele (and thus displaying the dominant phenotype). Assume that any two organisms can mate.
My solution works for the sample, but not for any problems generated. After further research it seems that I should find the probability of choosing any one organism at random, find the probability of choosing the second organism, and then the probability of that pairing producing offspring with a dominant allele.
My question is then: what does my code below find the probability of? Does it find the percentage of offspring with a dominant allele for all possible matings -- so rather than the probability of one random mating, my code is solving for the percentage of offspring with dominant alleles if all pairs were tested?
f = open('rosalind_iprb.txt', 'r')
r = f.read()
s = r.split()
############# k = # homozygotes dominant, m = #heterozygotes, n = # homozygotes recessive
k = float(s[0])
m = float(s[1])
n = float(s[2])
############# Counts for pairing between each group and within groups
k_k = 0
k_m = 0
k_n = 0
m_m = 0
m_n = 0
n_n = 0
##############
if k > 1:
k_k = 1.0 + (k-2) * 2.0
k_m = k * m
k_n = k * n
if m > 1:
m_m = 1.0 + (m-2) * 2.0
m_n = m * n
if n> 1:
n_n = 1.0 + (n-2) * 2.0
#################
dom = k_k + k_m + k_n + 0.75*m_m + 0.5*m_n
total = k_k + k_m + k_n + m_m + m_n + n_n
chance = dom/total
print chance
p(k,m,n)
directly. – Cammack