Haldane's Mapping Function
Citations:
Crow, J.F. and W.F. Dove. 1990. Anecdotal,
historical and critical commentaries on genetics. Genetics
125:669-671.
Liu, B.H. 1997. Statistica Genomics:
Linkage, Mapping, and QTL Analysis. CRC Press. Pgs.
18-19. Pgs. 328-329.
Haldane, J.B.S. 1919. J. Genet. 8:299-309.
Synopsis
1) Haldane's mapping function adjusts the observed
proportion of recombinant gametes for unobserved double
crossovers so that map distances are additive.
2) When observed recombination is less than r
= 10% (r<10%), we do not need to use Haldane's mapping
function. This is because when loci are located close
together the amount of double crossovers within such
a small interval is negligible.
3) When r = 50% two loci are inherited independently
and the distance between them in cM is infinite. This
means that when two loci are inherited independently,
we can not determine how many cM there are between the
two loci.
4) When
two loci are separated by 50cM, the two loci are not
inherited independently. A 50 cM map distance between
two loci is the equivalent of 32% recombination or r
= 0.32.
Derivation of Haldane's Mapping Function:
Let r = observed recombination in crossover units (c.u);
d = actual recombination in cM l
= number of crossover events = 2d
Distribution of the number of events Probability
of that number of events
0
e-l
= e-2d
1
le-l
2
(l2/2)e-l
3
(l3/3!)e-l
If e-l is
the probability of zero crossover events, then 1 - e-l
is the probability of at least one crossover event.
The probability of at least one recombinant gamete is
r = (½)(1 - e-l),
because for each crossover event only one-half of the
gametes will be recombinant types. This is because only
two of four strands are involved in a crossover. Therefore,
there is a coefficient of one-half in front of the (1
- e-l) term.
The exponent of l is explained
by the fact that the probability of a crossover is 2d.
Determining the map distance using Haldane's mapping
function is based on the observed proportion of recombinant
gametes adjusted for the number of unobservable double
crossovers. Solving for d:
2r = 1 - e-l
= 1 - Prob.( zero crossover events) where the probability
of zero crossover events equals e-l.
We can put the equation 2r = 1 - e-l
in words. The probability of at least one recombinant
event equals one minus the probability of no crossover
events. We can solve for l.
2r = 1 - e-l
e-l
= 1 - 2r
ln( e-l)
= ln(1 - 2r)
-l
= ln(1 - 2r)
Now 2d = l because d is
a measure of the proportion of actual recombinant gametes
and l is a measure of the
proportion of actual crossover events. d = l/2
means that each crossover event results in only one-half
recombinant gametes because only 2 of the 4 strands
participate in a crossover event.
-l
= ln(1 - 2r)
-2d = ln(1 - 2r)
d = -(1/2)ln(1
- 2r)
The above equation provides a solution for d in M for
an observed recombination of r crossover units. For
short chromosome segments the map distance (d) equals
the recombination fraction and d = r. For example, 4
c.u. = 4 cM = 8% crossover events. When r < 10 c.u.
then r = d. However, consider three loci, each separated
by 20 c.u.
Let rAC = observed recombination between
the A and C loci;
rAB = observed
recombination between the A and B loci;
rBC = observed
recombination between the B and C loci.
1 - rAC = (1 - rAB)(1 - rBC)
rAC = rAB + rBC - 2rAB
rBC
rAC = 0.2 + 0.2 - 2(0.2)(0.2) = 0.32
The reason is as follows:
1 - 2 rAC = the probability of no crossover
event between the A and C loci;
1 - 2 rAB = the probability of no crossover
event between the A and B loci;
1 - 2 rBC = the probability of no crossover
event between the B and C loci.
The probability of no crossover event between the A
and C loci is the probability of no crossover event
between the A and B loci multiplied by the probability
of no crossover event between the B and C loci.
(1 - 2 rAC)
= (1 - 2rAB)(1 - 2rBC)
1 - 2 rAC
= 1 - 2rBC - 2rAB + 4rABrBC
2 rAC = 2
rBC + 2 rAB - 4rAB
rBC
rAC = rBC
+ rAB - 2rABrBC
In our example, rBC = 0.2 = rAB
then,
rAC = rBC + rAB - 2rABrBC
= 0.2 + 0.2 - 2(0.2)(0.2)
= 0.32 c.u. = 32%
The point here is that the observed recombination between
the A and C loci does not add up to the sum of the observed
recombination between the A and B loci plus the observed
recombination between the B and C loci.This is why we
need to use Haldane's mapping function, to make recombination
additive between loci.
Let us convert rAC, which is the observed
proportion of recombination gametes into the actual
proportion of recombinant gametes.
d = -(½) ln(1 - 2r)
= -(½)ln(1 - 2rAC)
= -(½)ln(1 - 0.64)
= -(½)ln(0.36)
= -(½)ln(-1.022)
= 0.51 M
When loci are closely linked, it is not necessary
to convert recombination units to cM. Let r = 0.1. We
can convert this to d = 11 cM, which is approximately
equal to 10 c.u. This means that for closely linked
loci the distances are additive.
We will now convert our observed recombination values
to actual recombination values.
Previously in our example, rAB = 0.2 = rBC;
rAC = 0.32
We have shown that:
rAC
= rBC + rAB - 2rABrBC
and d = -(½)ln(1 - 2r)
dAB = -(½)ln(1 - 2rAB) = -(½)ln(1
- 0.4) = 0.255 M = 25.5 cM
dBC = -(½)ln(1 - 2rBC) = -(½)ln(1
- 0.4) = 0.255 M = 25.5 cM
dAC = -(½)ln(1 - 2rAC) = -(½)ln(1
- 0.64) = 0.511 M =51.1 cM
Haldane's mapping function converts recombination fractions
to map distances, which are now additive.
rAB + rBC = rAC, but
dAB + dBC = dAC
Why was rAB = 0.2 converted to dAB
= 25.5 cM? Because there are double crossovers between
the A and B loci which we cannot detect. There is no
locus between the A anc B loci which would permit us
to identify double crossovers between A and B. Double
crossovers between the A and B loci do not result in
observable recombination between the A and B loci.
We have shown that the average recombination for 2,
3 and 4-strand double crossovers is 50% recombination.
Double crossovers should be counted as two crossover
events, but the recombination fraction appears the same
as for a single crossover event. Thus, the relative
frequency of crossover events is underestimated by the
observable recombination fraction. We can use the function
d = -(½)ln(1 - 2r) to convert observed recombination
(r) to map distance in centiMorgans (d).
Independent Loci Example:
Let r = 0.5, which means that the two loci are inherited
independently.
d = -(½)ln(1 - 2r)
= -(½)ln(1 - 1)
= -(½)ln(0)
= infinity
Let r = 0.49
d = -(½)ln(1 - 2r)
= -(½)ln(1 - 0.98)
= 1.956 M = 195.6
cM
50cM Example:
Now we will use the formula to convert map distance
to observed recombination. We have to convert d from
cM to Morgans to use the formula.
Let d = 50 cM = 0.5 M
d = -(½)ln(1 - 2r)
0.5 M = -(½) ln (1 - 2r)
-1 = ln( 1 - 2r)
e-1 = 1 - 2r
1 - e-1 = 2r
(½) (1 - e -1) = r
(½) (1 - 1/e) = r
½ (1 - 0.368) = r = 0.316 = 31.6 c.u.
We can see that 50cM = 31.6 c.u. A 50 cM map distance
does not equal independence between loci. 50 c.u. =
50% recombination and 50 cM = 32 c.u.
Example of Loci Closely Linked:
When there is less than 10% recombination between loci,
it is not necessary to use Haldane's mapping function
to convert to cM. There are so few double crossovers
between closely linked loci that the observed and actual
recombination are approximately equal.
Let rDE = 6 c.u. and rEF
= 8 c.u.,
then rDF = rEF + rDE
= 6 + 8 = 14 c.u. = 14 cM