Linkage disequilibrium

From Wikipedia, the free encyclopedia

(Redirected from Linkage equilibrium)
Jump to: navigation, search

Linkage disequilibrium is a term used in the study of population genetics for the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is not the same as linkage, which describes the association of two or more loci on a chromosome with limited recombination between them. Linkage disequilibrium describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies. Non-random associations between polymorphisms at different loci are measured by the degree of linkage disequilibrium (LD). A comparison of different measures is provided by Devlin & Risch [1]

Linkage disequilibrium is generally caused by genetic linkage and the rate of recombination; mutation rate; random drift or non-random mating; and population structure. For example, some organisms may show linkage disequilibrium (such as bacteria) because they reproduce asexually and there is no recombination (r=0) to break down the linkage disequilibrium: D'=(1-r)D.

It may be instructive to study genetic equilibrium, and its application in the Hardy-Weinberg principle.

The International HapMap Project enables the study of LD in human populations online. The Ensembl project integrates HapMap data and such from dbSNP in general with other genetic information.

Contents

Formally, if we define pairwise LD, we consider indicator variables on alleles at two loci, say I1,I2. We define the LD parameter δ (delta) as:

\delta := \operatorname{cov}(I_1, I_2) = p_1 p_2 - h_{12} = h_{11}h_{22}-h_{12}h_{21}

Here p1,p2 denote the marginal allele frequencies at the two loci and h12 denotes the haplotype frequency in the joint distribution of both alleles. Various derivatives of this parameter have been developed. In the genetic literature the wording "two alleles are in LD" usually means to imply \delta \ne 0. Contrariwise, linkage equilibrium, denotes the case δ = 0.

If inspecting the two loci A and B with two alleles each—a two-locus, two-allele model—the following table denotes the frequencies of each combination:

Haplotype Frequency
A1B1 x11
A1B2 x12
A2B1 x21
A2B2 x22

Note that these are relative frequencies. One can use the above frequencies to determine the frequency of each of the alleles:

Allele Frequency
A1 p1 = x11 + x12
A2 p2 = x21 + x22
B1 q1 = x11 + x21
B2 q2 = x12 + x22

if the two loci and the alleles are independent from each other, then one can express the observation A1B1 as "A1 must be found and B1 must be found". The table above lists the frequencies for A1,p1, and B1,q1, hence the frequency of A1B1, x11, equals according to the rules of elementary statistics x11 = p1 * q1.

A deviation of the observed frequencies from the expected is referred to as the linkage disequilibrium parameter[2], and is commonly denoted by a capital D [3] as defined by:

D = x11p1q1

The following table illustrates the relationship between the haplotype and allele frequencies and D.

A1 A2 Total
B1 x11 = p1q1 + D     x21 = p2q1D    q1
B2 x12 = p1q2D x22 = p2q2 + D q2
Total    p1 p2 1

When extending these formula for diploid cells rather than investigating the gametes/haplotypes directly, the laid out principle prevails, the recombination rate between the two loci A and B must be taken into account, though, which is commonly denoted by the letter c.

D is nice to calculate with but has the disadvantage of depending on the frequency of the alleles inspected. This is evident since frequencies are between 0 and 1. There can be no D observed if any locus has an allele frequency 0 or 1 and is maximal when frequencies are at 0.5. Lewontin (1964) suggested normalising D by dividing it with the theoretical maximum for the observed allele frequencies. Thus D'=\frac{D}{D_\max} when D > = 0 When D < 0, D'=\frac{D}{D_\min}.

Dmax is given by the smaller of p1q2 and p2q1. Dmin is given by the larger of p1q1 and p2q2

Another value is the correlation coefficient as also laid out in the initial paragraphs of this page, denoted as r^2=\frac{D^2}{p_1p_2q_1q_2}. This however is not adjusted to the loci having different allele frequencies. If it was, r, the square root of r2 if given the sign of D would be equivalent to D' [4]

Another statistic used in a selective neutrality test is Tajima's D, to decide whether the mean number of differences between pairs of DNA sequences is compatible with the observed number of segregating sites in a sample.

These are summary statistics (i.e. descriptive statistics summarizing the pattern of genetic diversity) that are computed from diploid samples of DNA sequences and which assume that the gametic phase is known.

  1. ^ Devlin B., Risch N. (1995). "A Comparison of Linkage Disequilibrium Measures for Fine-Scale Mapping". Genomics 29: 311-322. 
  2. ^ Robbins, R.B. (1918). "Some applications of mathematics to breeding problems III". Genetics 3: 375-389. 
  3. ^ R.C. Lewontin and K. Kojima (1960). "The evolutionary dynamics of complex polymorphisms.". Evolution 14: 458-472. 
  4. ^ P.W. Hedrick and S. Kumar (2001). "Mutation and linkage disequilibrium in human mtDNA". Eur. J. Hum. Genet. 9: 969-972. 
  5. ^ Hao K., Di X., Cawley S. (2007). "LdCompare: rapid computation of single- and multiple-marker r2 and genetic coverage". Bioinformatics 23: 252-254. 

  • Hedrick, Philip W. (2005). Genetics of Populations, 3rd, Sudbury, Boston, Toronto, London, Singapore: Jones and Bartlett Publishers. ISBN 0763747726. 
Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.