Variation statistics
- 1. Frequency-based nucleotide variation
- S. Number of segregating sites per site (Nei 1987).
- S2. Number of segregating sites per site, excluding unknown nucleotides in the outgroup (Nei 1987).
- Pi. Nucleotide diversity: average number of nucleotide differences per site between any two sequences (Jukes and Cantor 1969; Nei and Li 1979; Nei 1987).
- theta. Nucleotide polymorphism: proportion of nucleotide sites that are expected to be polymorphic in any suitable sample (Watterson 1975; Tajima 1993, 1996).
- nuc_diversity_within. Nucleotide diversity within the population (Hudson, Slatkin and Maddison 1992; Wakeley 1996).
- hap_diversity_within. Haplotype diversity within the population (Hudson, Slatkin and Maddison 1992).
- Pneu. Number of 4-fold (putatively neutral) segregating sites.
- Psel. Number of 0-fold (putatively selected) segregating sites.
- 2. Divergence-based metrics
- Divsites. Number of divergent sites.
- D. Proportion of sites with divergent nucleotides.
- K. Nucleotide divergence per base pair, corrected by Jukes-Cantor (Jukes and Cantor 1969).
- Dneu. Number of 4-fold (putatively neutral) divergent sites.
- Dsel. Number of 0-fold (putatively selected) divergent sites.
- 3. Linkage disequilibrium
- Wall_B. Wall’s B summary statistic of linkage disequilibrium (Wall 1999), proportion of pairs of adjacent segregating sites that are congruent, with values approaching 1 indicating extensive congruence among adjacent segregating sites.
- Wall_Q. Wall’s Q summary statistic of linkage disequilibrium (Wall 1999), proportion of pairs of adjacent segregating sites that are congruent.
- Rozas_ZA. Rozas’s ZA summary statistics (Rozas et al 2001), average of r2 only between adjacent polymorphic sites.
- Rozas_ZZ. Rozas’s ZZ summary statistics (Rozas et al 2001), Rozas’s ZA minus Kelly’s ZnS.
- Kelly_ZnS. ZnS summary statistic (Kelly 1997), average pairwise r2 value.
- iHS. Integrated haplotype score (Voight et al 2006), based on the frequency of alleles in regions of high LD. Only for autosomes.
- 4. Recombination
- recomb_Bherer2017_females. Recombination estimates (cM/Mb) from the refined genetic map by Bherer et al. 2017, which collects recombination events from six recent studies of human pedigrees, pertaining to a total of 104,246 informative meioses. Females map.
- recomb_Bherer2017_males. Recombination estimates (cM/Mb) from the refined genetic map by Bherer et al. 2017, which collects recombination events from six recent studies of human pedigrees, pertaining to a total of 104,246 informative meioses. Males map.
- recomb_Bherer2017_sexavg. Recombination estimates (cM/Mb) from the refined genetic map by Bherer et al. 2017, which collects recombination events from six recent studies of human pedigrees, pertaining to a total of 104,246 informative meioses. Values from the females/males maps are averaged.
- recomb_Genethon_females_1Mb. Genethon genetic map based on 5,264 microsatellites for 8 CEPH families consisting of 134 individuals with 186 meioses. Females map.
- recomb_Genethon_males_1Mb. Genethon genetic map based on 5,264 microsatellites for 8 CEPH families consisting of 134 individuals with 186 meioses. Males map.
- recomb_Genethon_sexavg_1Mb. Genethon genetic map based on 5,264 microsatellites for 8 CEPH families consisting of 134 individuals with 186 meioses. Values from the females/males maps are averaged.
- recomb_Marshfield_females_1Mb. Marshfield genetic map based on 8,325 short tandem repeat polymorphisms (STRPs) for 8 CEPH families consisting of 134 individuals with 186 meioses. Females map.
- recomb_Marshfield_males_1Mb. Marshfield genetic map based on 8,325 short tandem repeat polymorphisms (STRPs) for 8 CEPH families consisting of 134 individuals with 186 meioses. Males map.
- recomb_Marshfield_sexavg_1Mb. Marshfield genetic map based on 8,325 short tandem repeat polymorphisms (STRPs) for 8 CEPH families consisting of 134 individuals with 186 meioses. Values from the females/males maps are averaged.
- recomb_deCODE_females_1Mb. deCODE genetic map based on 5,136 microsatellite markers for 146 families with a total of 1,257 meiotic events. Females map.
- recomb_deCODE_males_1Mb. deCODE genetic map based on 5,136 microsatellite markers for 146 families with a total of 1,257 meiotic events. Males map.
- recomb_deCODE_sexavg_1Mb. deCODE genetic map based on 5,136 microsatellite markers for 146 families with a total of 1,257 meiotic events. Values from the females/males maps are averaged.
- 5. Selection tests based on SFS and/or variability
- Tajima_D. Tajima's D test statistic (Tajima 1989), based on the differences between the number of segregating sites and the average number of nucleotide differences.
- FuLi_F. Fu & Li's F test statistic (Fu and Li 1993), number of derived nucleotide variants observed only once in a sample with the mean pairwise difference between sequences.
- FuLi_D. Fu & Li's D test statistic (Fu and Li 1993), number of derived nucleotide variants observed only once in a sample with the total number of derived nucleotide variants.
- FayWu_H. Fay & Wu’s H test statistic (Fay and Wu 2000), number of derived nucleotide variants at low and high frequencies with the number of variants at intermediate frequencies.
- Zeng_E. Zeng’s E test statistic (Zeng et al 2006), difference between θL and θW, sensitive to changes in high-frequency variants.
- Fst Fst statistic (Hudson et al. 1992), measures average levels of gene flow based on allele frequencies under the infinite-sites model.
- 6. Selection tests based on the MKT
- NI. Neutrality index (Rand and Kann 1996), which summarizes the four values in an McDonald and Kreitman test (McDonald and Kreitman 1991) table as a ratio of ratios, computed as NI = (Psel/Pneu) / (Dsel/Dneu).
- alpha. Proportion of substitutions that are adaptive (Charlesworth 1994; Smith and Eyre-Walker 2002), based on the McDonald and Kreitman test (McDonald and Kreitman 1991), which compares the amount of variation within species to the divergence between species at two types of site: synonymous and nonsynonymous sites. The test assumes that all synonymous mutations are neutral and that nonsynonymous mutations are either strongly deleterious, neutral, or strongly advantageous. For the calculation of this track, four-fold degenerate sites were used as synonymous (neutral) sites and zero-fold degenerate sites as nonsynonymous (putatively adaptive) sites, as alpha = 1 - ((Psel/Pneu) / (Dsel/Dneu)).
- DoS. Direction of Selection (Stoletzki and Eyre-Walker 2011), difference between the proportion of nonsynonymous divergence and nonsynonymous polymorphism, computed as DoS = (Dsel/(Dsel+Dneu)) - (Psel/(Psel+Pneu)).
- Fisher1. Fisher exact test p-value (Fisher 1922) for the McDonald and Kreitman test (McDonald and Kreitman 1991) 2x2 contingency table containing Dsel, Dneu, Psel and Pneu estimates, used to determine the significance of the MK test.
- Pneu_less5. Number of 4-fold (putatively neutral) segregating sites with MAF<5% (Mackay et al 2012).
- Pneu_more5. Number of 4-fold (putatively neutral) segregating sites with MAF>5% (Mackay et al 2012).
- Psel_less5. Number of 0-fold (putatively selected) segregating sites with MAF<5% (Mackay et al 2012).
- Psel_more5. Number of 0-fold (putatively selected) segregating sites with MAF>5% (Mackay et al 2012).
- Psel_neutral_less5. Fraction of 0-fold segregating sites with DAF < 5% that are neutral, computed as Psel_neutral_less5 = (Psel x Pneu_less5/Pneu)) (Mackay et al 2012).
- Psel_neutral. Fraction of new mutations that are neutral, calculated after removing the excess of sites at MAF<5% due to slightly deleterious mutations, calculated as Psel_neutral = Psel_neutral_less5 + Psel_more5 (Mackay et al 2012).
- Psel_weak. Fraction of new mutations that are weakly deleterious and segregate at MAF<5%, computed as Psel_weak = Psel_less5 – Psel_neutral_less5 (Mackay et al 2012).
- alpha_cor. Fraction of new mutations that are adaptive, calculated after removing slightly deleterious mutations as alpha_cor = 1-(Psel_neutral/Pneu)*(Dneu/Dsel) (Charlesworth 1994; Mackay et al 2012).
- Fisher2. Fisher exact test p-value (Fisher 1922) for the McDonald and Kreitman test (McDonald and Kreitman 1991) 2x2 contingency table containing Dsel_neutral, Dneu, Psel and Pneu estimates, used to determine the significance of the MK test.
See also Help -> Integrative MKT for a complete description of the method applied to both sliding windows and individual genes.