Legofit
infers population history from nucleotide site patterns.
|
Calculate derived allele frequency, daf.
Input file should consist of tab-separated columns:
This can be generated from a vcf file that includes annotations for ancestral alleles. If the ancestral is labelled "AA", the input for daf can be generated, using bcftools, as follows:
bcftools query -f 'CHROM\tPOS\tREF\tALT\tINFO/AA[\tGT]
' fname.vcf.gz
Output is in 5 columns, separated by whitespace:
The input files should include all sites at which derived alleles are present in any of the populations under study. For example, consider an analysis involving modern humans and Neanderthals. The modern human data must include all sites at which Neanderthals carry derived alleles, even if these sites do not vary among modern humans. To accomplish this, it is best to use whole-genome data for all populations.
The input should not contain duplicate nucleotide sites, the chromosomes should be sorted in lexical order, and within each chromosome, the nucleotides should be in numerical order. Otherwise, daf will abort with an error.
Sites are rejected unless they have a single ref or ancestral allele. Missing values are allowed for the alt allele. At the end of the job a summary of rejected sites is written to stderr.
Input file should consist of tab-separated columns:
This can be generated from a vcf file that includes annotations for ancestral alleles. If the ancestral is labelled "AA", the input for daf can be generated, using bcftools, as follows:
bcftools query -f 'CHROM\tPOS\tREF\tALT\tINFO/AA[\tGT]
' fname.vcf.gz
Output is in 5 columns, separated by whitespace:
The input files should include all sites at which derived alleles are present in any of the populations under study. For example, consider an analysis involving modern humans and Neanderthals. The modern human data must include all sites at which Neanderthals carry derived alleles, even if these sites do not vary among modern humans. To accomplish this, it is best to use whole-genome data for all populations.
The input should not contain duplicate nucleotide sites, the chromosomes should be sorted in lexical order, and within each chromosome, the nucleotides should be in numerical order. Otherwise, daf will abort with an error.
Sites are rejected unless they have a single ref or ancestral allele. Missing values are allowed for the alt allele. At the end of the job a summary of rejected sites is written to stderr.