Small Variant Calling Metrics
The QC metrics are printed to the standard output. In addition CSV files are written to the run output directory:
<output prefix>.vc_metrics.csv
Metrics are reported for each sample in multi sample VCF and gVCF files and output in a csv file with the file name ending in "vc_metrics.csv". Based on the run case, metrics are reported either as standard VARIANT CALLER or JOINT CALLER. Metrics are reported both for PREFILTER (including all variant calls regardless of FILTER field) and POSTFILTER (including only PASS calls) categories.
Panel of Normals (PON) and COSMIC filtered variants are counted as PASS variants in the POSTFILTER VCF metrics. These PASS variants can cause higher than expected variant counts in the POSTFILTER VCF metrics
Number of samples---Number of samples in the population/ joint VCF.
Reads Processed---The number of reads used for variant calling, excluding any duplicate marked reads and reads falling outside of the target region.
Total---The total number of variants (SNPs + MNPs + indels).
Biallelic---Number of sites in a genome that contains two observed alleles. The reference is counted as one allele, which allows for one variant allele.
Multiallelic---Number of sites in the VCF that contain three or more observed alleles. The reference is counted as one, which allows for two or more variant alleles.
SNPs---A variant is counted as an SNP when the reference, allele 1, and allele 2 are all length 1.
Insertions (Hom)---Number of variants that contains homozygous insertions.
Insertions (Het)---Number of variants where both alleles are insertions, but not homozygous.
Deletions (Het)---Number of variants that contains homozygous deletions.
INDELS (Het)---Number of variants where genotypes are either [insertion+deletion], [insertion+SNP], or [deletion+SNP].
De Novo SNPs---De novo marked SNPs with DQ greater than the threshold. Set the
--qc-snp-denovo-quality-thresholdoption to the required threshold. The default is 0.05 if ML recalibration is off, 0.0017 if ML recalibration is on.De Novo INDELs---De novo marked indels with DQ values greater than the threshold. This DQ threshold can be specified by setting the
--qc-indel-denovo-quality-thresholdoption to the required DQ threshold. The default is 0.4 if ML recalibration is off, 0.04 if ML recalibration is on.De Novo MNPs---De novo marked SNPs with DQ greater than the threshold. Set the
--qc-snp-denovo-quality-thresholdto the required threshold. The default is 0.05 if ML recalibration is off, 0.0017 if ML recalibration is on.(Chr X SNPs)/(Chr Y SNPs) ratio in the genome (or the target region) ---Number of SNPs in chromosome X (or in the intersection of chromosome X with the target region) divided by the number of SNPs in chromosome Y (or in the intersection of chromosome Y with the target region). If there was no alignment to either chromosome X or chromosome Y, this metric shows as NA.
SNP Transitions---Number of transitions, the interchange of two purines (A<->G) or two pyrimidines (C<->T), across the genome.
SNP Transversions---Number of transversions, the interchange of purine and pyrimidine bases, across the genome.
Ti/Tv ratio---Ratio of transitions to transitions.
SNP Transitions (non-centromeric)---Number of transitions across the genome, excluding centromeres.
SNP Transversions (non-centromeric)---Number of transversions across the genome, excluding centromeres.
Ti/Tv ratio (non-centromeric)---Ratio of transitions to transitions in non-centromeric regions.
Heterozygous---Number of heterozygous variants.
Homozygous---Number of homozygous variants.
Het/Hom ratio---Heterozygous/ homozygous ratio.
In dbSNP---Number of variants detected that are present in the dbSNP reference file. If no dbSNP file is provided via the
--bsnpoption, then both the In dbSNP and Novel metrics show as NA.Novel---Total number of variants minus number of variants in dbSNP.
Percent Callability---Available in germline and somatic modes with gVCF output. The percentage of non-N reference positions having a PASSing genotype call. Multiallelic variants are not counted. Deletions are counted for all the deleted reference positions only for homozygous calls. Only autosomes and chromosomes X, Y, and M are considered. To produce this metric for non-human references, set --qc-callability-autosome-contigs to specify the autosome contig names. Optionally, --qc-callability-xym-contigs allows setting X, Y and M contig names.
Percent Autosome Callability---Only autosomes are considered. To produce this metric for non-human references, set --qc-callability-autosome-contigs to specify the autosome contig names.
Percent QC Region Callability in Region i (i is equivalent to regions 1,2, or 3)---Available if callability for custom regions is requested via the
--qc-coverage-region-ioption and the callability output is specified with--qc-coverage-reports-i. All contigs are considered. Setting --qc-callability-autosome-contigs enables outputting this metric for non-human references.
Per Contig Het/Hom Ratio
When the germline small variant caller is executed, DRAGEN calculates a het/hom ratio per contig.
The het/hom ratio values can be used as an indication of whole chromosome uniparental disomy (UPD). UPD of certain chromosomes are associated with genetic syndromes known as imprinting disorders. Whole chromosome UPD have het/hom ratios close to 0.0. Ranges vary, but are usually between 1.0–2.0. The het/hom ratios should be interpreted in the context of the specific assay.
DRAGEN reports the ratios for both PREFILTER (including all variant calls regardless of FILTER field) and POSTFILTER (including only PASS calls) categories. The metrics are output to the .vc_hethom_ratio_metrics.csv file.
The file contains the following values for each primary contig processed.
Contig
Number of heterozygous variants
Number of homozygous variants
Het/Hom ratio
The following example shows a section of the metrics.
Last updated
Was this helpful?