Minimal Checklist

This section provides a succinct checklist of critical Quality Control (QC) metrics for evaluating DRAGEN run performance. These metrics are located in the standard CSV output files in the run directory.

1. Basic Run & Alignment QC

Source File: <output_prefix>.mapping_metrics.csv Default Status: Enabled

Note that mapping metrics are computed on the raw input sample (prior to optional UMI collapsing).

Metric Name
Critical Trend / Success Criteria

Total input reads

Volume check: Confirm read count matches sequencer output and assay expectations. E.g ~600–900M reads for 30X human WGS. Low counts indicate flow cell loading issues or demultiplexing failures.

Q30 bases

Base quality: Generally > 85–90%. A global drop indicates chemistry or flow cell issues (see FastQC metrics for positional detail).

Mapped reads

Alignment success: Aim for > 95% (human WGS). Low mapping (< 90%) suggests wrong reference genome, severe contamination, or poor library quality.

Supplementary (chimeric) alignments

Structural integrity: Reads split across distant loci. For high-quality human germline WGS, expected values are typically < 2–3%. Values > 3–5% warrant investigation and may indicate library chimera artifacts (e.g. PCR stitching, FFPE-related fragmentation) or, in somatic samples, true structural variation. Sustained levels > 5% in germline samples are generally considered abnormal.

Soft-clipped bases (R1/R2)

Adapter/quality trimming: High percentages indicate adapter read-through, short insert sizes, or poor-quality read ends (e.g. FFPE artifacts). For high-quality libraries, typical values are < 2–3%. Values > 3–5% suggest suboptimal trimming or degraded DNA. Levels > 5% should be treated as a QC concern and reviewed alongside FastQC metrics.

Estimated sample contamination

Purity check: (Requires --qc-detect-contamination=true). For human germline samples, > 1–2% contamination can materially impact variant calling accuracy, especially for low-frequency somatic variants.

Duplicate reads

Library complexity: Elevated duplication rates suggest reduced library complexity or over-amplification. For high-quality germline WGS, typical values are < 20%. For WES and other targeted assays, higher duplication rates (e.g. 20–50%) are common and should be interpreted in the context of on-target coverage and assay design.

Insert length (median)

Fragment size: Should match the library prep target (e.g. ~350 bp). Deviations can affect coverage uniformity.

2. FastQC (Sequence Composition)

Source File: <output_prefix>.fastqc_metrics.csv Default Status: Enabled

Note that FastQC metrics are computed on the raw input sample (prior to optional UMI collapsing).

Metric Name
Critical Trend / Success Criteria

Positional base mean quality

Cycle decay: Identify quality drop-off at read ends or specific cycles (e.g. fluidics issues).

Read GC content

Contamination/bias: Deviation from expected distribution (e.g. human ~40–45% GC) suggests contamination or severe PCR bias.

Sequence positions (adapters)

Adapter content: High levels indicate untrimmed adapters or short inserts. This can lead to high levels (e.g. > 5%) of reported soft-clipped bases in the mapping metrics. Confirm trimming options are enabled.

3. UMI QC (Applies Only to UMI Designs)

Source File: <output_prefix>.umi_metrics.csv Default Status: Enabled as part of the UMI pipeline (requires --umi-enable=true)

Metric Name
Critical Trend / Success Criteria

Consensus reads

Conversion efficiency: Low ratio relative to total input reads suggests low molecular complexity or insufficient sequencing depth.

Mean family size

Saturation: Family sizes near 1.0 indicate under-sequencing; error correction is ineffective without duplicate families. Very high mean family sizes may indicate excessive PCR amplification.

4. Coverage QC

Source File: <output_prefix>.*_coverage_metrics.csv (e.g. wgs_coverage_metrics.csv or target_bed_coverage_metrics.csv) Default Status: Enabled

Please note coverage metrics are computed post read deduplication or UMI collapsing. Reads with MAPQ=0 are ignored.

Metric Name
Critical Trend / Success Criteria

Average alignment coverage over target bed / genome

Depth check: Primary driver of sensitivity. Ensure it meets the assay target (e.g. 30× germline, 100×+ somatic).

Uniformity of coverage (PCT > 0.2× mean)

Bias check: Low uniformity indicates coverage bias (e.g. GC bias), leading to variant calling blind spots. Germline WGS typically expects ≥ 80–90% uniformity.

PCT of genome with coverage [x: inf)

Callability: (e.g. PCT ≥ 20×). Germline WGS typically requires > 95% at 20×.

Aligned bases in target bed / genome

Yield: Total usable data. Useful for normalizing performance across runs or flow cells.

5. Variant-Level Sanity Check (Optional)

Source File: <output_prefix>.vc_metrics.csv Default Status: Conditional (variant caller enabled)

Metric Name
Critical Trend / Success Criteria

Ti/Tv Ratio

Biological plausibility: For human germline WGS, expect approximately 1.9–2.2. Significantly lower values (e.g. < 1.7) often indicate elevated false-positive rates due to sequencing or alignment errors.

Note: Ti/Tv is a biological sanity check, not a standalone QC pass/fail metric. Expected values depend on organism, assay type, and genomic region.

How to Use This Checklist

  • Treat this as a minimum QC review, not an exhaustive list.

  • Values outside these ranges warrant investigation but are not automatic failures.

  • Some metrics are assay-specific and may require optimization for particular use cases.

Last updated

Was this helpful?