DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • DNA Analysis Methods
  • Reference Genomes
  • DRAGEN Map/Aligner
  • Small Variant Calling and Filtering
  • Somatic mode
  • Copy Number Variant Calling
  • Absolute Copy Numbers (ABCN)
  • Loss of Heterozygosity
  • Structural Variant Calling
  • Variant Deduplication
  • Contamination Detection
  • Annotation
  • Biomarkers
  • Tumor Mutational Burden
  • Microsatellite Instability Status
  • HRD
  • HLA Typing
  • Targeted Callers
  • Expansion Hunter
  • Variable Number Tandem Repeat (VNTR)

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4
  3. Clinical Research Workflows
  4. DRAGEN Solid WGS Tumor Normal Pipeline

Analysis Methods

PreviousAnalysis OutputNextTroubleshooting

Last updated 2 days ago

Was this helpful?

DNA Analysis Methods

The software is a DNA only analysis pipeline based on the . Even though it includes some of the default settings from the , it uses a distinct recipe with different options. A user has the ability to override specific parameters via a .

The software performs germline variant calling on the normal sample, and reports the following variants:

  • SNV (annotated)

  • CNV (annotated)

  • SV (annotated)

  • Targeted callers (cyp2b6, cyp2d6, cyp21a2, gbna, hba, lpa, rh and smn)

  • Expansion hunter

  • VNTR

The software perform somatic variant calling on the tumor sample and reports the following variants:

  • SNV (annotated)

  • MNV

  • CNV (annotated, requires germline SNV and CNV VCF)

  • SV (annotated, with variant deduplication)

  • TMB

  • MSI

  • HRD

  • ASCN

  • LOH

  • DUX4

  • HLA

/opt/edico/bin/dragen \
--ref-dir /staging/dragen-app-manager/resources/Illumina_hg38-alt_masked.cnv.graph.hla.methyl_cg.rna-11_r5.0-1 \
--output-directory DragenCaller/Sample-001 \
--output-file-prefix Sample-001 \
--events-log-file DragenCaller/Sample-001/events.csv \
--enable-map-align=true \
--enable-map-align-output=true \
--enable-variant-caller=true \
--vc-emit-ref-confidence=GVCF \
--vc-enable-vcf-output=true \
--enable-targeted=true \
--targeted-merge-vc=true \
--enable-star-allele=true \
--enable-cnv=true \
--cnv-enable-self-normalization=true \
--repeat-genotype-enable=true \
--enable-sv=true \
--enable-vntr=true \
--sv-vntr-merge=false \
--enable-hla=true \
--hla-enable-class-2=true \
--vc-output-evidence-bam=false \
--qc-detect-contamination=true \
--qc-coverage-ignore-overlaps=false \
--logging-to-output-dir=true \
--max-base-quality=63 \
--enable-duplicate-marking false \
--tumor-normal-has-umi both \
--umi-source qname \
--umi-library-type nonrandom-duplex \
--umi-min-supporting-reads 1 \
--umi-correction-table /staging/dragen-app-manager/resources/Illumina_solid-wgs-tn-resources_4.4.4.2/umi/umi_correction_table.txt.gz \
--bam-input Sample-001.bam \
--force 

Reference Genomes

The pipeline supports two reference genomes for the DRAGEN Map/Aligner - hg38 and hs37d5_chr.

The hs37d5_chr genome is the hg19 reference genome with the Chromosome Y PAR masked. It includes the NC_012920 mitocondria genome. The contigs have the chr prefix added, but without the native alternate loci names.

DRAGEN Map/Aligner

DRAGEN continues to use these final alignments as input for various variant calls such as gene amplification (copy number) calling, small variant calling (SNV, indel, MNV, delin), and DNA library quality control.

Small Variant Calling and Filtering

DRAGEN supports calling SNVs, indels, MNVs, and delins in tumor-only samples by using mapped and aligned DNA reads from a tumor sample as input. Variants are detected via both column wise pileup analysis and local de novo assembly of haplotypes. The de novo haplotypes allow the detection of much larger insertions and deletions than possible through column wise pileup analysis only. DRAGEN insertions and deletions are validated with lengths of at least 0–25 bp and more than 25 bp can be supported. In addition, DRAGEN also uses the de novo assembly to detect SNVs, insertions, and deletions that are co-phased and part of the same haplotypes. Any such co-phased variants that are within a window of 15 bp can then be reassembled into complex variants (MNVs and delins). The tumor-only pipeline produces a VCF file containing both germline and somatic variants that can be further analyzed to identify tumor mutations. The pipeline makes no ploidy assumptions, enabling detection of low-frequency alleles.

DRAGEN small variant calling includes the following steps:

  1. Detects regions with sufficient read coverage (callable regions).

  2. Detects regions where the reads deviate from the reference and there is a possibility of a germline or somatic call (active regions).

  3. Assembles de novograph haplotypes are assembled from reads (haplotype assembly).

  4. Extracts possible somatic or germline calls (events) from column wise pileup analysis.

  5. Calibrates read base qualities to account for background noise.

  6. Computes read likelihoods for each read/haplotype pair.

  7. Performs mutation calling by summing the genotype probabilities across all reads/haplotype pairs.

  8. Performs additional filtering to improve variant calling accuracy, including using a systematic noise file. The systematic noise file indicates the statistical probability of noise at specific positions in the genome. This noise file is constructed using clean (normal) samples. Regions where noise is common (eg, difficult to map regions) have higher noise values. The small variant caller penalizes those regions to reduce the probability of making false positive calls.

Somatic mode

Copy Number Variant Calling

The DRAGEN copy number variant caller performs amplification, reference, and deletion calling for CNV targets within the assay. It counts the coverage of each target interval on the panel, uses a preprocessed panel of normal samples to normalize target counts, corrects for GC coverage bias, and calculates scores of a CNV event from observed coverage and makes copy number calls.

Absolute Copy Numbers (ABCN)

Loss of Heterozygosity

Structural Variant Calling

Variant Deduplication

Contamination Detection

The contamination analysis step detects foreign human DNA contamination using the SNP error file and pileup file that are generated during the small variant calling and the TMB trace file. The software determines whether a sample has foreign DNA using the contamination score. In contaminated samples, the variant allele frequencies in SNPs shift from the expected values of 0%, 50%, or 100%. The algorithm collects all positions that overlap with common SNPs that have variant allele frequencies of < 25% or > 75%. Then, the algorithm computes the likelihood that the positions are an error or a real mutation. The contamination score is the sum of all the log likelihood scores across the predefined SNP positions with minor allele frequency < 25% in the sample and are not likely due to CNV events.

The larger the contamination score, the more likely there is foreign DNA contamination. A sample is considered to be contaminated if the contamination score is above predefined quality threshold. The contamination score was found to be high in samples with highly rearranged genomes or HRD samples. 1% of HRD samples found to be above the threshold with no evidence for actual contamination.

Annotation

The Illumina Annotation Engine performs annotation of small variants, and CNVs. The inputs are gVCF files and the outputs are annotated JSON files.

The Illumina Annotation Engine processes each variant entry and annotates with available information from databases such as dbSNP, gnomAD genome and exome, 1000 genomes, ClinVar, COSMIC, RefSeq, and Ensembl. The header includes version information and general details. Each annotated variant is included as a nested dictionary structure in separate lines following the header.

Biomarkers

Tumor Mutational Burden

DRAGEN is used to compute tumor mutational burden (TMB) in coding regions where there is sufficient coverage.

The following variants are excluded from the TMB calculation:

  • Non-PASS variants.

  • Mitochondrial variants.

  • MNVs.

  • Variants that do not meet a minimum depth threshold.

  • Variants that do not meet the minimum variant allele threshold.

  • Variants that fall outside the eligible regions.

  • Tumor driver mutations. Variants with a population allele count ≥ 50 are treated as tumor driver mutations. Germline variants are not counted towards TMB. Variants are determined as germline based on a database and a proxy filter.

Variants with a population allele count ≥ 10 that are observed in either the 1000 Genomes or gnomAD databases are marked as germline. MNVs, which do not count towards TMB, may be marked as germline when all their component small variants are marked as germline. The proxy filter scans the variants surrounding a specific variant and identifies those variants with similar variant allele frequencies (VAF). If the majority of surrounding variants of similar VAF are germline, then the variant is also marked as germline.

The formula for TMB calculation is:

TMB=Filtered VariantsEligible Region Size(Mbp)TMB = {Filtered\ Variants \over Eligible\ Region\ Size (Mbp)}TMB=Eligible Region Size(Mbp)Filtered Variants​

NonsynonymousTMB=Filtered Nonsynonymous VariantsEligible Region Size(Mbp)Nonsynonymous TMB = {Filtered\ Nonsynonymous\ Variants \over Eligible\ Region\ Size (Mbp)}NonsynonymousTMB=Eligible Region Size(Mbp)Filtered Nonsynonymous Variants​

Outputs are captured in a .tmb.trace.tsv file that contains information on variants used in the TMB calculation and a .tmb.metrics.json file that contains the TMB score calculation and configuration details.

Microsatellite Instability Status

DRAGEN can determine the MSI status of a sample. It uses a normal reference file, which was created from a set of normal samples. During sequencing, normal reference files are generated by tabulating read counts for each microsatellite site. The normal file contains the read count distribution for each microsatellite.

MSI calling for a tumor-only sample is performed by first tabulating tumor counts from the read alignments for each microsatellite site. Then, the Jensen-Shannon distance (JSD) is calculated between each pair of tumor and normal baseline samples. DRAGEN determines unstable sites by performing Chi-square testing of tumor JSD and normal JSD distributions. Unstable sites are called if the mean distance difference of the two JSD distributions is ≥to the distance threshold and Chi-square p-value is ≤ to the p-value threshold. Lastly, DRAGEN produces an MSI status given assessed site count, unstable site count, the percentage of unstable sites in all assessed sites, and the sum of the Jensen-Shannon distance of all the unstable sites.

HRD

Genomic instability score (GIS) is a whole genome signature for homologous recombination deficiency. The GIS is composed of the sum of three components: loss of heterozygosity, telomeric allele imbalance, and large-scale state transition. These components are estimated using the GIS algorithm contracted from Myriad Genetics, which uses an input of the b-allele frequency and coverage across a genome-wide single nucleotide panel. A panel of normal samples is used for both bias reduction and normalization prior to GIS estimation. Final GIS results can be found in the *.gis.json file.

HLA Typing

Targeted Callers

Expansion Hunter

Variable Number Tandem Repeat (VNTR)

An example command is provided that highlights the input and output used in DragenCaller step of the software, which may be found in the DRAGEN run log file. Any parameter options not displayed on the command line would be using the default value for the DRAGEN variant caller module. The detailed parameters and default arguments for the individual modules within the DragenCaller step may be found in the replay.json output. See for detailed explanations of the parameters.

involves aligning sequencing reads derived from DNA libraries to a reference genome prior to variant calling.

The software currently supports both tumor and normal samples with UMI. Please use the to get details on the options.

Additional information is available at .

The supports both matched tumor-normal pairs and tumor only samples. The germline mode of the small variant caller is used to analyze the normal sample in the matched pair.

Additional information is available at .

Absolute copy numbers are calculated by the CNV ASCN Caller. See .

See more information available at .

The DRAGEN Structural Variant (SV) Caller is described .

The DUX4 rearrangement caller is described .

The Variant Deduplication is described

The database content included with Nirvana database is available at the .

The pipeline currently does not support annotation of gVCF files. Please use the to perform tertiary analysis.

Please see the for details about the TMB biomarker analysis.

Please see the for details about the MSI biomarker analysis.

Please see the for details about the MSI biomarker analysis.

Please see the for details.

Please see

Please see .

Please see

DRAGEN Command Line Options
DNA alignment
DRAGEN DNA Pipeline UMI
DRAGEN DNA Pipeline Small Variant Calling
DRAGEN Somatic Pipeline
DRAGEN DNA Pipeline Small Variant Calling
ASCN Caller
DRAGEN DNA Pipeline - LOH
here
here
here
Nirvana online documentation
Illumina Connected Insights
DRAGEN DNA Pipeline - Biomarkers - TMB
DRAGEN DNA Pipeline - Biomarkers - MSI
DRAGEN DNA Pipeline - Biomarkers - HRD
DRAGEN DNA Pipeline - HLA Typing
DRAGEN DNA Pipeline - Targeted Callers
DRAGEN DNA Pipeline - Expansion Hunter
DRAGEN DNA Pipeline - VNTR
DRAGEN Secondary Analysis Software
DNA Somatic Tumor-Normal Solid WGS DRAGEN recipe
custom configuration file
Figure 1. DRAGEN Variant Calling Workflow