DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • Pipeline
  • Fingerprint Generation
  • Germline Variant Calling
  • Sample Matching
  • MRD Detect
  • Plasma Contamination Detection

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4

DRAGEN MRD Pipeline

PreviousDRAGEN Methylation PipelineNextDRAGEN Amplicon Pipeline

Last updated 2 days ago

Was this helpful?

The DRAGEN MRD (Minimal Residual Disease) pipeline detects residual cancer cells in solid tumors, enabling the monitoring of treatment efficacy and disease progression. This pipeline utilizes a tumor-informed Whole Genome Sequencing (WGS) approach. To detect trace ctDNA in plasma, analysis targets sites and alleles identified as somatic variants in the patient's initial tumor (the tumor fingerprint). Due to the need for significantly higher sensitivity compared to standard ctDNA variant calling, a dedicated application is required to detect these rare molecules (down to tumor fractions as low as 10^-4). The MRD Detect component provides ultra-sensitive detection of tumor ctDNA and generates multiple quality control (QC) metrics that can be used to assess the validity of the results.

Initial Diagnosis:

At initial diagnosis, a solid tumor biopsy and a matched normal sample are collected. The DRAGEN small variant caller identifies specific genetic mutations (SNVs) unique to the patient's cancer from this matched sample pair. This set of unique markers constitutes the "tumor fingerprint." It is recommended to prepare libraries with greater than 80X average tumor coverage and greater than 30X average normal coverage. Tumor samples can be FFPE (Formalin-Fixed Paraffin-Embedded) or fresh frozen. Buffy Coat (BC) matched normal samples are recommended.

Follow-up Plasma Samples:

After treatment (e.g., surgery, chemotherapy, stem cell transplant), follow-up plasma samples are collected at various time points to detect residual cancer cells. The tumor fingerprint from the initial diagnosis is used to target the variant sites where residual disease is assessed. Follow-up samples are also evaluated against QC thresholds to ensure sufficient quality. An inter-sample contamination detection step is included to identify potential sample contamination. It is recommended to sequence plasma samples at approximately 50X average WGS coverage.

Pipeline

The DRAGEN MRD pipeline does not include a pre-built workflow script, but rather defines the required computational steps. Data management, sample tracking, and workflow scripts are left to the user.

BCL demultiplexing must be completed prior to running the pipeline to ensure that sample-specific FASTQs (or BAMs/CRAMs) are available as input to the pipeline.

The following diagram lists the main DRAGEN steps.

Step
Description

Fingerprint generation

Run the somatic small variant caller on the matched tumor-normal sample pair to generate a once-off fingerprint VCF.

Germline variant calling

Run the germline small variant caller on the normal sample to generate a germline VCF file that will be used during subsequent QC steps.

Sample matching

Compare the normal sample germline VCF to the somatic BAM from the fingerprint step. Identify mismatched T/N sample pairs.

MRD detect

Run the MRD module on the plasma sample to detect residual disease.

Contamination detection

Run the MRD module on the plasma sample to detect human to human cross-sample contamination.

Fingerprint Generation

The DRAGEN somatic Tumor/Normal small variant caller pipeline is used to generate a fingerprint VCF. The setting --mrd-fingerprint=true enables the small variant caller and also activates additional strict filters, such as more aggressive read position filtering. This helps to reduce false positive SNVs in the fingerprint.

DRAGEN Fingerprint generation command line:

/opt/dragen/$VERSION/bin/dragen      # DRAGEN install path 
--ref-dir $REF_DIR                   # path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH     
--output-file-prefix $PREFIX 
# Inputs (e.g. FQ lists) 
--tumor-fastq-list $PATH              
--tumor-fastq-list-sample-id $STRING 
--fastq-list $PATH                    
--fastq-list-sample-id $STRING 
# Mapper 
--enable-map-align true              # optional for BAM/CRAM input 
--enable-map-align-output true       # optionally save the output BAM 
--enable-duplicate-marking true      # default=true
--Aligner.hard-clips=7               # remove any soft clips, this further helps reduce FP calls. 
# Small variant caller
--mrd-fingerprint=true 
--vc-target-bed $HIGH_CONFIDENCE_REGIONS 
--vc-systematic-noise $PATH          # e.g. FFPE_WGS_hg38_v2.0.0_systematic_noise.snv.bed.gz
# Annotation 
--enable-variant-annotation=true 
--variant-annotation-data=PATH       
--vc-enable-germline-tagging=true
# QC
--qc-detect-contamination=true

It is recommended to use the settings --vc-target-bed or --vc-excluded-regions-bed $BED to limit fingerprint calls to high-confidence regions. It is generally recommended to construct a BED file covering only easily mapped regions, and excluding ALU or highly repetitive regions where recurring noise tends to be more frequent.

Prebuilt WES/WGS noise files
Description

WGS_hg38_v2.0.0_systematic_noise.snv.bed.gz

For WGS FF

FFPE_WGS_hg38_v2.0.0_systematic_noise.snv.bed.gz

For WGS FFPE (only hg38)

Germline Variant Calling

Run the DRAGEN germline small variant caller pipeline on the normal sample to generate a germline small variant VCF. This VCF will be used in downstream QC steps, including sample matching and the plasma cross-sample contamination detection steps.

DRAGEN germline small variant calling generation cmd line:

/opt/dragen/$VERSION/bin/dragen      # DRAGEN install path 
--ref-dir $REF_DIR                   # path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH     
--output-file-prefix $PREFIX 
# Inputs (e.g. FQ list) 
--fastq-list $PATH                    
--fastq-list-sample-id $STRING 
# Mapper 
--enable-map-align true              # optional for BAM/CRAM input 
--enable-map-align-output false      # optionally save the output BAM (not required)
--enable-duplicate-marking true      # default=true
# Small variant caller
--enable-variant-caller true

Sample Matching

The tumor BAM from the fingerprint generation step can be compared to the normal sample germline VCF to ensure the tumor-normal samples are matched from the same individual.

Example sample matching cmd line:

/opt/dragen/$VERSION/bin/dragen     
--ref-dir $REF_DIR                     # path to DRAGEN linear hashtable 
-b $TUMOR_SAMPLE_BAM
--output-directory $PATH 
--output-file-prefix $STRING
--enable-checkfingerprint true
--checkfingerprint-expected-vcf $VCF   # Normal sample germline VCF

MRD Detect

MRD detection is based on observing variants at the fingerprint locations identified in the fingerprint step above. For this step, lower quality reads, such as those with low Phred scores or where data from only one read direction is available, are removed from consideration. The number of variant signals seen at the patient-specific fingerprint sites are counted and compared to a statistical noise model. When the "signal" at the fingerprint sites significantly exceeds the number expected from sequencing "noise", a detection call is made. In this step, sample-specific noise is estimated based on dynamically generated noise sites. When all samples in this process meet QC criteria, the algorithm produces a statistical "score". A sufficiently large score is indicative of the presence of tumor DNA in the plasma sample.

Example sample matching cmd line:

/opt/dragen/$VERSION/bin/dragen      # DRAGEN install path 
--ref-dir $REF_DIR                   # path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH     
--output-file-prefix $PREFIX 
# Inputs (e.g. FQ list) 
--fastq-list $PATH                    
--fastq-list-sample-id $STRING
# Mapper 
--enable-map-align true              # optional for BAM/CRAM input 
--enable-map-align-output true       # optionally save the output BAM (not required)
--enable-duplicate-marking true      # default=true
# MRD settings
--enable-mrd=true
--mrd-probes-file $FINGERPRINT_VCF 
--mrd-stats-mode production
--mrd-score-threshold $INT           # expected to be in the range 4 - 7 

The command-line parameters that control MRD detect are:

Parameter Name
Description

--enable-mrd

Enables MRD detect. Default = "false".

--mrd-probes-file

Path to the patient fingerprint file (VCF)

--mrd-score-threshold

Threshold used to determine the presence/absence of tumor DNA.

--mrd-stats-mode

Set to "production" to "eVAF" (estimated VAF) and "score" values in the .mrd_summary.json output file.

The MRD detect module generates an output summary file using the standard DRAGEN output directory and prefix: .mrd_summary.json. The file is a valid JSON file that contains an array of JSON objects. DRAGEN supports running one sample at a time, so the array will be of length one.

When the --mrd-stats-mode=production the JSON objects will at a minimum the following key-value pairs:

  • Run[i].TumorEstimate.illumina.eVAF

  • Run[i].TumorEstimate.illumina.score

The "eVAF" is the estimated fraction of cancer DNA in the sample.

The "score" can be used to indicate the presence or absence of residual cancer cells. A higher score indicates the presence of cancer cells are more likely. The exact threshold score that is used to indicate may depend on sample quality, quality and coverage, and can be optimized for a specific pipeline. It is expected that this threshold will typically lie between 4 - 7.

Plasma Contamination Detection

The MRD module provides a highly sensitive method for detecting human-to-human cross-contamination, surpassing the detection capabilities of the default DRAGEN contamination module. This enhanced sensitivity allows for the identification of even very low levels of contamination. However, it should be noted that unlike the standard DRAGEN contamination module, this module does not inherently adjust for VAF distortions that can occur in CNV-rich somatic samples. Therefore, it is recommended to include a safety margin when considering reported contamination levels.

The module requires a Germline VCF and the plasma FQ files as input. The Germline VCF was generated in an earlier step using the matched normal sample (Buffy Coat). The detection module then analyzes pileups at loci with high population allele frequencies (approximately 50%), after excluding any germline variant sites known to be present in the primary sample. By doubling the estimated variant allele frequency (VAF) at these loci, the module approximates the fractional foreign contamination.

Example sample matching cmd line:

/opt/dragen/$VERSION/bin/dragen           # DRAGEN install path 
--ref-dir $REF_DIR                        # path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH     
--output-file-prefix $PREFIX 
# Inputs (e.g. FQ list) 
--fastq-list $PATH                    
--fastq-list-sample-id $STRING
# Mapper 
--enable-map-align true                   # optional for BAM/CRAM input 
--enable-map-align-output true            # optionally save the output BAM (not required)
--enable-duplicate-marking true           # default=true
# MRD settings
--enable-mrd=true
--mrd-probes-file=$COMMON_GERMLINE_VCF    # VCF with common germline sites (population allele freq. ~50%)
--mrd-blocklist=$NORMAL_SAMPLE_VCF        # Normal sample germline VCF"
--mrd-stats-mode=production 

Similar to the MRD detect module the output JSON will include two fields:

  • Run[i].TumorEstimate.illumina.eVAF

  • Run[i].TumorEstimate.illumina.score

The "eVAF" (estimated Variant Allele Frequency) can be used as a proxy for DNA contamination from individuals other than the patient of interest. Since the mrd-probes-file evaluates the signal at common germline sites (with population allele frequencies close to 50%), it is expected that only about half of the sites will actually overlap with germline sites from contaminating individuals. For this reason, the "eVAF" can be multiplied by 2 to get a more realistic point estimate of the amount of contaminating DNA.

For DRAGEN MRD (similar to all somatic runs) it is recommended to use the linear hashtable. DRAGEN hashtables can be downloaded here:

It is also recommended to use a systematic noise file to further reduce false positives. Prebuilt systematic noise BED files (WES and WGS) can be downloaded here: .

For more information on systematic noise files see: To download germline annotation files, please refer to:

For more details on sample matching (also referred to as DRAGEN checkfingerprint, but not to be confused with the tumor fingerprint) please refer to: .

Product Files
Product Files
Sample Matching
Nirvana
SNV systematic noise