DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • MRJD method
  • Two modes of the MRJD Caller
  • Default mode
  • High Sensitivity mode
  • Running DRAGEN MRJD
  • Example WGS workflow that includes both DRAGEN Small Variant Caller and MRJD
  • Output format
  • Uniquely placed call
  • Non-uniquely-placed call

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4
  3. DRAGEN DNA Pipeline
  4. Small Variant Calling

Multi-Region Joint Detection

DRAGEN Multi-region Joint Detection (MRJD) is a de novo germline small variant caller for paralogous regions. In DRAGEN v4.4, MRJD covers regions that include six clinically relevant genes: NEB, TTN, SMN1/2, PMS2, STRC, and IKBKG. MRJD is compatible with hg38, hg19 and GRCh37 reference genome. The table below includes hg38 region coordinates covered by MRJD.

Chromosome
Start
End
Description

chr2

151578759

151588523

NEB exon 98-105

chr2

151589318

151599076

NEB exon 90-97

chr2

151599871

151609628

NEB exon 82-89

chr2

178653238

178654995

TTN exon 172-180

chr2

178657498

178659255

TTN exon 181-189

chr2

178661759

178663516

TTN exon 190-198

chr5

70049522

70077596

SMN2

chr5

70924940

70953013

SMN1

chr7

5970924

5980896

PMS2 exon 13-15

chr7

5980968

5987689

PMS2 exon 11-12

chr7

6737007

6743712

PMS2CL exon 2-3

chr7

6743880

6753867

PMS2CL exon 4-6

chr15

43599563

43602630

STRC exon 24-29

chr15

43602982

43611000

STRC exon 14-23

chr15

43611040

43618800

STRC exon 1-13

chr15

43699379

43702452

STRCP1 exon 23-28

chr15

43702488

43710472

STRCP1 exon 13-22

chr15

43710502

43718262

STRCP1 exon 1-12

chrX

154555884

154565047

IKBKG exon 3-10

chrX

154639390

154648553

IKBKGP1

MRJD method

MRJD is a variant calling method that is designed to detect de novo germline small variants in paralogous regions of the genome. A conventional variant caller relies on the read aligner to determine which reads likely originated from a given location. This method works well when the region of interest does not resemble any other region of the genome over the span of a single read (or a pair of reads for paired-end sequencing). However, a significant fraction of the human genome does not meet this criterion. At least 5% of the human genome consists of segmental duplications. Many regions of the genome have near-identical copies elsewhere, and as a result, the true source location of a read might be subject to considerable uncertainty. If a group of reads is mapped with low confidence, a conventional variant caller might ignore the reads, even though they contain useful information. If a read is mismapped (i.e., the primary alignment is not the true source of the read), it can result in variant detection errors.

MRJD is designed in an attempt to tackle the complexities raised by segmental duplication regions. Instead of considering each region in isolation, MRJD considers all locations from which a group of reads may have originated and attempts to detect the underlying sequences jointly across all paralogous regions in the sample of interest.

Below is a diagram showing the general workflow of MRJD in a pair of paralogous regions. MRJD takes primary alignments in all paralogous regions, regardless of mapping quality, builds and places all copies in a pair of paralogous regions based on reads and prior knowledge, call small variants based on the placed copies, and output final genotypes.

Figure 1. MRJD Caller workflow.

Two modes of the MRJD Caller

There are two modes of the DRAGEN MRJD Caller, default mode and high sensitivity mode. Here are details on the differences between the two modes.

Default mode

With --enable-mrjd=true, the MRJD Caller will report the following two types of variants:

  1. Uniquely placed variants, which means the variant is found and placed in one of the paralogous regions without ambiguity. These variants will be labeled as "UNIQUELY_PLACED" in the VCF INFO field.

  2. Region-ambiguous variants. In this case, the aggregated genotype contains a variant allele with high confidence, but MRJD Caller is unable to place the variant allele in one of the paralogous regions with high confidence. The MRJD Caller will report the variant allele in all paralogous regions. These variants will be labeled as "REGION_AMBIGUOUS" in the VCF INFO field.

High Sensitivity mode

With both --enable-mrjd=true and --mrjd-enable-high-sensitivity-mode=true, the MRJD Caller reports the same variants as from the default mode, plus two other types of variants.

  1. Positions where the reference alleles in all paralogous regions are not the same. It is well established that gene conversion, including reciprocal crossover, is a common event between paralogous regions (such as PMS2 and PMS2CL). When reciprocal crossover event occurs, the prior model, without nearby information on phasing, might end up placing the converted haplotype in the source region instead of the destination region, resulting in no variant. The high sensitivity mode compensates for this event by reporting the variant in corresponding positions in all paralogous regions. These variants will be labeled as "MRJD_HS;REF_DIFF_SITE" in the VCF INFO field.

  2. Variants that have been placed uniquely in one of the paralogous regions and no variant in the corresponding position in the other region. The high sensitivity mode reports the variant in the rest of the paralogous regions. This is to compensate the fact that sometimes the prior knowledge that is used to help place the variant is not sufficient or is estimated incorrectly. In those cases, the variant allele still exists but is placed in the wrong paralog region. Therefore, reporting the variant in the other paralogous regions can help maximize sensitivity even with the lack of prior. These variants will be labeled as "MRJD_HS;ALT_LOCATION" in the VCF INFO field.

Running DRAGEN MRJD

The MRJD Caller is disabled by default and requires WGS data aligned to a human reference genome build 38, 19, or GRCh37.

Here is the list of options related to MRJD.

  • --enable-mrjd If set to true, MRJD is enabled for the DRAGEN pipeline.

  • --mrjd-enable-high-sensitivity-mode If set to true, MRJD high sensitivity mode is enabled for the DRAGEN pipeline. See previous section on what variant types are reported in MRJD default mode and high sensitivity mode (default = ‘false’).

The following command-line example uses FASTQ input and runs MRJD Caller with high sensitivity mode:

dragen \
  -r <REF> \
  -1 <FQ1> \
  -2 <FQ2> \
  --RGID <RG> --RGSM <SM> \
  --output-dir <OUTPUT> \
  --output-file-prefix <PREFIX> \
  --enable-map-align=true \
  --enable-map-align-output=true \
  --enable-sort=true \
  --enable-duplicate-marking=true \
  --enable-mrjd true \
  --mrjd-enable-high-sensitivity-mode true

The following command-line example uses BAM input that has already been aligned and runs MRJD Caller with high sensitivity mode:

dragen \
  -r <REF> \
  -b <BAM> \
  --output-dir <OUTPUT> \
  --output-file-prefix <PREFIX> \
  --enable-map-align=false \
  --enable-mrjd true \
  --mrjd-enable-high-sensitivity-mode true

Example WGS workflow that includes both DRAGEN Small Variant Caller and MRJD

Starting from DRAGEN v4.4, MRJD can run together with the DRAGEN Small Variant Caller in the same DRAGEN run. Here are the example command lines to run DNA Mapping using FASTQ files as input, followed by Small Variant Calling and MRJD.

# run DNA Mapping, Small Variant Calling, and MRJD
dragen \
  -r <HASH_TABLE> \
  -1 <FQ1> \
  -2 <FQ2> \
  --RGID <RG> --RGSM <SM> \
  --output-dir <OUTPUT_DIRECTORY> \
  --output-file-prefix <PREFIX> \
  --enable-map-align true \
  --enable-map-align-output true \
  --enable-sort true \
  --enable-duplicate-marking true \
  --enable-variant-caller true \
  --enable-mrjd true \
  --mrjd-enable-high-sensitivity-mode true

Output format

The MRJD Caller generates a <sample>.mrjd.hard-filtered.vcf.gz file in the output directory. The output file is a compressed VCFv4.2 formatted file that contains the VCF representation of the small variants from the identified genotype.

Uniquely placed call

The following are example output format for uniquely placed variant. The DRAGENHardQual filter is applied to the records if the variant has a QUAL < 3.00.

Figure 2. VCF output format example for uniquely placed call.

Non-uniquely-placed call

For variants that are not uniquely placed, including region-ambiguous variants from default mode, and all variants from high sensitivity mode, the MRJD Caller will also report variants under diploid genotype format, which can be interpreted the same way as uniquely placed variant (the genotype is region-specific instead of being an aggregate across all regions). Under this format, the QUAL represents phred-scaled quality score for the assertion made in ALT (i.e. −10log10 prob(GT==0/0)). Note that the QUAL score will be equal to or less than 3 (if the QUAL > 3, then the call should be uniquely placed).

The QUAL, GT, GQ and PL will be reported similarly to the DRAGEN germline VC. To avoid losing information about the aggregated genotype across paralogous regions, the MRJD Caller reports genotype, phred-scaled quality score, and the phred-scaled genotype likelihoods for aggregated genotype using JGT, JQL, and JPL in the FORMAT column.

Figure 3. VCF output format example for region-ambiguous call.

PreviousVCF ImputationNextCopy Number Variant Calling

Last updated 2 days ago

Was this helpful?

Using this workflow, two VCF files will be created (<sample>.hard-filtered.vcf.gz by DRAGEN Small Variant Caller and <sample>.mrjd.hard-filtered.vcf.gz by DRAGEN MRJD). To help user get a single VCF file for downstream analysis, we prepared an utility tool that replaces the DRAGEN Small Variant Caller output in the homology region of the six medically relevant and challenging genes with MRJD caller output. The tool also annotates the calls made by MRJD (with "MRJD" tag in the INFO column). Please refer to the to download the utility tool.

DRAGEN Software Support Site page
Workflow
MRJD_unique
MRJD_region_ambiguous