DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • Total CYP21A2 and CYP21A1P Copy Number
  • Nonrecombinant-like Variant Calling
  • Nonallelic Homologous Recombination Variant Calling
  • CYP21A2 Output File
  • Output File Example

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4
  3. DRAGEN DNA Pipeline
  4. Targeted Caller

CYP21A2 Caller

The CYP21A2 Caller is capable of genotyping the CYP21A2 gene from whole-genome sequencing (WGS) data. Due to high sequence similarity with its pseudogene paralog CYP21A1P and a wide variety of common structural variants (SVs), a specialized caller is necessary to resolve variants.

The CYP21A2 calling workflow is broken up into the following major stages:

  1. Loading input configuration

  2. Processing read data

  3. Analyzing read data

Read data analysis is further split into the following steps:

  1. Determine total CYP21A2 and CYP21A1P copy number from read depth.

  2. Call small variants in CYP21A2 copies.

  3. Phase reads to detect common variants and recombination events.

  4. Identify most likely haplotypes.

The CYP21A2 Caller requires WGS data aligned to a human reference genome with at least 30x coverage.

Total CYP21A2 and CYP21A1P Copy Number

The first step of CYP21A2 calling is to determine the combined copy number of CYP21A2 and CYP21A1P. Reads aligned to regions in either CYP21A2 or CYP21A1P are counted. The counts in each region are corrected for GC-bias, and then normalized to a diploid baseline. The GC-bias correction and normalization factors are determined from read counts in 3000 preselected 2kb regions across the genome. These 3000 normalization regions were randomly selected from the portion of the reference genome having stable coverage across population samples. The combined CYP21A2 and CYP21A1P copy number is then calculated from the average sequencing depth across the CYP21A2 and CYP21A1P regions.

Nonrecombinant-like Variant Calling

Of the known nonrecombinant-like variants, some are in unique (nonhomologous) regions of CYP21A2 with high mapping quality. Only reads mapping to CYP21A2 are used for calling variants in nonhomologous regions. The other variants occur in homologous regions of CYP21A2/CYP21A1P where reads mapping to either are used for variant calling.

For each variant, reads containing either the variant allele or the nonvariant allele is counted. A binomial model that incorporates the sequencing error rate is then used to determine the most likely variant copy number (0 for nonvariant).

For a list of the supported nonrecombinant-like variants, refer to the targeted/cyp21a2/target_variants_*.tsv files located in the resources directory of the DRAGEN install location.

Nonallelic Homologous Recombination Variant Calling

To analyze the homologous region even further, DRAGEN phases reads covering differentiating sites and known variant sites. Whenever a detected haplotype has a CYP21A2->CYP21A1P or CYP21A1P->CYP21A2 transition that is consistent with one of the known recombinant-like variants, the transition is considered as a candidate breakpoint for calling those variants. Reads containing phasing information for the two sites flanking each candidate breakpoint are used for variant calling. When the read data supports the hypothesis that the sample contains at least one copy of a candidate breakpoint, the associated haplotype is a recombinant haplotype candidate. Recombinant haplotype candidates are sorted by likelihood and the number of variant sites. If no wild type haplotype was detected, DRAGEN reports any detected homozygous recombinant haplotype, or up to two different recombinant haplotypes (i.e. compound het) if detected. If any wild type haplotype was found, DRAGEN reports a maximum of one recombinant haplotype. When no recombinant haplotypes are detected two wild type haplotypes are reported.

For a list of recombinant variant sites, refer to the targeted/cyp21a2/recombinant_variants_*.tsv files located in the resources directory of the DRAGEN install location.

Note that NM_000500.9:c.710_719delinsACGAGGAGAA will be reported as the following three variants on the same haplotype: NM_000500.9:c.710T>A NM_000500.9:c.713T>A NM_000500.9:c.719T>A

CYP21A2 Output File

Fields in JSON
Explanation
Type and Possible Values

totalCopyNumber

Total copy number of CYP21A2 and CYP21A1P genes including hybrids

nonnegative integer

deletionBreakpointInGene

null (i.e. unknown) if totalCopyNumber > 3

true, false, null

true if CN <= 3 and a deletion-like recombinant variant haplotype is detected

false if CN <=3 and no deletion-like recombinant variant is detected

recombinantHaplotypes

List of detected haplotypes arising from nonallelic homologous recombination variant calling

Array of two strings. Each string consists of all associated allele IDs (if any) within the haplotype. Consecutive IDs in the same haplotype are separated by a '+'.

variants

List of single site, nonrecombinant-like variants (i.e. not arising from nonallelic homologous recombination). An empty list if no variants are detected.

Array of nonrecombinant-like variants.

Note: A deletion-like recombinant variant haplotype (as opposed to a gene conversion-like recombinant variant haplotype) is defined as a haplotype with one or fewer switch sites (transitions from a CYP21A1P allele to a CYP21A2 allele) after excluding some sites with common gene conversions in CYP21A1P.

Each nonrecombinant-like variant reported in the variants array will have the fields below.

Fields in JSON
Explanation
Type and Possible Values

alleleId

HGVS identifier of the variant allele

string

alleleCopyNumber

Copy number of the allele in the called genotype

nonnegative integer

genotypeQuality

Phred-scaled quality for the called genotype

nonnegative integer

filter

Filter for the called genotype

string. "PASS" when not filtered

Output File Example

An example of the CYP21A2 caller content in the <output-file-prefix>.targeted.json output file is shown below.

{        
    "cyp21a2": {
            "totalCopyNumber": 2,
            "deletionBreakpointInGene": true,
            "recombinantHaplotypes": [
                "NM_000500.9:c.92C>T+NM_000500.9:c.293-13C>G+NM_000500.9:c.332_339del+NM_000500.9:c.518T>A+NM_000500.9:c.710T>A+NM_000500.9:c.713T>A+NM_000500.9:c.719T>A+NM_000500.9:c.923dupT",
                "NM_000500.9:c.92C>T+NM_000500.9:c.293-13C>G+NM_000500.9:c.332_339del+NM_000500.9:c.518T>A+NM_000500.9:c.710T>A+NM_000500.9:c.713T>A+NM_000500.9:c.719T>A+NM_000500.9:c.923dupT+NM_000500.9:c.955C>T"
            ],
            "variants": [
                {
                    "alleleId": "NM_000500.9:c.1360C>T",
                    "alleleCopyNumber": 2,
                    "genotypeQuality": 18,
                    "filter": "PASS"
                }
            ]
    }
}
PreviousCYP2D6 CallerNextGBA Caller

Last updated 2 days ago

Was this helpful?

The CYP21A2 Caller generates its output in the targeted caller output file <output-file-prefix>.targeted.json that also contains calls from other targets (see ).

Recombinant-like and nonrecombinant-like variants are reported in VCF format. See for details about how these variants are reported in VCF.

Targeted JSON File
Targeted VCF File