DRAGEN Amplicon Pipeline
Amplicon sequencing is a highly targeted approach that enables you to analyze genetic variation in specific genomic regions. The ultradeep sequencing of PCR products (amplicons) allows you to efficiently identify and characterize variants. This method uses oligonucleotide probes designed to target and capture regions of interest, followed by next-generation sequencing (NGS).
The Amplicon Pipeline supports both DNA and RNA data. The Amplicon Pipeline turns off duplicate marking because there are only a few unique start and end positions for fragments from an amplicon target due to the assay.
The DNA Amplicon Pipeline uses the DRAGEN DNA Pipeline by including an additional step after mapping and aligning to soft-clip primers and rewrite alignments. If the target amplicon is found, DRAGEN tags each alignment with the target amplicon and performs soft-clipping on the primer sequences. DRAGEN performs tagging by adding an XN:Z:<amplicon name>
tag to the output BAM/CRAM record. Soft-clipping makes sure that the primer sequences do not contribute to the variant calls.
In the primer clipping step, poorly aligned reads are also unaligned with MAPQ set to 0:
Alignments that don't consume any reference bases after soft-clipping.
Off-target alignments overlapping target regions.
Alignments with a substitution fraction more than a threshold. Substitution fraction is the ratio of match count to match and mismatch count and the probe regions are excluded from the calculation. The threshold is specified by
--amplicon-max-substitution-fraction
with a default of 0.04.Alignments with read base count less than the short-read threshold after soft-clipping and with a substitution fraction more than a threshold including the probes. The short-read threshold is specified by
--amplicon-shortread-length-threshold
with a default of 30. The probe regions are included in the calculation and soft-clipped bases are treated as mismatches. The substitution threshold is set by--amplicon-max-shortread-substitution-fraction
with a default of 0.1.Alignments with a soft-clipping fraction more than a threshold. The probe regions are excluded from the calculation and the treshold is set by
--amplicon-max-softclip-fraction
with a default of 0.1.Off-target alignments with a soft-clipping fraction more than a threshold. The probe regions are included in the calculation and the threshold is set by
--amplicon-max-offtarget-softclip-fraction
with a default of 0.2.

The RNA Amplicon Pipeline uses the DRAGEN RNA Pipeline. Amplicon-specific parameters are set for fusion calling, including a fusion scoring model trained on RNA amplicon data. Small variant calling is not supported in RNA amplicon mode.

Amplicon BED File
The DRAGEN Amplicon Pipeline requires an amplicon BED file and all input files required by the DRAGEN DNA or RNA pipeline. Each row in an amplicon BED file describes an amplicon target. The fields are as follows.
chrom
The name of the chromosome.
chromStart
The 0-based inclusive start position of the target, excluding the primer.
chromEnd
The 0-based exclusive end position of the target, excluding the primer.
name
The name of the amplicon target.
gene
[Optional] The gene ID.
targetType
[Optional] The target type.
In copy number variant calling of DNA amplicon mode, the default segmentation mode is bed and could be modified via --cnv-segmentation-mode
. The CNV segmentation bed is gene-level and auto-generated based on the gene ID column in the amplicon BED file. In small gene panels, where regions for establishing a copy-neutral baseline are limited, the targetType (CTRL vs. CNVtarget) is used to identify control regions for CNV calling. In RNA amplicon mode, targetType is used to identify fusion targets, whose targetType is Fusion. The gene IDs for fusion targets are collected and written to an output file. The default value of --rna-gf-enriched-genes
is then set to this file containing fusion gene IDs. A candidate fusion is required to have both partner genes in the gene list. Base-level and read-level coverage is calculated for each region in the amplicon BED file. It is recommended that the fusion targets are commented to avoid competition with gene expression targets.
DRAGEN DNA Amplicon Settings
To use the DNA amplicon pipeline, set --enable-dna-amplicon
to true
. Use --amplicon-target-bed
to specify the path to your amplicon BED file.
To enable small variant calling, set --enable-variant-calling
to true
. To enable copy number variant calling, set set --enable-cnv
to true
. GC bias correction when generating target counts is enabled by default. The generation of the target counts for the normal samples should also have identical command line options with the case sample under analysis.
To enable structural variant calling, set --enable-sv
to true
. Note that amplicon assays may have limited ability to detect large structural variants (SV) due to their design characteristics and restricted target region length. The target small variant calling BED input is set to amplicon BED file by default and could be modified via --vc-target-bed
. The CNV segmentation bed is auto generated based on the gene ID column in the amplicon BED file and could be modified via cnv-segmentation-bed
. See CNV Targeted Segmentation (Segment BED) for more information. The amplicon pipeline can be run in either germline or somatic mode. For the somatic mode, specify a tumor-only or tumor-normal input. For more details about somatic mode, see Somatic Mode and Somatic Mode Options. In amplicon tumor-only somatic variant calling, potential germline variants can be annotated in the INFO field with the 'GermlineStatus' tag using population databases. Refer to Germline Tagging in the Tumor-Only Pipeline for details. For more information on the multicaller (germline & somatic) workflows, see Multicaller Workflows. If calling somatic small variants, we also recommend to set --vc-use-somatic-hotspots
to false
.
By default the maximum amplicon primer length is set to 50. You can specify a different value using --amplicon-primer-length
. The parameter affects whether an alignment is assigned to an amplicon target. If an alignment starts inside the primer region of the amplicon target, the alignment is assigned to the amplicon. For a properly paired alignment, both the alignment and the mate must come from the same amplicon target. However, in order to detect deletion events that are close to the target boundaries, we now require only one of the reads to start in the primer region (--amplicon-allow-partial-target=true
by default). For candidate deletions, we rewrite the CIGAR to make them candidates for columnwise detection (--amplicon-enable-deletion-realigner=true
by default).
|-- primer --|-- amplicon target --|-- primer --|
---------- read ----------------->
<---------- read -----------------
The following is an example command line to run the DRAGEN DNA Amplicon Pipeline with copy number, structural variant and germline small variant calling.
dragen --enable-dna-amplicon true --enable-map-align=true --enable-sort=true --enable-map-align-output=true -r reference_genomes/Hsapiens/hg19_alt_aware/DRAGEN/8 --amplicon-target-bed=CancerHotSpot-v2.dna_manifest.20180509.bed --enable-variant-caller=true --enable-cnv=true --enable-sv=true --fastq-file1=read1.fastq.gz --fastq-file2=read2.fastq.gz --RGSM NA12878 --RGID 1 --output-directory=/staging/out --output-file-prefix=NA12878
DRAGEN RNA Amplicon Settings
To use the RNA amplicon pipeline, set --enable-rna-amplicon
to true
. Use --amplicon-target-bed
to specify the path to your amplicon BED file.
We do not recommend enabling RNA quantification to produce the .sf
quantification output files as a panel-specific GTF file is usually not used. The .target_bed_read_cov_report.bed
read-level coverage output file should be used instead. This file is automatically produced when map/align is output enabled.
To enable RNA gene fusion calling, set --enable-rna-gene-fusion
to true
. Fusion calling parameters are automatically set in RNA amplicon mode but can be overridden in the command line. If fusion targets are not listed in the amplicon BED file, users can explicitly set --rna-gf-enriched-genes
to a file containing fusion gene IDs or symbols.
The following is an example command line to run the DRAGEN RNA Amplicon Pipeline with gene fusion calling.
dragen --enable-rna-amplicon true --enable-map-align=true --enable-sort=true --enable-map-align-output=true -r reference_genomes/Hsapiens/hg19_alt_aware/DRAGEN/8 --amplicon-target-bed=Myeloid.rna_manifest.20201014.bed --enable-rna-gene-fusion=true --ann-sj-file=gencode.v19.annotation.gtf --output-format=BAM --fastq-file1=read1.fastq.gz --fastq-file2=read2.fastq.gz --RGSM Seraseq --RGID 1 --output-directory=/staging/out --output-file-prefix=Seraseq
Imbalance Ratio
The RNA amplicon pipeline includes an option for an additional metric, the imbalance ratio. This is a metric for measuring fusion events using amplicons that target the 3' and 5' ends of a gene and calculating the deviation in their coverages. The imbalance ratio for the gene is the difference between 3' and 5' coverage divided by the total counts from all genes.

To enable imbalance ratio calculation, use --amplicon-enable-imbalance-ratio=true
in addition to any other amplicon pipeline arguments. If imbalance ratio is enabled, the gene and target type columns in the amplicon target BED file become mandatory. Target types can be "3prime", "5prime" or "control" (case sensitive). There can be multiple 3' or 5' targets for each gene. Any other target type (e.g., "none") will not be included in the imbalance ratio calculation, but will ba included in other amplicon metrics. Imbalance ratio results are reported in a comma-delimited file with the suffix .imbalance_ratio.csv
.
DRAGEN Amplicon Panel Specific Settings
To support the varied designs of amplicon panels and the specific requirements of different analysis types (e.g., SNV, CNV, SV, MSI, RNA fusion, RNA splice variants, and RNA 3'/5' imbalance ratio), panel-specific parameter settings have been integrated into the command-line options. Each supported panel has a dedicated option, and the details for these are listed in the table below:
Panel Name
Short Name
Panel Code
Sample Type
Default variant caller enabled
Command Line Options
oncoReveal BRCA1 & BRCA2 plus CNV
BRCA CNV
BR283
DNA
SNV, CNV
--amplicon-enable-dna-brca
oncoReveal Heme
Heme
P-HFU-01
RNA
RNA fusion
--amplicon-enable-rna-heme
oncoReveal Lymphoid
Lymphoid
P-LYM-01
DNA
SNV, SV
--amplicon-enable-dna-lymphoid
oncoReveal Core LBx
Core LBx
P-LBX-01
cfDNA
SNV, CNV, MSI
--amplicon-enable-cfdna-core
oncoReveal Essential LBx
Essential LBx
P-LBX-04
cfDNA
SNV, CNV, MSI
--amplicon-enable-cfdna-essential
oncoReveal Essential MPN
MPN
MY7
DNA
SNV
--amplicon-enable-dna-mpn
oncoReveal Fusion LBx
Fusion LBx
P-LBX-03
cfRNA
RNA fusion, RNA splice-variant
--amplicon-enable-cfrna-lbxfusion
oncoReveal Multi-Cancer RNA Fusion v2
Multi-Cancer with Fusion
SF-V2
RNA
RNA fusion, RNA splice-variant, RNA 3'/5' imbalance-ratio
--amplicon-enable-rna-multicancer
oncoReveal Multi-Cancer v4 with CNV
Multi-Cancer with CNV
HS341
DNA
SNV, CNV
--amplicon-enable-dna-multicancer
oncoReveal Myeloid
Myeloid
MY766
DNA
SNV, SV
--amplicon-enable-dna-myeloid
oncoReveal Nexus 21 Gene
Nexus
P-CMC-01
DNA
SNV, SV
--amplicon-enable-dna-nexus
oncoReveal Solid Tumor v2
Solid Tumor v2
P-ST-02
DNA
SNV
--amplicon-enable-dna-solidtumor
When a panel-specific switch is enabled, the corresponding default variant callers are automatically activated. These defaults can be overridden via the command line if needed:
Variant Callers
Command Line Options
SNV
--enable-variant-caller
CNV
--enable-cnv
SV
--enable-sv
MSI
--amplicon-enable-msi
RNA fusion
--enable-rna-gene-fusion
RNA splice-variant
--enable-rna-splice-variant
RNA 3'/5' imbalance-ratio
--amplicon-enable-imbalance-ratio
All necessary resource files for each panel—such as the amplicon target BED file, SNV and SV systematic noise files, CNV Panel of Normals (PON), and MSI PON (if applicable)—are pre-packaged within DRAGEN. These resources are automatically detected when the corresponding panel-specific option is enabled. Users who prefer to supply custom resource files can do so through command-line options:
Resources Files
Command Line Options
SNV systematic noise file
--vc-systematic-noise
SV systematic noise file
--sv-systematic-noise
CNV Panel of Normals (PON)
--cnv-combined-counts
MSI PON
--msi-ref-normal-input
By default, CNV analysis does not use the pre-packaged Panel of Normals (PON). We recommend including in-run normal samples—matched in sample type and library preparation—in the same sequencing run to serve as the PON. If generating a custom PON is not feasible, the pre-packaged panel-specific PON can be used as a fallback. To enable this, set the amplicon-cnv-use-default-pon
to true
. The CNV component also utilizes the sixth column of the amplicon target BED file to identify regions annotated as "CTRL" (used for establishing baseline coverage) and "CNVtarget" (used for calling copy number variants).
Last updated
Was this helpful?