# DRAGEN Amplicon Pipeline

Amplicon sequencing is a highly targeted approach that enables you to analyze genetic variation in specific genomic regions. The ultradeep sequencing of PCR products (amplicons) allows you to efficiently identify and characterize variants. This method uses oligonucleotide probes designed to target and capture regions of interest, followed by next-generation sequencing (NGS).

The Amplicon Pipeline supports both DNA and RNA data. The Amplicon Pipeline turns off duplicate marking because there are only a few unique start and end positions for fragments from an amplicon target due to the assay.

The DNA Amplicon Pipeline uses the DRAGEN DNA Pipeline by including an additional step after mapping and aligning to soft-clip primers and rewrite alignments. If the target amplicon is found, DRAGEN tags each alignment with the target amplicon and performs soft-clipping on the primer sequences. DRAGEN performs tagging by adding an `XN:Z:<amplicon name>` tag to the output BAM/CRAM record. Soft-clipping makes sure that the primer sequences do not contribute to the variant calls.

In the primer clipping step, poorly aligned reads are also unaligned with MAPQ set to 0:

* Alignments that don't consume any reference bases after soft-clipping.
* Off-target alignments overlapping target regions.
* Alignments with a substitution fraction more than a threshold. Substitution fraction is the ratio of match count to match and mismatch count and the probe regions are excluded from the calculation. The threshold is specified by `--amplicon-max-substitution-fraction` with a default of 0.04.
* Alignments with read base count less than the short-read threshold after soft-clipping and with a substitution fraction more than a threshold including the probes. The short-read threshold is specified by `--amplicon-shortread-length-threshold` with a default of 30. The probe regions are included in the calculation and soft-clipped bases are treated as mismatches. The substitution threshold is set by `--amplicon-max-shortread-substitution-fraction` with a default of 0.1.
* Alignments with a soft-clipping fraction more than a threshold. The probe regions are excluded from the calculation and the treshold is set by `--amplicon-max-softclip-fraction` with a default of 0.1.
* Off-target alignments with a soft-clipping fraction more than a threshold. The probe regions are included in the calculation and the threshold is set by `--amplicon-max-offtarget-softclip-fraction` with a default of 0.2.

![DRAGEN DNA Amplicon Pipeline](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-8a9380fb52173a28dd70f5637745afc0ace0811b%2FDRAGENTMDNAAmpliconPipeline_appDIAG.png?alt=media)

The RNA Amplicon Pipeline uses the DRAGEN RNA Pipeline. Amplicon-specific parameters are set for fusion calling, including a fusion scoring model trained on RNA amplicon data. Small variant calling is not supported in RNA amplicon mode.

![DRAGEN RNA Amplicon Pipeline](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-26ee93323c2a84139e6d845f061fb91887eea022%2FDRAGENTMRNAAmpliconPipeline_appDIAG.png?alt=media)

## Amplicon BED File

The DRAGEN Amplicon Pipeline requires an amplicon BED file and all input files required by the DRAGEN DNA or RNA pipeline. Each row in an amplicon BED file describes an amplicon target. The fields are as follows.

| Field      | Description                                                               |
| ---------- | ------------------------------------------------------------------------- |
| chrom      | The name of the chromosome.                                               |
| chromStart | The 0-based inclusive start position of the target, excluding the primer. |
| chromEnd   | The 0-based exclusive end position of the target, excluding the primer.   |
| name       | The name of the amplicon target.                                          |
| gene       | **\[Optional]** The gene ID.                                              |
| targetType | **\[Optional]** The target type.                                          |

In copy number variant calling of DNA amplicon mode, the default segmentation mode is **bed** and could be modified via `--cnv-segmentation-mode`. The CNV segmentation bed is gene-level and auto-generated based on the gene ID column in the amplicon BED file. In small gene panels, where regions for establishing a copy-neutral baseline are limited, the targetType (CTRL vs. CNVtarget) is used to identify control regions for CNV calling. In RNA amplicon mode, targetType is used to identify fusion targets, whose targetType is **Fusion**. The gene IDs for fusion targets are collected and written to an output file. The default value of `--rna-gf-enriched-genes` is then set to this file containing fusion gene IDs. A candidate fusion is required to have both partner genes in the gene list. Base-level and read-level coverage is calculated for each region in the amplicon BED file. It is recommended that the fusion targets are commented to avoid competition with gene expression targets.

## DRAGEN DNA Amplicon Settings

To use the DNA amplicon pipeline, set `--enable-dna-amplicon` to `true`. Use `--amplicon-target-bed` to specify the path to your amplicon BED file.

To enable small variant calling, set `--enable-variant-calling` to `true`. To enable copy number variant calling, set `--enable-cnv` to `true`. GC bias correction when generating target counts is enabled by default. The generation of the target counts for the normal samples should also have identical command line options with the case sample under analysis.\
To enable structural variant calling, set `--enable-sv` to `true`. Note that amplicon assays may have limited ability to detect large structural variants (SV) due to their design characteristics and restricted target region length. The target small variant calling BED input is set to amplicon BED file by default and could be modified via `--vc-target-bed`. The CNV segmentation bed is auto generated based on the gene ID column in the amplicon BED file and could be modified via `cnv-segmentation-bed`. See CNV [Targeted Segmentation (Segment BED)](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-overview) for more information. The amplicon pipeline can be run in either germline or somatic mode. For the somatic mode, specify a tumor-only or tumor-normal input. For more details about somatic mode, see [Somatic Mode](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/small-variant-calling/somatic-mode#somatic-mode) and [Somatic Mode Options](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/small-variant-calling/somatic-mode#somatic-mode-options). In amplicon tumor-only somatic variant calling, potential germline variants can be annotated in the INFO field with the 'GermlineStatus' tag using population databases. Refer to [Germline Tagging in the Tumor-Only Pipeline](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/small-variant-calling/somatic-mode#germline-tagging-in-the-tumor-only-pipeline) for details. For more information on the multicaller (germline & somatic) workflows, see [Multicaller Workflows](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/multi-caller#multicaller-workflows). If calling somatic small variants, we also recommend to set `--vc-use-somatic-hotspots` to `false`.

By default the maximum amplicon primer length is set to 50. You can specify a different value using `--amplicon-primer-length`. The parameter affects whether an alignment is assigned to an amplicon target. If an alignment starts inside the primer region of the amplicon target, the alignment is assigned to the amplicon. For a properly paired alignment, both the alignment and the mate must come from the same amplicon target. However, in order to detect deletion events that are close to the target boundaries, we now require only one of the reads to start in the primer region (`--amplicon-allow-partial-target=true` by default). For candidate deletions, we rewrite the CIGAR to make them candidates for columnwise detection (`--amplicon-enable-deletion-realigner=true` by default).

```
  |-- primer --|-- amplicon target --|-- primer --|
     ---------- read ----------------->
              <---------- read -----------------
```

The following is an example command line to run the DRAGEN DNA Amplicon Pipeline with copy number, structural variant and germline small variant calling.

```
dragen --enable-dna-amplicon true --enable-map-align=true --enable-sort=true --enable-map-align-output=true -r reference_genomes/Hsapiens/hg19_alt_aware/DRAGEN/8 --amplicon-target-bed=CancerHotSpot-v2.dna_manifest.20180509.bed --enable-variant-caller=true --enable-cnv=true --enable-sv=true --fastq-file1=read1.fastq.gz --fastq-file2=read2.fastq.gz --RGSM NA12878 --RGID 1 --output-directory=/staging/out --output-file-prefix=NA12878
```

## DRAGEN RNA Amplicon Settings

To use the RNA amplicon pipeline, set `--enable-rna-amplicon` to `true`. Use `--amplicon-target-bed` to specify the path to your amplicon BED file.

We do not recommend enabling RNA quantification to produce the `.sf` quantification output files as a panel-specific GTF file is usually not used. The `.target_bed_read_cov_report.bed` read-level coverage output file should be used instead. This file is automatically produced when map/align is output enabled.

To enable RNA gene fusion calling, set `--enable-rna-gene-fusion` to `true`. Fusion calling parameters are automatically set in RNA amplicon mode but can be overridden in the command line. If fusion targets are not listed in the amplicon BED file, users can explicitly set `--rna-gf-enriched-genes` to a file containing fusion gene IDs or symbols.

The following is an example command line to run the DRAGEN RNA Amplicon Pipeline with gene fusion calling.

```
dragen --enable-rna-amplicon true --enable-map-align=true --enable-sort=true --enable-map-align-output=true -r reference_genomes/Hsapiens/hg19_alt_aware/DRAGEN/8 --amplicon-target-bed=Myeloid.rna_manifest.20201014.bed --enable-rna-gene-fusion=true --ann-sj-file=gencode.v19.annotation.gtf --output-format=BAM --fastq-file1=read1.fastq.gz --fastq-file2=read2.fastq.gz --RGSM Seraseq --RGID 1 --output-directory=/staging/out --output-file-prefix=Seraseq
```

### Imbalance Ratio

The RNA amplicon pipeline includes an option for an additional metric, the imbalance ratio. This is a metric for measuring the likelihood of a fusion events using amplicons that target the 3' and 5' ends of a gene and calculating the deviation in their coverages. The imbalance ratio for the gene is the difference between 3' and 5' coverage divided by the total counts from all genes. A breakdown of the counts and imbalance ratios is available in a comma-delimited file with the suffix .imbalance\_ratio.csv. The denominator is the sum of the Total\_counts column.

![Imbalance Ratio](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-11ef11b9c1ff2fdde9606a9d81bc717febbec6ba%2Fimbalance-ratio.png?alt=media)

To enable imbalance ratio calculation, use --amplicon-enable-imbalance-ratio=true in addition to any other amplicon pipeline arguments. If imbalance ratio is enabled, the gene and target type columns in the amplicon target BED file become mandatory. Target types can be "3prime", "5prime" or "control" (case sensitive). There can be multiple 3' or 5' targets for each gene. Any other target type (e.g., "none") will not be included in the imbalance ratio calculation, but will ba included in other amplicon metrics.

## DRAGEN Amplicon Panel Specific Settings

To support the varied designs of amplicon panels and the specific requirements of different analysis types (e.g., SNV, CNV, SV, MSI, RNA fusion, RNA splice variants, and RNA 3'/5' imbalance ratio), panel-specific parameter settings have been integrated into the command-line options. Each supported panel has a dedicated option, and the details for these are listed in the table below:

|             **Panel Name**            |      **Short Name**      | **Panel Code** | **Sample Type** |             **Default variant caller enabled**            |      **Command Line Options**     |
| :-----------------------------------: | :----------------------: | :------------: | :-------------: | :-------------------------------------------------------: | :-------------------------------: |
|   oncoReveal BRCA1 & BRCA2 plus CNV   |         BRCA CNV         |      BR283     |       DNA       |                          SNV, CNV                         |     --amplicon-enable-dna-brca    |
|            oncoReveal Heme            |           Heme           |    P-HFU-01    |       RNA       |                         RNA fusion                        |     --amplicon-enable-rna-heme    |
|          oncoReveal Lymphoid          |         Lymphoid         |    P-LYM-01    |       DNA       |                          SNV, SV                          |   --amplicon-enable-dna-lymphoid  |
|          oncoReveal Core LBx          |         Core LBx         |    P-LBX-01    |      cfDNA      |                       SNV, CNV, MSI                       |    --amplicon-enable-cfdna-core   |
|        oncoReveal Essential LBx       |       Essential LBx      |    P-LBX-04    |      cfDNA      |                       SNV, CNV, MSI                       | --amplicon-enable-cfdna-essential |
|        oncoReveal Essential MPN       |            MPN           |       MY7      |       DNA       |                            SNV                            |     --amplicon-enable-dna-mpn     |
|         oncoReveal Fusion LBx         |        Fusion LBx        |    P-LBX-03    |      cfRNA      |               RNA fusion, RNA splice-variant              | --amplicon-enable-cfrna-lbxfusion |
| oncoReveal Multi-Cancer RNA Fusion v2 | Multi-Cancer with Fusion |      SF-V2     |       RNA       | RNA fusion, RNA splice-variant, RNA 3'/5' imbalance-ratio | --amplicon-enable-rna-multicancer |
|  oncoReveal Multi-Cancer v4 with CNV  |   Multi-Cancer with CNV  |      HS341     |       DNA       |                          SNV, CNV                         | --amplicon-enable-dna-multicancer |
|           oncoReveal Myeloid          |          Myeloid         |      MY766     |       DNA       |                          SNV, SV                          |   --amplicon-enable-dna-myeloid   |
|        oncoReveal Nexus 21 Gene       |           Nexus          |    P-CMC-01    |       DNA       |                          SNV, SV                          |    --amplicon-enable-dna-nexus    |
|       oncoReveal Solid Tumor v2       |      Solid Tumor v2      |     P-ST-02    |       DNA       |                            SNV                            |  --amplicon-enable-dna-solidtumor |

When a panel-specific switch is enabled, the corresponding default variant callers are automatically activated. These defaults can be overridden via the command line if needed:

|    **Variant Callers**    |      **Command Line Options**     |
| :-----------------------: | :-------------------------------: |
|            SNV            |      --enable-variant-caller      |
|            CNV            |            --enable-cnv           |
|             SV            |            --enable-sv            |
|            MSI            |       --amplicon-enable-msi       |
|         RNA fusion        |      --enable-rna-gene-fusion     |
|     RNA splice-variant    |    --enable-rna-splice-variant    |
| RNA 3'/5' imbalance-ratio | --amplicon-enable-imbalance-ratio |

All necessary resource files for each panel—such as the amplicon target BED file, SNV and SV systematic noise files, CNV Panel of Normals (PON), and MSI PON (if applicable)—are pre-packaged within DRAGEN. These resources are automatically detected when the corresponding panel-specific option is enabled. All genomic coordinates in the oncoReveal panel resource files use the hg19 reference build. Users who prefer to supply custom resource files can do so through command-line options:

|     **Resources Files**    | **Command Line Options** |
| :------------------------: | :----------------------: |
|  SNV systematic noise file |   --vc-systematic-noise  |
|  SV systematic noise file  |   --sv-systematic-noise  |
| CNV Panel of Normals (PON) |   --cnv-combined-counts  |
|           MSI PON          |  --msi-ref-normal-input  |

By default, CNV analysis does not use the pre-packaged Panel of Normals (PON). We recommend including in-run normal samples—matched in sample type and library preparation—in the same sequencing run to serve as the PON. If generating a custom PON is not feasible, the pre-packaged panel-specific PON can be used as a fallback. To enable this, set the `amplicon-cnv-use-default-pon` to `true`. The CNV component also utilizes the sixth column of the amplicon target BED file to identify regions annotated as "CTRL" (used for establishing baseline coverage) and "CNVtarget" (used for calling copy number variants).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-amplicon-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
