# DNA Somatic Tumor-Only Solid Amplicon

A DRAGEN recipe, like this one, is a predefined set of analysis parameters and workflow settings tailored to a specific type of genomic analysis. For clarity, some default parameters are explicitly included and annotated with comments.

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
# Inputs 
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
# Mapper 
--enable-map-align true                 #optional with BAM/CRAM input 
--enable-map-align-output true          #optionally save the output BAM 
--enable-sort true                      #default=true 
# Amplicon 
--enable-dna-amplicon true 
--amplicon-target-bed $PATH 
--enable-duplicate-marking false        #default=false 
# Small variant caller 
--enable-variant-caller true 
--vc-target-bed $VC_TARGET_BED          #Optional. Auto-generated based on amplicon target bed. 
--vc-systematic-noise $PATH             #optional for SNV systematic noise. 
--vc-target-vaf $NUM                    #Default = 0.03 (>= 3% VAF) 
# SV 
--enable-sv true 
# CNV 
--enable-cnv true 
--cnv-combined-counts $PATH             #CNV PON. Required for amplicon CNV calling on CASE samples. 
--cnv-target-bed $PATH                  #Optional. Auto-generated based on amplicon target bed. 
--cnv-filter-qual $NUM                  #CNV filter quality. Adjust CNV filter quality thresholds according to the user’s validation study. 
# Annotation 
--variant-annotation-data $NIRVANA_PATH 
--vc-enable-germline-tagging true 
# Microsatellite Instability (MSI) 
--enable-msi true 
--msi-microsatellites-file $PATH 
--msi-ref-normal-input $PATH            #required 
--amplicon-enable-msi true 
```

## Notes and additional options

### Pillar Amplicon Specific Settings

To support the varied designs of amplicon panels and the specific requirements of different analysis types (e.g., SNV, CNV, SV, MSI, RNA fusion, RNA splice variants, and RNA 3'/5' imbalance ratio), panel-specific parameter settings have been integrated into the command-line options. Each supported Pillar panel has a dedicated option, and the details for these DNA panels are listed in the table below:

|            **Panel Name**           |     **Short Name**    | **Panel Code** | **Sample Type** | **Default variant caller enabled** |      **Command Line Options**     |
| :---------------------------------: | :-------------------: | :------------: | :-------------: | :--------------------------------: | :-------------------------------: |
|  oncoReveal BRCA1 & BRCA2 plus CNV  |        BRCA CNV       |      BR283     |       DNA       |              SNV, CNV              |     --amplicon-enable-dna-brca    |
|         oncoReveal Lymphoid         |        Lymphoid       |    P-LYM-01    |       DNA       |               SNV, SV              |   --amplicon-enable-dna-lymphoid  |
|       oncoReveal Essential MPN      |          MPN          |       MY7      |       DNA       |                 SNV                |     --amplicon-enable-dna-mpn     |
| oncoReveal Multi-Cancer v4 with CNV | Multi-Cancer with CNV |      HS341     |       DNA       |              SNV, CNV              | --amplicon-enable-dna-multicancer |
|          oncoReveal Myeloid         |        Myeloid        |      MY766     |       DNA       |               SNV, SV              |   --amplicon-enable-dna-myeloid   |
|       oncoReveal Nexus 21 Gene      |         Nexus         |    P-CMC-01    |       DNA       |               SNV, SV              |    --amplicon-enable-dna-nexus    |
|      oncoReveal Solid Tumor v2      |     Solid Tumor v2    |     P-ST-02    |       DNA       |                 SNV                |  --amplicon-enable-dna-solidtumor |

For more detail on the amplicon pipeline, please refer to [DRAGEN Amplicon Pipeline](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-amplicon-pipeline)

### Hashtable

For DRAGEN somatic runs it is recommended to use the linear hashtable.

See: [Product Files](https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html)

### Input options

DRAGEN input sources include: fastq list, fastq, bam, or cram. For BCL input, first create FASTQs using [BCL conversion](https://help.dragen.illumina.com/product-guides/dragen-v4.5/bcl-conversion).

FQ list Input

```
--tumor-fastq-list $PATH 
--tumor-fastq-list-sample-id $STRING 
```

FQ Input

```
--tumor-fastq1 $PATH 
--tumor-fastq2 $PATH 
--RGSM-tumor $STRING 
--RGID-tumor $STRING 
```

BAM Input

```
--tumor-bam-input $PATH 
```

CRAM Input

```
--tumor-cram-input $PATH 
```

### Mapping and Aligning

| Option                           | Description                                     |
| -------------------------------- | ----------------------------------------------- |
| `--enable-map-align true`        | Optionally disable map & align (default=true).  |
| `--enable-map-align-output true` | Optionally save the output BAM (default=false). |

### Amplicon post-alignment processing

| Option                                 | Description                                                                                                                                                   |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--amplicon-primer-length INT`         | If an alignment starts inside the primer region of the amplicon target, the alignment is assigned to the amplicon.                                            |
| `--amplicon-allow-partial-target true` | In order to detect deletion events that are close to the target boundaries, we now require only one of the reads to start in the primer region (Default=true) |

For more detail on the amplicon post-alignment processing, please refer to [DRAGEN Amplicon Pipeline](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-amplicon-pipeline)

### Duplicate Marking

| Option                             | Description                                                                                                                                                                                               |
| ---------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--enable-duplicate-marking false` | The Amplicon Pipeline disables duplicate marking. In amplicon assays, fragments originate from a limited number of unique start and end positions, making conventional duplicate detection inappropriate. |

### SNV

| Option                                      | Description                                                                                                                                                                                                   |
| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--vc-target-bed`                           | Limit variant calling to region of interest.                                                                                                                                                                  |
| `--vc-combine-phased-variants-distance INT` | Maximum distance in base pairs (BP) over which phased variants will be combined. Set to 0 to disable. Valid range is \[0; 15] BP (Default=2)                                                                  |
| `--vc-target-vaf $FLOAT`                    | The default is 0.03 (3%)                                                                                                                                                                                      |
| `--vc-af-call-threshold $FLOAT`             | If the AF filter is enabled using --vc-enable-af-filter=true, the option sets the allele frequency call threshold for nuclear chromosomes to emit a call in the VCF. The default value is 0.01.               |
| `--vc-af-filter-threshold $FLOAT`           | If the AF filter is enabled using --vc-enable-af-filter=true, the option sets the allele frequency filter threshold for nuclear chromosomes to mark emitted VCF calls as filtered. The default value is 0.05. |

For more detail on the small variant caller in somatic mode please refer to [Somatic Mode](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/small-variant-calling/somatic-mode)

### CNV

Amplicon CNV requires PON input. In PON mode, the DRAGEN CNV Pipeline is broken down into two distinct stages. The target counts stage is performed on each sample (case and normals), to bin the alignments. The normalization and call detection stage is then performed with the case sample against the panel of normals to determine the events.

| Option                                 | Description                                                                                                                                                                                                                                                                                                       |
| -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-segmentation-mode $SEG_MODE`    | Option to override the default segmentation algorithm. By default, `bed` is used for standard panels and `hslm` for Pillar panels with a pre-built PON.                                                                                                                                                           |
| `--amplicon-cnv-use-default-pon false` | We recommend including in-run normal samples—matched in sample type and library preparation—in the same sequencing run to serve as the PON. If generating a custom PON is not feasible, for Pillar panels, the pre-packaged panel-specific PON can be used as a fallback. To enable this, set the option to true. |
| `--cnv-segmentation-bed $PATH`         | You can bypass segmentation by specifying a cnv-segmentation-bed and using cnv-segmentation-mode=bed. If bed segmentation mode is used, the segmentation bed is auto-generated from amplicon target bed by default                                                                                                |
| `--cnv-filter-qual $NUM`               | QUAL value at which to hard filter CNV VCF. You can adjust CNV filter quality thresholds according to the your validation study                                                                                                                                                                                   |

### Annotation

For instructions on how to download the Nirvana annotation database, please refer to [Nirvana](https://help.dragen.illumina.com/product-guides/dragen-v4.5/nirvana)

### MSI

Microsatellite sites and PON files can be downloaded here: [Product Files](https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html).

For more information on MSI calling, see [MSI](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/biomarkers/biomarker-msi)

### SV

| Option                                | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--sv-call-regions-bed`               | Specifies a BED file containing the set of regions to call. Default as amplicon target bed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `--enable-variant-deduplication true` | Relevant when both SV and SNV callers are enabled in somatic workflows. Can increase sensitivity and prevent the occurrence of replicated variants within genes such as FLT3 and KMT2A. Filter all small indels in the structural variant VCF that appear and are passing in the small variant VCF. DRAGEN will create a new VCF that contains variants in SV VCF that are not matching a variant from SNV VCF file. The new deduplicated SV VCF file will have the same prefix passed by `--output-file-prefix` followed by `sv.small_indel_dedup`. DRAGEN normalizes variants by trimming and left shifting by up to 500 bases. |
| `--sv-systematic-noise $BEDPE`        | Optional systematic noise BEDPE file containing the set of noisy paired regions.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |

For more information, see [Structural Variant Calling](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/sv-calling).

## Resource Files

DRAGEN requires resource files for components such as SNV, SV, and CNV. The following notes provide references for downloading these files or generating them for custom workflows or assays.

### SNV Systematic Noise

Systematic noise files are considered essential in Tumor-Only workflows. It is also recommended for Tumor-Normals workflows.

DRAGEN has pre-built systematic noise files for WGS, WES and for Pillar Amplicons. For high sensitivity applications, including panels or clinical WES/WGS assays, it is recommended to create your own systematic noise file as described under Custom.

#### Prebuilt

DRAGEN has pre-built systematic noise files for Pillar panels. These files are packaged directly with DRAGEN.

#### Custom

This section describes how to generate systematic noise files from phenotypically normal (non-tumor) samples to optimize the performance of a specific assay. For best accuracy, the normal samples should ideally closely match the sequencer, sample type, library prep, and coverage of the tumor samples of interest. It is typically recommended to use 30-70 normals when building a noise file, but fewer can be used.

**Step 1. Run DRAGEN somatic tumor-only on each of approximately 30-70 normal samples.**

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
--vc-detect-systematic-noise=true 
--enable-dna-amplicon true 
--amplicon-target-bed $PATH 
--vc-enable-germline-tagging=true 
--variant-annotation-data $NIRVANA_PATH 
--intermediate-results-dir $PATH 
--output-directory $PATH 
--output-file-prefix $STRING 
```

For WES and WGS pipelines gather the full paths to the small variant hard filtered VCFs (not GVCFs) from step 1 and create a lines file `${VCF_LIST}` by specifying 1 file per line.

**Step 2. Generate the final noise file.**

This step generates a bed file containing mean and max noise estimates per position. This can be used directly during variant calling (argument --vc-systematic-noise). The distribution of noise per position can also be plotted to identify particularly noisy positions that could be troubleshooted (e.g. modify assay settings or DRAGEN settings) or blocklisted

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--build-sys-noise-vcfs-list ${VCF_LIST} 
```

The SNV systematic noise files can also be built in the cloud using the [DRAGEN Baseline Builder App on BaseSpace](https://www.illumina.com/products/by-type/informatics-products/basespace-sequence-hub/apps/dragen-baseline-builder.html) or the DRAGEN Systematic Noise File Builder Pipeline on [ICA](https://www.illumina.com/products/by-type/informatics-products/connected-analytics.html).

### SV Systematic Noise

SV systematic noise files have not been tested with WES, enrichment and amplicon panels. It is considered an experimental mode for these assays.

#### Custom

Custom systematic noise files can be generated for WES, Panels or Amplicon. For best accuracy the normal samples should ideally closely match the sequencer, sample type, library prep and coverage of the tumor samples of interest. It is typically recommended to use 30 - 100 normals when building a noise file, but fewer can be used.

**Step 1. Run DRAGEN somatic tumor-only on normal samples with `--sv-detect-systematic-noise` set to true to generate VCF output per normal sample.**

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
--enable-dna-amplicon true 
--amplicon-target-bed $PATH 
--sv-detect-systematic-noise true 
```

**Step 2. Build the BEDPE file using input VCFs from previous step.**

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--sv-build-systematic-noise-vcfs-list $VCF_LIST#one VCF per line. 
```

Systematic noise BEDPE files can also be built in the cloud using the [DRAGEN Baseline Builder App on BaseSpace](https://www.illumina.com/products/by-type/informatics-products/basespace-sequence-hub/apps/dragen-baseline-builder.html) or the DRAGEN Systematic Noise File Builder Pipeline on [ICA](https://www.illumina.com/products/by-type/informatics-products/connected-analytics.html).

### CNV Panel of Normals (PON)

For CNV PON requirements and generation options see [CNV Preprocessing | Panel of Normals](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/cnv-overview/cnv-reference#panel-of-normals).

If a matched normal is available it is recommended to include it in the PON.

**Step 1. Generate CNV target counts of individual normal samples.**

Any samples that should not be included in the final PON file can be excluded from this step. Any options used for CNV target counts generation (BED file, GC Bias Correction, etc.) should be matched when processing the case samples.

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
# CNV 
--enable-cnv true 
--enable-dna-amplicon true 
--amplicon-target-bed $PATH 
```

**Step 2. CNV combined counts file generation.**

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--enable-cnv true 
--cnv-generate-combined-counts true 
--cnv-normals-list $CNV_NORMALS_LIST 
```

`$CNV_NORMALS_LIST` is a text file with one line for each path to a CNV target counts file generated in step 1 (either `<output-file-prefix>.target.counts.gz` or `<output-file-prefix>.target.counts.gc-corrected.gz`). Individual target counts files are merged into a single `<output-file-prefix>.combined.counts.txt.gz` PON file in the output directory. The PON file is used for each case sample run of DRAGEN CNV using the `--cnv-combined-counts` option.

### MSI baseline file (PON)

It is recommended to use a baseline file that matches the sample type (FF/FFPE), assay type (WGS/WES/Panel) and genome build (hg19/hg38) of the samples being analyzed. If matched normals are available, it is recommended to include them in the baselines.

For MSI baseline generation, see [Baseline microsatellite repeat distribution](https://help.dragen.illumina.com/product-guides/dragen-dna-pipeline/biomarkers/biomarker-msi#baseline-microsatellite-repeat-distribution). It is required that non-tumor inputs are used for baseline generation.

**Step 1. Generate MSI baselines for individual normal samples.**

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
--fastq-list $PATH                      #see 'Input Options' for FQ, BAM or CRAM 
--fastq-list-sample-id $STRING 
# MSI 
--enable-msi true 
--msi-generate-baseline true 
--msi-microsatellites-file $MICROSATELLITE_LIST#see 'Product Files' for available files 
```

**Step 2. Combine MSI baselines to a single file.**

```
touch combined_msi_baseline.dist 
sed 's/$/ $SAMPLEID $OUTPUT/$PREFIX.microsat_normal.dist | grep -v '#' >> combined_msi_baseline.dist 
```

The combined MSI baseline files can then be used for each case sample run of DRAGEN MSI using `--msi-ref-normal-input combined_msi_baseline.dist`.
