DNA Germline WGS UMI

The DRAGEN recipe includes the recommended pipeline specific commands. A DRAGEN recipe is a predefined set of analysis parameters and workflow settings tailored for a specific type of genomic analysis. Some default parameters are included for clarity and are marked with comments.

  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN pangenome hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SDD /staging 
--output-file-prefix $PREFIX 
# Inputs 
--fastq-list $PATH                      #see 'Input Options' for FQ, BAM or CRAM 
--fastq-list-sample-id $STRING 
# Mapper 
--enable-map-align true                 #optional with BAM/CRAM input 
--enable-map-align-output true          #optionally save the output BAM 
--enable-sort true                      #default=true 
# UMI 
--umi-enable true 
--umi-source STRING                     #Default='qname' 
--umi-library-type STRING               #e.g. random-duplex 
--umi-min-supporting-reads 1            #Default=2 
# Small variant caller 
--enable-variant-caller true 
# Annotation 
--variant-annotation-data PATH 
--enable-variant-annotation true 
# SV 
--enable-sv true 
# CNV 
--enable-cnv true 
--cnv-population-b-allele-vcf $POP_VCF  #optional to enable germline ASCN 
--cnv-enable-self-normalization true 
# HLA genotyper 
--enable-hla true 
# Targeted caller 
--enable-targeted true                  #Targeted 
# Star allele 
--enable-star-allele true 
# PGX 
--enable-pgx true                       #PGX 
# Short tandem repeats 
--repeat-genotype-enable true 
# Multi-Region Joint Detection (MRJD) 
--enable-mrjd true 
--mrjd-enable-high-sensitivity-mode true

Notes and additional options

Hashtable

For DRAGEN germline runs, it is recommended to use the pangenome hashtable.

See: Product Files

Input options

DRAGEN input sources include: fastq list, fastq, bam, or cram.

FQ list Input

--fastq-list $PATH 
--fastq-list-sample-id $STRING

FQ Input

--fastq-file1 $PATH 
--fastq-file2 $PATH 
--RGSM $STRING 
--RGID $STRING

BAM Input

--bam-input $PATH

CRAM Input

--cram-input $PATH

Mapping and Aligning

Option

Description

--enable-map-align true

Optionally disable map & align (default=true).

--enable-map-align-output true

Optionally save the output BAM (default=false).

--Aligner.clip-pe-overhang 2

Clean up any unwanted UMI indexes. Only use when reads contain UMIs, but UMI collapsing was not run.

UMI

Option

Description

--umi-source STRING

Specify the input type for the UMI sequence. Options: qname, fastq, bamtag.

--umi-library-type STRING

Set the batch option for different UMIs correction. Options: random-duplex, random-simplex, nonrandom-duplex.

--umi-nonrandom-whitelist $PATH

If UMI is nonrandom, either a whitelist or correction table is required. The whitelist includes a valid UMI sequence per line.

--umi-correction-table $PATH

If UMI is nonrandom, either a whitelist or correction table is required. The correction table defaults to the table used by TruSight Oncology: <INSTALL_PATH>/resources/umi/umi_correction_table.txt.gz.

--umi-min-supporting-reads INT

Specify the number of matching UMI inputs reads required to generate a consensus read. Any family with insufficient supporting reads is discarded. The default is 2.

--umi-metrics-interval-file $BED

Target region in BED format.

--umi-emit-multiplicity both

Set the consensus sequence type to output. DRAGEN UMI allows collapsing duplex sequences from the two strands of the original molecules. For more information, see Merge Duplex UMIs.

--umi-start-mask-length INT

Number of additional bases to ignore from start of read. The default is 0. To reduce FP optionally set to 1.

--umi-end-mask-length INT

Number of additional bases to ignore from end of read. The default is 0. To reduce FP optionally set to 3.

For more information see: UMI Options.

SNV

DRAGEN SNV VC employs machine learning based variant recalibration (DRAGEN-ML). It processes read and other contextual evidence to remove false positives, recover false negatives and reduce zygosity errors. No additional setup is required. DRAGEN-ML is enabled by default as needed, when running the germline SNV VC on hg19 or hg38.

Note that we do not recommend changing the default QUAL thresholds of 3 for DRAGEN-ML and 10 for DRAGEN without ML. These values differ from each other because DRAGEN-ML improves the calibration of QUAL scores, leading to a change in the scoring range.

Option

Description

--vc-target-bed

Limit variant calling to region of interest.

--vc-combine-phased-variants-distance INT

Maximum distance in base pairs (BP) over which phased variants will be combined. Set to 0 to disable. Valid range is [0; 15] BP (Default=2)

--vc-emit-ref-confidence GVCF

To enable gVCF output.

--vc-enable-vcf-output

To enable VCF file output during a gVCF run, set to true. The default value is false.

For more detail on the small variant caller in somatic mode please refer to Somatic Mode

Annotation

For instructions on how to download the Nirvana annotation database, please refer to Nirvana

HLA

Option

Description

--enable-hla

Enable HLA typer (this setting by default will only genotype class 1 genes)

--hla-as-filter-min-threshold

Internal option to set min alignment score threshold. The default is 59 and works for WES and WGS. Set to 29 for panels.

--hla-as-filter-ratio-threshold

Minimum Alignment score of a read mate to be considered. The default is 0.67 and works for WES and WES. Set to 0.85 for panels.

--hla-enable-class-2

Extend genotyping to HLA class 2 genes (default=true).

CNV

Option

Description

--cnv-enable-gcbias-correction true

Enable or disable GC bias correction when generating target counts.

--cnv-segmentation-mode $SEG_MODE

Option to override the default segmentation algorithm. Defaults include slm for germline WGS, aslm for somatic WGS, and hslm for targeted analysis.

--cnv-population-b-allele-vcf $POP_VCF

Specify a population SNP VCF. This option is available for both the germline and the somatic workflows. In germline it is only supported for WGS. In somatic, it can be used when a matched normal sample is not available and analysis must be performed in tumor-only mode.

--cnv-enable-cyto-output true

Enable Cytogenetics-compatible output (default true), see Cytogenetics Modality. Only available with the Germline ASCN caller.

--cnv-enable-mosaic-calling true

Enable MOSAIC-calling mode (default true). Only available with the Germline ASCN caller.

For more information, see CNV Calling.

# Multi-Region Joint Detection (MRJD)

Option

Description

--enable-mrjd

If set to true, MRJD is enabled for the DRAGEN pipeline.

--mrjd-enable-high-sensitivity-mode

If set to true, MRJD high sensitivity mode is enabled for the DRAGEN pipeline. See the MRJD section in the user guide for information on variant types reported in MRJD default mode and high-sensitivity mode (default=false).

For futher details refer to MRJD.

PreviousDNA Germline WES NextDNA Germline WGS

Last updated 27 days ago

Was this helpful?