Command Line Options

This section provides information on all the DRAGEN command-line options, including the name used in the configuration file, the command-line equivalent, a description, and the range of values.

NOTE After upgrading to a new version of DRAGEN, it is recommended to first run with the default DRAGEN options, including all filtering options, and then add any specific filters only if needed.

General Software Options

The following options are in the default section of the configuration file. The default section is at the top of the configuration file and does not have a section name (eg, [Aligner]) associated with it. Some mandatory fields must be specified on the command line and are not present in configuration files.

Name

Description

Command Line Equivalent

Range

append-read-index-to-name

By default, DRAGEN names both mate ends of pairs the same. When set to true, DRAGEN appends /1 and /2 to the two ends.

--append-read-index-to-name

true/false

aws-s3-region

Specifies the geographical region of AWS S3 buckets.

--aws-3-region

bam-input

Specifies aligned BAM file for input to the DRAGEN variant caller.

-b, --bam-input

bam-list

Specifies CSV file that contains a list of BAM files to process.

--bam-list

bcl-conversion-only

Performs Illumina BCL conversion to FASTQ format.

--bcl-conversion-only

true/false

bcl-input-directory

Inputs BCL directory for BCL conversion.

--bcl-input-directory

bcl-only-lane

For BCL input, the option converts only specified lane number. By default, all lanes are converted.

--bcl-only-lane

1–8

sample-sheet

For BCL input, the option sets the path to SampleSheet.csv file. The default location is the BCL root directory.

--sample-sheet

strict-mode

For BCL input, the option cancels analysis if any files are missing. The default value is false by default.

--strict-mode

true/false

first-tile-only

Converts only the first tile of each lane during BCL conversion. Use for testing or debugging.

--first-tile-only

true/false

run-info

Sets the path to RunInfo.xml file. The default is <flow cell>/RunInfo.xml.

--run-info

bcl-sampleproject-subdirectories

For BCL conversion, the option outputs to subdirectories based on sample sheet Sample_Project column.

--bcl-sampleproject-subdirectories

no-lane-splitting

Disables splitting output FASTQ files by lane. The default value is false.

--no-lane-splitting

true/false

bcl-only-matched-reads

Specifies if unmapped reads are output to files marked as Undetermined. The default value is false.

bcl-only-matched-reads

true/false

bcl-use-hw

If set to false, the option prevents DRAGEN FPGA acceleration during BCL conversion. The default value is true.

--bcl-use-hw

true/false

bcl-num-parallel-tiles

Specifies the number of tiles to process in parallel. The default value is dynamically determined.

--bcl-num-parallel-tiles

bcl-num-conversion-threads

Specifies the number of conversion threads per tile. The default value is dynamically determined.

--bcl-num-conversion-threads

bcl-num-compression-threads

Specifies the number of CPU threads for output fastq.gz compression. The default value is dynamically determined.

--bcl-num-compression-threads

bcl-num-decompression-threads

Specifies the number of CPU threads for BCL input decompression. The default value is dynamically determined.

--bcl-num-decompression-threads

shared-thread-odirect-output

Uses alternative shared-thread ODIRECT file output. The default value is false.

--shared-thread-odirect-output

true/false

build-hash-table

Generates a reference hash table.

--build-hash-table

true/false

cram-input

Specifies the CRAM file input for the variant caller.

--cram-input

cram-list

Specifies CSV file that contains a list of CRAM files to process.

--cram-list

dbsnp

Sets the path to the variant annotation database VCF (or *.vcf.gz) file.

--dbsnp

enable-auto-multifile

Imports subsequent segments of the *_001.{dbam,fastq} files.

--enable-auto-multifile

true/false

enable-bam-indexing

Enables generation of a BAI index file.

--enable-bam-indexing

true/false

enable-cram-indexing

Enables generation of a CRAI index file.

--enable-cram-indexing

true/false

enable-cnv

Enables copy number variant (CNV).

--enable-cnv

true/false

enable-duplicate-marking

Enables the flagging of duplicate output alignment records.

--enable-duplicate-marking

true/false

enable-map-align-output

Enables saving the output from the map/align stage. If only running map/align, the default value is true. If running the variant caller, the default value is false.

--enable-map-align-output

true/false

enable-methylation-calling

Automatically adds tags related to methylation and outputs a single BAM for methylation protocols.

--enable-methylation-calling

true/false

enable-sampling

Automatically detects paired-end parameters by running a sample through the mapper/aligner.

--enable-sampling

true/false

enable-sort

Enables sorting after mapping/alignment.

--enable-sort

true/false

enable-variant-caller

Enables the variant caller.(default=false)

--enable-variant-caller

true/false

enable-variant-deduplication

Enables variant deduplication. The default value is false.

--enable-variant-deduplication

true/false

enable-vcf-compression

Enables compression of VCF output files. The default value is true.

--enable-vcf-compression

true/false

enable-vcf-indexing

Outputs a *.tbi index file in addition to the output VCF/gVCF. The default is true.

--enable-vcf-indexing

true/false

fastq-file1

Specifies FASTQ file to input to the DRAGEN pipeline. Gzipped format can be used.

-1, --fastq-file1

fastq-file2

Specifies second FASTQ file with paired-end reads to input.

-2, --fastq-file2

fastq-list

Specifies CSV file that contains a list of FASTQ files to process.

--fastq-list

fastq-list-sample-id

If the RGSM entry matches the given Sample ID parameter for fastq-list.csv input, the option processes the entry.

--fastq-list-sample-id

fastq-list-all-samples

If true, process all samples in the fastq-list file, even when there are multiple RGSM (Sample ID) values.

--fastq-list-all-samples

true/false

fastq-n-quality

Specifies the base call quality to output for N bases. Automatically added to fastq-n-quality for all output N bases.

--fastq-n-quality

0–255

fastq-offset

Sets the FASTQ quality offset value.

--fastq-offset

filter-flags-from-output

Filters output alignments with any bits set in val present in the flags field. Hex and decimal values accepted.

--filter-flags-from-output

force

Forces overwrite of existing output file.

-f

force-load-reference

Forces loading of the reference and hash tables before starting the DRAGEN pipeline.

-l

generate-md-tags

Generates MD tags with alignment output records. The default value is false.

--generate-md-tags

true/false

generate-sa-tags

Generates SA:Z tags for records that have chimeric or supplemental alignments.

--generate-sa-tags

true/false

generate-zs-tags

Generate ZS tags for alignment output records. The default value is false.

--generate-zs-tags

true/false

ht-alt-liftover

SAM format liftover file of alternate contigs in reference.

--ht-alt-liftover

ht-mask-bed

Specifies the BED file for base masking.

--ht-mask-bed

ht-allow-mask-and-liftover

Allows the hash table builder to run with both ht-alt-liftover and ht-mask-bed. Default is false.

--ht-allow-mask-and-liftover

true/false

ht-build-rna-hashtable

Enables generation of RNA hash table. The default value is false.

--ht-build-rna-hashtable

true/false

ht-cost-coeff-seed-freq

Sets cost coefficient of extended seed frequency.

--ht-cost-coeff-seed-freq

ht-cost-coeff-seed-len

Sets cost coefficient of extended seed length.

--ht-cost-coeff-seed-len

ht-cost-penalty-incr

Sets cost penalty to incrementally extend a seed another step.

--ht-cost-penalty-incr

ht-cost-penalty

Sets cost penalty to extend a seed by any number of bases.

--ht-cost-penalty

ht-decoys

Specifies the path to a decoys file.

--ht-decoys

ht-max-dec-factor

Sets the maximum decimation factor for seed thinning.

--ht-max-dec-factor

ht-max-ext-incr

Sets the maximum bases to extend a seed by in one step.

--ht-max-ext-incr

ht-max-ext-seed-len

Specifies the maximum extended seed length.

-- ht-max-ext-seed-len

ht-max-seed-freq

Sets the maximum allowed frequency for a seed match after extension attempts.

--ht-max-seed-freq

1–256

ht-max-table-chunks

Specifies the maximum ~1 GB thread table chunks in memory at one time.

--ht-max-table-chunks

ht-mem-limit

Specifies the memory limit (hash table + reference) in units (KB, MB, GB).

--ht-mem-limit

ht-methylated

Automatically generates C->T and G->A converted reference hash tables.

--ht-methylated

true/false

ht-num-threads

Sets maximum worker CPU threads for building hash table.

--ht-num-threads

ht-rand-hit-extend

Includes a random hit with each EXTEND record of the frequency record.

--ht-rand-hit-extend

ht-rand-hit-hifreq

Includes a random hit with each HIFREQ record.

--ht-rand-hit-hifreq

ht-ref-seed-interval

Specifies the number of positions per reference seed.

--ht-ref-seed-interval

ht-reference

References file in FASTA format to build a hash table.

--ht-reference

ht-seed-len

Sets initial seed length to store in hash table.

--ht-seed-len

ht-size

Specifies the size of hash table in units (KB, MB, GB).

--ht-size

ht-soft-seed-freq-cap

Specifies the soft seed frequency cap for thinning.

--ht-soft-seed-freq-cap

ht-suppress-decoys

Suppresses the use of a decoys file when building a hash table.

--ht-suppress-decoys

ht-target-seed-freq

Sets the target seed frequency for seed extension.

--ht-target-seed-freq

input-qname-suffix-delimiter

Controls the delimiter used for append-read-index-to-name and for detecting matching pair names with BAM input.

--input-qname-suffix-delimiter

/ :

interleaved

Specifies the interleaved paired-end reads in single FASTQ.

-i

intermediate-results-dir

Specifies directory to store intermediate results in (eg, sort partitions).

--intermediate-results-dir

lic-no-print

Suppresses the license status message at the end of a run.

--lic-no-print

true/false

lic-server

Specifies the license server for cloud sites: http://<base64_use>:<base64_password>@

--lic-server

lic-instance-id-location

Use this option to override the default cloud instance ID location

--lic-instance-id-location

lic-credentials

Path to license credentials file that specifies the license credentials and domain.

--lic-credentials

methylation-generate-cytosine-report

Generates a genome-wide cytosine methylation report.

--methylation-generate-cytosine-report

true/false

methylation-generate-mbias-report

Generates a per system cycle methylation bias report.

--methylation-generate-mbias-report

true/false

methylation-TAPS

If input assays are generated by TAPS, the option is set to true.

--methylation-TAPS

true/false

methylation-match-bismark

If true, the option matches bismark tags exactly, including bugs.

--methylation-match-bismark

true/false

methylation-protocol

Describes library protocol for methylation analysis.

--methylation-protocol

none
directional
nondirectional
directional-complement

num-threads

Specifies the number of processor threads to use.

-n, --num-threads

output-directory

Specifies the output directory.

--output-directory

output-file-prefix

Outputs file name prefix to use for all files generated by the pipeline.

--output-file-prefix

output-format

Sets the format of the output file from the map/align stage. The following values are valid:BAM (the default),CRAM (lossless), SAM, or DBAM (a proprietary binary format)

--output-format

BAM/ CRAM/ SAM / DBAM

pair-by-name

Shuffles the order of BAM input records so paired-end mates are processed together.

--pair-by-name

pair-suffix-delimiter

Changes the delimiter character for suffixes.

--pair-suffix-delimiter

/ . :

preserve-bqsr-tags

Determines whether to preserve BI and BD flags from the input BAM file, which can cause problems with hard clipping.

--preserve-bqsr-tags

true/false

preserve-map-align-order

Produces output file that preserves original order of reads in the input file.

--preserve-map-align-order

true/false

qc-coverage-region-1

Generates coverage region report using bed file 1.

--qc-coverage-region-1

qc-coverage-region-2

Generates coverage region report using bed file 2.

--qc-coverage-region-2

qc-coverage-region-3

Generates coverage region report using bed file 3.

--qc-coverage-region-3

qc-coverage-reports-1

Describes the types of reports requested for qc-coverage-region-1.

--qc-coverage-reports-1

full_res/cov_report

qc-coverage-reports-2

Describes the types of reports requested for qc-coverage-region-2.

--qc-coverage-reports-2

full_res/cov_report

qc-coverage-reports-3

Describes the types of reports requested for qc-coverage-region-3.

--qc-coverage-reports-3

full_res/cov_report

qc-coverage-region-1-thresholds

Declares the thresholds to use in cov_report for qc-coverage-region-1.

--qc-coverage-region-1-thresholds

List of up to 11 numbers separated by commas

qc-coverage-region-2-thresholds

Declares the thresholds to use in cov_report for qc-coverage-region-2.

--qc-coverage-region-2-thresholds

List of up to 11 numbers separated by commas

qc-coverage-region-3-thresholds

Declares the thresholds to use in cov_report for qc-coverage-region-3.

--qc-coverage-region-3-thresholds

List of up to 11 numbers separated by commas

ref-dir

Specifies the directory containing the reference hash table. If the reference is not already loaded into the DRAGEN card, the option automatically loads the reference.

-r, --ref-dir

ref-sequence-filter

Outputs only reads mapping to the reference sequence.

--ref-sequence-filter

remove-duplicates

If true, the option removes duplicate alignment records instead of only flagging them.

true/false

RGCN

Specifies the read group sequencing center name.

--RGCN

RGCN-tumor

Specifies the read group sequencing center name for tumor input.

--RGCN-tumor

RGDS

Provides the read group description.

--RGDS

RGDS-tumor

Provides the read group description for tumor input.

--RGDS-tumor

RGDT

Specifies the read group run date.

--RGDT

RGDT-tumor

Specifies the read group run date for tumor input.

--RGDT-tumor

RGID

Specifies read group ID.

--RGID

RGID-tumor

Specifies read group ID for tumor input.

--RGID-tumor

RGLB

Specifies the read group library.

--RGLB

RGLB-tumor

Specifies the read group library for tumor input.

--RGLB-tumor

RGPI

Specifies the read group predicted insert size.

--RGPI

RGPI-tumor

Specifies the read group predicted insert size for tumor input.

--RGPI-tumor

RGPL

Specifies the read group sequencing technology.

--RGPL

RGPL-tumor

Specifies the read group sequencing technology for tumor input.

--RGPL-tumor

RGPU

Specifies the read group platform unit.

--RGPU

RGPU-tumor

Specifies read group platform unit for tumor input.

--RGPU-tumor

RGSM

Specifies read group sample name.

--RGSM

RGSM-tumor

Specifies read group sample name for tumor input.

--RGSM-tumor

sample-size

Specifies number of reads to sample when enable-sampling is true.

--sample-size

sample-sex

Specifies the sex of the sample.

--sample-sex

strip-input-qname-suffixes

Determines whether to strip read-index suffixes (eg, /1 and /2) from input QNAMEs. If set to false, the option preserves entire name.

--strip-input-qname-suffixes

true/false

tumor-bam-input

Specifies aligned BAM file for the DRAGEN variant caller in somatic mode.

--tumor-bam-input

tumor-bam-list

Specifies CSV file that contains a list of BAM files for the mapper, aligner, and somatic variant caller.

--tumor-bam-list

tumor-cram-input

Specifies aligned CRAM file for the DRAGEN variant caller in somatic mode.

--tumor-cram-input

tumor-cram-list

Specifies a CSV file that contains a list of CRAM files for the mapper, aligner, and somatic variant caller.

--tumor-cram-list

tumor-fastq-list

Inputs a CSV file containing a list of FASTQ files for the mapper, aligner, and somatic variant caller.

--tumor-fastq-list

tumor-fastq-list-sample-id

Specifies the sample ID for the list of FASTQ files specified by tumor-fastq-list.

--tumor-fastq-list-sample-id

tumor-fastq1

Inputs FASTQ file for the DRAGEN pipeline using the variant caller in somatic mode. The input file can be gzipped.

--tumor-fastq1

tumor-fastq2

Inputs second FASTQ file. Reads are paired to tumor-fastq1 reads for the DRAGEN pipeline using the variant caller in somatic mode. The input file can be gzipped.

--tumor-fastq2

vd-eh-vcf

Inputs the DRAGEN repeats VCF file for variant deduplication. The input file can be gzipped.

--vd-eh-vcf

vd-output-match-log

Outputs a file that describes the variants that matched during deduplication. The default value is false.

--vd-output-match-log

true/false

vd-small-variant-vcf

Inputs small variant VCF file for variant deduplication. The input file can be gzipped.

--vd-small-variant-vcf

vd-sv-vcf

Inputs structural variant VCF for variant deduplication. The input file can be gzipped.

--vd-sv-vcf

verbose

Enables verbose output from DRAGEN.

-v

version

Prints the DRAGEN version, the Hash Table version and exits.

-V,--version

Mapper Options

The following options are in the [Mapper] section of the configuration file. For more detailed information on these options, see [DNA Mapping]{.underline}.

Name

Description

Command Line Equivalent

Range

ann-sj-max-indel

Specifies maximum indel length to expect near an annotated splice junction.

--Mapper.ann-sj-max-indel

0–63

edit-chain-limit

For edit-mode 1 or 2, the option sets maximum seed chain length in a read to qualify for seed editing.

--Mapper.edit-chain-limit

edit-chain-limit >= 0

edit-mode

Controls when seed editing is used. The following values represent the different edit modes: 0 is no edits, 1 is chain length test, 2 is paired chain length test, 3 is full seed edits

--Mapper.edit-mode

0–3

edit-read-len

For edit-mode 1 or 2, controls the read length for edit-seed-num seed editing positions.

--Mapper.edit-read-len

edit-read-len > 0

edit-seed-num

For edit-mode 1 or 2, controls the requested number of seeds per read to allow editing on.

--Mapper.edit-seed-num

edit-seed-num >= 0

enable-map-align

Enable the mapper/aligner (Default=true)

--enable-map-align

true/false

map-orientations

Restricts the orientation of read mapping to only forward in the reference genome or only reverse-complemented. The following values represent the different orientations (paired end requires normal):0 is normal (paired-end inputs must use normal), 1 is reverse-complemented, 2 is no forward

--Mapper.map-orientations

0–2

max-intron-bases

Specifies maximum intron length reported.

--Mapper.max-intron-bases

min-intron-bases

Specifies minimum reference deletion length reported as an intron.

--Mapper.min-intron-bases

seed-density

Controls requested density of seeds from reads queried in the hash table

--Mapper.seed-density

0 > seed-density > 1

Aligner Options

The following options are in the [Aligner] section of the configuration file. For more information, see [DNA Aligning]{.underline}

Name

Description

Command Line Equivalent

Value

aln-min-score

A signed integer that specifies a minimum acceptable alignment score to report the baseline for MAPQ. When using local alignments (global is 0), aln-min-score is computed by the host software as 22 * match-score. When using global alignments (global is 1), aln-min-score is set to -1000000. Host software computation can be overridden by setting aln-min-score in configuration file.

--Aligner.aln-min-score

−2,147,483,648 to 2,147,483,647

clip-pe-overhang

When nonzero, clips 3' read ends overhanging their mate's 5' ends as aligned. Set 1 to soft-clip overhang, 2 to hard-clip.

--Aligner.clip-pe-overhang

0–2

dedup-min-qual

Specifies a minimum base quality for calculating read quality metric for deduplication.

--Aligner.dedup-min-qual

0–63

en-alt-hap-aln

Allows haplotype alignments to be output as supplementary.

--Aligner.en-alt-hap-aln

0–1

en-chimeric-aln

Allows chimeric alignments to be output as supplementary.

--Aligner.en-chimeric-aln

0–1

gap-ext-pen

Specifies the penalty for extending a gap.

--Aligner.gap-ext-pen

0–15

gap-open-pen

Specifies the penalty for opening a gap (ie, insertion or deletion).

gap-open-pen

0–127

global

Controls whether alignment is end-to-end in the read. The following values represent the different alignments: 0 is local alignment (Smith-Waterman) 1 is global alignment (Needleman-Wunsch)

--Aligner.global

0–1

hard-clips

Specifies alignments for hard clipping. The following values represent the different alignments: Bit 0 is primary Bit 1 is supplementary Bit 2 is secondary

--Aligner.hard-clips

3 bits

map-orientations

Constrains orientations to accept forward-only, reverse-complement only, or any alignments. The following values represent the different orientations: 0 is any 1 is forward only 2 is reverse only

--Aligner.map-orientations

0–2

mapq-max

Specifies ceiling on reported MAPQ. The default value is 60.

--Aligner.mapq-max

0–255

mapq-strict-js

Specific to RNA. When set to 0, a higher MAPQ value is returned, expressing confidence that the alignment is at least partially correct. When set to 1, a lower MAPQ value is returned, expressing the splice junction ambiguity.

--mapq-strict-js

0–1

match-n-score

A signed integer that specifies the score increment for matching where a read or reference base is N.

--Aligner.match-n-score

-16–15

match-score

Specifies the score increment for matching reference nucleotide.

--Aligner.match-score

When global = 0, match-score > 0 When global = 1, match-score >= 0

max-rescues

Specifies maximum rescue alignments per read pair. The default value is 10.

--max-rescues

0–1023

min-score-coeff

Sets adjustment to aln-min-score per read base.

--Aligner.min-score-coeff

-64–63.999

mismatch-pen

Defines the score penalty for a mismatch.

--Aligner.mismatch-pen

0–63

no-unclip-score

When set to 1, the option removes any unclipped bonus (unclip-score) contributing to an alignment from the alignment score before further processing.

--Aligner.no-unclip-score

0–1

no-unpaired

Determines if only properly paired alignments should be reported for paired reads.

--Aligner. no-unpaired

0–1

pe-max-penalty

Specifies the maximum pairing score penalty for unpaired or distant ends.

--Aligner.pe-max-penalty

0–255

pe-orientation

Specifies the expected paired-end orientation. The following values represent the different orientations: 0 is FR (default) 1 is RF 2 is FF

--Aligner.pe-orientation

0–2

rescue-sigmas

Sets deviations from the mean read length used for rescue scan radius. The default value is 2.5.

--Aligner.rescue-sigmas

sec-aligns

Restricts the maximum number of secondary (suboptimal) alignments to report per read.

--Aligner.sec-aligns

0–4095

sec-aligns-hard

If set to 1, forces the read to be unmapped when not all secondary alignments can be output.

--Aligner.sec-aligns-hard

0–1

sec-phred-delta

Controls which secondary alignments are emitted. Only secondary alignments within this Phred value of the primary are reported.

--Aligner.sec-phred-delta

0–255

sec-score-delta

Determines the pair score threshold below primary that secondary alignments are allowed.

--Aligner. sec-score-delta

supp-aligns

Restricts the maximum number of supplementary (chimeric) alignments to report per read.

--Aligner.supp-aligns

0–4095

supp-as-sec

Determines if supplementary alignments should be reported with secondary flag.

--Aligner.supp-as-sec

0–1

supp-min-score-adj

Specifies amount to increase minimum alignment score for supplementary alignments. The score is computed by host software as 8 * match-score for DNA. The default is 0 for RNA.

--Aligner. supp-min-score-adj

unclip-score

Specifies the score bonus for reaching the edge of the read.

--Aligner.unclip-score

0–127

unpaired-pen

Specifies the penalty for unpaired alignments, using Phred scale.

--Aligner.unpaired-pen

0–255

If you disable automatic detection of insert-length statistics via the--enable-sampling option, you must override all the following options to specify the statistics. For more information, see [Mean Insert Size Detection]{.underline}. These options are part of the [Aligner] section of the configuration file.

Option

Description

Command Line Equivalent

Value

pe-stat-mean-insert

Specifies the average template length.

--pe-stat-mean-insert

0–65535

pe-stat-mean-read-len

Specifies the average read length.

--pe-stat-mean-read-len

0–65535

pe-stat-quartiles-insert

Specifies a comma-delimited trio of numbers for the 25th, 50th, and 75th percentile template lengths.

--pe-stat-quartiles-insert

0–65535

pe-stat-stddev-insert

Specifies the standard deviation of template length distribution.

--pe-stat-stddev-insert

0–65535

Variant Caller Options

The following options are in the Variant Caller section of the configuration file. For more information on these options, see [Variant Caller Options]{.underline}.

Name

Description

Command Line Equivalent

Value

dn-cnv-vcf

For de novo calling, filters joint structural variant VCF from the CNV calling step. If omitted, DRAGEN skips any checks with overlapping copy number variants.

--dn-cnv-vcf

dn-input-vcf

For de novo calling, filters joint small variant VCF from the de novo calling step.

--dn-input-vcf

dn-output-vcf

For de novo calling, specifies the file location for writing the filtered VCF file. If not specified, the input VCF is overwritten.

--dn-output-vcf

dn-sv-vcf

For de novo calling, filters the joint structural variant VCF file from the SV calling step. If omitted, DRAGEN skips any checks with overlapping structural variants.

--dn-sv-vcf

enable-joint-genotyping

To enable the joint genotyping caller, set to true.

--enable-joint-genotyping

true/false

enable-multi-sample-gvcf

Enables generation of a multisample gVCF file. If set to true, requires a combined gVCF file as input.

--enable-multi-sample-gvcf

true/false

enable-vlrd

Enables Virtual Long Read Detection.

--enable-vlrd

true/false

pedigree-file

Specifies the path to a pedigree file that describes the familial relationships between panels (specific to joint calling). Only pedigree files that contain trios are supported.

--pedigree-file

qc-snp-DeNovo-quality-threshold

Sets the threshold for counting and reporting de novo SNP variants.

--qc-snp-DeNovo-quality-threshold

qc-indel-DeNovo-quality-threshold

Sets the threshold for counting and reporting de novo INDEL variants.

--qc-indel-DeNovo-quality-threshold

variant

Specifies the path to a single gVCF file. You can use the --variant option multiple times to specify paths to multiple gVCF files. Use one file per line. Up to 500 gVCFs are supported.

--variant

variant-list

Specifies the path to a file containing a list of input gVCF files that need to be combined. Use one file per line.

--variant-list

vc-af-call-threshold

If the AF filter is enabled using --vc-enable-af-filter=true, the option sets the allele frequency call threshold for nuclear chromosomes to emit a call in the VCF. The default value is 0.01.

--vc-af-call-threshold

vc-af-filter-threshold

If the AF filter is enabled using --vc-enable-af-filter=true, the option sets the allele frequency filter threshold for nuclear chromosomes to mark emitted VCF calls as filtered. The default value is 0.05.

--vc-af-filter-threshold

vc-af-call-threshold-mito

If the AF filter is enabled using --vc-enable-af-filter-mito=true, the option sets the allele frequency call threshold to emit a call in the VCF for mitochondrial variant calling. The default value is 0.01.

--vc-af-call-threshold-mito

vc-af-filter-threshold-mito

If the AF filter is enabled using --vc-enable-af-filter-mito=true, the option sets the allele frequency filter threshold to mark emitted VCF calls as filtered for mitochondrial variant calling. The default value is 0.02.

--vc-af-filter-threshold-mito

vc-callability-normal-threshold

Specifies the normal sample coverage threshold for a site to be considered callable in the somatic callable regions report. The default is 5.

--vc-callability-normal-thresh

vc-callability-tumor-threshold

Specifies the tumor sample coverage threshold for a site to be considered callable in the somatic callable regions report. The default is 50.

--vc-callability-tumor-thresh

vc-clustered-event-penalty

SQ score penalty applied to phased clustered somatic events; set to 0 to disable the penalty. The default value is 4.0 for tumor-normal and 7.0 for tumor-only.

--vc-clustered-event-penalty

vc-decoy-contigs

Specifies the path to a comma-separated list of contigs to skip during variant calling.

--vc-decoy-contigs

vc-depth-annotation-threshold

Filters all non-PASS somatic alt variants with a depth below this threshold. The default value is 0 (no filtering).

--vc-depth-annotation-threshold

vc-depth-filter-threshold

Filters all somatic variants (alt or homref) with a depth below this threshold. The default value is 0 (no filtering).

--vc-depth-filter-threshold

vc-emit-ref-confidence

Enables base pair resolution gVCF generation or banded gVCF generation.

--vc-emit-ref-confidence

BP_RESOLUTION GVCF

vc-enable-af-filter

Enables the allele frequency filter of nuclear chromosomes for somatic mode. The default value is false.

--vc-enable-af-filter

true/false

vc-enable-af-filter-mito

Enables the allele frequency filter for mitochondrial variant calling. The default value is true.

--vc-enable-af-filter-mito

true/false

vc-enable-baf

Enables B-allele frequency output. The default value is true.

--vc-enable-baf

true/false

vc-enable-decoy-contigs

Enables variant calls on decoy contigs. The default value is false.

--vc-enable-decoy-contigs

true/false

vc-enable-liquid-tumor-mode

Enables liquid tumor mode for tumor-normal analysis to account for tumor-in-normal contamination. The default value is false.

--vc-enable-liquid-tumor-mode

true false

vc-enable-non-homref-normal-filter

Enables the nonhomref normal filter, which filters out somatic variants if the normal sample genotype is not homozygous reference. The default value is true.

--vc-enable-non-homref-normal-filter

true/false

vc-enable-orientation-bias-filter

Enables the orientation bias filter. The default value is false, which means the option is disabled.

--vc-enable-orientation-bias-filter

true/false

vc-enable-phasing

Enables variants to be phased when possible. The default value is true.

--vc-enable-phasing

true/false

vc-combine-phased-variants-distance

When the specified value is greater than 0, combines all phased variants in the phasing set that have a distance less than or equal to the provided value. The max allowed phasing distance is 15. The default value is 0, which disables the option.

--vc-combine-phased-variants-distance

0–15

vc-enable-roh

Enables the ROH caller and output. The default value is true.

--vc-enable-roh

true/false

vc-enable-triallelic-filter

Enables the multiallelic filter for somatic mode. The default value is false.

--vc-enable-triallelic-filter

true/false

vc-enable-non-primary-allelic-filter

Similar to vc-enable-triallelic-filter, but less aggressive. Keep the allele per position with highest alt AD, and only filter the rest. The default is false. Not compatible with vc-enable-triallelic-filter.

--vc-enable-non-primary-allelic-filter

true/false

vc-enable-vcf-output

Enables VCF file output during a gVCF run. The default value is false.

--vc-enable-vcf-output

true/false

vc-enable-unequal-ntd-errors

Enables the Sample-specific SNV Error Estimation feature. The default value is true for somatic pipelines and false for germline pipelines.

--vc-enable-unequal-ntd-errors

true/false/auto

vc-enable-trimer-context

When enabled along with vc-enable-unequal-ntd-errors, DRAGEN uses trimer rather than monomer context to estimate SNV error rates. The default value is false, except when vc-enable-umi-liquid is enabled.

--vc-enable-trimer-context

true/false

vc-ntd-error-params

Params file for per-nucleotide error rate calibration.

--vc-ntd-error-params

*.snperror-sampler.log

vc-estimate-ntd-error

Override whether to run ntd error rate estimation

--vc-estimate-ntd-error

true/false

vc-forcegt-vcf

Forces genotyping for small variant calling. A file (*.vcf or *.vcf.gz) containing a list of small variants is required.

--vc-forcegt-vcf

*.vcf or *.vcf.gz file specifying the small variants to force genotype.

vc-gvcf-bands

Define bands for gVCF output. The default value is 1 10 20 30 40 60 80 for germline calling and 1 3 10 20 50 80 for somatic. If enable-multi-sample-gvcf is enabled, the default value is 5, 20, 60.

--vc-gvcf-bands

vc-gvcf-homref-lod

Sets the limit of detection for somatic homref calls. The default value is 0.05.

--vc-gvcf-homref-lod

vc-hard-filter

Uses a list of Boolean expressions to filter variant calls. The default expression is HardQUAL:all: QUAL < 10.4139;LowDepth:all: DP < 1

--vc-hard-filter

QD MQ FS MQRankSum ReadPosRankSum QUAL DP GQ

vc-homref-depth-filter-threshold

In gvcf mode, filters all somatic homref variants with a depth below this threshold. The default value is 3.

--vc-homref-depth-filter-threshold

vc-max-alternate-alleles

Specifies the maximum number of ALT alleles to output in a VCF or gVCF. The default value is 1000.

--vc-max-alternate-alleles

vc-max-reads-per-active-region

Specifies the maximum number of reads for an active region for downsampling. The default value is 10000.

--vc-max-reads-per-active-region

vc-max-reads-per-active-region-mito

Specifies the maximum number of reads for an active region of mitochondrial small variant calling. The default value is 40000.

--vc-max-reads-per-active-region-mito

vc-max-reads-per-raw-region

Specifies the maximum number of reads per raw region for downsampling. The default value is 30000.

--vc-max-reads-per-raw-region

vc-max-reads-per-raw-region-mito

Specifies the maximum number of reads covering a specified raw region of mitochondrial small variant calling. The default value is 40000.

--vc-max-reads-per-raw-region-mito

vc-min-base-qual

Specifies the minimum base quality to be considered in the active region detection of the small variant caller. The default value is 10.

--vc-min-base-qual

assembler-min-contig-qual

Specifies the minimum base quality to be considered for De Bruijn graph construction. The default value is 10.

--assembler-min-contig-qual

vc-min-tail-qual

Specifies the minimum base quality to trim consecutive bases on either end of a read. The default value is 10.

--vc-min-tail-qual

vc-min-call-qual

Specifies the minimum variant call quality for emitting a call. The default value is 3.

--vc-min-call-qual

vc-min-read-qual

Specifies the minimum read quality (MAPQ) to be considered for small variant calling. The following default values exist: 1 for germline 3 for somatic T/N 20 for somatic T-only

--vc-min-read-qual

vc-min-reads-per-start-pos

Specifies the minimum number of reads per start position for downsampling. The default value is 10.

--vc-min-reads-per-start-pos

vc-min-tumor-read-qual

Specifies the minimum tumor read quality (MAPQ) to be considered for variant calling.

--vc-min-tumor-read-qual

vc-orientation-bias-filter-artifacts

Specifies the artifact type to be filtered. An artifact, or an artifact and the reverse compliment of the artifact, cannot be listed twice.

--vc-orientation-bias-filter-artifacts

C/T, G/T C/T, G/T, C/A

vc-output-variant-read-position

Enables outputting the variant read position in the INFO field. The default value is false.

--vc-output-variant-read-position

true/false

vc-override-tumor-pcr-params-with-normal

Ignores the tumor sample parameters and uses the normal sample parameters for analysis of both samples. The default value is true.

--vc-override-tumor-pcr-params-with-normal

true/false

vc-remove-all-soft-clips

If set to true, the variant caller does not use soft clips of reads to determine variants. The default value is false.

--vc-remove-all-soft-clips

true/false

vc-roh-blacklist-bed

If provided, the ROH caller ignores variants that are contained in any region in the block list BED.

--vc-roh-blacklist-bed

vc-sq-call-threshold

Sets the SQ call threshold to emit a call in the VCF. The default value is 3.0 for tumor-normal and 0.1 for tumor-only.

--vc-sq-call-threshold

vc-sq-filter-threshold

Sets the SQ filter threshold mark calls as filtered in the VCF. The default value is 17.5 for tumor-normal and 3.0 for tumor-only.

--vc-sq-filter-threshold

vc-somatic-hotspots

Provides a file to override the default hotspots file.

--vc-somatic-hotspots

vc-use-somatic-hotspots

If set to false, disables the use of somtic hotspots.

--vc-use-somatic-hotspots

true/false

vc-hotspot-log10-prior-boost

Specifies the magnitude by which the prior probabilities of hotspot variants are boosted (default: 4.0).

--vc-hotspot-log10-prior-boost

vc-target-bed

Restricts processing of the small variant caller, target BED related coverage, and callability metrics to regions specified in a BED file.

--vc-target-bed

*.bed file

vc-target-bed-padding

Specifies a number of bases that the small variant caller then uses to pad each target BED region. The default value is 0. .

--vc-target-bed-padding

vc-target-coverage

Specifies the target coverage for downsampling. The default value is 500 for germline and 50 for somatic mode.

--vc-target-coverage

vc-target-coverage-mito

Specifies the maximum number of reads with a start position overlapping any given position for mitochondrial small variant calling. The default value is 40000.

--vc-target-coverage-mito

vc-target-vaf

Specifies an allele frequency above which haplotypes will be considered by the caller as potentially appearing in the sample. Default=0.03.

--vc-target-vaf

[0, 1]

vc-tin-contam-tolerance

Sets the maximum tumor-in-normal contamination expected. Setting this to a nonzero value enables liquid tumor mode. If liquid tumor mode is enabled, the default value is 0.15. If liquid tumor mode is disabled, the default value is 0.

--vc-tin-contam-tolerance

vc-excluded-regions-bed

Somatic mode only: if provided, variants that overlap with the regions in the BED file are hard-filtered and marked as "excluded_regions" in the filter column.

--vc-excluded-regions-bed

*.bed file

vc-systematic-noise

Specifies a BED file with site-specific systematic noise level to calculate AQ score (systematic noise score).

--vc-systematic-noise

vc-systematic-noise-filter-threshold

Sets the AQ threshold for applying the systematic-noise filter. The default value is 10 for tumor-normal and 60 for tumor-only.

--vc-systematic-noise-filter-threshold

0–100

vc-systematic-noise-filter-threshold-in-hotspot

Sets the AQ threshold for applying the systematic-noise filter to hotspot variants. The default value is 10 for tumor-normal and 20 for tumor-only.

--vc-systematic-noise-filter-threshold-in-hotspot

0–100

vc-enable-germline-tagging

--vc-enable-germline-tagging

true/false

germline-tagging-db-threshold

The minimum alternative allele count in population database for a variant to be defined as germline. The default value is 50.

--germline-tagging-db-threshold

germline-tagging-pop-af-threshold

The minimum population allele frequency for a variant to be defined as germline. Once specified, this will ignore the input from --germline-tagging-db-threshold.

--germline-tagging-pop-af-threshold

Nirvana Annotation Options

Name

Description

Command Line Equivalent

Value

enable-variant-annotation

Enable Nirvana variant annotation on the output vcf/gvcf files. The default is false.

--enable-variant-annotation

true/false

variant-annotation-data

Top directory containing Nirvana data file. Dowloadable at https://support.illumina.com/content/dam/illumina-support/help/Illumina_DRAGEN_Bio_IT_Platform_v3_7_1000000141465/Content/SW/Informatics/Dragen/Nirvana_DownloadData_fDG.htm

--variant-annotation-data

variant-annotation-assembly

Genome assembly to use for variant annotation.

--variant-annotation-assembly

GRCh37/GRCh38

Mutation Annotation Format (MAF) Conversion Options

Name

Description

Command Line Equivalent

Value

enable-maf-output

Enables Mutation Annotation Format (MAF) output. The default value is false.

--enable-maf-output

true/false

maf-transcript-source

Specifies desired transcript source for Mutation Annotation Format (MAF) output.

--maf-transcript-source

Refseq/Ensembl

maf-input-vcf

Specifies input VCF file for standalone Mutation Annotation Format (MAF) output.

--maf-input-vcf

*.hard-filtered.vcf.gz file

maf-input-json

Specifies input JSON file for standalone Mutation Annotation Format (MAF) output.

--maf-input-json

*.hard-filtered.vcf.annotated.json.gz file

maf-include-non-pass-variants

Enables all variants, including non-PASS variants, output. THe default value is false.

--maf-include-non-pass-variants

true/false

CNV Caller Options

The following options are applicable to the CNV caller.

Name

Description

Command Line Equivalent

Value

cnv-bypass-contig-check

Bypass contig check for self normalization.

--cnv-bypass-contig-check

true/false

cnv-cbs-alpha

Specifies the significance level for the test to accept change points. The default value is 0.01.

--cnv-cbs-alpha

cnv-cbs-eta

Specifies the type I error rate of the sequential boundary for early stopping when using the permutation method. The default value is 0.05.

--cnv-cbs-eta

cnv-cbs-kmax

Specifies the maximum width of smaller segment for permutation. The default value is 25.

--cnv-cbs-kmax

cnv-cbs-min-width

Specifies the minimum number of markers for a changed segment. The default value is 2.

--cnv-cbs-min-width

[2,5]

cnv-cbs-nmin

Specifies the minimum length of data for maximum statistic approximation. The default value is 200.

--cnv-cbs-nmin

cnv-cbs-nperm

Specifies the number of permutations used for p-value computation. The default value is 10000.

--cnv-cbs-nperm

cnv-cbs-trim

Specifies the proportion of data to be trimmed for variance calculations. The default value is 0.025.

--cnv-cbs-trim

cnv-counts-method

Specifies the overlap method for counting an alignment.

--cnv-counts-method

midpoint / start / overlap

cnv-enable-filter-copy-ratio

Enable cnvCopyRatio filtering based on fixed threshold values. Default true for germline analysis

--cnv-enable-filter-copy-ratio

true/false

cnv-enable-gcbias-correction

Enables GC bias correction. The default is true.

--cnv-enable-gcbias-correction

true/false

cnv-enable-gcbias-smoothing

Enables smoothing across GC bins. The default value is true. The default value is true.

--cnv-enable-gcbias-smoothing

true/false

cnv-enable-gender-matched-pon

Enable gender matched PON normalization. The default value is true.

--cnv-enable-gender-matched-pon

true/false

cnv-enable-ref-calls

When set to true, copy neutral (REF) calls are included in the output VCF.

--cnv-enable-ref-calls

true/false

cnv-enable-self-normalization

Enables self-normalization.

--cnv-enable-self-normalization

true/false

cnv-enable-tracks

Enables generation of track files that can be imported into IGV for viewing. The default is true.

--cnv-enable-tracks

true/false

cnv-exclude-bed

Specifies regions to blocklist for CNV processing.

--cnv-exclude-bed

cnv-exclude-bed-min-overlap

Specifies the minimum fraction of overlap between target intervals and the blocklist to exclude target from the list.

--cnv-exclude-bed-min-overlap

[0.0, 1.0]

cnv-extreme-percentile

Specifies the extreme median percentile value used to filter out samples. The default value is 2.5.

--cnv-extreme-percentile

[0.0, 100.0]

cnv-filter-bin-count

Minimum number of bins to pass a call (currently only applied to somatic WGS calls)

--cnv-filter-bin-count

[0.0, inf)

cnv-filter-bin-support-ratio

If the span of supporting bins is less than the specified ratio with respect to the overall event length, the option filters out a candidate event. The default ratio is 0.2 (20% support).

--cnv-filter-bin-support-ratio

[0.0, 1.0]

cnv-filter-bin-support-ratio-min-len

Mininum event length to apply cnv-filter-bin-support-ratio (currently only applied for germline WGS calls). Default value of 80000.

--cnv-filter-bin-support-ratio-min-len

[0.0, inf]

cnv-filter-copy-ratio

Specifies the minimum copy ratio threshold value centered about 1.0 at which a reported event is marked as PASS in the output VCF file. The default value is 0.2.

--cnv-filter-copy-ratio

[0.0, 1.0]

cnv-filter-del-mean

SM value used to hard filter DELs in CNV VCF (Somatic WGS) when the caller returns the default model (purity: NA). Default is automatically computed based on the variance of the sample.

--cnv-filter-del-mean

[0.0, 1.0]

cnv-filter-dup-mean

SM value used to hard filter DUPs in CNV VCF (Somatic WGS) when the caller returns the default model (purity: NA). Default is automatically computed based on the variance of the sample.

--cnv-filter-dup-mean

[1.0, inf)

cnv-filter-de-novo-quality

Sets the Phred-scale threshold for calling an event as de novo in the proband.

--cnv-filter-de-novo-quality

[0, inf)

cnv-filter-duplicate-alignments

Filter duplicate marked alignments during target counts if option is set to true. Require enable-duplicate-marking=true

--cnv-filter-duplicate-alignments

true/false

cnv-filter-length

Specifies the minimum event length in bases at which a reported event is marked as PASS in the output VCF file. The default value is 10000.

--cnv-filter-length

[0, inf)

cnv-filter-limit-of-detection

Target limit of detection for enrichment somatic CNV alternative hypothesis test. The default value is 0.2.

--cnv-filter-limit-of-detection

[0.0, inf)

cnv-filter-qual

Specifies the QUAL value at which a reported event is marked as PASS in the output VCF file.

--cnv-filter-qual

[0, inf)

cnv-generate-pon-metric-file

Generate PON metric file for WES/targeted panel.

--cnv-generate-pon-metric-file

true/false

cnv-input

Specifies a CNV input file instead of a BAM. Files can be target.counts.gz or tn.tsv.gz for de novo.

--cnv-input

cnv-interval-width

Specifies the width of the sampling interval for CNV WGS processing.

--cnv-interval-width

[100, inf)

cnv-max-percent-zero-samples

Specifies the number of zero coverage samples allowed for the target. If the target exceeds the specified threshold, then the target is filtered out. The default value is 5%.

--cnv-max-percent-zero-samples

[0.0, 100.0]

cnv-max-percent-zero-targets

Specifies the number of zero coverage targets allowed for the sample. If the sample exceeds the specified threshold, then the sample is filtered out. The default value is 5%.

--cnv-max-percent-zero-targets

[0.0, 100.0]

cnv-merge-distance

Specifies the maximum segment gap allowed for merging segments. The default value for Somatic WGS is 10k, for Germline WGS is 0. Default is inf when using CNV WES workflows.

--cnv-merge-distance

[0, inf)

cnv-merge-threshold

Specifies the maximum segment mean difference to merge two adjacent segments. The segment mean is represented as a linear copy ratio value.

--cnv-merge-threshold

[0.0, inf)

cnv-min-mapq

Specifies the minimum MAPQ for alignment to be counted.

--cnv-min-mapq

[1, inf)

cnv-normal-b-allele-vcf

Normal sample SNV VCF for determining het sites.

--cnv-normal-b-allele-vcf

cnv-normal-cnv-vcf

Matched-normal CNV calls.

--cnv-normal-cnv-vcf

cnv-normals-file

Specifies a single file to be used in the panel of normals. You can use the option multiple times, once for each file.

--cnv-normals-file

cnv-normals-list

Specifies a text file containing paths to the list of reference target counts files to use as a panel of normals.

--cnv-normals-list

cnv-num-gc-bins

Specifies the number of bins for GC bias correction. Each bin represents the GC content percentage. The default value is 25.

--cnv-num-gc-bins

10 20 25 50 100

cnv-num-singular-values

Number of singular values to retain for tangent normalization. The default is 5 when cnv-segmentation-mode=bed, otherwise dynamically detected.

--cnv-num-singular-values

[1, inf)

cnv-ploidy

Specifies the normal ploidy value. Used for estimating the copy number value emitted in the output VCF file. The default value is 2.

--cnv-ploidy

cnv-population-b-allele-vcf

CNV population SNP input VCF file.

--cnv-population-b-allele-vcf

cnv-qual-length-scale

Specifies the bias weighting factor to adjust QUAL estimates for segments with longer lengths. The default value is 0.9303 (2-0.1) and should not need to be modified.

--cnv-qual-length-scale

[0.0, 1.0]

cnv-qual-noise-scale

Specifies the bias weighting factor to adjust QUAL estimates based on sample variance. The default value is 1.0 and should not need to be modified.

--cnv-qual-noise-scale

[1.0, 10.0]

cnv-segmentation-mode

Specifies the segmentation algorithm to perform.

--cnv-segmentation-mode

cbs slm hslm aslm

cnv-somatic-enable-lower-ploidy-limit

Enable check on lower ploidy limit based on essential genes. Default true.

--cnv-somatic-enable-lower-ploidy-limit

true/false

cnv-somatic-essential-genes-bed

BED file containing genes (regions) where the model should not predict HOMDELs. A default set of regions will be used if this is not provided.

--cnv-somatic-essential-genes-bed

cnv-skip-contig-list

A comma-separated list of contig identifiers to skip when generating intervals for WGS analysis. If not specified, the following contigs are skipped by default: chrM,MT,m,chrm.

--cnv-wgs-skip-contig-list

cnv-slm-eta

Sets the baseline probability that the mean process changes its value. A higher value increases SLM segmentation sensitivity. The default value is 4e–5.

--cnv-slm-eta

[0, inf)

cnv-slm-fw

Specifies the minimum number of data points for a CNV to be emitted. The default value is 0.

--cnv-slm-fw

cnv-slm-omega

Sets the scaling parameter modulating relative weight between experimental/biological variance. The default value is 0.3.

--cnv-slm-omega

[0, inf)

cnv-slm-stepeta

Specifies the distance normalization parameter. The default value is 10000. Only valid for HSLM.

--cnv-slm-stepeta

[0, inf)

cnv-target-bed

Specifies a properly formatted BED file that indicates the target intervals to use for sample coverage. Use in WES analysis.

--cnv-target-bed

cnv-target-factor-threshold

Specifies the bottom percentile of panel-of-normals medians to filter out useable targets. The default value is 1% for whole genome processing and 5% for targeted sequencing processing.

--cnv-target-factor-threshold

cnv-truncate-threshold

Sets the percent threshold used to truncate extreme outliers. The default value is 0.1%.

--cnv-truncate-threshold

cnv-use-somatic-vc-baf

Use somatic SNV BAFs from VC for B allele counting.

--cnv-use-somatic-vc-baf

cnv-use-somatic-vc-vaf

Use somatic SNV VAFs from VC to help determine purity and ploidy.

--cnv-use-somatic-vc-vaf

Structural Variant Caller Options

Name

Description

Command Line Equivalent

Range

enable-sv

Enables the structural variant caller. The default value is false.

--enable-sv

true/false

sv-call-regions-bed

Specifies a BED file containing the set of regions to call. Optionally, you can compress the file in GZIP or BZIP format.

--sv-call-regions-bed

sv-denovo-scoring

Enables de novo quality scoring for structural variant joint diploid calling. Provide the pedigree file as well.

--sv-denovo-scoring

sv-forcegt-vcf

Specifies a VCF of structural variants for forced genotyping. The variants are scored and included in the output VCF, even if not found in the sample data. The variants are merged with any additional variants discovered directly from the sample data.

--sv-forcegt-vcf

sv-discovery

Enables SV discovery. Set to false when using --sv-forcegt-vcf to indicate that SV discovery should be disabled and only the forced genotyping input should be used.

--sv-discovery

true/false

sv-exome

When set to true, configures the variant caller for targeted sequencing inputs, which includes disabling high depth filters. The default value is false unless --enable-map-align=true and there is not more than 50 Gb of sequencing input.

--sv-exome

true/false

sv-output-contigs

Set to true to have assembled contig sequences output in a VCF file. The default value is false.

--sv-output-contigs

true/false

sv-region

Limits the analysis to a specified region of the genome for debugging purposes. You can use the option multiple times to build a list of regions.

--sv-region

Must be in the format chr:startPos-endPos.

VNTR Caller Options

The following options pertain to the VNTR Caller. For more information on these options, see the VNTR Calling page.

Name

Description

Command Line Equivalent

Range

enable-vntr

Enables the VNTR caller (default value is false).

--enable-vntr

true/false

vntr-num-threads

Sets the number of threads used by the VNTR caller (default value is 36).

--vntr-num-threads

integer: [1, max available]

vntr-catalog-bed

Specifies the set of regions considered by the VNTR caller, formatted as a BED file. Optionally the bed file can be compressed in GZIP format.

--vntr-catalog-bed

vntr-normalization-regions-bed

Specifies a BED file of regions free of large variants to be used as a baseline for custom references.

--vntr-normalization-regions-bed

vntr-priors-model

Specifies the priors model to be used by the VNTR genotyper: 1 is no priors, 2 is a set of 4 priors based on genotype, 3 is population allele-based priors. Default is 3 with a default set of population priors. If 3 is used with a custom reference then a vntr-priors-file must also be provided.

--vntr-priors-model

1, 2, or 3

vntr-priors-file

Specifies a set of population allele priors to be used for the VNTR genotyper as a JSON file. Required for a custom reference with the vntr-priors-model set to 3.

--vntr-priors-file

sv-vntr-merge

Integrates the VNTR calls into the SV output VCF when both VNTR and SV calling are enabled (default value is true). Also splits multi-allelic VNTR calls, filters short VNTR calls, and filters overlapping SV calls in the SV VCF.

--sv-vntr-merge

true/false

sv-vntr-filter-total-calls

Applies "TotalCall" filter to VNTR "total calls" with GT=./. reported in merged SV VCF when both VNTR and SV calling are enabled (default value is true).

--sv-vntr-filter-total-calls

true/false

Repeat Expansion Detection Options

The following options can be set in the RepeatGenotyping section of the configuration file or on the command line. For more information, see Repeat Expansion Detection [on page 1]{.underline}.

Name

Description

Command Line Equivalent

Range

enable

Enables repeat expansion detection.

--repeat-genotype-enable

true/false

specs

Specifies the full path to the JSON file that contains the repeat variant catalog (specification) describing the loci to call.

--repeat-genotype-specs

use-catalog

Repeat variant catalog type to use (default - ~60 repeats, default_plus_smn - same as default with SMN repeat, expanded - ~50K repeats).

--repeat-genotype-use-catalog

default/default_plus_smn/expanded

Repeat Profiling Options

Name

Description

Command Line Equivalent

Range

enable-str-profiler

Enables the STR profiler module

--enable-str-profiler

true/false

str-profiler-sample-name

A name to identify the sample in downstream analyses (default: same as RGSM)

--str-profiler-sample-name

str-profiler-output-directory

Specify a directory where to save the STR profile into (default: same as --ouput-directory)

--str-profiler-output-directory

str-profiler-analysis

Specify an analysis to be performed on the samples cohort

--str-profiler-analysis

outlier/casecontrol

str-profiler-controls-directory

Specify the directory containing the profiles for the control samples (required by --str-profiler-analysis)

--str-profiler-controls-directory

str-profiler-cases-directory

Specify the directory containing the profiles for the cases samples (required by --str-profiler-analysis)

--str-profiler-cases-directory

str-profiler-regions-bed

Specify the path to a file in BED format containing regions to restrict the analysis to

--str-profiler-regions-bed

str-profiler-resampling-rounds

Specify how many times to resample the read counts during outlier analysis (default: 1000)

--str-profiler-resampling-rounds

Integer [3-Inf)

str-profiler-threads

Specify how many threads to use during resampling (default: 48)

--str-profiler-threads

Integer [1-Inf)

str-profiler-min-anchored-mapq

Minimum mapping quality for a read to be considered an anchor (default: 50)

--str-profiler-min-anchored-mapq

Integer [0-Inf)

str-profiler-max-irr-mapq

Maximum mapping quality for a read entropy to be computed (default: 40)

--str-profiler-max-irr-mapq

Integer [0-Inf)

str-profiler-shortest-unit-to-consider

Shortest motif size to evaluate (default: 2)

--str-profiler-shortest-unit-to-consider

Integer [2-Inf)

str-profiler-longest-unit-to-consider

Longest motif size to evaluate (default: 20)

--str-profiler-longest-unit-to-consider

Integer [2-Inf)

RNA-seq Options

Name

Description

Command Line Equivalent

Range

enable-rna

Enables processing of RNA-seq data.

--enable-rna

true/false

annotation-file

Use to supply a gene annotation file. Required for quantification and gene-fusion.

--annotation-file, -a

Path to GTF/GFF file

enable-rna-quantification

Enables RNA quantification.

--enable-rna-quantification

true/false

enable-rna-gene-fusion

Enables RNA gene fusion calling.

--enable-rna-gene-fusion

true/false

UMI Options

Name

Description

Command Line Equivalent

Range

umi-library-type

Sets the batch option for correcting UMIs. Not required.

--umi-library-type

random-duplex random-simplex nonrandom-duplex

umi-enable

Enables UMI-based read processing.

--umi-enable

true/false

vc-enable-umi-solid

Enables solid tumor UMI-aware VC settings. The default value is false.

--vc-enable-umi-solid

true/false

vc-enable-umi-liquid

Enables liquid tumor UMI-aware VC settings. The default value is false.

--vc-enable-umi-liquid

true/false

vc-enable-umi-germline

Allow germline VC from UMI-collapsed reads. The default value is false.

--vc-enable-umi-germline

true/false

umi-correction-scheme

Describes the methodology to use for correcting sequencing errors in UMIs.

--umi-correction-scheme

lookup random none positional

umi-correction-table

Provides the path to the correction table for lookup correction scheme. .

--umi-correction-table

Path to table file

umi-emit-multiplicity

Sets the consensus read output type.

--umi-emit-multiplicity

both duplex simplex

umi-min-supporting-reads

Specifies the number of input reads with matching UMI and position required to generate a consensus read.

--umi-min-supporting-reads

Integer ≥ 1. The default is 2.

umi-metrics-interval-file

Provides the path to target regions file used for UMI on target metrics.

--umi-metrics-interval-file

Path to valid BED file

umi-source

Specifies the location to read UMIs from.

--umi-source

qname bamtag fastq

umi-fastq

Provides the path to a separate FASTQ file with UMI sequences for each read.

--umi-fastq

Path to valid FASTQ file

umi-nonrandom-whitelist

Provides the path to a file containing valid nonrandom UMIs sequences. Enter one path per line.

--umi-nonrandom-whitelist

umi-fuzzy-window-size

Collapses reads with matching UMIs and alignment positions up to the distance specified.

--umi-fuzzy-window-size

Integer ≥ 1. The default is 3.

umi-output-uncollapsed-bam

Enable raw reads BAM output

--umi-output-uncollapsed-bam

true/false

Systematic Noise BED Creation Options

The following options are applicable to create the systematic noise BED file from normal VCFs.

Step 1. Run DRAGEN somatic tumor-only pipeline on each of approximately 50 normal samples.

Name

Description

Command Line Equivalent

Value

vc-detect-systematic-noise

Runs the tumor-only pipeline in a very sensitive mode that aims to capture noise. Use --tumor-fastq1/2 or --tumor-bam-input to specify input reads. This step requires vc-enable-germline-tagging=true. To explicitly run without germline tagging, use vc-skip-germline-tagging=true. VCFs generated with this setting can be used in step 2 when building a systematic noise file from normal samples. This mode is not intended for analyzing tumor samples. The default value is false.

--vc-detect-systematic-noise

true/false

vc-enable-germline-tagging

Enable germline variant tagging using population databases. The default is false. Once enabled, it will also require user to specify Nirvana parameters. Details can be found in somatic small variant calling section. When used with noise generation it helps prevent treating germline sites as noisy. It is strongly recommended to enable this option when generating VCFs for the normal samples.

--vc-enable-germline-tagging

true/false

Step 2. Generate the final noise file with:

Name

Description

Command Line Equivalent

Value

build-sys-noise-vcfs-list

Path to text file containing list of normal VCF/GVCF files (from step 1) to be included in the systematic noise. One file per line.

--build-sys-noise-vcfs-list

build-sys-noise-germline-vaf-threshold

Minimum variant allele frequency threshold to define germline variants. Variants with AF higher than this threshold will not contribute to the noise file. Set to 1 to disable.

--build-sys-noise-germline-vaf-threshold

Default=1. The valid range is [0-1].

build-sys-noise-use-germline-tag

If available in the VCF use germline tags to prevent germline calls from contributing to systematic noise.

--build-sys-noise-use-germline-tag

Default=true

build-sys-noise-threads

Max number of threads used during noise generation. Each thread consumes approx. 70 GB of system memory.

--build-sys-noise-threads

Options are 1 or 2. Default=2.

build-sys-noise-method

Method to compute noise across samples ['mean'/'max']. For higher specificity 'max' is recommended, for higher sensitivity 'mean' is recommended.

--build-sys-noise-method

Default=mean

build-sys-noise-decimal-precision

Number of decimal digits in noise file. Options are [3-6]. For typical WES/WGS with 50-500X coverage 3 decimal places should be sufficient. For deep UMI samples with low noise rates 5 decimal places are recommended. Lower precision may help reduce noise file size especially on WES/WGS. The default is set for accuracy.

--build-sys-noise-decimal-precision

Default=5

build-sys-noise-min-sample-cov

Min coverage at a site for a sample to be used towards noise estimation. At low coverages estimated allele frequencies become less reliable, but low coverage sites also tend to be noisy and useful for inclusion in the noise file.

--build-sys-noise-min-sample-cov

Default=5

build-sys-noise-min-supporting-samples

Min number of samples with noise at a position in order for a position to be considered systematic-noise

--build-sys-noise-min-supporting-samples

Default=2

Explify Options

There are three separate Explify capabilities available: the Explify analysis pipleine ("explify" prefix), a generalized metagenomics kmer classifier ("kmer-classifier" prefix), and a tool to build databases to be used by the kmer classifier ("kmer-class-db-builder" prefix).

Explify Analysis Pipeline Options

Name

Description

Command Line Equivalent

Range

enable-explify

Enables the Explify Pipeline. The default value is false

--enable-explify

true/false

explify-sample-list

Input sample list .tsv file with sample IDs, FASTQs, etc.

--explify-sample-list

See User Guide

explify-test-panel-name

Set test panel name

--explify-test-panel-name

"RPIP", "UPIP", "VSPv2", "Custom"

explify-test-panel-version

Set to test panel version (e.g. "7.3.2")

--explify-test-panel-version

See User Guide

explify-ref-db-dir

Path to root directory for Explify Database files

--explify-ref-db-dir

explify-load-db-ram

Option to load database into RAM if not on ramdisk. The default value is false.

--explify-load-db-ram

true/false

explify-no-read-qc

Option to turn off read QC on FASTQs before analysis. The default value is false.

--explify-no-read-qc

true/false

explify-internal-control

Option to set internal control from an accepted list. The default value is "Enterobacteria phage T7"

--explify-internal-control

See User Guide

explify-internal-control-concentration

Option to set internal control concentration in copies/mL of sample. The default value is 12100000.

--explify-internal-control-concentration

Integer > 0

explify-sensitivity-threshold

Option to set sensitivity threshold. The default value is 5.

--explify-sensitivity-threshold

0 < Integer < 1000. Only valid for VSPv2

explify-custom-ref-fasta

Reference Fasta file

--explify-custom-ref-fasta

Required for custom ref DBs

explify-custom-ref-bed

Reference BED file

--explify-custom-ref-bed

Optional for custom ref DBs

explify-ncpus

Option to set the number of CPUs available for processing

--explify-ncpus

[1,max avail]

Metagenomics Kmer Classifier Options

Name

Description

Command Line Equivalent

Range

enable-kmer-classifier

Enables the Kmer Classifier. The default value is false

--enable-kmer-classifier

true/false

kmer-classifier-input-read-file

Input sequence file (zipped or unzipped) to the Kmer Classifier

--kmer-classifier-input-read-file

kmer-classifier-db-file

Database of sequences to classify against

--kmer-classifier-db-file

kmer-classifier-load-db-ram

Load the database onto RAM. Do not use if database in on ramdisk. The default value is false

--kmer-classifier-load-db-ram

true/false

kmer-classifier-multiple-inputs

Set to true to run with multiple inputs. The input read file is now a .tsv file that has three columns: Sample ID, Read1 file, (optional) Read 2 file. The default value is false.

--kmer-classifier-multiple-inputs

true/false

kmer-classifier-min-window

The minimum number of consecutive kmers for classify assignment at taxid. The default value is 1

--kmer-classifier-min-window

Integer >=1

kmer-classifier-output-read-seq

Option to enable read sequence column in the output file. The default value is false

--kmer-classifier-output-read-seq

true/false

kmer-classifier-output-taxid-seq

Option to enable a taxid string column in the output file. The default value is false

--kmer-classifier-output-taxid-seq

true/false

kmer-classifier-db-to-taxid-json

Path to JSON file that maps database IDs to external taxids, names, and ranks

--kmer-classifier-db-to-taxid-json

See User Guide

kmer-classifier-no-read-output

Option to not create individual read output. The default value is false

--kmer-classifier-no-read-output

true/false

kmer-classifier-no-taxid-counts

Option to not write taxid count output file. The default value is false

--kmer-classifier-no-taxid-counts

true/false

kmer-classifier-protein-input

Option to indicate protein query sequences and database. The default value is false

--kmer-classifier-protein-input

true/false

kmer-classifier-remove-dups

Set to deduplicate reads in input files

--kmer-classifier-remove-dups

true/false

kmer-classifier-ncpus

Option to set the number of CPUs available for processing

--kmer-classifier-ncpus

[1,max avail]

Metagenomics Kmer Classifier Database Builder Options

Name

Description

Command Line Equivalent

Range

enable-kmer-class-db-builder

Enables the Kmer Classifier Database Builder. The default value is false

--enable-kmer-class-db-builder

true/false

kmer-class-db-builder-input-file

Headerless, tab-delimited file where each line is (1) path to a reference fasta file and (2) the associated taxid. When using --kmer-class-db-builder-taxids-as-seq-name, the second column is required but ignored

--kmer-class-db-builder-input-file

kmer-class-db-builder-kmer-length

Kmer length

--kmer-class-db-builder-kmer-length

[4, 64]

kmer-class-db-builder-gmer-length

Gmer length (must be >= kmer length)

--kmer-class-db-builder-gmer-length

[4, 64]

kmer-class-db-builder-tax-tree-file

.tri file with nodes in the taxonomic tree for a classifier database (not required if building binner database). Headerless, tab-delimited file where each line has (1) child node taxid and (2) parent node taxid.

--kmer-class-db-builder-tax-tree-file

kmer-class-db-builder-protein

Set to indicate input sequences are protein sequences. Default is false.

--kmer-class-db-builder-protein

true/false

kmer-class-db-builder-taxids-to-keep

File with taxids to keep. If set, any kmers with taxids not in this file will be excluded from database.

--kmer-class-db-builder-taxids-to-keep

kmer-class-db-builder-num-categories

Set to build binner database with this number of categories. Max is 25 categories, assumes categories are from 2^0..2^n sequentially. The categories take the place of taxids in the input file.

--kmer-class-db-builder-num-categories

Integer [0,25]

kmer-class-db-builder-save-weights

Set to build classification database that saves all kmers / taxids / weights.

--kmer-class-db-builder-save-weights

true/false

kmer-class-db-builder-kmer-cutoff

Cutoff that excludes k-mers that are found in more than cutoff number of taxids when building a database using --kmer-class-db-builder-save-weights. Helps speed up classification. (Default=1000)

--kmer-class-db-builder-kmer-cutoff

Integer

kmer-class-db-builder-mask-bits

Number of bits to mask in kmer before building / searching. (Deafult=7)

--kmer-class-db-builder-mask-bits

Integer

kmer-class-db-builder-num-cpus

Option to set the number of CPUs available for processing

--kmer-class-db-builder-num-cpus

[1,max avail]

kmer-class-db-builder-num-kmers-per-bucket

Set to output number of kmers in each minimizer bucket. The default value is false.

--kmer-class-db-builder-num-kmers-per-bucket

true/false

kmer-class-db-builder-include-lowercase

Set to include kmers with lowercase bases (usually repeatmasked). The default value is false.

--kmer-class-db-builder-include-lowercase

true/false

kmer-class-db-builder-taxids-as-seq-name

Set to indicate that the reference fastas listed in the input file have taxids as sequence name. In this case, the second column of the input file is ignored. The default value is false.

--kmer-class-db-builder-taxids-as-seq-name

true/false

Targeted Caller Options

Name

Description

Command Line Equivalent

Range

enable-targeted

Enable targeted calling. When the small variant caller is enabled for human germline WGS¹ analysis, then cyp21a2, gba, hba, and rh are enabled by default, otherwise the default is false.

--enable-targeted

true/false or space-separated list of one or more supported target names

targeted-merge-vc

Enable merging of targeted caller small variant VCF records into the <prefix>.hard-filtered.vcf.gz and <prefix>.hard-filtered.gvcf.gz files when the small variant caller is enabled. Enabled by default for cyp21a2, gba, hba, and rh only when sort is enabled.

--targeted-merge-vc

true/false or space-separated list of one or more supported target names

targeted-enable-legacy-output

[DEPRECATED] This option may not be supported for all targets. Enable generation of target-specific .tsv files from previous DRAGEN versions. Default is false.

--targeted-enable-legacy-output

true/false

¹ For exome or enrichment analysis, the default targeted callers are still enabled with the small variant caller, but will not generate any output.

PreviousORA Compression NextDRAGEN Reports

Last updated 3 months ago

Was this helpful?