Effective Coverage Downsampling

DRAGEN can reserve a random subset of fragments that are separate from the normal alignment outputs using downsampling. You can use downsampling to generate data sets for performing comparisons between samples or between replicates. DRAGEN samples fragments after performing any hardware accelerated trimming or filtering functions, which enables DRAGEN to rapidly create analysis-read test data sets.

To enable downsampling, set the --enable-down-sampler command line option to true.

You can use any valid sequencing data format that is compatible with the DRAGEN Host Software. For more information on compatible input options, see Input Options.

DRAGEN downsampling outputs the reserved subset of data in FASTQ format. If the input is paired-ended, DRAGEN outputs two FASTQ files that contain subsampled data. If the input is unpaired, DRAGEN outputs two FASTQ files.

Command-line Options

In addition to enabling the downsampling command line option, you must set the quantity of fragments to downsample. To set the quantity of fragments, use either --down-sampler-fragments or --down-sampler-coverage.

If you specified a coverage level, you must also specify a genome using the --ref-dir or manually specify the genome size using --down-sampler-genome-size. If you specify both a read and coverage limit, DRAGEN applies both quantity limits and keeps whichever result is smaller.

Option
Description

--enable-down-sampler

Set to true to enable downsampling. The default value is false. If enabled, you must set either down-sampler-fragments or --down-sampler-coverage.

--down-sampler-num-threads

Specify the number of threads to use for down-sampled reads. The default value is 8.

--down-sampler-random-seed

Set random seed for down-sampled fragments. The default value is 42.

--down-sampler-genome-size

Set target genome size for downsampling coverage. The default value is 0. The --down-sampler-genome-size option is not compatible with the --ref-dir option.

--down-sampler-fragments

Specify the target number of fragments for downsampling. The default value is 0.

--down-sampler-coverage

Set target genomic coverage for downsampling. The default value is 0. If enabled, you must set either -ref-dir or --down-sampler-genome-size.

Last updated