Effective Coverage Downsampling
DRAGEN can reserve a random subset of fragments that are separate from the normal alignment outputs using downsampling. You can use downsampling to generate data sets for performing comparisons between samples or between replicates. DRAGEN samples fragments after performing any hardware accelerated trimming or filtering functions, which enables DRAGEN to rapidly create analysis-read test data sets.
To enable downsampling, set the --enable-down-sampler
command line option to true
.
You can use any valid sequencing data format that is compatible with the DRAGEN Host Software. For more information on compatible input options, see Input Options.
DRAGEN downsampling outputs the reserved subset of data in FASTQ format. If the input is paired-ended, DRAGEN outputs two FASTQ files that contain subsampled data. If the input is unpaired, DRAGEN outputs two FASTQ files.
Command-line Options
In addition to enabling the downsampling command line option, you must set the quantity of fragments to downsample. To set the quantity of fragments, use either --down-sampler-fragments
or --down-sampler-coverage
.
If you specified a coverage level, you must also specify a genome using the --ref-dir
or manually specify the genome size using --down-sampler-genome-size
. If you specify both a read and coverage limit, DRAGEN applies both quantity limits and keeps whichever result is smaller.
--enable-down-sampler
Set to true
to enable downsampling. The default value is false. If enabled, you must set either down-sampler-fragments
or --down-sampler-coverage
.
--down-sampler-num-threads
Specify the number of threads to use for down-sampled reads. The default value is 8.
--down-sampler-random-seed
Set random seed for down-sampled fragments. The default value is 42.
--down-sampler-genome-size
Set target genome size for downsampling coverage. The default value is 0. The --down-sampler-genome-size
option is not compatible with the --ref-dir
option.
--down-sampler-fragments
Specify the target number of fragments for downsampling. The default value is 0.
--down-sampler-coverage
Set target genomic coverage for downsampling. The default value is 0. If enabled, you must set either -ref-dir
or --down-sampler-genome-size
.
Last updated