CNV Input
Last updated
Was this helpful?
Last updated
Was this helpful?
The DRAGEN CNV pipeline supports multiple input formats. To run the DRAGEN CNV pipeline directly with FASTQ input without generating a BAM or CRAM file, see for instructions on streaming alignment records directly from the DRAGEN map/align stage.
DRAGEN CNV also supports running from an already mapped and aligned BAM or CRAM file. If you have data that has not yet been mapped and aligned, see .
For the DRAGEN CNV pipeline, the hashtable must be generated with the --ht-build-cnv-hashtable
option set to true, in addition to any other options required by other pipelines. When --ht-build-cnv-hashtable
is true, DRAGEN generates an additional k-mer uniqueness map that the CNV algorithm uses to counteract mappability biases. You only need to generate the k-mer uniqueness map file one time per reference hashtable. The generation takes about 1.5 hours per whole human genome.
The reference hashtable is a pregenerated binary representation of the reference genome. For information on generating a hashtable, see .
The following example command generates a hashtable.
The following command-line examples show how to run the DRAGEN map/align pipeline depending on your input type. The map/align pipeline generates an alignment file in the form of a BAM or CRAM file that can then be used in the DRAGEN CNV Pipeline.
You need to generate alignment files for all samples that have not already been mapped and aligned, including any samples to be used as references for normalization. Each sample must have a unique sample identifier. Use the --RGSM
option to specify the identifier. For BAM and CRAM input files, the sample identifier is taken from the file, so the --RGSM
option is not required.
The following example command maps and aligns a FASTQ file:
The following example command maps and aligns an existing BAM file:
The following example command maps and aligns an existing CRAM file:
DRAGEN can map and align FASTQ samples, and then directly stream them to downstream callers, such as the CNV Caller and the Haplotype Variant Caller. You can use this process to skip generation of a BAM or CRAM file, which bypasses the need to store additional files.
To stream alignments directly to the DRAGEN CNV pipeline, run the FASTQ sample through a regular DRAGEN map/align workflow, and then provide additional arguments to enable CNV. The following example command line maps and aligns a FASTQ file, and then sends the file to the Germline CNV WGS pipeline.
DRAGEN can perform mapping and aligning of FASTQ samples, and then directly stream the data to downstream callers. If the input is a FASTQ sample, a single sample can run through both the CNV and the small VC. This triggers self-normalization by default.
Run the FASTQ sample through a regular DRAGEN map/align workflow, and then provide additional arguments to enable the CNV, VC, or both. The options that apply to CNV in the standalone workflows are also applicable here.
The following examples show different commands.
For information on running CNV concurrently with the Haplotype Variant Caller, see .