Copy Number Variant Calling

The DRAGEN Copy Number Variant (CNV) Pipeline detects copy number aberrations and regions with loss of heterozygosity (LOH) from next-generation sequencing (NGS) data. It supports germline and somatic workflows for both whole-genome sequencing (WGS) and whole-exome sequencing (WES) in a single interface via the DRAGEN Host Software.

Choose Your Workflow

Use the table below to identify the right pipeline for your use case and jump directly to its documentation.

Sample type
Data type
Input samples
Documentation

Germline

WGS / WES

Single / Multi

Somatic

WGS / WES

Single

Example commandlines are provided under DRAGEN Recipesarrow-up-right.

Visit Reference for more details on the CNV component.

Before You Begin

Before running the CNV pipeline, ensure the following prerequisites are in place:

  1. CNV-enabled reference hashtable — The hashtable must be built with --ht-build-cnv-hashtable true. This generates an additional k-mer uniqueness map used to correct mappability biases.

  2. Aligned BAM or CRAM input — The pipeline accepts pre-aligned reads. If you are starting from FASTQ, first run map/align or use streaming alignments.

  3. Panel of normals (required for WES) — WES normalization requires a panel of normals. WGS can use self-normalization instead.

Pipeline Overview

The DRAGEN CNV Pipeline processes the input signal through the following stages:

1. Target Counts — Read counts and other signals are extracted from alignments and binned into target intervals.

2. Normalization — The case sample is normalized against a panel of normals or against the estimated normal ploidy. Systematic biases (e.g., GC bias) are corrected to amplify event-level signals.

3. Segmentation — The normalized signal is segmented using one of the available segmentation algorithms.

4. CNV Calling — Events are called from the segments, scored, and emitted in the output VCF.

Last updated

Was this helpful?