Somatic CNV Calling WES

For somatic whole-exome sequencing (WES) and somatic targeted panels, you can use a panel of normals as the reference baseline to provide insight into copy number variants. The reported events are based solely on the normalized copy ratio values and the deviation from the expected reference baseline levels. This workflow can be useful for applications that require only the detection of gains and losses in targeted genes. The somatic WES CNV model is similar to the germline WES CNV model, but utilizes a different quality scoring and calling model.

Use one of the following input options.

  • --tumor-fastq1 and --tumor-fastq2 --Specify a FASTQ file

  • --tumor-bam-input --Specify an existing BAM file

  • --tumor-cram-input --Specify an existing CRAM file

The Somatic WES CNV Caller requires a panel of normals. The panel of normals samples help measure instrinsic biases of the upstream processes to allow for proper normalization. To generate a panel of normals, see Panel of Normals. The panel of normals sample should be well matched to the case sample under analysis.

If a matched normal sample is available, the sample can be included in the panel of normals. The workflow does not change if a matched normal is or is not available.

Example Command Lines

The following example command line runs somatic analysis on WES data.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-normals-list <NORMALS> \
--cnv-target-bed <BED> \

If you are using somatic targeted panels with a set of genes supplied with the capture kit, then you can bypass segmentation by specifying a cnv-segmentation-bed and using cnv-segmentation-mode=bed. If using this option, all events in the segmentation BED are reported in the output VCF. For more information on the segmentation BED file, see [Targeted Segmentation (Segment BED)].

The following example command line runs somatic analysis on a targeted panel.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-normals-list <NORMALS> \
--cnv-target-bed <BED> \
--cnv-segmentation-bed <SEGMENT_BED> \
--cnv-segmentation-mode bed \

Quality Scoring and Calling

The Somatic WES CNV Caller computes quality scores using a 2 sample t-test between the normalized copy ratio of the case sample and the panel of normals samples. The caller computes a p-value per segment. The p-values are then converted to Phred-scaled scores. For copy neutral events, the caller computes quality scores as 1-p.

DUP/DEL events calls are made based on the limit of detection threshold (LoD) which is set using cnv-filter-limit-of-detection (default 0.2). For each segment, the caller compute a p-value for hypothetical counts by Case Counts X (1 +/- LoD) against PON. If p-value of Case Counts X (1+LoD) is highest, then segment is called as DUP. If p-value of Case Counts X (1-LoD) is highest, then segment is called DEL. Otherwise segment is called REF.

The output VCF contains the quality score in the QUAL field.

Tumor Purity and Fold Change

Tumor purity can be estimated automatically through the ASCN workflow.

The non-ASCN Somatic WES CNV Caller only reports copy ratio, also known as fold change. Fold change is encoded in the FORMAT/SM field as a linear copy ratio of the segment mean. In such case, if tumor purity is known, you can infer the ploidy of a gene or segment in the sample from the reported fold change using the following calculation.

CopyNumber=[(200xFoldChange)(2x[100TumorPurity%])]/TumorPurity%Copy Number=[(200xFold Change)-(2x[100-Tumor Purity \% ])]/Tumor Purity \%

For example, if the tumor purity is 30% for MET with a fold change of 2.2x, then there are 10 copies of MET DNA in the sample.

Last updated