Allele Specific CNV for Somatic WES CNV

To detect somatic copy number aberrations and regions with loss of heterozygosity, run the DRAGEN CNV Caller on a tumor sample with a VCF that contains germline SNVs from matched normal sample or population SNV VCF. The output file is a VCF file. Components of the germline CNV caller are reused in the somatic algorithm with the addition of a somatic modeling component, which estimates tumor purity and ploidy.

The germline SNVs are used to compute B-allele ratios in the tumor, which allows for allele-specific copy number calling on the tumor sample. Where possible, use of the small-variant VCF from a matched normal sample.

Panel of normals are used for the reference baseline to provide insight into copy number variants. The ASCN somatic WES CNV model is similar to the somatic WGS CNV model (with different internal parameters tuned for WES), but it uses a panel of normals to remove coverage bias in each target region.

The pipeline accepts various input types for matched normal sample or population SNV VCF for B-allele loci. If the normal sample was already processed using the germline small variant caller, the user can provide its output VCF file.

If the normal sample was not processed, the user can provide raw reads or aligned reads and enable the concurrent execution of the small variant caller. In such case the DRAGEN CNV will receive the small variant caller's output, and use it to estimate B-allele frequencies from the germline SNVs.

If there is no matched normal sample, the user can provide a population SNV VCF. DRAGEN will intersect the population SNV VCF with the target region provided by the cnv-target-bed and use the resulting SNVs to estimate B-allele frequencies.

ASCN Somatic WES CNV Calling Options

You can use following somatic WES CNV calling command-line options:

InputOptionArgumentDescription

Tumor input

--tumor-fastq1,--tumor-fastq2,--tumor-bam-input, --tumor-cram-input

file

Specify a tumor input file.

Normal input Option 1

--fastq1,--fastq2,--bam-input, --cram-input

file

Specify a normal input file (if normal VCF is not ready).

Normal input Option 1

--cnv-use-somatic-vc-baf

true/false

If running in tumor-normal mode with the SNV caller enabled, use this option to specify the germline heterozygous sites. For more information on specifying b-allele loci, see Specification of B-Allele Loci.

Normal input Option 2

--cnv-normal-b-allele-vcf

vcf file

Specify a matched normal SNV VCF. For more information on specifying b-allele loci, see Specification of B-Allele Loci.

Normal input Option 3

--cnv-population-b-allele-vcf

vcf file

Specify a population SNP catalog. For more information on specifying b-allele loci, see Specification of B-Allele Loci.

PON option 1

--cnv-normals-file

normal count file

Specify individual normal counts file (target.counts.gz or target.counts.gc-corrected.gz) for PON. You can use this option multiple times, one time for each file.

PON option 2

--cnv-normals-list

text file indicating normal count files per line

Specify text file that contains paths to the list of reference target counts files to be used as a panel of normals (new line separated).

PON option 3

--cnv-combined-counts

file

Specify combined PON file (.combined.counts.txt.gz).

PON option 4

NA

If no PON sample is specified, then DRAGEN utilizes matched normal sample as single sample PON. Available for Normal input Option 1

Target region

--cnv-target-bed

bed file

Specify target region bed file

Sample sex

--sample-sex

male/female/auto/none

If known, specify the sex of the sample. If the sample sex is not specified, the caller attempts to estimate the sample sex from tumor alignments.

Input requirements:

  • 1 tumor input

  • 1 normal input (either option 1, 2, or 3)

  • Panel of normals (either option 1, 2, 3 or 4)

  • Target region

When the normal sample input is not in VCF format (e.g., FASTQ/BAM/CRAM), then the normal sample shall be capable of being used as PON. However, if the normal sample is already included in the PON, then it will not be added.

Example command lines

The following is an example command line for running ASCN tumor-normal somatic WES CNV calling with matched normal SNV VCF.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-normal-b-allele-vcf <SNV_NORMAL_VCF> \
--cnv-normals-list <PANEL_OF_NORMALS> \
--cnv-target-bed <TARGET_BED> \
--sample-sex <SEX>

The following example command line runs ASCN tumor-normal somatic WES CNV calling concurrently with the Somatic SNV Caller, which allows you to use the matched normal germline heterozygous sites directly from the SNV Caller with the command cnv-use-somatic-vc-baf true.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--tumor-bam-input <TUMOR_BAM> \
--bam-input <NORMAL_BAM> \
--enable-cnv true \
--enable-variant-caller true \
--cnv-use-somatic-vc-baf true \
--cnv-normals-list <PANEL_OF_NORMALS> \
--cnv-target-bed <TARGET_BED> \
--vc-target-bed <TARGET_BED> \
--sample-sex <SEX>

If a matched normal is not available, DRAGEN CNV requires population SNV VCF to run in tumor-only mode. The following example command line runs tumor-only somatic WGS CNV calling with a population SNV VCF.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-population-b-allele-vcf <SNV_POP_VCF> \
--cnv-normals-list <PANEL_OF_NORMALS> \
--cnv-target-bed <TARGET_BED> \
--sample-sex <SEX>

Method and Outputs

ASCN somatic WES CNV pipeline utilize same methods and workflow of DRAGEN Somatic WGS CNV pipeline. Please see Somatic WGS CNV Calling for more details.

Last updated