Allele Specific CNV for Somatic WES CNV
Last updated
Last updated
To detect somatic copy number aberrations and regions with loss of heterozygosity, run the DRAGEN CNV Caller on a tumor sample with a VCF that contains germline SNVs from matched normal sample or population SNV VCF. The output file is a VCF file. Components of the germline CNV caller are reused in the somatic algorithm with the addition of a somatic modeling component, which estimates tumor purity and ploidy.
The germline SNVs are used to compute B-allele ratios in the tumor, which allows for allele-specific copy number calling on the tumor sample. Where possible, use of the small-variant VCF from a matched normal sample.
Panel of normals are used for the reference baseline to provide insight into copy number variants. The ASCN somatic WES CNV model is similar to the somatic WGS CNV model (with different internal parameters tuned for WES), but it uses a panel of normals to remove coverage bias in each target region.
The pipeline accepts various input types for matched normal sample or population SNV VCF for B-allele loci. If the normal sample was already processed using the germline small variant caller, the user can provide its output VCF file.
If the normal sample was not processed, the user can provide raw reads or aligned reads and enable the concurrent execution of the small variant caller. In such case the DRAGEN CNV will receive the small variant caller's output, and use it to estimate B-allele frequencies from the germline SNVs.
If there is no matched normal sample, the user can provide a population SNV VCF. DRAGEN will intersect the population SNV VCF with the target region provided by the cnv-target-bed
and use the resulting SNVs to estimate B-allele frequencies.
You can use following somatic WES CNV calling command-line options:
1 tumor input
1 normal input (either option 1, 2, or 3)
Panel of normals (either option 1, 2, 3 or 4)
Target region
When the normal sample input is not in VCF format (e.g., FASTQ/BAM/CRAM), then the normal sample shall be capable of being used as PON. However, if the normal sample is already included in the PON, then it will not be added.
The following is an example command line for running ASCN tumor-normal somatic WES CNV calling with matched normal SNV VCF.
The following example command line runs ASCN tumor-normal somatic WES CNV calling concurrently with the Somatic SNV Caller, which allows you to use the matched normal germline heterozygous sites directly from the SNV Caller with the command cnv-use-somatic-vc-baf true
.
If a matched normal is not available, DRAGEN CNV requires population SNV VCF to run in tumor-only mode. The following example command line runs tumor-only somatic WGS CNV calling with a population SNV VCF.
ASCN somatic WES CNV pipeline utilize same methods and workflow of DRAGEN Somatic WGS CNV pipeline. Please see Somatic WGS CNV Calling for more details.
Input | Option | Argument | Description |
---|---|---|---|
Tumor input
--tumor-fastq1
,--tumor-fastq2
,--tumor-bam-input
, --tumor-cram-input
file
Specify a tumor input file.
Normal input Option 1
--fastq1
,--fastq2
,--bam-input
, --cram-input
file
Specify a normal input file (if normal VCF is not ready).
Normal input Option 1
--cnv-use-somatic-vc-baf
true
/false
If running in tumor-normal mode with the SNV caller enabled, use this option to specify the germline heterozygous sites. For more information on specifying b-allele loci, see Specification of B-Allele Loci.
Normal input Option 2
--cnv-normal-b-allele-vcf
vcf file
Specify a matched normal SNV VCF. For more information on specifying b-allele loci, see Specification of B-Allele Loci.
Normal input Option 3
--cnv-population-b-allele-vcf
vcf file
Specify a population SNP catalog. For more information on specifying b-allele loci, see Specification of B-Allele Loci.
PON option 1
--cnv-normals-file
normal count file
Specify individual normal counts file (target.counts.gz or target.counts.gc-corrected.gz) for PON. You can use this option multiple times, one time for each file.
PON option 2
--cnv-normals-list
text file indicating normal count files per line
Specify text file that contains paths to the list of reference target counts files to be used as a panel of normals (new line separated).
PON option 3
--cnv-combined-counts
file
Specify combined PON file (.combined.counts.txt.gz).
PON option 4
NA
If no PON sample is specified, then DRAGEN utilizes matched normal sample as single sample PON. Available for Normal input Option 1
Target region
--cnv-target-bed
bed file
Specify target region bed file
Sample sex
--sample-sex
male
/female
/auto
/none
If known, specify the sex of the sample. If the sample sex is not specified, the caller attempts to estimate the sample sex from tumor alignments.