CNV with SV Support

The DRAGEN CNV caller leverages depth as its primary signal for calling copy number variants. Depth alone poses challenges for calling events that are less than 10kbp. The sensitivity of CNVs at lengths less than 10kbp can be improved by leveraging junction signals from the DRAGEN structural variant caller.

When both the DRAGEN CNV and SV caller are executed in a single invocation, then an additional integration step is done at the end of a DRAGEN run to improve the CNV calls. This feature is enabled automatically when DRAGEN detects a germline WGS analysis.

The SV/CNV Integration module takes in DEL and DUP calls from the output data structures of the germline CNV and SV callers, identifies putative matches, updates annotations, filters, scores, and outputs the refined records in a new output VCF. By leveraging junction signals from the SV caller and depth signals from the CNV caller, this approach allows for sensitive CNV detection down to 1kbp while also improving recall and precision across length scales. This is achieved by rescuing previously low quality calls if evidence is found from both callers, and also by adjusting CNV breakends to the more accurate SV breakends. The matching algorithm takes into account the proximity of the events as well as the transition states at the breakends, among other things.

Example command lines

The following is an example command line for running a germline WGS analysis for both CNV and SV.

dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--bam-input <BAM> \
--enable-map-align false \
--enable-cnv true \
--cnv-enable-self-normalization true \
--enable-sv true \

Other optional CNV or SV parameters can also be added.

Combined CNV/SV VCF Output

The original CNV and SV VCF output files, prior to integration, are available for users in the DRAGEN output directory, as described elsewhere. Additionally, there is an enhanced CNV VCF available with the *.cnv_sv.vcf.gz extension. The VCF header lines in the *.cnv_sv.vcf.gz mostly correspond to a concatenation of the individual header lines from the CNV and SV VCFs, with a few lines deduplicated and some new ones added. For details on the legacy header lines, please refer to the individual CNV and SV user guide sections.

Newly added header lines are described in the following table.

Header Field
Number
Type
Description

END_LEFT_BND_OF

1

String

ID of CNV whose left end is matched to the end of SV

END_RIGHT_BND_OF

1

String

ID of CNV whose right end is matched to the end of SV

LEFT_BND

1

String

ID of SV that matches the left end of CNV record

LEFT_BND_OF

1

String

ID of CNV whose left end is matched to SV

MatchSv

1

Integer

ID of original SV that was merged with CNV record

OrigCnvEnd

1

Integer

Coordinate of original CNV end

OrigCnvPos

1

Integer

Coordinate of original CNV pos

RIGHT_BND

1

String

ID of SV that matches the right end of CNV record

RIGHT_BND_OF

1

String

ID of CNV whose right end is matched to SV

SVCLAIM

A

String

Claim made by the structural variant call. Valid values are D, J, DJ for abundance, adjacency and both respectively

Records that can be matched or rescued will have annotations indicating the breakpoint linkage between a CNV and SV record. If a complete match is found, then the MatchSv annotation will be present in the record, indicating the SV record's ID field for this CNV record. Furthermore, the use of the SVCLAIM field will indicate if the record has evidence arising from depth signal D, or junction signals J, or both DJ.

Because of the mixing of standalone SV records and CNV records, the FORMAT field may have different annotations. For details on the CNV or SV specific annotations, please refer to the individual CNV and SV user guide sections.

Records that can be matched or rescued will have FILTER set to PASS. The original FILTERs are retained for records that were not matched or rescued. For example, the cnvLength FILTER will still be applied to standalone CNV records (those with SVCLAIM=D).

Example records are shown below.

# Merged record, note presence of SVCLAIM=DJ and MatchSv
1   869444  DRAGEN:LOSS:1:869444-870284 N   <DEL>   150  PASS    SVLEN=-840;SVTYPE=CNV;END=870284;REFLEN=840;OrigCnvPos=869000;OrigCnvEnd=871000;SVCLAIM=DJ;MatchSv=DRAGEN:DEL:41710:0:0:0:2:0   GT:SM:CN:BC:GC:CT:AC:PE 1/1:0.649442:1:2:0.6785:0.408:0.3705:10,3
 
# CNV record that did not match, note presence of SVCLAIM=D
1   13472000    DRAGEN:LOSS:1:13472001-13663000 N   <DEL>   69  PASS    SVLEN=-191000;SVTYPE=CNV;END=13663000;REFLEN=191000;SVCLAIM=D   GT:SM:CN:BC:GC:CT:AC:PE 0/1:0.427273:1:141:0.467603:0.501092:0.498667:7,10
  
# SV record that did not match, note presence of SVCLAIM=J
1   14657708    DRAGEN:LOSS:1:14657708-14658485 N   <DEL>   150 PASS    END=14658485;SVTYPE=DEL;SVLEN=-776;CIGAR=1M776D;CIPOS=0,2;HOMLEN=2;HOMSEQ=TC;SVCLAIM=J  GT:FT:GQ:PL:PR:SR   0/1:PASS:671:908,0,668:36,13:27,18

Last updated