CNV with SV Support
The DRAGEN CNV caller leverages depth as its primary signal for calling copy number variants. Depth alone poses challenges for calling events that are less than 10kbp. The sensitivity of CNVs at lengths less than 10kbp can be improved by leveraging junction signals from the DRAGEN structural variant caller.
When both the DRAGEN CNV and SV caller are executed in a single invocation, then an additional integration step is done at the end of a DRAGEN run to improve the CNV calls. This feature is enabled automatically when DRAGEN detects a germline WGS analysis.
The SV/CNV Integration module takes in DEL and DUP calls from the output data structures of the germline CNV and SV callers, identifies putative matches, updates annotations, filters, scores, and outputs the refined records in a new output VCF. By leveraging junction signals from the SV caller and depth signals from the CNV caller, this approach allows for sensitive CNV detection down to 1kbp while also improving recall and precision across length scales. This is achieved by rescuing previously low quality calls if evidence is found from both callers, and also by adjusting CNV breakends to the more accurate SV breakends. The matching algorithm takes into account the proximity of the events as well as the transition states at the breakends, among other things.
Example command lines
The following is an example command line for running a germline WGS analysis for both CNV and SV.
Other optional CNV or SV parameters can also be added.
Combined CNV/SV VCF Output
The original CNV and SV VCF output files, prior to integration, are available for users in the DRAGEN output directory, as described elsewhere. Additionally, there is an enhanced CNV VCF available with the *.cnv_sv.vcf.gz
extension. The VCF header lines in the *.cnv_sv.vcf.gz
mostly correspond to a concatenation of the individual header lines from the CNV and SV VCFs, with a few lines deduplicated and some new ones added. For details on the legacy header lines, please refer to the individual CNV and SV user guide sections.
Newly added header lines are described in the following table.
END_LEFT_BND_OF
1
String
ID of CNV whose left end is matched to the end of SV
END_RIGHT_BND_OF
1
String
ID of CNV whose right end is matched to the end of SV
LEFT_BND
1
String
ID of SV that matches the left end of CNV record
LEFT_BND_OF
1
String
ID of CNV whose left end is matched to SV
MatchSv
1
Integer
ID of original SV that was merged with CNV record
OrigCnvEnd
1
Integer
Coordinate of original CNV end
OrigCnvPos
1
Integer
Coordinate of original CNV pos
RIGHT_BND
1
String
ID of SV that matches the right end of CNV record
RIGHT_BND_OF
1
String
ID of CNV whose right end is matched to SV
SVCLAIM
A
String
Claim made by the structural variant call. Valid values are D, J, DJ for abundance, adjacency and both respectively
Records that can be matched or rescued will have annotations indicating the breakpoint linkage between a CNV and SV record. If a complete match is found, then the MatchSv
annotation will be present in the record, indicating the SV record's ID
field for this CNV record. Furthermore, the use of the SVCLAIM
field will indicate if the record has evidence arising from depth signal D
, or junction signals J
, or both DJ
.
Because of the mixing of standalone SV records and CNV records, the FORMAT field may have different annotations. For details on the CNV or SV specific annotations, please refer to the individual CNV and SV user guide sections.
Records that can be matched or rescued will have FILTER set to PASS. The original FILTERs are retained for records that were not matched or rescued. For example, the cnvLength
FILTER will still be applied to standalone CNV records (those with SVCLAIM=D
).
Example records are shown below.
Last updated