Cytogenetics Modality
Last updated
Was this helpful?
Last updated
Was this helpful?
Conventional cytogenetics methodologies typically focus on larger alterations than the ones provided by NGS analyses. The Cytogenetics modality for the CNV caller allows the user to visualize CNAs at different resolutions, aiming at providing a more flexible workspace for different use cases.
From the same sample, and during the same run, the Cytogenetics modality starts from the high resolution results provided in the standard output CNV VCF. The output callset then undergoes multiple rounds of smoothing, going progressively from finer resolution to coarser resolution calls (larger alterations). Each round of smoothing produces a smoothed callset which is set aside and becomes the starting point for callsets with higher degree of smoothing.
At the end of the smoothing procedure, the Cytogenetics modality produces several outputs, e.g.:
Multiple GFF3 files, one for each round of smoothing (extension *cyto.<resolution_ID>.gff3
).
A single VCF file, with extension *.cyto.vcf.gz
. This file contains all callsets identified through the smoothing iterations, where the iteration identifier is stored on the INFO/RES
field. Identical alterations across resolutions are deduplicated. In such case, the INFO/RES
field will contain a comma-separated list of resolution identifiers.
Some resolutions will be based on depth of coverage only (no BAF). Their INFO/RES
value will reflect the original callset used as a starting point, with added suffix _depth
. E.g., for depth-only calls derived from resolution 1M
, the new callset will have resolution ID 1M_depth
. Note: calls made at different resolutions or with different information (depth+BAF versus depth-only) may occasionally conflict. For instance, in a region that is AOH that also has a mosaic DEL, the region may be reported as AOH for the depth+BAF calling but may be reported as (mosaic) DEL for the depth-only track. The event type with the strongest evidence will be output for each resolution.
A single IGV session file, with extension *.cyto.igv_session.xml
, which provides a convenient way to load the multiple GFF3 files and other typical tracks found on the standard *.cnv.igv_session.xml
. Below an example screenshot of one of such IGV sessions:
The first 5 tracks provide the DRAGEN CNV calls (Blue/DEL, Green/REF, Magenta/AOH, Red/DUP) at decreasing degree of resolution (from high to low, top to bottom)
The remaining tracks are similar to the standard *cnv.igv_session.xml
run, e.g.: poor mappability regions, target counts coverage, improper pairs, B-allele frequency, etc.
Since the most-informative resolution may vary depending on circumstances (event sizes, distance between calls, presence of smaller calls causing fragmentation, etc), no one-size-fits-all recommendation can work for all cases. However, some practical recommendations to consider are the following:
Each resolution INFO/RES
ID identifies the minimum size for alterations to be considered PASS.
If only minimal call smoothing is necessary, resolution 25k can provide a good balance and provide calls in size ranges compatible with Chromosomal Microarray.
When comparing against technologies such as karyotyping, resolution 1M may be the more appropriate to reduce call fragmentation.
Note: if the use case under consideration is not impacted by call fragmentation, it is typically recommended to use the *.cnv.vcf.gz
or *.cnv_sv.vcf.gz
output results (instead of the ones in *.cyto.vcf.gz
), to take full advantage of the superior detail of NGS.
For some use cases, it is sometimes necessary to inspect a sample at arm or whole-chromosome level. Typically this would require the use of an additional caller, together with the standard CNV caller with automated segment detection. On the same run, the Cytogenetics modality provides such set of calls within the same VCF file (with extension *.cyto.vcf.gz
).
In the example above, two calls derived from such callset. The segment ID annotation (INFO/SEGID
) provides the name for the segment call under consideration (i.e., for this example, p-arm of chromosome 22 and the entire chromosome X). REF calls are not displayed by default unless required explicitly by the user (i.e., with --cnv-enable-ref-calls true
. Note: this will enable REF calls for both CNV and CYTO VCF files).
An additional callset which does not conform to the ones above (no INFO/RES
field) is the one containing whole-arm/-chromosome aneuploidies. For this callset, all reported records have the chromosome name or arm name in the INFO/SEGID
field. Entries for this callset will not be present on any GFF3 file. For more details see the .