Cytogenetics Modality

Conventional cytogenetics methodologies typically focus on larger alterations than the ones provided by NGS analyses. The Cytogenetics modality for the CNV caller allows the user to visualize CNAs at different resolutions, aiming at providing a more flexible workspace for different use cases.

It is enabled, on supported callers, with --cnv-enable-cyto-output=true. Currently supported on:

From the same sample, and during the same run, the Cytogenetics modality starts from the high resolution results provided in the standard output CNV VCF. The output callset then undergoes multiple rounds of smoothing, going progressively from finer resolution to coarser resolution calls (larger alterations). Each round of smoothing produces a smoothed callset which is set aside and becomes the starting point for callsets with higher degree of smoothing.

At the end of the smoothing procedure, the Cytogenetics modality produces several outputs, e.g.:

  • Multiple GFF3 files, one for each round of smoothing (extension *cyto.<resolution_ID>.gff3).

  • A single VCF file, with extension *.cyto.vcf.gz. This file contains all callsets identified through the smoothing iterations, where the iteration identifier is stored on the INFO/RES field. Identical alterations across resolutions are deduplicated. In such case, the INFO/RES field will contain a comma-separated list of resolution identifiers.

    • Some resolutions will be based on depth of coverage only (no BAF). Their INFO/RES value will reflect the original callset used as a starting point, with added suffix _depth. E.g., for depth-only calls derived from resolution 1M, the new callset will have resolution ID 1M_depth. Note: calls made at different resolutions or with different information (depth+BAF versus depth-only) may occasionally conflict. For instance, in a region that is AOH that also has a mosaic DEL, the region may be reported as AOH for the depth+BAF calling but may be reported as (mosaic) DEL for the depth-only track. The event type with the strongest evidence will be output for each resolution.

    • An additional callset which does not conform to the ones above (no INFO/RES field) is the one containing whole-arm/-chromosome aneuploidies. For this callset, all reported records have the chromosome name or arm name in the INFO/SEGID field. Entries for this callset will not be present on any GFF3 file. For more details see the section on whole-chromosome aneuploidies.

  • A single IGV session file, with extension *.cyto.igv_session.xml, which provides a convenient way to load the multiple GFF3 files and other typical tracks found on the standard *.cnv.igv_session.xml. Below an example screenshot of one of such IGV sessions:

    • The first 5 tracks provide the DRAGEN CNV calls (Blue/DEL, Green/REF, Magenta/AOH, Red/DUP) at decreasing degree of resolution (from high to low, top to bottom)

    • The remaining tracks are similar to the standard *cnv.igv_session.xml run, e.g.: poor mappability regions, target counts coverage, improper pairs, B-allele frequency, etc.

Below, an example set of calls from the *.cyto.vcf.gz output file (note additional INFO/RES annotation with respect to *.cnv.vcf.gz output file):

# Example REF call
chr1    819841  DRAGEN:REF:chr1:819841-6103865  N       .       1000    PASS
  END=6103865;REFLEN=5284025;RES=25k,500k,50k
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/0:2:1:1000:1000:2.00155:1.000775:1.000775:129.1:0.5:4544:10920:66,10:0.00368019

# Example copy-neutral LOH call
chr1    6104347 DRAGEN:CNLOH:chr1:6104348-6727324       N       <LOH>   1000    PASS
  END=6727324;REFLEN=622977;RES=25k,500k,50k;SVLEN=622977;LOHTYPE=AOH;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  1/1:2:0:1000:1000:1.9876:0.001988:0.993798:128.2:0.001:528:916:10,12:0.00766703

# Example GAIN call
chr1    16605768        DRAGEN:GAIN:chr1:16605769-16645359      N       <DUP>   427     PASS
  END=16645359;REFLEN=39591;RES=25k;SVLEN=39591;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE
  ./1:6:.:1:.:6.27065:.:3.135326:404.457:.:23:0:6,11

# Example GAIN LOH call
chr15   20212550        DRAGEN:GAINLOH:chr15:20212551-20421468  N       <LOH>   390     PASS
  END=20421468;REFLEN=208918;RES=25k,50k;SVLEN=208918;LOHTYPE=AOH;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  1/1:6:0:1:1:5.90559:0.000000:2.952793:380.91:0:76:1:9,8:0

# Example LOSS call
chr1    25274774        DRAGEN:LOSS:chr1:25274775-25331683      N       <DEL>   226     PASS
  END=25331683;REFLEN=56909;RES=25k,50k;SVLEN=56909;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:1:0:1000:1000:1.01085:0.000000:0.505426:65.2:0:7:10:5,1:0

# Example MOSAIC GAIN call
chr2    89673674        DRAGEN:GAIN:chr2:89673675-89851643      N       <DUP>   1000    PASS
  END=89851643;REFLEN=177969;RES=25k,50k;MOSAIC;SVLEN=177969;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE
  ./1:4:.:0:.:4.32403:.:2.162016:278.9:.:70:0:13,2

# Example MOSAIC LOSS call
chr21   10522164        DRAGEN:LOSS:chr21:10522165-10650403     N       <DEL>   480     PASS
  END=10650403;REFLEN=128239;RES=25k,50k;MOSAIC;SVLEN=128239;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE
  0/1:1:0:0:480:0.84186:.:0.420930:54.3:.:38:0:1,1

Recommendations

Selection of appropriate resolution

Since the most-informative resolution may vary depending on circumstances (event sizes, distance between calls, presence of smaller calls causing fragmentation, etc), no one-size-fits-all recommendation can work for all cases. However, some practical recommendations to consider are the following:

  • Each resolution INFO/RES ID identifies the minimum size for alterations to be considered PASS.

  • If only minimal call smoothing is necessary, resolution 25k can provide a good balance and provide calls in size ranges compatible with Chromosomal Microarray (CMA).

  • When comparing against technologies such as karyotyping, resolution 1M may be the more appropriate to reduce call fragmentation.

Note: if the use case under consideration is not impacted by call fragmentation, it is typically recommended to use the *.cnv.vcf.gz or *.cnv_sv.vcf.gz output results (instead of the ones in *.cyto.vcf.gz), to take full advantage of the superior detail of NGS.

Additional options

Option
Description

--cnv-cyto-keep-resolutions=<resolution_list>

Comma-separated list of resolutions to output (currently supported: 25k,50k,500k,1M,1M_depth)

Whole-chromosome Aneuploidy Detection

For some use cases, it is sometimes necessary to inspect a sample at arm or whole-chromosome level. Typically this would require the use of an additional caller, together with the standard CNV caller with automated segment detection. On the same run, the Cytogenetics modality provides such set of calls within the same VCF file (with extension *.cyto.vcf.gz).

chr21  12000000   DRAGEN:GAIN:chr21:12000001-46709983  N   <DUP>  1000  PASS
  END=46709983;REFLEN=34709983;SEGID=chr21q;SVLEN=34709983;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:3:1:1000:1000:3.00155:1.002518:1.500775:193.6:0.334:29570:66224:0,0:0.0016016

chrX   1        DRAGEN:LOSS:chrX:2-156040895     N     <DEL>  1000  PASS
  END=156040895;REFLEN=156040894;SEGID=chrX;SVLEN=156040894
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:1:0:1000:1000:0.996364:0.000996:0.498182:82.2:0.001:122580:144548:0,0:0.00995089

In the example above, two calls derived from such callset. The segment ID annotation (INFO/SEGID) provides the name for the segment call under consideration (i.e., for this example, q-arm of chromosome 21 and the entire chromosome X). REF calls are not displayed by default unless required explicitly by the user (i.e., with --cnv-enable-ref-calls true. Note: this will enable REF calls for both CNV and CYTO VCF files).

Note: acrocentric chromosomes (13, 14, 15, 21, and 22) have short arms characterized by repetitive regions. These regions create mappability issues and they are typically excluded from analysis. Thus, calling short arm alterations for these chromosomes is challenging, being based on a small percentage of total arm's length. To avoid false positive calls (in this case, indicating an alteration on the full short arm with evidence only coming from a minimal portion of it), the algorithm has a hard threshold (default 500 intervals) on the minimum number of intervals required when calling whole-arm alterations. When the chromosome arm call does not satisfy this threshold, the call is filtered with FILTER chromArmBinCount. The default can be changed with option cnv-filter-chrom-arm-bin-count.

Last updated

Was this helpful?