# Somatic

## Overview

DRAGEN provides somatic copy number variant (CNV) calling workflows that detect copy number aberrations and regions with loss of heterozygosity (LOH) in whole genome sequencing (WGS) and whole exome sequencing (WES) data. The CNV workflows leverage both depth of coverage and B-allele frequencies (BAFs) to provide comprehensive detection of:

* Copy number gains (duplications) and losses (deletions)
* Copy-neutral loss of heterozygosity (CNLOH)
* Subclonal alterations (WGS only, enabled by default)
* Minor allele copy number estimation

## Workflow

The DRAGEN somatic CNV workflow follows this processing pipeline: ![](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-e461f75e0f305110bc7cd0e0844e4b30ac92fd77%2Fsomatic-cnv-calling-WorkflowBlockDiagram.general.png?alt=media)

The pipeline consists of the following modules:

1. **Target Counts** — Binning of read counts and other signals from alignments
2. **B-Allele Counts** — Extraction of allelic read counts
3. **Bias Correction** — Correction of GC bias and other systematic biases
4. **Normalization** — Detection of normal ploidy levels and normalization
5. **Segmentation** — Breakpoint detection via segmentation of normalized depth and BAF signals
6. **Allele Specific Copy Number (ASCN) Calling** — Integration of depth and BAF segments to determine copy number states and allele-specific information

**B-Allele Frequency Inputs**: The pipeline supports multiple input options for estimating B‑allele frequencies (BAF), depending on the availability of a matched normal sample.

* Matched normal already processed
  * If the matched normal sample has been processed with the germline small variant caller, the resulting VCF file can be provided directly.
* Matched normal not yet processed
  * If the matched normal has not been processed, the user may provide raw reads or aligned reads and enable concurrent execution of the germline small variant caller.
  * In this case, DRAGEN CNV consumes the small variant caller output to estimate B‑allele frequencies from germline SNVs.
* No matched normal available
  * A population SNV VCF may be provided.
  * DRAGEN estimates B‑allele frequencies using variants from the population SNV VCF.

For WES, population SNVs are intersected with the regions defined in cnv-target-bed. The target BED file must contain the same target intervals used to generate the PON.

**Depth-Only Workflow (Legacy)**: For applications that require only fold-change detection without purity/ploidy model estimation, a legacy depth-only workflow is also available for WES and targeted panels. See [Depth-Only Workflow](https://github.com/illumina-swi/dragen-docs/blob/release/4.5-prod/product-guides/dragen-v4.5/user-guide/dragen-dna-pipeline/cnv-calling/legacy/cnv-germline-legacy.md) for details.

## Example Command Lines

### WGS — Tumor-Normal (concurrent SNV caller)

If the matched normal has not been pre-processed, you can run the somatic SNV caller concurrently with CNV, which feeds germline heterozygous sites directly to the CNV caller:

```bash
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--bam-input <NORMAL_BAM> \
--enable-variant-caller true \
--cnv-use-somatic-vc-baf true
```

To additionally enable [Germline-aware Mode](#germline-aware-mode) and [VAF-aware Mode](#vaf-aware-mode), add the following flags:

```bash
--cnv-normal-cnv-vcf <CNV_NORMAL_VCF>   # germline-aware mode
# VAF-aware mode is enabled by default for tumor/matched-normal runs with --enable-variant-caller true
```

### WGS — Tumor-Only (population SNP VCF)

If no matched normal is available, run in tumor-only mode using a population SNP catalog:

```bash
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-population-b-allele-vcf <POP_SNP_VCF>
```

### WES — Tumor-Normal (concurrent SNV caller)

```bash
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--tumor-bam-input <TUMOR_BAM> \
--bam-input <NORMAL_BAM> \
--enable-cnv true \
--enable-variant-caller true \
--cnv-use-somatic-vc-baf true \
--cnv-normals-list <PANEL_OF_NORMALS> \
--cnv-target-bed <TARGET_BED> \
--vc-target-bed <TARGET_BED>
```

### WES — Tumor-Only (population SNP VCF)

```bash
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--cnv-population-b-allele-vcf <POP_SNP_VCF> \
--cnv-normals-list <PANEL_OF_NORMALS> \
--cnv-target-bed <TARGET_BED>
```

## Required Options

| Option         | Description                           |
| -------------- | ------------------------------------- |
| `--enable-cnv` | Enable CNV processing (set to `true`) |

### Input Options

**DNA inputs**

| Option                             | Description                                            |
| ---------------------------------- | ------------------------------------------------------ |
| `--tumor-fastq1`, `--tumor-fastq2` | FASTQ input files (requires `--enable-map-align true`) |
| `--tumor-bam-input`                | BAM input file                                         |
| `--tumor-cram-input`               | CRAM input file                                        |

**B-Allele inputs**

| Option                          | Description                                                                                                              |
| ------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| `--cnv-normal-b-allele-vcf`     | Specify a matched normal SNV VCF.                                                                                        |
| `--cnv-population-b-allele-vcf` | Specify a population SNP catalog.                                                                                        |
| `--cnv-use-somatic-vc-baf`      | If running in tumor-normal mode with the SNV caller enabled, use this option to specify the germline heterozygous sites. |

For more information on specifying b-allele loci, see [Specification of B-Allele Loci](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#b-allele-counts).

**PON inputs**

| Option                  | Description                                                                                                                                                        |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `--cnv-target-bed`      | BED file defining exome capture regions (only for WES)                                                                                                             |
| `--cnv-normals-file`    | Specify individual normal counts file (target.counts.gz or target.counts.gc-corrected.gz) for PON. You can use this option multiple times, one time for each file. |
| `--cnv-normals-list`    | Specify text file that contains paths to the list of reference target counts files to be used as a panel of normals (new line separated).                          |
| `--cnv-combined-counts` | Specify combined PON file (.combined.counts.txt.gz).                                                                                                               |

**Other inputs**

| Option               | Description                                                                        |
| -------------------- | ---------------------------------------------------------------------------------- |
| `--ref-dir`          | DRAGEN reference genome hashtable directory                                        |
| `--enable-map-align` | Enable mapper and aligner module                                                   |
| `--sample-sex`       | Sample sex (e.g., `male`, `female`). If not specified, sex is estimated from data. |

#### Pop SNP download

Population VCF files can be downloaded from link below:

| Reference                            | Size  | Download                                                                                                                                 |
| ------------------------------------ | ----- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| hg38 CNV Population SNP VCF v1.0     | 1.8GB | [Download](https://webdata.illumina.com/downloads/software/dragen/resource-files/misc/hg38_1000G_phase1.snps.high_confidence.vcf.gz)     |
| hg19 CNV Population SNP VCF v1.0     | 1.8GB | [Download](https://webdata.illumina.com/downloads/software/dragen/resource-files/misc/hg19_1000G_phase1.snps.high_confidence.vcf.gz)     |
| hs37d5 CNV Population SNP VCF v1.0   | 1.8GB | [Download](https://webdata.illumina.com/downloads/software/dragen/resource-files/misc/hs37d5_1000G_phase1.snps.high_confidence.vcf.gz)   |
| CHM13-v2 CNV Population SNP VCF v1.0 | 4.0GB | [Download](https://webdata.illumina.com/downloads/software/dragen/resource-files/misc/chm13_v2_1000G_phase1.snps.high_confidence.vcf.gz) |

> Download links in the table opens an external Illumina download page: <https://support.illumina.com/sequencing/sequencing\\_software/dragen-bio-it-platform/product\\_files.html>

### Output Options

| Option                     | Description                                                                        |
| -------------------------- | ---------------------------------------------------------------------------------- |
| `--output-directory`       | Output directory for all results                                                   |
| `--output-file-prefix`     | Prefix prepended to all output file names                                          |
| `--cnv-enable-cyto-output` | Enable cytogenetics-compatible output VCF (default false) - only available for WGS |

### Target Counting Options

| Option                              | Description                                                                                                                                                                                                                                                                                                                                                                |
| ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-counts-method`               | Specifies the counting method for an alignment to be counted in a target bin. Values are midpoint, start, or overlap. The default value is overlap when using the panel of normals approach, which means if an alignment overlaps any part of the target bin, the alignment is counted for that bin. In the self-normalization mode, the default counting method is start. |
| `--cnv-min-mapq`                    | Specifies the minimum MAPQ for an alignment to be counted during target counts generation. The default value is 3 for self-normalization and 20 otherwise. When generating counts for panel of normals, all MAPQ0 alignments are counted.                                                                                                                                  |
| `--cnv-target-bed`                  | Specifies a properly formatted BED file that indicates the target intervals to sample coverage over. For use in WES analysis.                                                                                                                                                                                                                                              |
| `--cnv-interval-width`              | Specifies the width of the sampling interval for CNV processing. This option controls the effective window size. The default is 1000 for WGS analysis and 500 for WES analysis.                                                                                                                                                                                            |
| `--cnv-skip-contig-list`            | Specifies a comma-separated list of contig identifiers to skip when generating intervals for WGS analysis. The default contigs that are skipped, if not specified, are `chrM,MT,m,chrm`.                                                                                                                                                                                   |
| `--cnv-filter-duplicate-alignments` | Filter duplicate marked alignments during target counts if option is set to `true`. The default setting is `true` unless map/align is enabled and duplicate marking is disabled.                                                                                                                                                                                           |

Note that `--cnv-filter-duplicate-alignments` is only available with duplicate marking option set to true. For more information, see [Filter Duplicate Alignments](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#filter-duplicate-alignments)

For more information of target counting method description, see [Target Counts](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#target-counts)

### GC Bias Correction Options

| Option                            | Description                                                                                                                                                       |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-enable-gc-bias-correction` | Enable or disable GC bias correction when generating target counts. The default is true.                                                                          |
| `--cnv-enable-gcbias-smoothing`   | Enable or disable smoothing the GC bias correction across adjacent GC bins with an exponential kernel. The default is true.                                       |
| `--cnv-num-gc-bins`               | Specifies the number of bins for GC bias correction. Each bin represents the GC content percentage. Allowed values are 10, 20, 25, 50, or 100. The default is 25. |

For more information, see [GC Bias Correction](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#gc-bias-correction)

### Normalization Options

A Panel of normals (PON) is used to provide the reference baseline for copy number variants. PON is required for WES, while WGS can use either self (recommended) or PON normalization.

**Self-normalization option**

| Option                            | Description                                                                                                 |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------- |
| `--cnv-enable-self-normalization` | Enable/disable self normalization mode, which does not require a panel of normals (only available for WGS). |

**PON normalization options**

| Option                                       | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `--cnv-extreme-percentile`                   | Specifies the extreme median percentile value at which to filter out samples. The default is 2.5.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| `--cnv-max-percent-zero-samples`             | Specifies the number of zero coverage samples allowed for a target. If the target exceeds the specified threshold, then the target is filtered out. The default value is 5%. The option is sensitive to the number of normal samples being used. Make sure you adjust the threshold accordingly. If your panel of normals size is small and the threshold not adjusted, the option could filter out targets that were not intended to be.                                                                                                                                                        |
| `--cnv-max-percent-zero-targets`             | Specifies the number of zero coverage targets allowed for a sample. If sample exceeds the specified threshold, then the sample is filtered out. The default value is 2.5%. The option is sensitive to the total number of target intervals. Make sure you adjust the threshold accordingly. If the capture kit has a small number of probes and the threshold not adjusted, the option could filter out targets that were not intended to be.                                                                                                                                                    |
| `--cnv-target-factor-threshold`              | Specifies the bottom percentile of panel of normals medians to filter out useable targets. The default is 1% for whole genome processing and 5% for targeted sequencing processing.                                                                                                                                                                                                                                                                                                                                                                                                              |
| `--cnv-truncate-threshold`                   | Specifies a percentage threshold for truncating extreme outliers. The default is 0.1%.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `--cnv-enable-gender-matched-pon`            | Enable/disable gender matched PON normalization. If enabled, DRAGEN uses matched gender PON for sex chromosome normalization. Sex chromosome intervals are filtered if PON has no matched gender sample. The default value is true.                                                                                                                                                                                                                                                                                                                                                              |
| `--cnv-enable-cross-gender-adjustments-chrX` | Enable normalization on chrX by adjusting coverage of PON samples according to the expected number of copies of chrX in male and female samples. If the case sample is male, coverage of female PON samples is scaled down by a factor of 2 on chrX. If the case sample is female, coverage of male PON samples is scaled up by a factor of 2 on chrX. If no male PON samples are available, chrY intervals will be filtered. This feature is only supported for germline enrichment runs. The default value is false; if set to true, then `--cnv-enable-gender-matched-pon` must also be true. |

DRAGEN will select PON normalization if PON is provided. For more information, see [normalization](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#normalization)

### Segmentation Options

The segmentation method for both WGS and WES somatic workflows, and in both tumor-normal and tumor-only configurations, is a variant of shifting level models (SLM) called adaptive shifting level models, or ASLM. This can be overridden with the option `--cnv-segmentation-mode` (see [segmentation](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#segmentation)), but is not recommended.

| Option           | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| --cnv-slm-eta    | Probability that the segmenter changes to any other state than the current state going from the current target to the next target. This could also be expressed as the probability that the true depth for adjacent targets is different for reasons that simple counting noise does not adequately explain. Likewise, the stay-in-state probability is (1.0 - eta). The effective default value is 3e-3, the range is (0.0, 1.0) excluding endpoints. Decreasing this value results in longer segments and reduced fragmentation; increasing produces shorter segments with more fragmentation. |
| --cnv-slm-bafeta | Similar to above, but between adjacent B allele sites. The default value is 1e-7 for somatic WES, 1e-12 for somatic WGS tumor-normal, and 1e-20 for somatic WGS tumor-only. The range is (0.0, 1.0) excluding endpoints. Decreasing this value results in longer segments and reduced fragmentation; increasing produces shorter segments with more fragmentation. However, see below for the limited purpose of segmentation on B allele frequencies.                                                                                                                                           |

The B allele segmentation is performed separately and independently of the depth segmentation. It is a crude segmentation to find the segments which have a balanced B allele frequency, indicating both parental haplotypes are present at equal copy number. A subset of these B allele balanced segments, subject to some additional criteria, are then used to identify a common variance parameter for the depth domain. The ordinary [SLM method](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#shifting-level-models-segmentation) used for depth-based segmentation is then extended to have a state-dependent emission variance computed from the common variance and scaled by the state mean. **The B allele segmentation is not directly used after this, but it plays a critical role in determining the parameterization of the depth-based segmentation.** However, there is no analogous parameter in ASLM to `--cnv-slm-omega` in SLM and HSLM as described for germline analyses.

The following options are documented here in proximity to segmentation options because of their direct relevance to each other. Once provisional calls for copy number (CN) and minor copy number (MCN) have been made on the resulting segments from the segmentation stage, given the selected [purity/ploidy model](#purityploidy-model-selection-options), adjacent segments with the same CN and MCN are joined to form a single segment. This is continued until no two adjacent segments satisfy the merging criteria. [Segment merging](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#call-smoothing) is a critical step which compensates for over-segmentation or over-fragmentation happening at the segmentation stage. However, segment merging cannot split segments apart, so it cannot compensate in the other direction. **Thus, segmentation can afford to produce a degree of over-segmentation, but there is no compensatory mechanism for under-segmentation.** These options control segment merging in somatic analyses and do not depend on the segmentation option settings.

| Option                | Description                                                                                                                                                                                                                                                                                  |
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| --cnv-merge-distance  | Maximum gap in base pairs between two adjacent segments that still allows them to be merged. The default is 10000 for somatic WGS, meaning segments must be within 10 kb of each other. For WES, the default is effectively unlimited, since target intervals are inherently non-contiguous. |
| --cnv-merge-threshold | Maximum difference in segment mean (linear copy ratio) between two adjacent segments that still allows them to be merged. The default is 0.025 for somatic WGS and 0.4 for somatic WES.                                                                                                      |

Setting `--cnv-merge-threshold` to zero disables segment merging entirely. This is not recommended.

You can specify additional CBS options

| Option                | Description                                                                                                                           |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-cbs-alpha`     | Specifies the significance level for the test to accept change points. The default is 0.01.                                           |
| `--cnv-cbs-eta`       | Specifies the Type I error rate of the sequential boundary for early stopping when using the permutation method. The default is 0.05. |
| `--cnv-cbs-kmax`      | Specifies maximum width of smaller segment for permutation. The default is 25.                                                        |
| `--cnv-cbs-min-width` | Specifies the minimum number of markers for a changed segment. The default is 2.                                                      |
| `--cnv-cbs-nmin`      | Specifies the minimum length of data for maximum statistic approximation. The default is 200.                                         |
| `--cnv-cbs-nperm`     | Specifies the number of permutations used for p-value computation. The default is 10000.                                              |
| `--cnv-cbs-trim`      | Specifies the proportion of data to be trimmed for variance calculations. The default is 0.025.                                       |

For more information, see [segmentation](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#segmentation)

### Purity/Ploidy model selection options

| Option                                    | Description                                                                                                                                                                 |
| ----------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-use-somatic-vc-vaf`                | Use the variant allele frequencies (VAFs) from the somatic SNVs to help select the tumor model for the sample. For more information, see [VAF-aware Mode](#vaf-aware-mode). |
| `--cnv-somatic-essential-genes-bed`       | BED file containing genes where the model should not predict HOMDEL                                                                                                         |
| `--cnv-somatic-enable-het-calling`        | Enable HET-calling mode for heterogeneous segments.                                                                                                                         |
| `--cnv-somatic-enable-lower-ploidy-limit` | Enable check on lower ploidy limit based on essential genes                                                                                                                 |
| `--cnv-normal-cnv-vcf`                    | Specify germline CNVs from the matched normal sample. For more information, see [Germline-aware Mode](#germline-aware-mode).                                                |
| `--cnv-somatic-min-purity`                | Specify minimum purity to consider                                                                                                                                          |
| `--cnv-somatic-max-purity`                | Specify maximum purity to consider                                                                                                                                          |
| `--cnv-ascn-min-ploidy`                   | Specify minimum ploidy to consider                                                                                                                                          |
| `--cnv-ascn-max-ploidy`                   | Specify maximum ploidy to consider                                                                                                                                          |

For more information, see [ASCN calling](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#allele-specific-copy-number-calling)

### Filtering Options

| Option                            | Description                                                                                    |
| --------------------------------- | ---------------------------------------------------------------------------------------------- |
| `--cnv-enable-ref-calls`          | Emit copy-neutral (REF) calls in output VCF (defaul`true` for WGS, `false` for WES)            |
| `--cnv-filter-qual`               | QUAL value at which to hard filter CNV VCF (default `40` for WGS/WES, `90` for WES depth-only) |
| `--cnv-filter-length`             | Minimum event length (bp) for PASS calls (default `10000` for WGS, `0` for WES)                |
| `--cnv-filter-del-mean`           | SM value used to hard filter DELs in CNV VCF (Somatic WGS)                                     |
| `--cnv-filter-dup-mean`           | SM value used to hard filter DUPs in CNV VCF (Somatic WGS)                                     |
| `--cnv-filter-cnloh-maf`          | MAF value used to hard filter CNLOHs in CNV VCF (Somatic WGS)                                  |
| `--cnv-somatic-filter-het-length` | Minimum event length to hard filter subclonal CNV VCF                                          |
| `--cnv-post-vcf-target-bed`       | BED file to keep only VCF entries overlapping with target regions                              |

If --cnv-post-vcf-target-bed is specified, VCF records that do not overlap the provided BED intervals are filtered out. This is a post‑processing hard filter applied only to the output VCF and does not affect any upstream workflow steps or CNV modeling.

### Other Options

| Option                                         | Description                                                                |
| ---------------------------------------------- | -------------------------------------------------------------------------- |
| `--cnv-enable-tracks`                          | Enables generation of IGV track files                                      |
| `--cnv-generate-pon-metric-file`               | Generate PON metric file for WES/targeted panel                            |
| `--cnv-exclude-bed`                            | BED file specifying intervals to exclude from analysis                     |
| `--cnv-exclude-bed-min-overlap`                | Minimum overlap fraction for exclusion (default `0.5`)                     |
| `--cnv-sex-genotyper-num-interval-requirement` | Number of sex contig interval requirements for sex genotyper (default:300) |

## CNV Output Files

The somatic CNV workflow generates the following output files:

| File                                           | Description                                         | Format           |
| ---------------------------------------------- | --------------------------------------------------- | ---------------- |
| `<prefix>.tumor.target.counts.gz`              | Raw target counts before bias correction            | gzipped TSV      |
| `<prefix>.tumor.target.counts.gc-corrected.gz` | GC-bias corrected target counts                     | gzipped TSV      |
| `<prefix>.tumor.ballele.counts.gz`             | B-allele counts at population SNP sites             | gzipped TSV      |
| `<prefix>.baf.bedgraph.gz`                     | B-allele frequency in bedgraph format               | gzipped bedGraph |
| `<prefix>.tn.tsv.gz`                           | Tangent-normalized coverage signal                  | gzipped TSV      |
| `<prefix>.cnv.excluded_intervals.bed.gz`       | List of target regions excluded                     | gzipped TSV      |
| `<prefix>.cnv.pon_metrics.tsv.gz`              | Coverage statistics of PON per interval             | gzipped TSV      |
| `<prefix>.cnv.pon_correlation.txt.gz`          | Correlation between CASE and PON                    | gzipped TSV      |
| `<prefix>.seg`                                 | Segmentation results (depth and BAF)                | TSV              |
| `<prefix>.cnv.purity.coverage.models.tsv`      | Model likelihood score for purity/ploidy estimation | TSV              |
| `<prefix>.cnv.vcf.gz`                          | Primary CNV calls (VCF v4.4 by default)             | gzipped VCF      |
| `<prefix>.cyto.vcf.gz`                         | Cytogenetics-compatible calls (if enabled)          | gzipped VCF      |
| `<prefix>.cnv_metrics.csv`                     | Summary metrics including predicted sex             | CSV              |
| `<prefix>.cnv.gff3`                            | Variant calls in GFF format                         | GFF              |
| `<prefix>.tn.bw`                               | Tangent-normalized signal track                     | BigWig           |

### Target Counts Output

`<prefix>.tumor.target.counts.gz`

Compressed tab-delimited file containing the number of read counts per target interval. This is the raw signal as extracted from the alignments of the BAM or CRAM file. The format is identical for both the case sample and any panel of normals samples. There is also a bigWig representation of a `target.counts.diploid` file, which is normalized to the normal ploidy level of 2 instead of raw counts.

Columns:

1. Contig identifier
2. Start position
3. End position
4. Target interval name
5. Count of alignments in this interval
6. Count of improperly paired alignments in this interval

Header lines starting with `#` contain the DRAGEN version, command line, and other meta information.

Example:

```
#TARGET COUNTS FILE
##DRAGENVersion=<VERSION_INFO>
##DRAGENCommandLine=<CommandLineOptions>
#TargetCountOptions=<CNV_COUNTS_OPTIONS>
...
#Input target file: BED_FILENAME
contig  start   stop    name    WES_EA_N_1      improper_pairs
chr1    12080   12251   target-wes-chr1-12080:12251     662     0
chr1    12595   12802   target-wes-chr1-12595:12802     220     1
...
```

For more information, see [Target Counts File](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#target-counts-file)

### GC-Corrected Counts Output

`<prefix>.tumor.target.counts.gc-corrected.gz`

Contains GC-corrected read counts per target interval. The format is equivalent to the `*.target.counts.gz` file:

1. Contig identifier
2. Start position
3. End position
4. Target interval name
5. GC-corrected read counts in this interval
6. Count of improperly paired alignments in this interval

Example:

```
#GC CORRECTED FILE
##DRAGENVersion=<VERSION_INFO>
##DRAGENCommandLine=<CommandLineOptions>
#TargetCountOptions=<CNV_COUNTS_OPTIONS>
#Original input file: sample.target.counts.gz // raw counts filename
contig  start   stop    name    SampleName      improper_pairs
chr1    12080   12251   target-wes-chr1-12080:12251     981.529698      0
chr1    12595   12802   target-wes-chr1-12595:12802     50.05497673     1
chr1    13163   13658   target-wes-chr1-13163:13658     1086.20189      4
...
```

For more information, see [GC bias correction](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#gc-bias-correction)

### B-Allele Counts

In somatic ASCN runs, B-allele counts are calculated at sites in the tumor sample where the normal sample is likely to be heterozygous. When analyzed in conjunction with a matched normal sample, the sites are those that are called as heterozygous SNVs in the normal sample. When analyzed in tumor-only mode, sites are selected from a population collection (similar to germline ASCN runs). Each B-allele site consists of a reference allele and a variant allele, and the number of reads in the sample supporting each of these alleles is counted.

B-allele counts are written both to gzipped tsv file `*.ballele.counts.gz` and gzipped bedgraph file `*.baf.bedgraph.gz`.

`<prefix>.ballele.counts.gz`

Columns:

1. Contig identifier
2. Start, BED-style (zero-based inclusive) start position of the reference allele
3. Stop, BED-style (one-based inclusive) stop position of the reference allele
4. Base sequence for the reference allele
5. Base sequence for the first allele being counted
6. Base sequence for the second allele being counted
7. The number of qualified reads containing a sequence matching the first allele
8. The number of qualified reads containing a sequence matching the second allele

Additionally, in the case of B-allele sites from a population VCF, the following two additional columns are added after the columns listed above:

9. Population frequency for the first allele
10. Population frequency for the second allele

Example:

```
contig  start   stop    refAllele       allele1 allele2 allele1Count    allele2Count    allele1AF       allele2AF
chr1    51478   51479   T       T       A       4       2       0.6747  0.3253
chr1    82733   82734   T       T       C       111     36      0.79346 0.20654
chr1    83083   83084   T       T       A       0       0       0.1538  0.8462
chr1    86330   86331   A       A       G       9       9       0.87384 0.12616
chr1    88315   88316   G       G       A       0       0       0.8926  0.1074
```

### B-Allele Counts BED Graph

`<prefix>.baf.bedgraph.gz`

B-allele frequency in bedgraph format. Allele count ratios are calculated by sorting alleles according to base priority {A, T, G, C} (descending), producing frequencies deterministically distributed above and below 0.5. This provides easy visualization in IGV of significant BAF changes between neighboring segments.

Example:

```
chr1    11021   11022   0.333333
chr1    14463   14464   0.755102
chr1    16494   16495   0.317708
chr1    38741   38742   0.5
chr1    39014   39015   0.44186
```

### Normalization Output

`<prefix>.tn.tsv.gz`

Contains the normalized signal of the case sample per target interval, i.e., the log2-transformed copy ratio signal. A strong signal deviation from 0.0 indicates a potential for a CNV event. The format is equivalent to the `*.target.counts.gz` file:

1. Contig identifier
2. Start position
3. End position
4. Target interval name
5. Log2-transformed copy ratio in this interval
6. Count of improperly paired alignments in this interval

Header lines are also included that start with `#`. In some cases, the normalization counts could be patched internally with intervals from other processes, such as the SegDups extension. In such cases, patches are indicated (sorted in order of application) with header lines starting with `#patch`:

```
#patch 1 = <normalized_counts_patch_1_filename>
#patch 2 = <normalized_counts_patch_2_filename>
...
```

and the original (unpatched) `*.tn.tsv.gz` is renamed as `*.tn.unpatched.tsv.gz`. Note: this file is reported in output for inspection, but most use cases will use the (patched) `*.tn.tsv.gz` file downstream of normalization.

An example of a `*.tn.tsv.gz` file is shown below.

```
#title = Tangent normalized coverage profile
#sex = MALE
contig  start   stop    name    SampleName      improper_pairs
chr1    12080   12251   target-wes-chr1-12080:12251     -0.3025426810360819     0
chr1    12595   12802   target-wes-chr1-12595:12802     -0.10691600293612752    0
chr1    13163   13658   target-wes-chr1-13163:13658     -0.55258557719170587    6
...
```

For more information, see [Normalization](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#normalization)

### Excluded Intervals Output

`<prefix>.cnv.excluded_intervals.bed.gz`

To improve accuracy, the DRAGEN CNV Pipeline excludes genomic intervals if one or more of the target intervals failed at least one quality requirement. The excluded intervals are reported to \*.cnv.excluded\_intervals.bed.gz file. The file has a bed format, identifies the regions of the genome that are not callable for CNV analysis and describes the reason intervals were excluded in the fourth column. The following are the possible reasons for exclusion.

Example:

```
chr1    258648  258852  PON_TARGET_FACTOR_THRESHOLD
...
chrX    151717091       151717377       EXCLUDE_BED
chrY    348335  348455  PON_UNMATCHED_GENDER
...
```

* 4th column provides reason for excluded intervals

For more information, see [Excluded Intervals File](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#excluded-intervals-file)

### PON Metrics Output

`<prefix>.cnv.pon_metrics.tsv.gz`

The DRAGEN CNV Pipeline generates the PON Metrics File (.cnv.pon\_metrics.tsv.gz) if a Panel of Normals is provided and --cnv-generate-pon-metric-file is set to true. If PON size is less than 2, then an empty file will be generated.

The PON Metric File includes basic statistics of the coverage profile for each interval. To remove sample coverage bias, DRAGEN applies sample median normalization, and then computes the metrics.

Example:

```
contig  start   stop    name    mean    std     normalizedStd min     25%     50%     75%     max     intervalSize    gcContents
1       12098   12178   target-wes-1-12098:12178/1      3.6259044560802365      0.46661435469856077      0.1286890927079175     2.7961783439490446      3.2573018790849675      3.7105263157894739      4.0162683823529415      4.3298969072164946      80      0.49382716049382713
1       12178   12258   target-wes-1-12178:12258/2      5.0685579775753595      0.70638315915955963      0.13936570564740217     3.9044585987261144      4.5225944682508761      5.067708333333333       5.5778115844038769      6.3277777777777775      80      0.46913580246913578
1       12553   12637   target-wes-1-12553:12637/1      4.6990858287992054      0.62537786269786677      0.13308500535681309     3.7417218543046356      4.0305632538350444      5.0382165605095546      5.2151580459770113      5.5773195876288657      84      0.6705882352941176
...
```

For more information, see [PON Metrics File](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#pon-metrics-file)

### PON Correlation Output

`<prefix>.cnv.pon_correlation.txt.gz`

The DRAGEN CNV Pipeline generates the PON Correlation File (.cnv.pon\_correlation.txt.gz) if a Panel of Normals is provided. The PON Correlation File includes correlation between CASE sample and each PON sample.

Example:

```
Correlation of case sample CASE_SAMPLE_NAME
  PON1: 0.9786
  PON2: 0.9868
  PON3: 0.9912
  ...
```

For more information, see [PON Correlation File](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#pon-correlation-file)

### PON Combined Counts Output

`<prefix>.combined.counts.txt.gz`

If PON samples are provided by `--cnv-normals-file` or `--cnv-normals-list`, then CNV generate single PON file for later uses by `--cnv-combined-counts` option.

Example:

```
#COMBINED COUNTS FILE
##DRAGENVersion=<VERSION_INFO>
##DRAGENCommandLine=<CommandLineOptions>
#TargetCountOptions=<CNV_COUNTS_OPTIONS>
contig  start   stop    name    PON1  PON2  PON3  PON4  PON5  PON6  PON7  PON8  PON9
chr1    69411   69541   target-wes-chr1-69411:69541/1   0       1.8140869319999999      0       0       3.6301639140000002      0       3.6322517749999998      2.7239599229999998      0
chr1    69541   69670   target-wes-chr1-69541:69670/2   0       1.732555405     0       0       3.4641199290000002      0       3.4667661330000001      3.4653864840000002      0
chr1    785931  786282  target-wes-chr1-785931:786282/1 41.683179699999997      37.341243050000003      52.024789929999997      59.030795980000001      53.800898459999999      50.370179270000001      43.404515570000001      43.349519780000001      46.891320659999998
chr1    817466  817596  target-wes-chr1-817466:817596/1 1.9608427310000001      0       0.98101929590000003     1.9595376019999999      0       0.9799059067    1.957367598     4.9118208430000001      0.97657295369999997
chr1    826645  826950  target-wes-chr1-826645:826950/1 67.020948630000007      66.92125953     76.833943899999994      64.963436049999999      37.414978750000003      90.559157929999998      61.079245899999997      78.87869293     69.023087360000005
...
```

### Segmentation Results

`<prefix>.seg`

Contains the segments produced by the segmentation algorithm. The `Segment_Mean` value of a segment is the ratio of the mean of that segment to the whole-sample median, without log transformation (linear copy-ratio). A strong signal deviation from 1.0 indicates a potential for a CNV event.

The file has the following columns:

1. Sample name
2. Contig identified
3. Start position
4. End position
5. Number of intervals in the segment
6. Linear copy-ratio of the segment

An example of a `*.seg` file is shown below.

```
Sample  Chromosome      Start   End     Num_Probes      Segment_Mean
<SampleName> chr1    818022  1117426 224     0.82500341336435279
<SampleName> chr1    1117426 4063702 2438    0.91726081432236528
<SampleName> chr1    4063702 4067591 3       0.38861386123247205
<SampleName> chr1    4067591 7705829 3302    0.93021316913709917
<SampleName> chr1    7705829 9357003 1405    0.98147825043799442
<SampleName> chr1    9357003 9377365 19      0.50269670724395654
<SampleName> chr1    9377365 12859821        2905    1.0684818476332989
```

### BAF Segmentation Output

`<prefix>.baf.seg`

In addition to segmentation of target counts, some workflows perform segmentation of B-allele loci. The output file has suffix `*.baf.seg` and it has the same format of the `*.seg` file with two modifications. First, the `Segment_Mean` value is the mean over B-allele loci of the smaller observed allele fraction. Second, there is an additional column:

7. `BAF_SLM_STATE`: Integer between 0 and 10, indicating bins of minor-allele fraction (low to high), or `.` when the BAF data are too variable to estimate a minor-allele fraction

An example of BAF segmentation output file is shown below:

```
Sample  Chromosome      Start   End     Num_Probes      Segment_Mean    BAF_SLM_STATE
<SampleName> chr1    820348  1104646 194     0.29301737166888697     6
<SampleName> chr1    1105091 1533754 444     0.26185904799069076     5
<SampleName> chr1    1533810 1534166 9       0.41958837071702065     8
<SampleName> chr1    1534217 9356793 6689    0.26034515815016335     5
<SampleName> chr1    9358304 9376529 27      0.46450553586280602     10
```

### Purity/Coverage Models Output

`<prefix>.cnv.purity.coverage.models.tsv`

Contains the tested purity and diploid-coverage models along with their log-likelihood scores. Each row corresponds to a candidate model evaluated by the ASCN caller during model selection.

Columns:

1. Model purity (Cellularity) — fraction of cells in the sample due to tumor \[0, 1]
2. Model diploid coverage — expected read count for a target bin in a diploid region
3. Model log-likelihood — log-likelihood score for this purity/coverage hypothesis
4. Approximate ploidy - approximate sample ploidy estimation before CNV calling, derived from the sample mean coverage
5. Failed constraints - model search constraints that were not satisfied by the model

The model with the highest log-likelihood is selected as the best estimate of tumor purity and ploidy. The selected purity is reported as `EstimatedTumorPurity` in the VCF header.

Example:

```
#Purity Coverage        logL    ApproxPloidy    FailedConstraints
0.05    400     -19966231.8629  5.99    MIN_MHD,HOM_DEL,USER_MIN_PURITY
0.05    401     -19952483.5915  5.88    MIN_MHD,HOM_DEL,USER_MIN_PURITY
0.10    402     -19938276.3649  5.77    
...
```

### VCF Output

`<prefix>.cnv.vcf.gz`

The CNV VCF file follows the standard VCF format [v4.4](https://samtools.github.io/hts-specs/VCFv4.4.pdf). The VCF header is annotated with `##source=<DRAGEN_SOURCE>`, where `<DRAGEN_SOURCE>` identifies the caller which produced the VCF, e.g.:

* `DRAGEN_ASCN`: CNV caller
* `DRAGEN_ASCN_SV`: CNV caller + SV support
* `DRAGEN_CNV`: [legacy depth-only CNV caller](https://github.com/illumina-swi/dragen-docs/blob/release/4.5-prod/product-guides/dragen-v4.5/user-guide/dragen-dna-pipeline/cnv-calling/legacy/cnv-germline-legacy.md) (note: for legacy reasons this caller uses VCF version [v4.2](https://samtools.github.io/hts-specs/VCFv4.2.pdf))

Due to the nature of how CNV events are represented, not all fields are applicable. In general, if more information is available about an event, then the information is annotated. To include copy neutral (REF) calls, set `--cnv-enable-ref-calls` to true. AOH/LOH events are not available in the legacy depth-only caller.

#### Example Records

```bash
# Example REF call
chr1    819841  DRAGEN:REF:chr1:819841-6103865  N       .       1000    PASS
  END=6103865;REFLEN=5284025
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/0:2:1:1000:1000:2.00155:1.000775:1.000775:129.1:0.5:4544:10920:66,10:0.00368019

# Example copy-neutral LOH call
chr1    6104347 DRAGEN:CNLOH:chr1:6104348-6727324       N       <LOH>   1000    PASS
  END=6727324;REFLEN=622977;SVLEN=622977;LOHTYPE=AOH;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  1/1:2:0:1000:1000:1.9876:0.001988:0.993798:128.2:0.001:528:916:10,12:0.00766703

# Example GAIN call
chr1    16715826        DRAGEN:GAIN:chr1:16715827-16949283      N       <DUP>   744     PASS
  END=16949283;REFLEN=233457;SVLEN=233457;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:3:1:1000:99:3.08217:1.134239:1.541085:198.8:0.368:49:26:20,14:0.0384615

# Example GAIN LOH call
chr15   20212550        DRAGEN:GAINLOH:chr15:20212551-20421468  N       <LOH>   390     PASS
  END=20421468;REFLEN=208918;SVLEN=208918;LOHTYPE=AOH;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  1/1:6:0:1:1:5.90559:0.000000:2.952793:380.91:0:76:1:9,8:0

# Example LOSS call
chr1    25274774        DRAGEN:LOSS:chr1:25274775-25331683      N       <DEL>   226     PASS
  END=25331683;REFLEN=56909;SVLEN=56909;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:1:0:1000:1000:1.01085:0.000000:0.505426:65.2:0:7:10:5,1:0
```

#### Header

The VCF header includes somatic-specific fields in addition to the common CNV header lines:

```
##fileformat=VCFv4.4
##ModelSource=DEPTH+BAF
##EstimatedTumorPurity=0.72
##DiploidCoverage=384.000000
##OverallPloidy=2.103412
##OutlierBafFraction=0.031287
##AlternativeModelDedup=0.72,192
##AlternativeModelDup=0.72,768
...
```

| ID                                        | Description                                                                                                                                                                      |
| ----------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ModelSource                               | basis on which the final tumor model was chosen (e.g., `DEPTH+BAF`, `DEPTH+BAF_DOUBLED`, `VAF`, `SAMPLE_MEDIAN`).                                                                |
| EstimatedTumorPurity                      | fraction of cells in the sample due to tumor. Range: \[0, 1] or `NA` if a confident model could not be determined.                                                               |
| DiploidCoverage                           | expected read count for a target bin in a diploid region.                                                                                                                        |
| OverallPloidy                             | length-weighted average of copy number for PASS events in the tumor fraction.                                                                                                    |
| OutlierBafFraction                        | fraction of B-allele frequencies incompatible with their segment call. High values may indicate a mismatched normal, cross-sample contamination, or bone marrow transplantation. |
| AlternativeModelDedup/AlternativeModelDup | alternative models corresponding to one fewer or one more whole-genome duplication, given as `(purity, diploid_coverage)`. Useful for manual investigation.                      |

#### Records

All coordinates in the VCF are 1-based.

| ID    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ----- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| CHROM | The chromosome (or contig) on which the copy number variant occurs.                                                                                                                                                                                                                                                                                                                                                                                                                               |
| POS   | Start position of the variant. If any of the ALT alleles is a symbolic allele (e.g., `<DEL>`), POS denotes the coordinate of the base preceding the polymorphism.                                                                                                                                                                                                                                                                                                                                 |
| ID    | Encodes the event type and coordinates of the event (1-based, inclusive). Event types include `GAIN`, `LOSS`, `REF`, `CNLOH`, and `GAINLOH`.                                                                                                                                                                                                                                                                                                                                                      |
| REF   | Contains `N` for all CNV events.                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ALT   | Specifies the type of CNV event: `<DEL>`, `<DUP>`, or `<LOH>`. REF calls have ALT `.`. With `--cnv-enable-legacy-vcf-format` (VCF v4.2), the `ALT` field contains `<DEL>,<DUP>` in place of `<LOH>` for AOH/LOH events.                                                                                                                                                                                                                                                                           |
| QUAL  | Estimated quality score used in hard filtering. Note: different workflows provide different QUAL score distributions - it is recommended to compare QUAL scores only within results from the same workflow (e.g., it is incorrect to compare QUAL scores between the CNV caller and the [legacy (depth-only) CNV caller](https://github.com/illumina-swi/dragen-docs/blob/release/4.5-prod/product-guides/dragen-v4.5/user-guide/dragen-dna-pipeline/cnv-calling/legacy/cnv-germline-legacy.md)). |

#### FILTER

The FILTER column contains `PASS` if the CNV event passes all filters, otherwise the column contains the name of the failed filter. Default values are defined in the header line for each available FILTER.

| ID        | Description                                         |
| --------- | --------------------------------------------------- |
| binCount  | CNV events with a bin count lower than a threshold. |
| cnvLength | The length of the CNV is lower than a threshold.    |
| cnvQual   | The QUAL of the CNV is lower than a threshold.      |

#### INFO

The INFO column contains information representing the event.

| ID      | Description                                                                                                                                                                                                                    |
| ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| REFLEN  | Length of the event.                                                                                                                                                                                                           |
| SVLEN   | Length of the event. Only present for non-REF records. Note: in VCF v4.2 format (enabled with `--cnv-enable-legacy-vcf-format`), `SVLEN` is a signed representation of `REFLEN` (e.g., a negative value indicates a deletion). |
| SVTYPE  | Always `CNV`. Only present for non-REF records.                                                                                                                                                                                |
| END     | End position of the event (1-based, inclusive).                                                                                                                                                                                |
| LOHTYPE | Type of loss of heterozygosity. Possible values: `CNLOH` (Copy-Neutral LOH), `GAINLOH` (LOH with copy number gain).                                                                                                            |
| HET     | Tag identifying subclonal (heterogeneous) calls, present when `--cnv-somatic-enable-het-calling` is set                                                                                                                        |
| CIPOS   | Confidence interval around the nominal `POS`.                                                                                                                                                                                  |
| CIEND   | Confidence interval around the nominal `END`.                                                                                                                                                                                  |

The meaning of the SVLEN, SVTYPE, END, CIPOS, and CIEND fields match their [VCF v4.2](https://samtools.github.io/hts-specs/VCFv4.2.pdf) definitions.

If using a segment BED file, then the segment identifier is carried over from the input to `SEGID` field.

When [Germline-aware Mode](#germline-aware-mode) is enabled, DRAGEN annotates somatic VCF entries with:

| ID   | Description                                                          |
| ---- | -------------------------------------------------------------------- |
| NCN  | Germline copy number from the matched normal sample.                 |
| SCND | Somatic copy number difference relative to the germline copy number. |

When matching CNV with SV output, additional INFO annotations are added.

#### FORMAT

The common FORMAT fields are described in the header:

| ID   | Description                                                                                                                                                                                                |
| ---- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GT   | Genotype                                                                                                                                                                                                   |
| SM   | Linear copy ratio of the segment mean                                                                                                                                                                      |
| CN   | Estimated total copy number of tumor fraction                                                                                                                                                              |
| BC   | Number of read count bins                                                                                                                                                                                  |
| PE   | Number of improperly paired end reads at start and stop breakpoints                                                                                                                                        |
| AS   | Number of allelic read count sites                                                                                                                                                                         |
| CNF  | Floating point estimate of copy number                                                                                                                                                                     |
| CNQ  | Exact total copy number Q-score                                                                                                                                                                            |
| MAF  | Estimate for the minor allele frequency                                                                                                                                                                    |
| MCN  | Estimated minor-haplotype copy number                                                                                                                                                                      |
| MCNF | Floating point estimate of minor-haplotype copy number                                                                                                                                                     |
| MCNQ | Minor copy number Q-score                                                                                                                                                                                  |
| MF   | Mosaic fraction estimate (for MOSAIC calls)                                                                                                                                                                |
| OBF  | Per-segment Outlier BAF Fraction. Percentage of BAF counts which are considered "outlier" with respect to the chosen segment call. Higher values might indicate segments where BAF counts are problematic. |
| SD   | Best estimate of segment's bias-corrected read count                                                                                                                                                       |

For more information, see [CNV VCF](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#cnv-vcf-file).

### Cytogenetics Output

`<prefix>.cyto.vcf.gz`

The Cytogenetics modality output has a similar format to the standard CNV VCF (`*.cnv.vcf.gz`). A list of differences is indicated below:

* Records can have the `INFO/RES` field. In such case, such field indicate the resolution(s) associated with the record.
* Records can have the `INFO/SEGID` field. In such case, such field can either indicate custom predefined segments indicated in input by the user (similar to the standard CNV VCF), or Cytogenetics-specific predefined segments which are typically whole-arm/-chromosome segments automatically injected during the caller execution. In the latter case, the annotation field indicates the ID or name for the arm or chromosome.
* The VCF header is annotated with `##source=DRAGEN_CYTO` to indicate the file is generated by the Cytogenetics modality.

**Note:** The Cyto VCF also provides resolution-specific homozygosity indexes (i.e., computed on each specific resolution's callset). The default minimum size considered is the same as the main `HomozygosityIndex`, and for each resolution in output, there will be an additional header line on the Cyto VCF indicating the resulting metric, e.g., `##HomozygosityIndex(25k)=0.001015`.

### CNV Metrics Output

`<prefix>.cnv_metrics.csv`

The following metrics are reported:

**Sex Genotyper**

| Metric           | Description                                                                               |
| ---------------- | ----------------------------------------------------------------------------------------- |
| Estimated sex    | Estimated sex of the case sample (and panel of normals samples if applicable).            |
| Confidence score | Range: \[0.0, 1.0]. If the sample sex is specified via `--sample-sex`, this value is 0.0. |

DRAGEN Sex Genotyper requires a minimum of 300 target intervals to confidently determine sex genotype; if the panel covers fewer intervals on the sex chromosomes, genotyping will fail and an undetermined genotype is returned. Users may lower this requirement by setting `--cnv-sex-genotyper-num-interval-requirement` to a smaller value, at the risk of increased false genotype calls.

**CNV Summary**

* Bases in reference genome in use
* Average alignment coverage over genome - The average alignment coverage over the genome is calculated by dividing the total number of bases from processed alignment records (excluding those filtered by the Target Counts stage in DRAGEN CNV) by the genome length. Alignment records are filtered taking into consideration duplicate marking status (if available), MAPQ, and mapping status.
* Number of alignment records processed
  * Number of filtered records (total)
  * Number of filtered records (due to duplicates)
  * Number of filtered records (due to MAPQ)
  * Number of filtered records (due to being unmapped)
* PMAD - Pairwise Median Absolute Deviation measures the variation in read coverage between adjacent bins. It measures variability due to various factors, such as DNA degradation, extraction, amplification or library preparation. Higher values indicate noisier sample data. PMAD is calculated as following:
  * Define a vector v\[i] as normalized counts of i-th interval in log scale, and d\[i] as pairwise differences of consecutive normalized counts between i and i+1 intervals, i.e. d\[i] = (v\[i] - v\[i+1])
  * PMAD is median absolute deviation of d, i.e. PMAD = Median(|d\[i]-Median(d)|)
* Coverage MAD - Median absolute deviation of normalized case counts. Higher values indicate noisier sample data.
* Median Bin Count - Median of raw counts normalized by interval size.
* Number of target intervals
* Number of normal samples
* Number of segments
* Number of amplifications - Note: GAINLOH events (ALT=LOH and CN > 2) are also included here
* Number of deletions
* Number of CNLOHs (Copy-Neutral LOHs)
* Number of PASS amplifications - Note: GAINLOH events (ALT=LOH and CN > 2) are also included here
* Number of PASS deletions
* Number of PASS CNLOHs (Copy-Neutral LOHs)
* Post-Normalization Bin Count Sigma - Standard deviation of post-PoN-normalization median-normalized coverage values.

Coverage MAD and Median Bin Count are only printed for WES germline/somatic CNV. Post-Normalization Bin Count Sigma is only printed when PoN normalization has been applied.

Example:

```
SEX GENOTYPER,,sample,UNDETERMINED,0.0000
SEX GENOTYPER,,v1r1_normal_60,UNDETERMINED,0.0000
...
CNV SUMMARY,,Bases in reference genome,3217346917
CNV SUMMARY,,OutlierBafFraction,0.049278
CNV SUMMARY,,beta-binomial overdispersion M,184.400000
CNV SUMMARY,,PMAD,0.067799
CNV SUMMARY,,Coverage MAD,0.06750
CNV SUMMARY,,Median Bin Count,1.80
...
```

For more information, see [CNV Metrics](https://help.dragen.illumina.com/product-guides/dragen-v4.5/dragen-dna-pipeline/cnv-reference#cnv-metrics-file)

### Track Files (IGV)

To generate additional equivalent bigWig and gff files, set the `--cnv-enable-tracks` option to true. These files can be loaded into IGV along with other tracks that are available, such as RefSeq genes. Using these tracks alongside publicly available tracks allows for easier interpretation of calls. DRAGEN autogenerates IGV session XML file if tracks are generated by DRAGEN CNV. The `*.cnv.igv_session.xml` can be loaded directly into IGV for analysis.

The following IGV tracks are automatically populated in the output IGV session file:

| Track File            | Description                                                                                                                                                                                                            | Recommended View   |
| --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ |
| `*.target.counts.bw`  | BigWig representation of target counts bins. Values are GC-corrected if GC correction was performed.                                                                                                                   | Barchart or points |
| `*.improper_pairs.bw` | BigWig representation of improper pairs counts.                                                                                                                                                                        | Barchart           |
| `*.tn.bw`             | BigWig representation of the tangent normalized signal.                                                                                                                                                                | Points             |
| `*.seg.bw`            | BigWig representation of the segments.                                                                                                                                                                                 | Points             |
| `*.baf.seg.bw`        | BigWig representation of BAF segments (if available).                                                                                                                                                                  | Points             |
| `*.baf.bedgraph.gz`   | BED graph representation of B-allele frequency (if available).                                                                                                                                                         | Points             |
| `*.cnv.gff3`          | GFF3 representation of CNV events: DEL=blue, DUP=red, filtered=light gray, REF=green (if enabled), AOH/LOH=magenta. An example is shown below (different workflows may output different attributes on the 9th column). | —                  |

Example GFF3 output:

```
##gff-version 3
chr1    DRAGEN  LOSS    12779193        12859821        30      .       .       Alt=DEL;LinearCopyRatio=0.576;CopyNumber=1;Genotype=0/1;Qual=30;Filter=PASS;Start=12779192;Stop=12859821;Length=80629;BinCount=24;ImproperPairsCount=16,7;color=#0000FF;
chr1    DRAGEN  REF     13106280        13122338        19      .       .       Alt=REF;LinearCopyRatio=1.05981;CopyNumber=2;Genotype=./.;Qual=19;Filter=PASS;Start=13106279;Stop=13122338;Length=16059;BinCount=8;ImproperPairsCount=3,1;color=#00FF00;
chr1    DRAGEN  GAIN    13225213        13247040        66      .       .       Alt=DUP;LinearCopyRatio=2.016;CopyNumber=4;Genotype=./1;Qual=66;Filter=PASS;Start=13225212;Stop=13247040;Length=21828;BinCount=9;ImproperPairsCount=7,5;color=#FF0000;
```

#### IGV Session

![](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-5b52de28953daaaf9d7ca6cea3fb035b9a9ad341%2Fcnv-calling.IGVTracks.png?alt=media)

File extension: `*.igv_session.xml`

The IGV session XML file is prepopulated with track files generated by DRAGEN. The session file loads the reference genome that best matches the standard reference genomes in an IGV installation, by comparing the name of the `--ref-dir` specified on the command-line. Standard UCSC human reference genomes are autodetected, but any variations from the standard reference genomes might not be autodetected. To edit the genome detection, alter the `genome` attribute in the `Session` element to the reference genome you would like for analysis before loading into IGV. The reference identifier used by IGV might differ from the actual name of the genome. The following is an example edited session file.

```
<?xml version="1.0" encoding="utf-8"?>
<Session genome="b37" hasGeneTrack="false" hasSequenceTrack="true" version="8">
    <Resources>
        <Resource path="example.cnv.gff3"/>
        <Resource path="example.cnv.excluded_intervals.bed.gz"/>
        <Resource path="example.target.counts.bw"/>
        <Resource path="example.improper.pairs.bw"/>
        <Resource path="example.tn.bw"/>
        <Resource path="example.seg.bw"/>
    </Resources>
    <Panel height="500" width="1200" name="DataPanel">
        ...
    </Panel>
</Session>
```

Note that depending on the IGV version installed, it may come prepackaged with different flavors of GRCh37. The reference naming conventions have changed so a user may have to edit the `genome` field in the XML file directly. For example, IGV has traditionally packaged a `b37` reference genome, but may also include a `1kg_v37` or a `1kg_b37+decoy`, which will appear on the IGV user interface as "1kg, b37" or "1kg, b37+decoy" respectively.

You can determine what the correct encoding of a reference genome by going to `File > Save Session...` and then inspecting the generated igv\_session.xml file.

![](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-3d9e4f253516593e404d240f2a3b64d10793f64a%2Fcyto.IGVExample.png?alt=media)

## Germline-aware Mode

To specify germline CNVs from a matched normal sample, use `--cnv-normal-cnv-vcf`. When specified, CNV records marked as `PASS` in the normal sample are used during tumor-sample segmentation to make sure that confident germline CNV boundaries are also boundaries in the somatic output. Segments with germline copy number changes that are relative to reference ploidy are excluded from somatic model selection. During somatic copy number calling and scoring, the germline copy number is used to modify the expected depth contribution from the normal contamination fraction of the tumor sample. The process leads to more accurate assignment of somatic copy number in regions of germline CNV. DRAGEN then annotates the somatic WGS CNV VCF entries with germline copy number (`NCN`) and the somatic copy number difference relative to germline (`SCND`) for the segments that have germline CNVs.

**Example:**

```bash
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--enable-map-align false \
--enable-cnv true \
--tumor-bam-input <TUMOR_BAM> \
--bam-input <NORMAL_BAM> \
--enable-variant-caller true \
--cnv-use-somatic-vc-baf true \
--cnv-normal-cnv-vcf <CNV_NORMAL_VCF>
```

## VAF-aware Mode

If both the small variant caller and the CNV caller are enabled in a tumor-matched normal run, somatic SNV variant allele frequencies (VAFs) can inform the purity and ploidy model selection. VAF-based modeling is particularly useful when a tumor has limited copy number variation and/or CNVs are mostly subclonal (e.g., many liquid tumors), preventing the depth+BAF signal from reaching a clear model.

VAF information can also help determine the presence or absence of a whole-genome duplication even in clonal tumors with clear CNVs.

For tumor/matched-normal runs with `--enable-variant-caller true`, VAF-based modeling is enabled by default. To disable it, set `--cnv-use-somatic-vc-vaf false`.

## Advanced Topics

### Cytogenetics Modality

Conventional cytogenetics methodologies typically focus on larger alterations than the ones provided by NGS analyses. The Cytogenetics modality for the CNV caller allows the user to visualize CNAs at different resolutions, aiming at providing a more flexible workspace for different use cases.

It is enabled with `--cnv-enable-cyto-output` (default true for germline workflows). Not available for somatic WES workflows.

From the same sample, and during the same run, the Cytogenetics modality starts from the high resolution results (before smoothing) provided in the standard output CNV VCF. The output callset then undergoes multiple rounds of smoothing, going progressively from finer resolution to coarser resolution calls (larger alterations). Each round of smoothing produces a smoothed callset which is set aside and becomes the starting point for callsets with higher degree of smoothing.

![](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-c21aae6744a21423831bbe81a8961166c8549eeb%2Fcyto.Smoothing.BlockDiagram.png?alt=media)

At the end of the smoothing procedure, the Cytogenetics modality produces several outputs, e.g.:

* Multiple GFF3 files, one for each round of smoothing (extension `*cyto.<resolution_ID>.gff3`).
* A single VCF file, with extension `*.cyto.vcf.gz`. This file contains all callsets identified through the smoothing iterations, where the iteration identifier is stored on the `INFO/RES` field. Identical alterations across resolutions are deduplicated. In such case, the `INFO/RES` field will contain a comma-separated list of resolution identifiers.
  * Some resolutions will be based on depth of coverage only (no BAF). Their `INFO/RES` value will reflect the original callset used as a starting point, with added suffix `_depth`. E.g., for depth-only calls derived from resolution `1M`, the new callset will have resolution ID `1M_depth`. Note: calls made at different resolutions or with different information (depth+BAF versus depth-only) may occasionally conflict. For instance, in a region that is AOH that also has a mosaic DEL, the region may be reported as AOH for the depth+BAF calling but may be reported as (mosaic) DEL for the depth-only track. The event type with the strongest evidence will be output for each resolution.
  * An additional callset which does not conform to the ones above (no `INFO/RES` field) is the one containing whole-arm/-chromosome aneuploidies. For this callset, all reported records have the chromosome name or arm name in the `INFO/SEGID` field. Entries for this callset will not be present on any GFF3 file. For more details see the section on whole-chromosome aneuploidies below.
* A single IGV session file, with extension `*.cyto.igv_session.xml`, which provides a convenient way to load the multiple GFF3 files and other typical tracks found on the standard `*.cnv.igv_session.xml`. Below an example screenshot of one of such IGV sessions:
  * The first 5 tracks provide the DRAGEN CNV calls (Blue/DEL, Green/REF, Magenta/AOH, Red/DUP) at decreasing degree of resolution (from high to low, top to bottom).
  * The remaining tracks are similar to the standard `*cnv.igv_session.xml` run, e.g.: poor mappability regions, target counts coverage, improper pairs, B-allele frequency, etc.

![](https://25033470-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG9szlFZupV6Q2DasL98y%2Fuploads%2Fgit-blob-3d9e4f253516593e404d240f2a3b64d10793f64a%2Fcyto.IGVExample.png?alt=media)

Below, an example set of calls from the `*.cyto.vcf.gz` output file (note additional `INFO/RES` annotation with respect to `*.cnv.vcf.gz` output file):

```
# Example REF call
chr1    819841  DRAGEN:REF:chr1:819841-6103865  N       .       1000    PASS
  END=6103865;REFLEN=5284025;RES=25k,500k,50k
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/0:2:1:1000:1000:2.00155:1.000775:1.000775:129.1:0.5:4544:10920:66,10:0.00368019

# Example GAIN call
chr1    16605768        DRAGEN:GAIN:chr1:16605769-16645359      N       <DUP>   427     PASS
  END=16645359;REFLEN=39591;RES=25k;SVLEN=39591;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE
  ./1:6:.:1:.:6.27065:.:3.135326:404.457:.:23:0:6,11

# Example LOSS call
chr1    25274774        DRAGEN:LOSS:chr1:25274775-25331683      N       <DEL>   226     PASS
  END=25331683;REFLEN=56909;RES=25k,50k;SVLEN=56909;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:1:0:1000:1000:1.01085:0.000000:0.505426:65.2:0:7:10:5,1:0
```

**Selection of appropriate resolution**

Since the most-informative resolution may vary depending on circumstances (event sizes, distance between calls, presence of smaller calls causing fragmentation, etc), no one-size-fits-all recommendation can work for all cases. However, some practical recommendations to consider are the following:

* Each resolution `INFO/RES` ID identifies the *minimum size* for alterations to be considered PASS.
* If only minimal call smoothing is necessary, resolution 25k can provide a good balance and provide calls in size ranges compatible with Chromosomal Microarray (CMA).
* When comparing against technologies such as karyotyping, resolution 1M may be the more appropriate to reduce call fragmentation.

Note: if the use case under consideration is not impacted by call fragmentation, it is typically recommended to use the `*.cnv.vcf.gz` or `*.cnv_sv.vcf.gz` output results (instead of the ones in `*.cyto.vcf.gz`), to take full advantage of the superior detail of NGS.

**Additional options**

| Option                                          | Description                                                                                    |
| ----------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| --cnv-cyto-keep-resolutions=\<resolution\_list> | Comma-separated list of resolutions to output (currently supported: 25k,50k,500k,1M,1M\_depth) |

**Whole-chromosome Aneuploidy Detection**

For some use cases, it is sometimes necessary to inspect a sample at arm or whole-chromosome level. Typically this would require the use of an additional caller, together with the standard CNV caller with automated segment detection. On the same run, the Cytogenetics modality provides such set of calls within the same VCF file (with extension `*.cyto.vcf.gz`).

```
chr21  12000000   DRAGEN:GAIN:chr21:12000001-46709983  N   <DUP>  1000  PASS
  END=46709983;REFLEN=34709983;SEGID=chr21q;SVLEN=34709983;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:3:1:1000:1000:3.00155:1.002518:1.500775:193.6:0.334:29570:66224:0,0:0.0016016

chrX   1        DRAGEN:LOSS:chrX:2-156040895     N     <DEL>  1000  PASS
  END=156040895;REFLEN=156040894;SEGID=chrX;SVLEN=156040894;SVTYPE=CNV
  GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF
  0/1:1:0:1000:1000:0.996364:0.000996:0.498182:82.2:0.001:122580:144548:0,0:0.00995089
```

In the example above, two calls derived from such callset. The segment ID annotation (`INFO/SEGID`) provides the name for the segment call under consideration (i.e., for this example, q-arm of chromosome 21 and the entire chromosome X). REF calls are not displayed by default unless required explicitly by the user (i.e., with `--cnv-enable-ref-calls true`. Note: this will enable REF calls for both CNV and CYTO VCF files).

Note: acrocentric chromosomes (13, 14, 15, 21, and 22) have short arms characterized by repetitive regions. These regions create mappability issues and they are typically excluded from analysis. Thus, calling short arm alterations for these chromosomes is challenging, being based on a small percentage of total arm's length. To avoid false positive calls (in this case, indicating an alteration on the full short arm with evidence only coming from a minimal portion of it), the algorithm has a hard threshold (default 500 intervals) on the minimum number of intervals required when calling whole-arm alterations. When the chromosome arm call does not satisfy this threshold, the call is filtered with `FILTER` `chromArmBinCount`. The default can be changed with option `cnv-filter-chrom-arm-bin-count`.

### Joint SV/CNV calling

Somatic joint calling performs copy number segment matching against all SVs with the starts and ends being matched independently.

Somatic joint calling is not enabled by default and must be enabled with `--enable-cnv-sv-somatic true`.

To ensure copy number neutral SVs have matching copy number segments, whenever `--enable-cnv-sv-somatic` is enabled, `--cnv-enable-ref-calls` is automatically enabled as well.

The following steps are performed:

* SV calling is performed.
* The SV call set is filtered to only PASS SV records.
* For each SV, the breakpoint(s) at which a copy number transition would occur, if it were base-pair consistent with the SV, are obtained.
* CNV segmentation is performed to obtain CNV breakpoints.
* If `--cnv-enable-sv-forced-segmentation` is enabled, SV breakpoints are added to the CNV breakpoints. Segments are generated from the combined CNV and SV breakpoints.
  * If a matching CNV breakpoint is found, the CNV breakpoint is adjusted to the SV breakpoint rather than adding a new breakpoint.
  * If a matching CNV breakpoint is not found, the SV breakpoint is added. CNV segments are therefore split at the internal SV breakpoints.
* CNV calling is performed on the segments.
* Adjacent CNV segments in which the END/CIEND of the left segment overlaps the POS/CIPOS of the right segment are adjusted to remove the gap.
* CNV segment start and end are independently matched to SV breakends based on POS/CIPOS and END/CIEND respectively. When there are multiple matching SVs, the inner-most position is matched.
* If a segmentation gap is created due to SV matching, short CNV segments filling the gaps between SVs are created. Short CNV segments CN is set to the CN of the containing pre-adjusted segment.
* SV `<DEL>`/`<DUP>` records that correspond to a single CNV `<DEL>`/`<DUP>` record are merged into a single VCF record. As with germline joint CNV+SV calling, these VCF record contains both the SV and CNV INFO and FORMAT fields.
* The joint call set is written to the `.cnv_sv.vcf.gz` output file. `cnv.vcf.gz` and `.sv.vcf.gz` outputs are unaffected.

When `--cnv-enable-sv-forced-segmentation` is enabled, the somatic joint CNV+SV call set forms a [breakpoint graph](https://en.wikipedia.org/wiki/Sequence_graph).

#### Example command lines

```
dragen \
-r <HASHTABLE> \
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \
--bam-input <NORMALBAM> \
--tumor-bam-input <TUMORBAM> \
--enable-map-align false \
--enable-cnv true \
--enable-sv true \
--enable-cnv-sv-somatic true \
--cnv-enable-sv-forced-segmentation true

```

#### Joint SV/CNV VCF Output

The original CNV and SV VCF output files, prior to integration, are available for users in the DRAGEN output directory, as described elsewhere. Additionally, there is an enhanced CNV VCF available with the `*.cnv_sv.vcf.gz` extension. The VCF header lines in the `*.cnv_sv.vcf.gz` mostly correspond to a concatenation of the individual header lines from the CNV and SV VCFs, with a few lines deduplicated and some new ones added. For details on the legacy header lines, please refer to the individual CNV and SV user guide sections.

Newly added header lines are described in the following table.

| Header Field        | Number | Type    | Description                                                                                                          |
| ------------------- | ------ | ------- | -------------------------------------------------------------------------------------------------------------------- |
| END\_LEFT\_BND\_OF  | 1      | String  | ID of CNV whose left end is matched to the end of SV                                                                 |
| END\_RIGHT\_BND\_OF | 1      | String  | ID of CNV whose right end is matched to the end of SV                                                                |
| LEFT\_BND           | 1      | String  | ID of SV that matches the left end of CNV record                                                                     |
| LEFT\_BND\_OF       | 1      | String  | ID of CNV whose left end is matched to SV                                                                            |
| MatchSv             | 1      | Integer | ID of original SV that was merged with CNV record                                                                    |
| OrigCnvEnd          | 1      | Integer | Coordinate of original CNV END                                                                                       |
| OrigCnvPos          | 1      | Integer | Coordinate of original CNV POS                                                                                       |
| RIGHT\_BND          | 1      | String  | ID of SV that matches the right end of CNV record                                                                    |
| RIGHT\_BND\_OF      | 1      | String  | ID of CNV whose right end is matched to SV                                                                           |
| SVCLAIM             | A      | String  | Claim made by the structural variant call. Valid values are D, J, DJ for: abundance, adjacency and both respectively |

Records that can be matched or rescued will have annotations indicating the breakpoint linkage between a CNV and SV record. If a complete match is found, then the `MatchSv` annotation will be present in the record, indicating the SV record's `ID` field for this CNV record. In this case, BND notations refer to the merged record ID itself rather than the SV before merging. Furthermore, the use of the `SVCLAIM` field will indicate if the record has evidence arising from depth signal `D`, or junction signals `J`, or both `DJ`.

Because of the mixing of standalone SV records and CNV records, the FORMAT field may have different annotations. For details on the CNV or SV specific annotations, please refer to the individual CNV and SV user guide sections.

Records that can be matched or rescued will have FILTER set to PASS. The original FILTERs are retained for records that were not matched or rescued. For example, the `cnvLength` FILTER will still be applied to standalone CNV records (those with `SVCLAIM=D`).

Example records are shown below.

```
# Merged record, note presence of SVCLAIM=DJ and MatchSv
chr1    9357666 DRAGEN:LOSS:chr1:9357667-9377061        N       <DEL>   1000    PASS    END=9377061;REFLEN=19395;SVLEN=19395;SVTYPE=DEL;LEFT_BND=DRAGEN:LOSS:chr1:9357667-9377061;OrigCnvPos=9357666;CIPOS=0,2;RIGHT_BND=DRAGEN:LOSS:chr1:9357667-9377061;OrigCnvEnd=9377061;CIEND=0,2;SVCLAIM=DJ;MatchSv=DRAGEN:DEL:1268:0:1:0:0:0;HOMLEN=2;HOMSEQ=TC;SOMATIC;SOMATICSCORE=444.26;LCF;RIGHT_BND_OF=DRAGEN:GAINLOH:chr1:4066343-9357666;LEFT_BND_OF=DRAGEN:LOSS:chr1:9357667-9377061;END_RIGHT_BND_OF=DRAGEN:LOSS:chr1:9357667-9377061;END_LEFT_BND_OF=DRAGEN:GAINLOH:chr1:9377062-9495567      GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:PR:SR:VF:VF1:VAF1:VF2:VAF2       1/1:0:0:1000:585:0.007000:.:0.003500:0.7:.:19:0:95,103:0,100:0,49:0,119:0,119:1.000000:0,119:1.000000
 
# CNV record that did not match, note presence of SVCLAIM=D
chr1    143540109       DRAGEN:GAIN:chr1:143540110-143751543    N       <DUP>   1000    PASS    END=143751543;CIPOS=-269657,1792;CIEND=-1808,799863;REFLEN=211434;SVLEN=211434;SVTYPE=CNV;SVCLAIM=D     GT:CN:MCN:CNQ:MCNQ:CNF:MCNF:SM:SD:MAF:BC:AS:PE:OBF       0/1:3:1:1000:1000:3.000000:1.134000:1.500000:300:0.378:139:119:24,19:0.0168067
  
# SV record that did not match, note presence of SVCLAIM=J
chr1    156918006       DRAGEN:DUP:TANDEM:15023:0:1:0:0:0       N       <DUP:TANDEM>    .       PASS    END=156930478;SVTYPE=DUP;SVLEN=12472;CIPOS=0,3;CIEND=0,3;HOMLEN=3;HOMSEQ=GTG;SOMATIC;SOMATICSCORE=174.11;LCF;RIGHT_BND_OF=DRAGEN:GAIN:chr1:156918005-156918006;LEFT_BND_OF=DRAGEN:GAIN:chr1:156918007-156930478;END_RIGHT_BND_OF=DRAGEN:GAIN:chr1:156918007-156930478;END_LEFT_BND_OF=DRAGEN:GAIN:chr1:156930479-157982548;SVCLAIM=J  PR:SR:VF:VF1:VAF1:VF2:VAF2:PSL  114,38:70,27:162,65:93,65:0.411392:69,65:0.485075:DRAGEN_BND_15023_1_1_2_4_0_0
```
