Ploidy Estimator

The Ploidy Estimator runs by default. The Ploidy Estimator uses reads from the mapper/aligner to calculate the sequencing depth of coverage for each autosome and allosome in the human genome. The sex karyotype of the sample is then estimated using the ratios of the median sex chromosome coverages to the median autosomal coverage. The sex karyotype is estimated based on the range the ratios fall in. If the ratios are outside all expected ranges, then the Ploidy Estimator does not determine a sex karyotype.

Sex KaryotypeX Ratio MinX Ratio MaxY Ratio MinY Ratio Max

XX

0.75

1.25

0.00

0.25

XY

0.25

0.75

0.25

0.75

XXY

0.75

1.25

0.25

0.75

XYY

0.25

0.75

0.75

1.25

X0

0.25

0.75

0.00

0.25

XXXY

1.25

1.75

0.25

0.75

XXX

1.25

1.75

0.00

0.25

Ploidy estimation can fail if the type of input sequencing data cannot be determined to be either WGS or WES. When ploidy estimation fails the estimated median coverage values will be zero. The type of input sequencing data is determined using coverage skewness.

skewness = std::abs(autosomeMean - autosomeMedian) / autosomeMean

When skewness is <= 0.2 the data is determined to be WGS. Note that a minimum of 2x coverage is required for WGS. WGS with coverage lower than 2x may not be detected properly or may be detected as WES. When skewness is >=0.6 the data is determined to be WES. Skewness between 0.2 and 0.6 will have undefined input sequencing data type and the reported estimated median coverage values will be zero.

For WES data, the median exome coverage is estimated using the 99th percentile of coverage bins across each contig. This estimated median exome coverage is then reported by the Ploidy Estimator and used for sex estimation.

If there is not sufficient sequencing coverage in the autosomes (at least 2x for either WGS or WES) then the Ploidy Estimator does not determine a sex karyotype.

When both tumor and matched normal reads are provided as input, the Ploidy Estimator only estimates sequencing coverage and sex karyotype for the matched normal sample and ignores the tumor reads. If only tumor reads are provided as input, the Ploidy Estimator estimates sequencing coverage and sex karyotype for the tumor sample.

Output Metrics

The Ploidy Estimator results, including each normalized per-contig median coverage, is reported in the <output-file-prefix>.ploidy_estimation_metrics.csv file and in standard output.

The following is an example of the results.

  PLOIDY ESTIMATION   Autosomal median coverage      44.79
  PLOIDY ESTIMATION   X median coverage              42.47
  PLOIDY ESTIMATION   Y median coverage              20.82
  PLOIDY ESTIMATION   1 median / Autosomal median    0.95
  PLOIDY ESTIMATION   2 median / Autosomal median    1.05
  PLOIDY ESTIMATION   3 median / Autosomal median    1.01
  PLOIDY ESTIMATION   4 median / Autosomal median    0.99
  ...                                                 
  PLOIDY ESTIMATION   22 median / Autosomal median   0.99
  PLOIDY ESTIMATION   X median / Autosomal median    0.95
  PLOIDY ESTIMATION   Y median / Autosomal median    0.46
  PLOIDY ESTIMATION   Ploidy estimation              XXY

Last updated