JSON Metrics Reporting
DRAGEN generates an output JSON file, <output-prefix>.metrics.json
that aggregates metadata and module information into a single file that be easily parsed and indexed.
The JSON file currently contains the following modules (when enabled):
Mapping and Aligning metrics (analogous to
<output-prefix>.mapping_metrics.csv
)Variant Calling metrics (analogous to
<output-prefix>.vc_metrics.csv
)Coverage region metrics (analogous to
<output-prefix>.<coverage-region-prefix>.coverage_metrics.csv
)FASTQC metrics (analogous to
<output-prefix>.fastqc_metrics.csv
)
The JSON file also currently contains the following sections:
Metadata
Format
The JSON file is composed of nested dictionary entries containing metadata and metric information.
A standard output JSON metrics file is shown below:
The metadata section contains information about a DRAGEN run and license information. It currently contains the following information:
DRAGEN version - the version of DRAGEN used (string)
License info - an array of JSON objects containing license name (string) and license usage (integer)
Pipeline - the pipeline executed by DRAGEN (string)
A typical metric field will have the following format:
Mapping and Aligning Metrics
The mapAlign
module contains two nested dictionaries: one for global metrics applicable to the whole sample (globalMetrics
) and one for all read group information (perReadGroupMetrics
). Read group metrics are dictionaries indexed by the read group name and contain per read group level information.
This is summarized in the following format:
The following table shows the mapping between the corresponding JSON field name and the standard output/CSV name:
totalInputReads
Total input reads
duplicateMarkedReads
Number of duplicate marked reads
duplicatesRemoved
Number of duplicate marked and mate reads removed
uniqueReads
Number of unique reads
readsMateSequenced
Reads with mate sequenced
readsWithoutMateSequenced
Reads without mate sequenced
qcFailedReads
QC-failed reads
mappedReads
Mapped reads
mappedReadsR1
Mapped reads R1
mappedReadsR2
Mapped reads R2
mappedReadsToPopAltInsertions
Mapped reads to pop-alt insertions (PAI)
mappedReadsToNonRefDecoys
Mapped reads to non-ref decoys (NRD)
mappedReadsToRefExternalSeq
Mapped reads to ref-external sequences (PAI or NRD)
mappedReadsToFilterContigs
Mapped reads (RNA) to rRNA and filtered
mappedReadsToExcludedContigs
Mapped reads (RNA) to chrM and excluded from metrics
mappedReadsAdj
Mapped reads including ref-external or filtered or excluded
unmappedReads
Unmapped reads
unmappedReadsAdjForRefExternal
Unmapped reads minus ref-external mappings
unmappedReadsAdjForFiltered
Unmapped reads minus filtered mappings
unmappedReadsAdjForExcluded
Unmapped reads minus excluded mappings
unmappedReadsAdj
Unmapped reads minus ref-external or filtered or excluded
singletonReads
Singleton reads
pairedReads
Paired reads
properlyPairedReads
Properly paired reads
discordantReads
Not properly paired reads (discordant)
pairedReadsDiffChrom
Paired reads mapped to different chromosomes
pairedReadsDiffChromMapQ10
Paired reads mapped to different chromosomes (MAPQ >= 10)
readsMultipleLoc
Reads mapping to multiple locations
readsMapQ40Inf
Reads with MAPQ [40:inf)
readsMapQ3040
Reads with MAPQ [30:40)
readsMapQ2030
Reads with MAPQ [20:30)
readsMapQ1020
Reads with MAPQ [10:20)
readsMapQ010
Reads with MAPQ [ 0:10)
readsMapQNa
Reads with MAPQ NA (Unmapped reads)
readsWithIndelR1
Reads with indel R1
readsWithIndelR2
Reads with indel R2
readsWithSpliceJunction
Reads with splice junction
totalBases
Total bases
totalBasesR1
Total bases R1
totalBasesR2
Total bases R2
mappedBases
Mapped bases
mappedBasesR1
Mapped bases R1
mappedBasesR2
Mapped bases R2
softClippedBases
Soft-clipped bases
softClippedBasesR1
Soft-clipped bases R1
softClippedBasesR2
Soft-clipped bases R2
hardClippedBases
Hard-clipped bases
hardClippedBasesR1
Hard-clipped bases R1
hardClippedBasesR2
Hard-clipped bases R2
mismatchedBasesR1
Mismatched bases R1
mismatchedBasesR2
Mismatched bases R2
mismatchedBasesR1ExIndel
Mismatched bases R1 (excl. indels)
mismatchedBasesR2ExIndel
Mismatched bases R2 (excl. indels)
q30Bases
Q30 bases
q30BasesR1
Q30 bases R1
q30BasesR2
Q30 bases R2
q30BasesNonDupNonClipped
Q30 bases (excl. dups & clipped bases)
totalAlignments
Total alignments
secondaryAlignments
Secondary alignments
supplementaryAlignments
Supplementary (chimeric) alignments
estimatedReadLength
Estimated read length
insertLengthMean
Insert length: mean
insertLengthMedian
Insert length: median
insertLengthStdDev
Insert length: standard deviation
inputBasesRefGenomeRatio
Input bases divided by reference genome size
inputBasesTargetBedRatio
Input bases divided by target bed size
estimatedSampleContamination
Estimated sample contamination
Variant Calling Metrics
The variantCaller
module contains three nested dictionaries: the variant calling summary (summary
), prefilter metrics (prefilter
), and postfilter metrics (postfilter
). The prefilter and postfilter metrics are dictionaries index by the read group name.
This summarized in the following format:
The following table shows the mapping between the corresponding JSON field name and the standard output/CSV name:
numberOfSamples
Number of samples
readsProcessed
Reads Processed
childSample
Child Sample
totalVariants
Total
singleAllelic
Single allelic
biallelic
Biallelic
multiallelic
Multiallelic
snps
SNPs
insertions
Insertions
insertionsHap
Insertions (Hap)
insertionsHom
Insertions (Hom)
insertionsHet
Insertions (Het)
deletions
Deletions
deletionsHap
Deletions (Hap)
deletionsHom
Deletions (Hom)
deletionsHet
Deletions (Het)
indelsHet
Indels (Het)
denovoAutosomeSnp
DeNovo Autosome SNPs
denovoAutosomeIndel
De Novo INDELs
denovoChrXSnp
DeNovo chrX SNPs
denovoChrXIndel
DeNovo chrX INDELs
denovoChrYSnp
DeNovo chrY SNPs
denovoChrYIndel
DeNovo chrY INDELs
chrXSnp
Chr X number of SNPs over <region>
chrYSnp
Chr Y number of SNPs over <region>
chrXYSnpRatio
(Chr X SNPs)/(chr Y SNPs) ratio over <region>
snpTransitions
SNP Transitions
snpTransversions
SNP Transversions
tiTvRatio
Ti/Tv ratio
numHeterozygous
Heterozygous
numHomozygous
Homozygous
snpMosaic
SNP Mosaics
indelMosaic
Indel Mosaics
inDbSnp
In dbSNP
notInDbSnp
Not in dbSNP
percentCallability
Percent Callability
percentAutosomeCallability
Percent Autosome Callability
percentExomeCallability
Percent Autosome Exome Callability
percentQcRegionCallability
Percent QC Region Callability in Region <number>
QC Coverage Region Metrics
The coverageSummary
module contains nested dictionaries for each of the QC regions provided as input to DRAGEN. Each dictionary is indexed by the name of the region if provided (e.g --qc-coverage-tag
is set), or by the default command line name of the region (e.g. qc-coverage-region-1
).
This is summmarized in the following format:
The following table shows the mapping between the corresponding JSON field name and the standard output/CSV name:
alignedBases
Aligned bases
alignedBasesInRegion
Aligned bases in <region>
avgAlignmentCovOverRegion
Average alignment coverage over <region>
uniformityCov20PerOverRegion
Uniformity of coverage (PCT > 0.2*mean)
uniformityCov40PerOverRegion
Uniformity of coverage (PCT > 0.4*mean)
pctOfRegionWithCoverageNxtoInf
PCT of <region>
with coverage Nx to Inf
medianChrXCovOverRegion
Median chr X coverage (ignore 0x regions) over <region>
medianChrYCovOverRegion
Median chr Y coverage (ignore 0x regions) over <region>
avgMitoCovOverRegion
Average mitochondrial coverage over <region>
avgAutosomalCovOverRegion
Average autosomal coverage over <region>
medianAutosomalCovOverRegion
Median autosomal coverage over <region>
meanMedianCovRatioOverRegion
Mean/Median autosomal coverage ratio over <region>
alignedReads
Aligned reads
alignedReadsInRegion
Aligned reads in <region>
FASTQC Metrics
The fastQc
module contains nested dictionaries of metrics for read 1 and read 2 (if applicable). Since these represent sets of histogram data, the format in JSON is different than other modules.
This is summmarized in the following format:
The following table shows the mapping between the corresponding JSON field name and the standard output/CSV name:
readMeanQuality
Read Mean Quality
positionalMeanQuality
Positional Base Mean Quality
positionalBaseContent
Positional Base Content
readLengths
Read Lengths
readGcContent
Read GC Content
readGcContentQuality
Read GC Content Quality
seqPos
Sequence Positions
posQuality
Positional Quality
Last updated
Was this helpful?