# Minimal Checklist

This section provides a succinct checklist of critical Quality Control (QC) metrics for evaluating DRAGEN run performance. These metrics are located in the standard CSV output files in the run directory.

## 1. Basic Run & Alignment QC

**Source File:** `<output_prefix>.mapping_metrics.csv`\
**Default Status:** Enabled

Note that mapping metrics are computed on the raw input sample (prior to optional UMI collapsing).

| Metric Name                             | Critical Trend / Success Criteria                                                                                                                                                                                                                                                                                                                                                                                 |
| --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Total input reads**                   | **Volume check:** Confirm read count matches sequencer output and assay expectations. E.g \~**600–900M reads** for 30X human WGS. Low counts indicate flow cell loading issues or demultiplexing failures.                                                                                                                                                                                                        |
| **Q30 bases**                           | **Base quality:** Generally **> 85–90%**. A global drop indicates chemistry or flow cell issues (see FastQC metrics for positional detail).                                                                                                                                                                                                                                                                       |
| **Mapped reads**                        | **Alignment success:** Aim for **> 95%** (human WGS). Low mapping (< 90%) suggests wrong reference genome, severe contamination, or poor library quality.                                                                                                                                                                                                                                                         |
| **Supplementary (chimeric) alignments** | **Structural integrity:** Reads split across distant loci. For high-quality human germline WGS, expected values are typically **< 2–3%**. Values **> 3–5%** warrant investigation and may indicate library chimera artifacts (e.g. PCR stitching, FFPE-related fragmentation) or, in somatic samples, true structural variation. Sustained levels **> 5%** in germline samples are generally considered abnormal. |
| **Soft-clipped bases (R1/R2)**          | **Adapter/quality trimming:** High percentages indicate adapter read-through, short insert sizes, or poor-quality read ends (e.g. FFPE artifacts). For high-quality libraries, typical values are **< 2–3%**. Values **> 3–5%** suggest suboptimal trimming or degraded DNA. Levels **> 5%** should be treated as a QC concern and reviewed alongside FastQC metrics.                                             |
| **Estimated sample contamination**      | **Purity check:** (Requires `--qc-detect-contamination=true`). For human germline samples, **> 1–2%** contamination can materially impact variant calling accuracy, especially for low-frequency somatic variants.                                                                                                                                                                                                |
| **Duplicate reads**                     | **Library complexity:** Elevated duplication rates suggest reduced library complexity or over-amplification. For high-quality germline WGS, typical values are **< 20%**. For WES and other targeted assays, higher duplication rates (e.g. **20–50%**) are common and should be interpreted in the context of on-target coverage and assay design.                                                               |
| **Insert length (median)**              | **Fragment size:** Should match the library prep target (e.g. \~350 bp). Deviations can affect coverage uniformity.                                                                                                                                                                                                                                                                                               |

## 2. FastQC (Sequence Composition)

**Source File:** `<output_prefix>.fastqc_metrics.csv`\
**Default Status:** Enabled

Note that FastQC metrics are computed on the raw input sample (prior to optional UMI collapsing).

| Metric Name                       | Critical Trend / Success Criteria                                                                                                                                                                                        |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Positional base mean quality**  | **Cycle decay:** Identify quality drop-off at read ends or specific cycles (e.g. fluidics issues).                                                                                                                       |
| **Read GC content**               | **Contamination/bias:** Deviation from expected distribution (e.g. human \~40–45% GC) suggests contamination or severe PCR bias.                                                                                         |
| **Sequence positions (adapters)** | **Adapter content:** High levels indicate untrimmed adapters or short inserts. This can lead to high levels (e.g. **> 5%**) of reported soft-clipped bases in the mapping metrics. Confirm trimming options are enabled. |

## 3. UMI QC (Applies Only to UMI Designs)

**Source File:** `<output_prefix>.umi_metrics.csv`\
**Default Status:** Enabled as part of the UMI pipeline (requires `--umi-enable=true`)

| Metric Name          | Critical Trend / Success Criteria                                                                                                                                                                      |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Consensus reads**  | **Conversion efficiency:** Low ratio relative to total input reads suggests low molecular complexity or insufficient sequencing depth.                                                                 |
| **Mean family size** | **Saturation:** Family sizes near **1.0** indicate under-sequencing; error correction is ineffective without duplicate families. Very high mean family sizes may indicate excessive PCR amplification. |

## 4. Coverage QC

**Source File:** `<output_prefix>.*_coverage_metrics.csv`\
(e.g. `wgs_coverage_metrics.csv` or `target_bed_coverage_metrics.csv`)\
**Default Status:** Enabled

Please note coverage metrics are computed post read deduplication or UMI collapsing. Reads with MAPQ=0 are ignored.

| Metric Name                                             | Critical Trend / Success Criteria                                                                                                                                      |
| ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Average alignment coverage over target bed / genome** | **Depth check:** Primary driver of sensitivity. Ensure it meets the assay target (e.g. **30×** germline, **100×+** somatic).                                           |
| **Uniformity of coverage (PCT > 0.2× mean)**            | **Bias check:** Low uniformity indicates coverage bias (e.g. GC bias), leading to variant calling blind spots. Germline WGS typically expects **≥ 80–90%** uniformity. |
| **PCT of genome with coverage \[x: inf)**               | **Callability:** (e.g. `PCT ≥ 20×`). Germline WGS typically requires **> 95% at 20×**.                                                                                 |
| **Aligned bases in target bed / genome**                | **Yield:** Total usable data. Useful for normalizing performance across runs or flow cells.                                                                            |

## 5. Variant-Level Sanity Check (Optional)

**Source File:** `<output_prefix>.vc_metrics.csv`\
**Default Status:** Conditional (variant caller enabled)

| Metric Name     | Critical Trend / Success Criteria                                                                                                                                                                                      |
| --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Ti/Tv Ratio** | **Biological plausibility:** For human germline WGS, expect approximately **1.9–2.2**. Significantly lower values (e.g. **< 1.7**) often indicate elevated false-positive rates due to sequencing or alignment errors. |

**Note:**\
Ti/Tv is a biological sanity check, not a standalone QC pass/fail metric. Expected values depend on organism, assay type, and genomic region.

### How to Use This Checklist

* Treat this as a **minimum QC review**, not an exhaustive list.
* Values outside these ranges warrant investigation but are **not automatic failures**.
* Some metrics are assay-specific and may require optimization for particular use cases.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.dragen.illumina.com/product-guides/dragen-v4.5/qc-metrics-reporting/minimal_checklist.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
