DRAGEN
Illumina Connected Software
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • Key Features:
  • main.nf
  • Directives
  • Input
  • Output
  • Publishing
  • Script Execution
  • Usage

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4
  3. Clinical Research Workflows
  4. DRAGEN Heme WGS Tumor Only Pipeline
  5. ICA Cloud App
  6. Advanced Topics

Post Processing

A reusable Nextflow component designed for executing various post-processing tasks on data generated by upstream pipeline steps. It can be used to enhance, transform, or modify outputs, making it versatile for addressing specific requirements or handling errors.

This component is highly configurable, supporting fine-tuned control of computational resources (CPU, memory), containerization, and output management. Users can integrate custom containers (e.g., Python, C#) and scripts to implement their own logic for post-processing, all configured through parameters. Externalized process scripts allow for seamless execution of containerized processes.

Key Features:

  • Flexibility: Can be applied to a wide range of pipeline steps and data types.

  • Customizability: Easily adaptable to different post-processing requirements.

  • Reusability: Can be used in multiple pipelines, reducing development effort.

  • Error handling: Can be used to address issues or errors in the pipeline.

  • Data transformation: Can be used to transform or modify output data in various ways.

main.nf

Directives

  • label: Specifies the computational resources (e.g., CPUs, memory) required by the process. Configured using configuration.json "cpusMemoryConfig"

  • container: Specifies the Docker container image to be used by the process. Configured using configuration.json "container"

  • tag: Tags the process for easier identification or logging, Configured using configuration.json "tag".

  • when: Conditionally executes the process if enabled. Configured using configuration.json "enabled", If this parameter is set to true, the process will run; otherwise, it will be skipped.

Input

  • inputs: A channel that passes a parent step output directory and a collection of inputs to the process, giving the flexibility to pass multiple inputs

Output

  • path("${params.self.stepName}/**"): Output path of step name defined in ${params.self.stepName}.

  • Parent step output directory

Publishing

  • publishDir: The publishDir directive is used to specify where output files should be saved:

  • ${params.parent.logsIntermediates}: Directory for intermediate files and logs.

  • ${params.parent.results}: Directory for final results.

  • mode: 'copy': Specifies that the files should be copied to the target directory.

  • pattern: File pattern to match for publishing (e.g., *.tsv).

  • enabled: Conditional flag (params.self.publishToResults) to control whether the files should be published.

Script Execution

  • GroovyShell: A GroovyShell instance is created to evaluate a Groovy script specified by ${params.self.groovyScript}. The script is passed the inputs as a binding, allowing it to dynamically process the input data.

    • Ref: postscripts/config.groovy

  • template: The shell script template specified by ${params.self.shellScript} is executed with the processed configuration. It allows user to add post processing logic and spawn a docker container with configuration.json "container"

    • Ref: postscripts/script.sh

    In this example, BAM output is automatically updated into CRAM format to save disk space using the reference genome specified in the analysis.

  #!/bin/bash

mkdir -p "${params.postProcessing.stepName}"

cd "${params.postProcessing.stepName}"

resultsdir="${params.analysisDir}/Results"
genomefa="${params.customResourceDir}/genome.fa"

bamfiles=\$(find \$resultsdir -type f -name '*.bam' )
if [ -z "\$bamfiles" ];
    then
        echo "WARNING: BAM files NOT found !"
        exit 0
fi

for f in \$bamfiles
do
    filename=\$(basename -s .bam \$f)

    samtools view -C -T "\$genomefa" -o "./\$filename.cram" "\$f"

done

Usage

To use the PostProcessing process in your Nextflow workflow:

  • Configuration: Ensure that the necessary parameters (e.g., container, CPU/memory settings, and script paths) are defined in your nextflow.config or passed as command-line parameters.

  • Execution:

    • Create a PostProcessing module for your process e.g. https://git.illumina.com/ClinicalGenomics/clinical-pipelines/tree/main/modules/dragen_analysis_post.

    • Create a Groovy script, shell script and container image to fit the specific needs of your pipeline.

      • Creating config.groovy

      • Binding Input Variables: Ensure that all necessary inputs are correctly bound to variables within the Groovy script. These variables should be accessible and easily referenced within script.sh.

      • Process Return Structure:

        • Step name directory

        • Input tuple: tuple of sampleId and parent step directory path.

          • The process modifies one or multiple files in parent work dir and return the parent step directory.

          • tuple to return is set in Groovy script, like below

             def tuples=[[sampleId, parentOutputDir]]
             return [outputs: tuples, args: args]
    • Creating shell.sh

      • Invoke the containers main process with specified arguments.

    • Configuration Parameters:

      • Ensure the params.pipelineConfig.{PARENT}PostProcessing section in your configuration file is properly set up with these parameters.

        • container: The process utilizes a Docker container to ensure consistent execution across environments.

        • shellScript: A shell script which calls the container.

        • groovyScript: A groovy script which binds the process inputs, so they can be accessible in shell script.

        • publishToResults: To copy output files to Results.

        • resultsPublishPattern: Files pattern to copy to Results.

    • Including the Process:

      • Use the include statement to import the PostProcessing process into your workflow under a custom name. This allows you to easily reference it in different parts of your pipeline. e.g.

          include { PostProcessing } as DragenAnalysisPostProcessing from '../../modules/postprocessing/main.nf' params(parent: params.parent, self: params.pipelineConfig.dragenAnalysisPostProcessing)
    • Invoke the Process:

      • The process takes 2 parameters

        • tuple of sampleId and parent step directory path

          • NOTE: SampleId can be optional. User can set it to ${sampleId} or "*" for parent process which don't run per sample.

        • An array of variable inputs, useful to pass multiple parameters to script

      • Call the included process with the necessary input channels and capture the outputs. e.g.

        (_dragenAnalysisPostProcessingFilesOut, dragenAnalysisSampleIdDirPost) = DragenAnalysisPostProcessing(dragenAnalysisSampleIdDir, [[".exon_cov_report.bed"]] )
    • This approach ensures that the original inputs can be modified or and additional parameters or arguments are clearly organized and accessible for further processing in the workflow.

PreviousCustom Config SupportNextAnalysis Output

Last updated 2 days ago

Was this helpful?