WDL Requirements Intake Template

IntermediateWDL Templates2026-03-19

WDL Requirements Intake Template

Use this template to capture all the information a WDL developer needs to design, build, and validate a workflow. Complete each section as thoroughly as possible — the more detail provided upfront, the fewer iterations required during development.

1. Project Overview

Field	Details
Project Name	[Enter project or workflow name]
Requestor Name	[Name and role of the person requesting the WDL]
Date Submitted	[YYYY-MM-DD]
Target Completion Date	[YYYY-MM-DD or "No deadline"]
Priority	[Critical / High / Medium / Low]
Business Justification	[Why is this workflow needed? What problem does it solve?]

2. Workflow Purpose and Scope

2.1 High-Level Description

Provide a plain-language summary of what the workflow should accomplish from start to finish.

[Enter description here]

2.2 Scientific or Business Context

Describe the domain context. For bioinformatics workflows, include the biological question being addressed. For data processing workflows, describe the data domain.

[Enter context here]

2.3 Scope Boundaries

In Scope	Out of Scope
[What the workflow WILL do]	[What the workflow will NOT do]
[...]	[...]

3. Input Specifications

3.1 Primary Inputs

For each input file or parameter the workflow requires, complete the following:

Input Name	Type	Format / Extension	Required?	Example Value	Description
[e.g. sample_bam]	[File / String / Int / Float / Boolean / Array]	[e.g. .bam]	[Yes / No]	[e.g. sample001.bam]	[What this input represents]
[...]	[...]	[...]	[...]	[...]	[...]

3.2 Reference Data and Databases

Reference Name	Source / Version	File Format	Size (approx.)	Update Frequency	Access Location
[e.g. hg38 reference genome]	[UCSC / Ensembl / NCBI]	[.fa / .fasta]	[e.g. 3.2 GB]	[Static / Quarterly / Annual]	[URL or file path]
[...]	[...]	[...]	[...]	[...]	[...]

3.3 Input Validation Rules

Describe any validation criteria that should be enforced on inputs before processing begins.

[e.g. BAM files must be coordinate-sorted and indexed]
[e.g. FASTQ files must be paired-end with matching read counts]
[...]

4. Expected Outputs

4.1 Primary Outputs

Output Name	Format / Extension	Description	Downstream Consumer
[e.g. aligned_bam]	[.bam]	[Aligned and sorted BAM file]	[Variant calling pipeline / Manual review]
[...]	[...]	[...]	[...]

4.2 Intermediate Outputs

List any intermediate files that should be preserved (not just final outputs).

Output Name	Format	Reason to Retain
[e.g. duplicate_metrics]	[.txt]	[QC reporting]
[...]	[...]	[...]

4.3 Quality Control Outputs

QC Metric / Report	Format	Pass/Fail Criteria
[e.g. alignment rate]	[.txt / .html]	[>95% reads mapped]
[e.g. duplication rate]	[.txt]	[<30% duplicates]
[...]	[...]	[...]

5. Workflow Steps and Logic

5.1 Step-by-Step Process

Describe each major processing step in order. For each step, include the tool, version, and key parameters.

Step	Task Name	Tool / Software	Version	Key Parameters	Description
1	[e.g. FastQC]	[FastQC]	[v0.11.9]	[--threads 4]	[Raw read quality assessment]
2	[e.g. Trim]	[Trimmomatic]	[v0.39]	[LEADING:3 TRAILING:3]	[Adapter and quality trimming]
3	[...]	[...]	[...]	[...]	[...]

5.2 Conditional Logic

Describe any branching or conditional steps in the workflow.

[e.g. If sample is paired-end, run step X; if single-end, run step Y]
[e.g. If QC fails, halt pipeline and notify]
[...]

5.3 Scatter/Gather Operations

Describe any steps that should be parallelised across samples, chromosomes, or other units.

Scatter Variable	Scatter Over	Gather Method
[e.g. sample_id]	[Array of sample BAMs]	[Merge output VCFs]
[e.g. chromosome]	[Array of chromosome intervals]	[Concatenate results]

6. Compute and Runtime Requirements

6.1 Per-Task Resource Estimates

Task Name	CPU Cores	Memory (GB)	Disk (GB)	GPU Required?	Estimated Runtime
[e.g. BWA-MEM alignment]	[8]	[16]	[100]	[No]	[~2 hours per sample]
[e.g. GATK HaplotypeCaller]	[4]	[8]	[50]	[No]	[~1 hour per sample]
[...]	[...]	[...]	[...]	[...]	[...]

6.2 Execution Environment

Field	Details
Target Platform	[Cromwell / miniWDL / Terra / DNAnexus / AWS HealthOmics / Other]
Backend	[Local / HPC (Slurm/PBS) / Cloud (GCP/AWS/Azure)]
Container Registry	[Docker Hub / GCR / ECR / Quay.io / Custom]
Preemptible/Spot Instances	[Yes / No / Where possible]
Maximum Retry Attempts	[e.g. 2]

6.3 Docker Containers

Task Name	Docker Image	Tag / Version	Registry URL
[e.g. BWA alignment]	[broadinstitute/bwa]	[0.7.17]	[docker.io/broadinstitute/bwa:0.7.17]
[...]	[...]	[...]	[...]

7. Error Handling and Edge Cases

7.1 Known Edge Cases

Describe any known edge cases or special conditions the workflow must handle.

[e.g. Empty input files — workflow should skip processing and log a warning]
[e.g. Very large samples (>100GB) may require increased disk allocation]
[...]

7.2 Failure Modes and Recovery

Failure Scenario	Expected Behaviour	Recovery Action
[e.g. Tool exits with non-zero code]	[Task fails]	[Retry up to N times, then halt]
[e.g. Disk space exhausted]	[Task fails]	[Increase disk multiplier and retry]
[...]	[...]	[...]

8. Testing and Validation

8.1 Test Data

Dataset Name	Size	Location	Description
[e.g. NA12878 downsampled]	[~1 GB]	[gs://bucket/test-data/]	[Standard reference sample for validation]
[...]	[...]	[...]	[...]

8.2 Acceptance Criteria

Define what "done" looks like. How will you validate the workflow produces correct results?

[e.g. Output VCF matches expected variants in truth set with >99% sensitivity]
[e.g. Workflow completes successfully on all 5 test samples]
[e.g. Runtime is within 20% of estimated duration]
[...]

9. Metadata and Governance

Field	Details
Data Classification	[Public / Internal / Confidential / Restricted]
Compliance Requirements	[HIPAA / GDPR / GxP / None]
Data Retention Policy	[e.g. Retain outputs for 7 years]
Audit Trail Required?	[Yes / No]
Version Control Repository	[e.g. github.com/org/repo]
WDL Version	[1.0 / 1.1 / development]

10. Stakeholders and Approvals

Role	Name	Email	Sign-off Required?
Requestor	[Name]	[email]	[Yes]
WDL Developer	[Name]	[email]	[Yes]
Scientific Lead	[Name]	[email]	[Yes / No]
DevOps / Platform	[Name]	[email]	[Yes / No]
Data Governance	[Name]	[email]	[Yes / No]

11. Additional Notes

Include any additional context, links to related documentation, diagrams, or references that would help the WDL developer.

[Enter any additional notes here]

Revision History

Version	Date	Author	Changes
1.0	[YYYY-MM-DD]	[Name]	[Initial submission]
[...]	[...]	[...]	[...]