Chapter 5 WDL style guide
The Workflow Description Language (WDL) is a way to specify data processing workflows with a human-readable and writeable syntax. WDL makes it straightforward to define complex analysis tasks, chain them together in workflows, and parallelize their execution
5.1 Name conventions
The names for the various types of objects should be in the following format:
- Workflow:
UpperCamelCase
- Task:
UpperCamelCase
- Struct:
UpperCamelCase
- Variable:
lower_case
separated with underscore_
To avoid name conflicts, end with Task
to tasks, and end with Flow
to
workflows.
5.1.1 Input and output names
Should mimic the (long form) versions of the options they represent as much as possible.
Should be formatted as
lower_case
.End with
_path
for input which represents a file path.End with
_file
for input which represents a file.Avoid referencing Absolute Paths (Except when using Docker), use a input parameter.
task bad { File f command { java -jar /usr/lib/library.jar -Dinput=${f} } } # Instead: task good { File f File jar command { java -jar ${jar} -Dinput=${f} } }
5.2 Indentation
Following should be indented:
- Anything within a set of braces
{}
. - Inputs following
input:
in a call block. - Continuations of expressions which did no fit on a single line, see Line length and line breaks.
Indentation rules:
- Use spaces.
- 4 spaces per indentation.
- Closing braces should use the same level as indentation as their opening braces.
Example:
workflow Example {
call SomeTask as doStuff {
input:
number = 1,
letter = "a"
}
}
5.3 Blank Lines
Blank lines should be used to separate different parts of a workflow:
- Different blocks (code surrounded by
{}
or<<<>>>
) should be separated by a single blank line. - Different groupings of inputs (in call, task and workflow blocks) and items in runtime and parameter_meta sections may also be separated by a single blank line.
- Between the closing braces of a parent and child block, no blank lines should be placed.
5.4 Line length and line breaks
Lines should be at most 80 characters long. If a line exceeds this, it should be broken up into multiple lines.
5.4.1 Line break
Aways add line breaks in:
- Input section after each input.
- Output section after each output.
- After
{
,}
,<<<
or>>>
. - Between inputs in a call block.
When using line breaks to adhere to the line length limit, they should occur on logical places. These include:
- Following a comma.
- Before the
then
orelse
in anif-then-else
expression. - Following an opening bracket (
(
). - Following an operator which would otherwise be followed by a space in (Expression spacing)
Example:
Int value = if defined(aVariableThatHasAWayTooLongName)
then aVariableThatHasAWayTooLongName else 10
# or
Int value = if defined(aVariableThatHasAWayTooLongName)
then aVariableThatHasAWayTooLongName
else 10
# ----
call SomeTask as doStuff {
input:
number = 1,
letter = "a"
}
5.5 Expression spacing
- Spaces should be added between values and operators in expressions.
- If multiple expressions occur within a single overarching expression, they may be grouped without placing spaces, with spaces being placed between the groups instead. It is advisable to base the groups on the order of operations.
- In the case of groupings, opening brackets (
(
) should always be preceded by a space and closing brackets ()
) should always be followed by one. There should not be a space between the brackets and their first or last value. - In the case of function calls, there should not be a space between the function’s name and the opening bracket of it’s parameters, but besides that the same rules apply as with groupings, as far as the brackets are concerned. The commas separating the parameters should be followed, but not preceded by a space.
Example
1 + 1
1 + 1/2
(1 + 1) / 2
!(a == -1)
5.6 Tasks
Recommend to use
as
format and name all tasks.One empty line between each subsection example.
task Echo { input { String message String? outputPath # Optional input(s) separated from mandatory input(s) } command <<< echo ~{message} ~{"> " + outputPath} >>> output { File? outputFile = outputPath } }
5.6.1 Command section
Each option in a bash command should be on a new line. Ending previous lines with a backslash (). Some grouping of options on a single line may be acceptable. For example, various java memory settings may be set on the same line, as long as the line does not exceed the line length limit.
All bash commands should start with
set -e -o pipefail
if more than one bash command gets executed in the task.Use
command <<<...>>>
rather thancommand {...}
, which menas the~{...}
placeholder syntax should be used in all cases rather than the${...}
syntax.Full scripts, e.g. python
task heredoc { input { File in } command<<< python <<CODE with open("${in}") as fp: for line in fp: if not line.startswith('#'): print(line.strip()) CODE >>> }
5.6.2 Runtime section
A docker container should be provided for all tasks, e.g. Biocontainers. Only if a task performs highly generic tasks (eg. a simple ln command) may the docker container be omitted.
5.6.3 The parameter_meta section
It is highly advised that a parameter_meta section is defined, containing descriptions of both the inputs and output of the task.
5.6.4 Example
task DoStuff {
input {
File inputFile
String outputPath
Int maxRAM
String? preCommand
}
command {
set -e -o pipefail
mkdir -p `dirname outputPath`
someScript \
-i ~{inputFile} \
--maxRAM ~{maxRAM} \
-o outputPath
}
output {
File output = outputPath
}
runtime {
docker: "alpine"
}
parameter_meta {
inputFile: "A file"
outputPath: "A location to put the output"
maxRam: "The maximum amount of RAM that can be used (in GB)"
output: "A file containing the output"
}
}
5.7 Modularization
5.7.1 Tasks and workflows
- In general tasks and structs should be kept in separate files from workflows. Only if a task is small and specific to a certain workflow may it be placed in the same file as the workflow.
- Tasks and structs relating to the same tool or toolkit should be in the same file.
- If a file contains multiple tasks and/or structs, the structs should be kept below the tasks and both the tasks and structs should be ordered alphabetically.
- Calls and value assignments in a workflow should be placed in order of execution.
- If there are multiple workflows and task files:
- all tasks should saved in the
tasks/
folder. - All workflows that contain tasks must go in
workflows/
. - each wdl file must only contain tasks corresponding to a certain tool.
- all tasks should saved in the
5.7.2 Analyses
- Analyses are higher level workflows that string together multiple workflows that all do the same kind of computation.
- All analyses should go into
analyses/
.
Example:
task A {
command {
echo A
}
}
task Z {
command {
echo Z
}
}
struct B {
String name
}
5.8 Imports
Imports should be placed in alphabetical order at the top of the file. All imports must be named using the as syntax.