Chapter 6 Cromwell-WDL practical skills

6.1 Modes for Cromwell

There are two modes for Cromwell: - run: It is designed for local prototyping or demos and has limited features compared to server, and it executes a single workflow in Cromwell and then exits.

Options: - --metadata-output: specify a file where Cromwell will write workflow metadata in JSON format, such as start/end timestamps, status, inputs and outputs. By default Cromwell does not write workflow metadata.

  • server: The default mode for most applications of Cromwell, suitable for production use, and it accepts no arguments and runs Cromwell as a web server that accepts REST requests ( REST endpoints).

6.2 Tasks

6.2.1 Runtime arrtributes

There are two ways to specify runtime attributes of tasks:

  1. Within a task by set the runtime Section in the WDL file.
task jes_task {
  command {
    echo "Hello JES!"
  }
  
  runtime {
    docker: "ubuntu:latest"
    memory: "4G"
    cpu: "3"
    zones: "us-central1-c us-central1-b"
    disks: "/mnt/mnt1 3 SSD, /mnt/mnt2 500 HDD"
  }
}

workflow jes_workflow {
  call jes_task
}
  1. Using cromwell workflow options, e.g. options can be provided in JSON format, see Workflow Options for more details. This options will be overridden by the values provided in the workflow definition file itself
"docker": "ubuntu:latest",
"continueOnReturnCode": [4, 8, 15, 16, 23, 42]

docker

This can have a format like ubuntu:latest or broadinstitute/scala-baseimage in which case it should be interpreted as an image on DockerHub (i.e. it is valid to use in a docker pull command).

Use local registry

Create local registry and push the required image to it:

  1. Create local registry using docker-registry.

  2. Customize the storage location. By default, the registry contents are saved in /var/lib/registry, you should change it to a readable directory using argument -v if you are rootless.

  3. Tag the image and push it to the local registry.

    # create local registry
    docker run -d \
      -p 5000:5000 \
      --restart=always \
      --name registry \
      -v <customized storage location>:/var/lib/registry \
      registry
    
     # tag
     docker tag ubuntu:latest 127.0.0.1:5000/ubuntu:latest
    
     # push
     docker push 127.0.0.1:5000/ubuntu:latest
    
     # show images in the local registry
     curl 127.0.0.1:5000/v2/_catalog
    
     # then you can test it: remove and pull
     docker image rm 127.0.0.1:5000/ubuntu:latest
     docker image pull 127.0.0.1:5000/ubuntu:latest

Then the the image can be specified within the WDL file or using worlflow options.

# e.g. in workflow options
default_runtime_attributes: {
  docker: localhost:5000/ubuntu:latest
}

Note To use local registry: you should set docker.hash-lookup.enabled = false see this issue.

volume host path to the container

If there are files in the host need to read in the container or some files are persisted even the container stops, you can use docker volume.

To make the volume flexibility:

  1. you can set the path in host using workflow inputs.
  2. set the inputs in runtime-attributes and customize the submit string ( submit-docker) in the backend in cromwell configure files.
  3. set runtime attributes of tasks: in workflow options or in the runtime
    section of a task (Runtime arrtributes).
# e.g. data_volume, workflow input data_volume
workflow test {
    String data_volume = "/data"
  
    ...
}

# cromwell configure files, take Local backend for example
# new input data_volume for `run-attributes`
runtime-attributes = """
String? docker
String? docker_user
String? data_volume
"""
# submit-docker string
submit-docker = """
docker run \
--rm -i \
${"--user " + docker_user} \
--entrypoint ${job_shell} \
-v ${cwd}:${docker_cwd} \
-v ${data_volume}:${data_volume} \
${docker} ${docker_script}
"""

# runtime attrutes of tasks
# in runtime section
runtime {
    docker: localhost:5000/ubuntu:latest
    data_volume: data_volume
}
# or in workflow options file
default_runtime_attributes: {
    docker: localhost:5000/ubuntu:latest
    data_volume: data_volume
}

Expression support

Runtime attribute values are interpreted as expressions. This means that it has the ability to express the value of a runtime attribute as a function of one of the task’s inputs.

task runtime_test {
  String ubuntu_tag
  Int memory_gb

  command {
    ./my_binary
  }

  runtime {
    docker: "ubuntu:" + ubuntu_tag
    memory: memory_gb + "GB"
  }
}

disks

This attribute specifies

docker

When specified, Cromwell will run your task within the specified Docker image.

  • Local: Cromwell will automatically run the docker container.
  • SFS: When a docker container exists within a task, the submit-docker method is called. See the Getting started with containers guide for more information.
  • GCP: This attribute is mandatory when submitting tasks to Google Cloud.
  • AWS Batch: This attribute is mandatory when submitting tasks to AWS Batch.

continueOnReturnCode

  • When set to false, any non-zero return code will be considered a failure.
  • When set to true, all return codes will be considered successful.