Common Workflow Language User Guide

Key Points

  • CWL describes command line tools and workflows.

  • CWL is not software.

  • Descriptions in CWL aid portability between environments

First Example
  • CWL documents are written in YAML and/or JSON.

  • The command called is specified with baseCommand.

  • Each expected input is described in the inputs section.

  • Input values are specified in a separate YAML file.

  • The tool description and input files are provided as arguments to a CWL runner.

Essential Input Parameters
  • Inputs are described in the inputs section of a CWL description.

  • Files should be described with class: File.

  • You can use the inputBinding section to describe where and how an input appears in the command.

Returning Output Files
  • Outputs are described in the outputs section of a CWL description.

  • The field outputBinding describes how to to set the value of each output parameter.

  • Wildcards are allowed in the glob field.

Capturing Standard Output
  • Use the stdout field to specify a filename to capture streamed output.

  • The corresponding output parameter must have type: stdout.

Parameter References
  • Some fields permit parameter references enclosed in $(...).

  • References are written using a subset of Javascript syntax.

Running Tools Inside Docker
  • Containers can help to simplify management of the software requirements of a tool.

  • Specify a Docker image for a tool with DockerRequirement in the hints section.

Additional Arguments and Parameters
  • Use the arguments section to describe command line options that do not correspond exactly to input parameters.

  • Runtime parameters provide information about the environment when the tool is actually executed.

  • Runtime parameters are referred under the runtime namespace.

Array Inputs
  • Array parameter definitions are nested under the type field with type: array.

  • The appearance of array parameters on the command line differs depending on with the inputBinding field is provided in the description.

  • Use the itemSeparator field to control concatenatation of array parameters.

Array Outputs
  • You can capture multiple output files into an array of files using glob.

  • Use wildcards and filenames to specify the output files that will be returned after tool execution.

Advanced Inputs
  • Use the record field to group parameters together.

  • Multiple records within the same parameter description are treated as exclusive.

Environment Variables
  • Tools run in a restricted environment with a minimal set of environment variables.

  • Use the EnvVarRequirement field to set environment variables inside a tool’s environment.

JavaScript Expressions
  • If InlineJavascriptRequirement is specified, you can include JavaScript expressions that will be evaluated by the CWL runner.

  • Expressions are only valid in certain fields.

  • Expressions should only be used when no built in CWL solution exists.

Creating Files at Runtime
  • Use InitialWorkDirRequirement to specify input files that need to be created during tool runtime.

Staging Input Files
  • Input files are normally kept in a read-only directory.

  • Use InitialWorkDirRequirement to stage input files in the working directory.

File Formats
  • You can document the expected format of input and output Files.

  • Once your tool is mature, we recommend specifying formats by referencing existing ontologies e.g. EDAM.

Metadata and Authorship
  • Metadata can be provided in CWL descriptions.

  • Developers should provide a minimal amount of authorship information to encourage correct citation.

Custom Types
  • You can create your own custom types to load into descriptions.

  • These custom types allow the user to configure the behaviour of a tool without tinkering directly with the tool description.

  • Custom types are described in separate YAML files and imported as needed.

Specifying Software Requirements
  • Software requirements should be specified under hints:SoftwareRequirement.

Writing Workflows
  • Each step in a workflow must have its own CWL description.

  • Top level inputs and outputs of the workflow are described in the inputs and outputs fields respectively.

  • The steps are specified under steps.

  • Execution order is determined by the connections between steps.

Nested Workflows
  • A workflow can be used as a step in another workflow, if the workflow engine supports the SubworkflowFeatureRequirement.

  • The workflows are specified under steps, with the worklow’s description file provided as the value to the run field.

  • Use default to specify a default value for a field, which can be overwritten by a value in the input object.

  • Use > to ignore newlines in long commands split over multiple lines.

Scattering Workflows
  • A workflow can scatter over an input array in a step of a workflow, if the workflow engine supports the ScatterFeatureRequirement.

  • The scatter field is specified for each step you want to scatter

  • The scatter field references the step level inputs, not the workflow inputs

  • Scatter runs on each step specified independently

Conditional workflows