1.2. Prerequisites#

The software and configurations listed in this section are prerequisites for following this user guide. The CWL standards are implemented by many different workflow runners and platforms. This list of requirements focuses on the CWL reference runner, cwltool. You can use another CWL-compatible runner or workflow system, but the results and interface may look different (though the exact workflow outputs should be identical).

CWL Implementations

There are many implementations of the CWL standards. Some are complete CWL runners, while others could be plug-ins or extensions to workflow engines. We have a better explanation in the Implementations section.

1.2.1. Operating System#

We recommend using an up-to-date operating system. You can choose any of the following options for your operating system:

Linux
macOS
Windows

Note

If you are using Windows, you will have to install the Windows Subsystem for Linux 2 as documented in the cwltool documentation for Microsoft Windows users. Your operating system also needs internet access and a recent version of Python (3.6+).

1.2.2. CWL Runner#

The first thing you will need for running CWL workflows is a CWL runner. cwltool is a Python Open Source project maintained by the CWL community. It is also the CWL reference runner, which means it must support everything in the current CWL specification, v1.2.

cwltool can be installed with pip, apt, or conda. We recommend using a virtual environment like venv or conda.

Note

Visit the cwltool documentation for details on installing cwltool.

Let’s use a simple CWL tool description true.cwl with cwltool.

true.cwl#

cwlVersion: v1.2
class: CommandLineTool
inputs: []
outputs: []
# `true` is a Linux command that exits with exit code `0` (success).
baseCommand: "true"

The cwltool command has an option to validate CWL tool and workflow descriptions. This option will parse the CWL document, look for syntax errors, and verify that the workflow descriptions are compliant with the CWL standards. However, these actions will be performed without running the document. To validate CWL workflows (or even a standalone command line tool description like the above) pass the --validate option to the cwltool command:

Validating true.cwl with cwltool.#

$ cwltool --validate true.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'true.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/true.cwl'
true.cwl is valid CWL.

You can run the CWL tool description by omitting the --validate option:

Running true.cwl with cwltool.#

$ cwltool true.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'true.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/true.cwl'
INFO [job true.cwl] /tmp/c1gnsnnc$ true
INFO [job true.cwl] completed success
{}INFO Final process status is success

1.2.2.1. Generic `cwl-runner` alias#

cwl-runner is an implementation-agnostic alias for any CWL compliant runner. This simply means that the cwl-runner alias command can be invoked independently, and is not reliant on a particular CWL runner program name. Users can invoke cwl-runner instead of invoking a CWL runner like cwltool directly. The cwl-runner is installed by a system administrator or user to point to the preferred CWL implementation. This is convenient for environments with multiple CWL runners.

The CWL community publishes a Python package with the name cwlref-runner that installs an alias for cwltool under the name cwl-runner.

Installing cwl-runner alias for cwltool with pip.#

$ pip install cwlref-runner

Now you can validate and run your workflow with the cwl-runner executable, which will invoke cwltool. You should have the same results and output as in the previous section.

Validating true.cwl with cwl-runner.#

$ cwl-runner --validate true.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwl-runner 3.1.20240508115724
INFO Resolved 'true.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/true.cwl'
true.cwl is valid CWL.

Running true.cwl with cwl-runner.#

$ cwl-runner true.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwl-runner 3.1.20240508115724
INFO Resolved 'true.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/true.cwl'
INFO [job true.cwl] /tmp/6kdhah_6$ true
INFO [job true.cwl] completed success
{}INFO Final process status is success

Another way to execute cwl-runner is by invoking the file directly. For that, the first thing you need to do is copy true.cwl workflow into a new file: true_shebang.cwl, and include a special first line, a shebang:

true_shebang.cwl#

#!/usr/bin/env cwl-runner

cwlVersion: v1.2
class: CommandLineTool
inputs: []
outputs: []
# `true` is a Linux command that exits with exit code `0` (success).
baseCommand: "true"

Now you can make the file true_shebang.cwl executable with chmod u+x.

Making true.cwl executable.#

$ chmod u+x true.cwl

And finally, you can execute it directly in the command-line. On execution, the program specified in the shebang (cwl-runner) will be used to execute the rest of the file.

Running true_shebang.cwl with a shebang.#

$ ./true_shebang.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwl-runner 3.1.20240508115724
INFO Resolved './true_shebang.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/true_shebang.cwl'
INFO [job true_shebang.cwl] /tmp/1ew4xm4b$ true
INFO [job true_shebang.cwl] completed success
{}INFO Final process status is success

Note

The shebang is the two-character sequence #! at the beginning of a script. When the script is executable, the operating system will execute the script using the executable specified after the shebang. It is considered a good practice to use /usr/bin/env [executable] rather than using a hard-coded location, since /usr/bin/env [executable] looks for the [executable] program in the system PATH,

1.2.3. Text Editor#

You can use any text editor with CWL, but for syntax highlighting we recommend an editor with YAML support. Popular editors are Visual Studio Code, Sublime, WebStorm, vim/neovim, and Emacs.

There are extensions for Visual Studio Code and WebStorm that provide integration with CWL, and features such as customized syntax highlighting and better auto-complete:

Visual Studio Code with the Benten (CWL) plugin - rabix/benten
cwl-plugin for IntelliJ - https://plugins.jetbrains.com/plugin/10040-cwl-plugin

The CWL community also maintains a list of editors and viewers: https://www.commonwl.org/tools/#editors

1.2.4. Docker#

cwltool uses Docker to run tools, workflows, and workflow steps that specify a software container. Follow the instructions in the Docker documentation to install it for your operating system: https://docs.docker.com/.

You do not need to know how to write and build Docker containers. In the rest of the user guide, we will use existing Docker images for running examples, and to clarify the differences between the execution models with and without containers.

Note

cwltool supports running containers with Docker, Podman, udocker, and Singularity. You can also use alternative container registries for pulling images.

1.2.5. Learn More#

The Implementations topic in the next section, Basic Concepts.
The Python venv module: https://docs.python.org/3/library/venv.html

1.2. Prerequisites#

1.2.1. Operating System#

1.2.2. CWL Runner#

1.2.2.1. Generic cwl-runner alias#

1.2.3. Text Editor#

1.2.4. Docker#

1.2.5. Learn More#

1.2.2.1. Generic `cwl-runner` alias#