2.20. Solucionando Problemas#
In this section you will find ways to troubleshoot when you have problems executing CWL.
We focus on cwltool here but some of these techniques may apply to other CWL Runners.
2.20.1. Run cwltool with cachedir#
You can use the --cachedir option when running a workflow to tell cwltool to
cache intermediate files (files that are not input nor output files, but created
while your workflow is running). By default, these files are created in a
temporary directory but writing them to a separate directory makes accessing
them easier.
In the following example troubleshooting-wf1.cwl we have two steps, step_a and step_b.
The workflow is equivalent to echo "Hello World" | rev, which would print the message
«Hello World» reversed, i.e. «dlroW olleH». However, the second step, step_b, has a typo,
where instead of executing the rev command it tries to execute revv, which
fails.
troubleshooting-wf1.cwl#cwlVersion: v1.2
class: Workflow
inputs:
text:
type: string
default: 'Hello World'
outputs:
reversed_message:
type: string
outputSource: step_b/reversed_message
steps:
step_a:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
text: string
outputs:
step_a_stdout:
type: File
outputBinding:
glob: 'stdout.txt'
arguments: ['echo', '-n', '$(inputs.text)']
in:
text: text
out: [step_a_stdout]
step_b:
run:
class: CommandLineTool
stdout: stdout.txt
inputs:
step_a_stdout: File
outputs:
reversed_message:
type: string
outputBinding:
glob: stdout.txt
loadContents: true
outputEval: $(self[0].contents)
baseCommand: revv
arguments: [ $(inputs.step_a_stdout) ]
in:
step_a_stdout:
source: step_a/step_a_stdout
out: [reversed_message]
Let’s execute this workflow with /tmp/cachedir/ as the --cachedir value (cwltool will
create the directory for you if it does not exist already):
$ cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'troubleshooting-wf1.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/troubleshooting/troubleshooting-wf1.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_a
INFO [step step_a] start
INFO [job step_a] Using cached output in /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf
INFO [step step_a] completed success
INFO [workflow ] starting step step_b
INFO [step step_b] start
INFO [job step_b] Output of job will be cached in /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273
INFO [job step_b] /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273$ revv \
/tmp/7etn85oz/stg4e43cefe-976f-4328-9c71-0692e7b8e134/stdout.txt > /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273/stdout.txt
ERROR 'revv' not found: [Errno 2] No such file or directory: 'revv'
WARNING [job step_b] completed permanentFail
ERROR [step step_b] Output is missing expected field file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/troubleshooting/troubleshooting-wf1.cwl#step_b/reversed_message
WARNING [step step_b] completed permanentFail
INFO [workflow ] completed permanentFail
{
"reversed_message": null
}WARNING Final process status is permanentFail
The workflow is in the permanentFail status due to step_b failing to execute the
non-existent revv command. The step_a was executed successfully and its output
has been cached in your cachedir location. You can inspect the intermediate files
created:
$ tree /tmp/cachedir
/tmp/cachedir
├── 3dfb3e8c82b46e9e2d650a90a303a16a
│ └── stdout.txt
├── 3dfb3e8c82b46e9e2d650a90a303a16a.status
├── 609ea62e2a895d4dd4f7fd481ae06273
│ └── stdout.txt
├── 609ea62e2a895d4dd4f7fd481ae06273.status
├── 629so9et
├── 86ambv53
├── _953sv1n
├── a3kcdo31
├── edb2bbda4f67d8bf15e1112f6a5a10cf
│ └── stdout.txt
├── edb2bbda4f67d8bf15e1112f6a5a10cf.status
├── g_q3zryu
├── oxy9z5j5
├── tgca11o8
├── ul1hai7c
└── yxzpmizj
12 directories, 6 files
Each workflow step has received a unique ID (the long value that looks like a hash).
The ${HASH}.status files display the status of each step executed by the workflow.
And the step_a output file stdout.txt is visible in the output of the command above.
Now fix the typo so step_b executes rev (i.e. replace revv by rev in the
step_b). After fixing the typo, when you execute cwltool with the same arguments
as the previous time, note that now cwltool output contains information about
pre-cached outputs for step_a, and about a new cache entry for the output of step_b.
Also note that the status of step_b is now of success.
$ cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1-stepb-fixed.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'troubleshooting-wf1-stepb-fixed.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/troubleshooting/troubleshooting-wf1-stepb-fixed.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_a
INFO [step step_a] start
INFO [job step_a] Using cached output in /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf
INFO [step step_a] completed success
INFO [workflow ] starting step step_b
INFO [step step_b] start
INFO [job step_b] Using cached output in /tmp/cachedir/3dfb3e8c82b46e9e2d650a90a303a16a
INFO [step step_b] completed success
INFO [workflow ] completed success
{
"reversed_message": "dlroW olleH"
}INFO Final process status is success
In this example the workflow step step_a was not re-evaluated as it had been cached, and
there was no change in its execution or output. Furthermore, cwltool was able to recognize
when it had to re-evaluate step_b after we fixed the executable name. This technique is
useful for troubleshooting your CWL documents and also as a way to prevent cwltool to
re-evaluate steps unnecessarily.