Common Workflow Language

The Common Workflow Language (CWL) is an open standard for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, High Energy Physics, and Machine Learning.
CWL is developed by a multi-vendor working group consisting of organizations and individuals aiming to enable scientists to share data analysis workflows. The CWL project is maintained on Github and we follow the Open-Stand.org principles for collaborative open standards development. Legally CWL is a member project of Software Freedom Conservancy and is formally managed by the elected CWL leadership team, however every-day project decisions are made by the CWL community which is open for participation by anyone.
CWL 2021 Mini-Conference §
The CWL 2021 Mini-Conference was on February 8-10th. All videos are accessible online via YouTube and Conf.Tube
Vision for the CWL Project §
What this community hopes to have accomplished in the future
Researchers, scientists, and analysts share their batch data analysis workflows without technical barriers using an open standard. Sharing workflows this way is a usual occurrence and seen as a typical way of working. The workflows are complete and run on a variety of environments; and people re-use shared workflow descriptions and build new workflows from them. No vendor dominates the ecosystem
Mission of the CWL Project §
How we plan to achieve our vision
The CWL project supports open consensus-based standards for command line data analysis workflows and tools.
Specifically, we support the
- pre-standards process by providing a neutral place of convening to discuss, propose and test ideas about command-line tool based workflow standards and related topics
- standardization process by stewarding the development and delivery of standards in accordance with the Open Stand principles.
- post-standards life cycle by (1) promoting the released standards, (2) developing and maintaining related training and tools, and by (3) tracking deficits and other post-standardization feedback.
Getting Started §
The CWL user guide provides a gentle introduction to learning how to write CWL command line tool and workflow descriptions.
Browse CWL Implementations to find a software package that’s right for you.
CWLの日本語での解説ドキュメント is a 15 minute introduction to the CWL project in Japanese.
A series of video lessons about CWL is available in Russian as part of the Управление вычислениями (Computation Management) free online course.
Support, Community and Contributing §
The recommended place to ask a question about all things CWL is on the CWL Discourse Group. Previously we used biostars.org where you can still read older responses.
If you are interested in learning more or contributing ideas or code, come chat with us on Gitter, check out #CommonWL on Twitter, join the mailing list common-workflow-language on Google Groups or fork the repository and send a pull request!
Besides the web interface for the mailing list, one can also join by sending a blank email to common-workflow-language+subscribe@googlegroups.com and replying to the automated message.
Code of Conduct §
The CWL Project is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, age, race, or religion. We do not tolerate harassment of participants in any form. This code of conduct applies to all CWL Project spaces, including the Google Group, the Gitter chat room, the Google Hangouts chats, both online and off. Anyone who violates this code of conduct may be sanctioned or expelled from these spaces at the discretion of the leadership team.
For more details, see our Code of Conduct.
Specification §
For developers and advanced users, the current CWL Standards v1.2.0 provides authoritative documentation of the execution of CWL documents. The previous versions, CWL Standards v1.0.2 and CWL Standards v1.1.0, are also available.
Citation §
To reference the Common Workflow Language and the CWL project in scholarly work, please use the following citation:
Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, The CWL Community (2021):
Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language.
Communication of the ACM. https://doi.org/10.1145/3486897 Retrieved from arXiv 2105.07028 [cs.DC] https://arxiv.org/abs/2105.07028
To reference the CWL specification in scholary work, please use the following citation inclusive of the DOI:
Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić (editors), Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, Luka Stojanovic (2016):
Common Workflow Language, v1.0.
Specification, Common Workflow Language working group. https://w3id.org/cwl/v1.0/ https://doi.org/10.6084/m9.figshare.3115156.v2
A collection of existing references to CWL can be found at https://zotero.org/groups/cwl
Implementations §
In Production: §
| Software | Description | Platform support |
|---|---|---|
| cwltool | Reference implementation of CWL | Linux, OS X, Windows, local execution only |
| Arvados | Distributed computing platform for data analysis on massive data sets. Using CWL on Arvados | AWS, GCP, Azure, Slurm |
| Toil | Toil is a workflow engine entirely written in Python. | AWS, Azure, GCP, Grid Engine, HTCondor, LSF, Mesos, OpenStack, Slurm, PBS/Torque |
| CWL-Airflow | Package to run CWL workflows in Apache-Airflow (supported by BioWardrobe Team, CCHMC) | Linux, OS X |
| REANA | RE usable ANAlyses | Kubernetes, CERN OpenStack OpenStack Magnum |
Partial Implementations: §
| Software | Description | Self-Reported Compliance | Platform support |
|---|---|---|---|
| ep3 | Extremely Pluggable Pipeline Processor | CWL v1.0 |
local |
| xenonflow | Run CWL workflows using Xenon through a REST api. | CWL v1.0 |
any Xenon backend: local, ssh, SLURM, Torque, Grid Engine |
| Galaxy | Web-based platform for data intensive biomedical research. | – | |
| cwl-tes | CWL engine backended by the GA4GH Task Execution API | Alicloud, AWS, Google, HPC, Local, Spark, TES | |
| AWE | Workflow and resource management system for bioinformatics data analysis. | – | |
| yacle | Yet Another CWL Engine | local | |
| Calrissian | CWL Engine built for Kubernetes | Kubernetes | |
| Cromwell | Cromwell workflow engine | Google, HTCondor, Local, LSF, PBS/Torque, SGE, Slurm, TES | |
| CWLEXEC | Apache 2.0 licensed CWL executor for IBM Spectrum LSF, supported by IBM for customers with valid contracts. | IBM Spectrum LSF 10.1.0.3+ | |
| Mariner | “The Gen3 Workflow Execution Service”, Apache 2.0 licensed, written in Go, also implements the GA4GH WES API | Kubernetes | |
| Pegasus | Pegasus Workflow Management System | Partial support for importing CWL workflows is under development | |
| StreamFlow | Workflow Management System for hybrid HPC-Cloud infrastructures | Full support for CWL 1.2 is currently under development |
See also: an ongoing analysis of CWL Implementations by the BioExcel Center of Excellence.
Repositories of CWL Tools and Workflows §
| Repository | Description |
|---|---|
| Common Workflow Library | Git organization for community maintenance of tools and workflows. |
| Dockstore tool registry | An open platform for sharing Docker-based tools described with the Common Workflow Language used by the GA4GH. |
| CWLviewer | A web application to view and share Common Workflow Language workflows |
| cwl-source | Git repository for collections of tools, workflows, metadata, and input parameter files. Administered by xD Bio Inc. Integrates with truwl.com |
| GitHub | Search for CWL documents using extension:cwl cwlVersion + <your search terms>, for example extension:cwl cwlVersion picard. |
Search for CWL documents using filetype:cwl cwlVersion + <your search terms>, for example filetype:cwl cwlVersion picard. |
|
| Workflow Hub | A registry for scientific workflows, sponsored by EOSC Life |
Software for working with CWL §
Editors and viewers §
| Software | Description |
|---|---|
| Rabix Composer | Graphical CWL editor |
| CWLviewer | A web application to view and share Common Workflow Language workflows |
| atom-cwl | CWL editing mode for Atom |
| vim-cwl | CWL editing mode for Vim |
| cwl-mode | CWL editing mode for Emacs (instructions english, 日本語) |
| vscode-cwl | CWL support in Visual Studio Code |
| IntelliJ CWL plugin | CWL plugin for IntelliJ and other JetBrains editors |
| bioSyntax | Includes CWL syntax highliting for gedit |
| Rabix Benten | A language server for CWL. Provides CWL code intelligence for VS Code, vim/neovim, Emacs, Acme, IntelliJ/JetBrains, and others |
| vue-cwl | Visualizer of CWL workflows for the Vue JavaScript framework using cwl-svg |
| cwl-for-remote-container-template | A template to write CWL documents with VSCode with remote container extension |
| ToolJig | Web forms for building CWL Tool and Workflow descriptions |
Utilities §
| Software | Description |
|---|---|
| cwldep | CWL dependency manager, for importing tool wrappers and workflows into your own project. |
| cwltest | CWL testing framework, automated testing of tools and workflows written with CWL |
| cwl2zshcomp | generates ZSH auto completions from CWL command line tool descriptions |
| Cerise | A REST service for running CWL workflows on remote clusters |
| workflow-service | An implementation of the GA4GH Workflow Execution Service (WES) API to run CWL workflows on remote clusters |
| cwl-inspector | Tool to inspect properties of tools or workflows written in CWL |
| cwlprov-py | Command line tool and Python API to explore CWLProv Research Objects containing provenance of Common Workflow Language executions |
| cwl-utils | Example scripts using the new Python 3.6+ CWL parsing library |
| looper | job submitting engine with support for CWL whole workflow/tool scattering |
Converters and code generators §
| Software | Description |
|---|---|
| cwl-upgrader | Upgrade CWL documents from draft-3 to v1.0, v1.0 to v1.1, and v1.1 to v1.2. |
| argparse2tool | Generate CWL CommandLineTool wrappers (and/or Galaxy tool descriptions) from Python programs that use argparse. Also supports the click argument parser. |
| cwl2argparse | Generate Python argparse code from CWL CommandLineTool description. |
| pypi2cwl | Automatically run argparse2cwl on any package in PyPi |
| acd2cwl | ACD (EMBOSS) to CWL generator |
| CTD converter | Common Tool Definition (CTD) to CWL converter |
| scriptcwl | Create CWL workflows by writing a simple Python script |
| cwl-to-parsl | Convert CWL to Parsl |
| Beatrice | Pipeline Assembler For CWL |
| zatsu-cwl-generator | A simple CWL document generator from given execution commands |
| Janis | A Python API that generates portable CWL and WDL workflows |
| cwl-utils | New Python 3.6+ CWL parsing library |
| ipython2cwl | a tool for converting IPython Jupyter Notebooks to CWL CommandLineTools via typing annotations |
| pegasus-cwl-converter | Work in progress tool to convert a CWL workflow into a Pegasus workflow. |
Code libraries §
| Software | Description |
|---|---|
| cwltool | cwltool (can be imported as a Python module and extended to create custom cwl runners |
| schema salad | Python module and tools for working with the CWL schema. |
| cwljava | Java classes for loading, modifying, and creating CWL v1.2 documents |
| CWL for R | Parse and work with CWL from R |
| buchanae/cwl | CWL document parsing and processing utilities in Go. |
| CWL for Go | - |
| CWL for Scala | CWL object model for Scala |
| cwl-proto | Reading and writing Common Workflow Language to Protocol Buffers |
| CmdParser | Reading and Writing Common Workflow Language spec files from C++ applications. Includes a Command Line Parser |
| Rcwl | Build, read, write and run CWL in R |
| tidycwl | Tidy (R) Common Workflow Language Tools and Workflows |
| cwl-rs | CWL object model for Rust |
Projects the CWL community is participating in §
| Name | Details |
|---|---|
| Bio-compute objects | “a step towards evaluation and validation of bio-medical scientific computations”, CWL and researchobject.org participants are cooperating with this effort |
| European Open Science Cloud | The CWL project is signatory to the EOSC Declaration and is a core enabling standard for two of the EOSCPilot Science Demonstrations: LOFAR - Astrophysics Data and eWaterCycle & SWITCH-ON – FAIR Data for Hydrology |
| GA4GH Task Execution API | a minimal common API for submitting a single job to a remote execution endpoint. Many contributions from CWL project participants. |
| GA4GH Workflow Execution API | a minimal common API for submitting workflow requests to workflow execution systems in a standardized way. Many contributions from CWL project participants. |
| ResearchObjects.org | “an emerging approach to the publication, and exchange of scholarly information on the Web.” CWL participants and RO enthusiasts have created CWLProv, a profile for provenance research object of a CWL workflow run. |
Contributers and Governance §
Participating Organizations §
- Arvados Project
- Curii
- Seven Bridges Genomics
- Galaxy Project
- Apache Taverna
- Institut Pasteur
- Wellcome Trust Sanger Institute
- University of California Santa Cruz
- Harvard T.H. Chan School of Public Health
- Cincinnati Children’s Hospital Medical Center
- Broad Institute
- University of Melbourne Center for Cancer Research
- Netherlands eScience Center
- Texas Advanced Computing Center Life Science Computing Group / Agave Platform
- CyVerse
- Institute for Systems Biology
- ELIXIR Europe
- BioExcel CoE
- BD2K
- EMBL Australia Bioinformatics Resource
- IBM Spectrum Computing
- DNAnexus
- CERN
Individual Contributors §
(Alphabetical)
- Peter Amstutz, Curii / Arvados Project; https://orcid.org/0000-0003-3566-7705
- Robin Andeer; https://orcid.org/0000-0003-1132-5305
- Brad Chapman; https://orcid.org/0000-0002-3026-1856
- John Chilton, Pennsylvania State University / Galaxy Project; https://orcid.org/0000-0002-6794-0756
- Michael R. Crusoe, CWL Project Lead; https://orcid.org/0000-0002-2961-9670
- Roman Valls Guimerà; https://orcid.org/0000-0002-0034-9697
- Guillermo Carrasco Hernandez guille.ch.88@gmail.com
- Kenzo-Hugo Hillion; https://orcid.org/0000-0002-6517-6934
- Manabu Ishii, RIKEN; https://orcid.org/0000-0002-5843-4712
- Sinisa Ivkovic sinisa.ivkovic@sbgenomics.com
- Sehrish Kanwal; https://orcid.org/0000-0002-5044-4692
- Andrey Kartashov; https://orcid.org/0000-0001-9102-5681
- John Kern; https://orcid.org/0000-0001-6977-458X
- Farah Zaib Khan; https://orcid.org/0000-0002-6337-3037
- Dan Leehr; https://orcid.org/0000-0003-3221-9579
- Hervé Ménager, Institut Pasteur; https://orcid.org/0000-0002-7552-1009
- Maxim Mikheev mikhmv@biodatomics.com
- Michael Miller mdmiller53@comcast.net
- Tazro Ohta, DBCLS; http://orcid.org/0000-0003-3777-5945
- Tim Pierce twp@unchi.org
- Josh Randall; https://orcid.org/0000-0003-1540-203X
- Mark Robinson; https://orcid.org/0000-0002-8184-7507
- Janko Simonović janko.simonovic@sbgenomics.com
- Stian Soiland-Reyes, University of Manchester; https://orcid.org/0000-0001-9842-9718
- Luka Stojanovic luka.stojanovic@sbgenomics.com
- Tomoya Tanjo, NIG; https://orcid.org/0000-0002-4421-9659
- Nebojša Tijanić https://orcid.org/0000-0001-8316-4067
- Hiromu Ochiai; @otiai10 https://orcid.org/0000-0001-6636-856X
CWL Advisors §
(Alphabetical)
- Peter Amstutz, Curii / Arvados Project; https://orcid.org/0000-0003-3566-7705
- Artem Barski, Cincinnati Children’s Hospital Medical Center / University of Cincinnati College of Medicine; https://orcid.org/0000-0002-1861-5316
- John Chilton, Pennsylvania State University / Galaxy Project; https://orcid.org/0000-0002-6794-0756
- Kyle Cranmer, New York University; https://orcid.org/0000-0002-5769-7094
- Michael R. Crusoe, CWL Project Lead; https://orcid.org/0000-0002-2961-9670
- Brandi Davis Dusenbery, Seven Bridges Genomics, Inc.; https://orcid.org/0000-0001-7811-8613
- Niels Drost, Netherland eScience Center; https://orcid.org/0000-0001-9795-7981
- Scott Edmunds, GigaScience; https://orcid.org/0000-0001-6444-1436
- Geet Duggal, DNAnexus; https://orcid.org/0000-0003-3485-359X
- Rob Finn, EMBL-EBI; https://orcid.org/0000-0001-8626-2148
- Marc Fiume, DNAstack; https://orcid.org/0000-0002-9769-375X
- Jeff Gentry, Foundation Medicine; https://orcid.org/0000-0001-5351-8442
- Kaushik Ghose, Seven Bridges Genomics, Inc; https://orcid.org/0000-0003-2933-1260
- Carole Goble, The University of Manchester; https://orcid.org/0000-0003-1219-2137
- Oliver Hofmann, University of Melbourne / bcbio-nextgen; https://orcid.org/0000-0002-7738-1513
- Hervé Ménager, Institut Pasteur; https://orcid.org/0000-0002-7552-1009
- Folker Meyer, Argonne / University of Chicago; https://orcid.org/0000-0003-1112-2284
- Anton Nekrutenko, The Pennsylvania State University / Galaxy Project; https://orcid.org/0000-0002-5987-8032
- Brian O’Connor, University of California Santa Cruz; https://orcid.org/0000-0002-7681-6415
- Tibor Simko, CERN, https://orcid.org/0000-0001-7202-5803
- Nihar Sheth, DNAnexus; https://orcid.org/0000-0003-4128-4364
- Stian Soiland-Reyes, University of Manchester; https://orcid.org/0000-0001-9842-9718
- Nebojša Tijanić, Seven Bridges; https://orcid.org/0000-0001-8316-4067
- Ward Vandewege, Curii / Arvados; https://orcid.org/0000-0002-2527-6949
- Alexander Wait Zaranek, Curri / Arvados; https://orcid.org/0000-0002-0415-9655
CWL Leadership Team §
CWL is a member project of the Software Freedom Conservancy. In general, discussions about CWL should happen on open forums but you can also contact the CWL leadership team & Conservancy directly via commonworkflowlanguage@sfconservancy.org. This address should be CC’ed regarding all activities that involve activities of Common Workflow Language that related to things other than software development and documentation, and particularly any activities that expect to make use of Software Freedom Conservancy’s non-profit status.
To contact just the CWL leadership team, please email cwl-leadership@googlegroups.com.
The CWL leadership team consists of the following people, listed in alphabetical order by their last name:
- Peter Amstutz, Curii / Arvados Project; https://orcid.org/0000-0003-3566-7705
- John Chilton, Pennsylvania State University / Galaxy Project; https://orcid.org/0000-0002-6794-0756
- Michael R. Crusoe, CWL Project Lead; https://orcid.org/0000-0002-2961-9670
- Brandi Davis Dusenbery, Seven Bridges Genomics, Inc.; https://orcid.org/0000-0001-7811-8613
- Jeff Gentry, Foundation Medicine; https://orcid.org/0000-0001-5351-8442
- Hervé Ménager, Institut Pasteur; https://orcid.org/0000-0002-7552-1009
- Stian Soiland-Reyes, University of Manchester; https://orcid.org/0000-0001-9842-9718