WEBVTT

00:00:00.000 --> 00:00:03.629
In many scientific fields such as
bioinformatics, medical imaging, and

00:00:03.629 --> 00:00:06.299
astronomy
large quantities of data need to be

00:00:06.299 --> 00:00:10.769
analyzed. This can involve large-scale
and repetitive processes in long

00:00:10.769 --> 00:00:15.240
pipelines of different tools — referred to
as workflows. It can be very

00:00:15.240 --> 00:00:18.930
time-consuming to run data for all these
different tools by hand and convert

00:00:18.930 --> 00:00:21.949
outputs to various formats to make them
compatible with the next step.

00:00:21.949 --> 00:00:26.130
Workflow management systems are designed
to alleviate this problem by allowing

00:00:26.130 --> 00:00:30.090
these workflows to be expressed formally
and providing infrastructure to set up.

00:00:30.090 --> 00:00:34.290
execute, and monitor them. This formal
expression of workflows allows for

00:00:34.290 --> 00:00:38.399
scientists to easily share and reuse
them. Crucially they can also be used to

00:00:38.399 --> 00:00:43.110
verify results of computation for
published work. However there are many

00:00:43.110 --> 00:00:46.579
competing [ways] for describing
workflows which is a barrier to this aim.

00:00:46.579 --> 00:00:50.399
Currently there are over a hundred
different data analysis workflow systems

00:00:50.399 --> 00:00:55.020
with no interoperability between them.
The need has arisen to have a single

00:00:55.020 --> 00:00:58.469
common standard and so the "Common
Workflow Language" project was created: an

00:00:58.469 --> 00:01:02.430
open standard designed to express
workflows and their tooling in groups of

00:01:02.430 --> 00:01:05.539
YAML structured text files.
