Introduction
|
Common Workflow Language is a standard for describing data analysis workflows
We will use an bioinformatics RNA-seq analysis as an example workflow, but does not require in-depth knowledge of biology.
After completing this training, you should be able to begin writing workflows for your own analysis, and know where to learn more.
|
Create a Workflow by Composing Tools
|
CWL documents are written using a syntax called YAML.
The key components of the workflow are: the header, the inputs, the steps, and the outputs.
|
Running and Debugging a Workflow
|
The input parameter file is a YAML file with values for each input parameter.
A common reason for a workflow step fails is insufficient RAM.
Use ResourceRequirement to set the amount of RAM to be allocated to the job.
Output parameter values are printed as JSON to standard output at the end of the run.
|
Writing a Tool Wrapper
|
The key components of a command line tool wrapper are the header, inputs, baseCommand, arguments, and outputs.
Like workflows, CommandLineTools have inputs and outputs .
Use baseCommand and arguments to provide the program to run and the command line arguments to run it with.
Use glob to capture output files and assign them to output parameters.
Use DockerRequirement to supply the name of the Docker image that contains the software to run.
|
Analyzing Multiple Samples
|
Separate the part of the workflow that you want to run multiple times into a subworkflow.
Use a scatter step to run the subworkflow over a list of inputs.
The result of a scatter is an array, which can be used in a combine step to get a single result.
|
Dynamic Workflow Behavior
|
CWL expressions allow you to use custom logic to determine input parameter values.
CWL ExpressionTool can be used to reshape data, such as declaring directories that should contain output files.
|
Resources for further learning
|
|
Supplement: Creating Docker Images for Workflows
|
Docker images contain the initial state of the filesystem for a container
Docker images are made up of layers
Dockerfiles consist of a series of commands to install software into the container.
|