I wrote the same pipeline in Nextflow, Snakemake, and WDL

In March I wrote the same bioinformatics pipeline three times in a row, once each in Nextflow, Snakemake, and WDL, and by the third one my coworkers had started asking if I was okay. The repos are public (simple-nf, simple-smk, simple-wdl). There was a reason. I was building a workflow execution service that has to run all three, and I was sick of debugging it against giant production pipelines where a failure could mean anything from my bug to a bad reference genome. I needed small, known-good pipelines I understood down to the last line, in every language the service claims to support.

The workflow is deliberately boring: FastQC on the raw reads, trimming with Trimmomatic, alignment with BWA-MEM, sorting and indexing with samtools, variant calling with bcftools. Five steps in a chain, each eating the previous one's output, plus FastQC branching off on its own. Small enough to hold in your head, big enough to make each language actually resolve a dependency graph. Writing it three times taught me more than the service ever did, because you don't really know a language's execution model until the same five steps behave three different ways under your hands.

Nextflow: channels and dataflow

Nextflow thinks in channels. Each process declares input and output channels and you wire them together like plumbing. TRIM emits a channel of trimmed FASTQ pairs, ALIGN drinks from that channel and emits a channel of BAMs, and on down the line. Once it clicked I liked it a lot. Getting it to click is the part nobody warns you about.

workflow {
    reads_ch = Channel.fromFilePairs(params.reads)
    FASTQC(reads_ch)
    trimmed = TRIM(reads_ch)
    aligned = ALIGN(trimmed, params.reference)
    sorted  = SORT(aligned)
    CALL(sorted, params.reference)
}

Parallelism just falls out. Emit 10 sample pairs into reads_ch and the runtime forks 10 independent paths without you writing a single line of parallel code. Each process gets its own working directory under work/, with inputs staged in as symlinks. That isolation is great for reproducibility, but it has a price: you cannot share intermediate state between steps unless you route it through a channel on purpose. There's no sneaking a file across the fence.

Here's what actually cost me an afternoon. Nextflow channels are consumed on read. Feed reads_ch into both FASTQC and TRIM and you have to either .into { fastqc_ch; trim_ch } in DSL1 or just reference it twice in DSL2, which forks it implicitly. In DSL1, referencing a channel a second time silently empties it. No error. No warning. The second process just gets nothing and you sit there wondering why FastQC never ran. It's a bug that doesn't exist in any other workflow language, because it isn't a logic error, it's the dataflow model biting you for using it wrong.

Error handling is per-process through errorStrategy. Set 'retry' with maxRetries and Nextflow re-runs the failed process in a fresh work directory. This matters more than it looks, because bioinformatics tools fail for boring transient reasons (a network filesystem hiccup, a memory spike) far more often than for real bugs in your code. A blanket retry buys back a lot of failed 8-hour runs.

Snakemake: rules and filename DAGs

Snakemake flips the whole thing around. You declare rules with input and output file patterns, and the engine works the DAG backwards from whatever output you asked for. Coming straight from Nextflow's channels, this rewired my brain for a day.

rule trim:
    input:  "data/{sample}_R1.fastq.gz", "data/{sample}_R2.fastq.gz"
    output: "trimmed/{sample}_R1.fastq.gz", "trimmed/{sample}_R2.fastq.gz"
    shell:  "trimmomatic PE {input} {output} ILLUMINACLIP:..."

rule align:
    input:  "trimmed/{sample}_R1.fastq.gz", "trimmed/{sample}_R2.fastq.gz"
    output: "aligned/{sample}.bam"
    shell:  "bwa mem {config[reference]} {input} | samtools view -bS - > {output}"

You ask for aligned/sampleA.bam and Snakemake reasons its way back: to build that BAM I need trimmed FASTQs, to build those I need raw FASTQs. The entire DAG is inferred from filename patterns, which is genuinely elegant when your data is well-structured. It falls apart the moment your naming convention is messy, because the DAG is quite literally built out of filename regex matches. Let two rules produce the same output pattern and Snakemake throws an ambiguity error, and untangling it means ruleorder directives or reshaping your output paths. I hit this fast, and my read is that Snakemake rewards tidy filenames and punishes everything else.

By default Snakemake runs right in the working directory, no sandboxed per-process dirs like Nextflow, so intermediate files from different rules all see each other on disk. That cuts both ways. Debugging is easy, just ls the output directory and poke around, but you have to watch for rules quietly reading stale files from a previous run. --forceall re-runs everything, though in practice I gave up and wrote a clean rule that wipes the output directories, because I got burned by stale output exactly once and didn't want a second time.

Parallelism is --cores N: Snakemake figures out which rules are independent and runs up to N at once. Swap in --cluster "sbatch ..." and local execution becomes job submission, which is the real reason Snakemake owns HPC. If your world is Slurm, it fits the way nothing else does.

WDL: tasks, types, and scatter-gather

WDL was the one that felt like coming home. It reads the most like writing functions in an ordinary programming language. You define a task with typed inputs, a command block that's really a bash template, typed outputs, and a runtime section. After the dataflow puzzles and the filename regexes, the plainness was a relief.

task align {
    input {
        File r1
        File r2
        File reference
        File reference_idx
    }
    command <<<
        bwa mem ~{reference} ~{r1} ~{r2} | samtools view -bS - > aligned.bam
    >>>
    output {
        File bam = "aligned.bam"
    }
    runtime {
        docker: "biocontainers/bwa:0.7.17"
        memory: "8 GB"
        cpu: 4
    }
}

Tasks compose into a workflow block, and scatter-gather is explicit: scatter (sample in samples) { call align { input: ... } } gives you back an Array[File] of BAMs. Nothing parallelizes because of how the data happens to be shaped. You say what runs wide, and I came to prefer that clarity over Nextflow's magic after a few rounds of guessing what Nextflow was going to fork.

The type system is the real reason to care about WDL. File, String, Int, Boolean, Array[X], Map[K, V], Pair[A, B], Object, and optionals like File?. The Cromwell and miniwdl validators typecheck a workflow before a single task runs, so passing an Array[File] where a File belongs gets caught up front. For the service I was building, which takes workflows from people I don't control, that's not a nicety, it's the feature. I can bounce a malformed WDL at submission time with an error that actually says what's wrong, instead of letting it fail ten minutes deep and handing the user a stack trace.

Where WDL loses is ecosystem, and it isn't close. Nextflow has nf-core, hundreds of curated pipelines. Snakemake has its Workflow Catalog. WDL has BioWDL and the Broad's own workflows and not much past that. If you're starting fresh and your tool already ships as an nf-core module, be honest with yourself and use Nextflow. I would.

What I took away

There is no "best" here, and I stopped looking for one. Nextflow has the biggest community and the deepest ecosystem. Snakemake has the gentlest on-ramp and is a joy on a single machine. WDL has the cleanest spec and the only pre-execution validation story I'd trust with strangers' input. Three tools, three genuinely different bets.

Which settled the question I actually started with. The execution service should not have a favorite. Each language bakes in an execution model, and the researcher who picked it picked it for reasons rooted in their domain, not mine. My job is to run whatever lands in the queue, correctly, without an opinion. Writing the same five steps three times is what finally let me write adapters that don't fight the language underneath them. It looked unhinged from the outside. It was the cheapest tuition I've paid in a while.