
Bulk Sequencing

LogoTitleDescriptionData inputs

STAR Generate Genome Index capsule

Generates necessary files to run STAR RNA alignment

  • Genome DNA .fasta

  • Genome gene annotation .gtf/.gff

STAR Alignment

RNA-Seq alignment. STAR addresses many of the challenges of RNA-seq data mapping by accounting for spliced alignments. This means that RNA sequences can successfully align to the DNA genome.

  • Short/long read .fastq

  • STAR Index

Salmon Preparing Transcriptome Indices for Mapping-Based Mode

Generates necessary files to run Salmon RNA alignment from genome RNA transcript fasta file and genome DNA genome fasta file.

  • Genome DNA .fasta

  • Transcripts RNA .fasta

Salmon: mapping-based quantification

RNA-Seq quantification. Salmon specifically is designed for speed and is more geared towards quantification of transcripts specifically than precise read alignment.

  • Short/long read .fastq

  • Salmon Index

BWA Generate Genome Index

Generates necessary files to run BWA DNA alignment from a DNA fasta file.

  • Genome DNA .fasta


BWA is a software package for mapping sequences against a large reference genome, such as the human genome.

  • Short/long read .fastq (designed for short reads)

  • BWA Index

Bowtie2 Generate Genome Index

Generates necessary files to run Bowtie DNA alignment from a DNA fasta file.

  • Genome DNA .fasta


Bowtie is a software package for mapping sequences against a large reference genome, such as the human genome.

  • Short/long read .fastq (designed for short reads)

  • Bowtie2 Index

Single Cell

LogoTitleDescriptionData Inputs

STAR-Solo Alignment

STAR-Solo analyzes droplet single cell RNA sequencing data for example, 10X Genomics Chromium System. It is intended to be a drop in replacement for CellRanger from 10X

  • Single cell RNA-seq .fastq

  • STAR Index

RShiny Cell

ShinyCell is an R package that allows users to create interactive Shiny-based web applications to visualize single-cell data.

  • Single cell .rds inputs from Seurat (see README)

1-3. Single Cell Analysis Tutorial (Scanpy & Seurat)

Tutorials to describe working with Single Cell data for Scanpy and Seurat:

1. Preprocessing and clustering 3k PBMCs

2. Core Plotting Functions

3. How to preprocess UMI count data with analytic Pearson residuals

  • Tutorial datasets (see README for details)

4. Single Cell Tutorial Seurat to AnnData (Scanpy) tutorial

Tutorial demonstrating an example of how a Seurat object can easily be converted to AnnData (Scanpy).

  • Tutorial datasets (see README for details)

5-6. Single Cell Analysis Tutorial (Scanpy)

Tutorials demonstrating how to regress cell cycle effect and how to simulate data using a literature-curated boolean gene regulatory network.

  • Tutorial datasets (see README for details)

7-10. Single Cell Analysis Tutorial (Scanpy) Advanced

Tutorials for advanced Single Cell processing.

  • Tutorial datasets (see README for details)


LogoTitle DescriptionData Inputs

Download data from BaseSpace

Download demultiplexed (fastq.gz) or raw (bcl) Illumina sequencing data through the Illumina BaseSpace CLI. This capsule requires a BaseSpace account and NGS data owned or shared with the user.

  • None

Sambamba Filtering (Duplicates, Multimappers, Unaligned)

Remove optical and PCR duplicates from Illumina data using the software tool Sambamba. Sambamba is intended to be a drop in replacement for Picard MarkDuplicates but more performant

  • .bam alignment files.

Sambamba Sort and Index

Sort and Index Illumina data using the software tool Sambamba. Sambamba is intended to be a drop in replacement for samtools but more performant

  • .bam alignment files.

Trim Galore

Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data

  • .fastq files


A tool designed to provide fast all-in-one preprocessing for FastQ files (adapter trimming, downsampling etc.). This tool is developed in C++ with multithreading supported to afford high performance

  • .fastq files


TitleDescriptionInput Data

MACS PeakCalling

MACS3 is a peak calling tool generally used on ChIP seq data to identify transcript factor binding sites.

  • .bam alignment files

  • compare_sheet.csv (see README)


This capsule will run featureCounts from the Subreads R package to generate an expression matrix.

  • Gene annotation .gtf file

  • .bam alignments


Homer contains a useful, all-in-one program for performing peak annotation called This capsule uses to annotate *.bed coordinates with gene features.

  • .bed files containing peaks

  • Genome reference .fasta

  • Gene annotation .gtf file.

Gene Enrichment Analysis (GEA)

This capsule presents a user-friendly Streamlit application designed to facilitate gene enrichment analysis. The analysis results are sourced from reliable and widely-used platforms, namely g-profiler and Panther.

  • File containing gene names

GATK RNAseq short variant discovery (SNPs + Indels)

Based on GATK RNASeq short variant discovery pipeline. Takes in alignments and outputs vcf containing SNPs and indels

  • .bam RNA alignments

Delly somatic complete analysis

Structural variant (SV) prediction to discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data of somatic cells.

  • Genome reference .fasta

  • .bam DNA alignment files

Delly germline complete analysis

Structural variant (SV) prediction to discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data of germline cells.

  • Genome reference .fasta

  • .bam DNA alignment files


ART is a set of simulation tools to generate synthetic next-generation sequencing reads.

  • .fasta containing the sequence to simulate reads from

PySpark and EMR Serverless

This capsule runs an example PySpark job on EMR Serverless.

  • NOAA Global Surface Summary of Day dataset