Reproducible Bioinformatics with Pixi, Nextflow & Snakemake

Author

Abu Bakar Siddique

Published

April 1, 2026

1 Session 1 — Introduction

1.1 What is reproducibility?

Ability to re-run analyses and get the same results
Requires: environments, versions, workflows, metadata
Reproducibility in bioinformatics is notoriously difficult.
Traditional Conda/Mamba environments helps but often drift over time, break across platforms, or install different dependency versions depending on when and where they are solved.

1.2 Why Pixi?

Pixi solves these issues through:

✅ Automatic lockfiles ensuring identical environments everywhere

✅ Multi-platform resolution (Linux, macOS Intel/ARM, Windows)

✅ Local project‑scoped environments for clean reproducible analysis

✅ Task runner replacing Makefiles, bash scripts & fragile command chains

✅ Rust‑based solver → significantly faster than Conda

This makes Pixi ideal for:

Bioinformatics pipelines
Teaching environments
HPC systems
Snakemake / Nextflow workflows
Collaborative research groups

1.3 Conda vs Pixi: A Quick Comparison

Why Conda often fails

No strict version pinning
Environment drift
Slow solving
Not fully reproducible

Why Pixi fixes this

Strict lockfiles
Zero solver
Fast, deterministic builds
Perfect for workflows

2 Session 2 — Pixi Basics

2.1 Install Pixi

On Linux:

# Install Pixi via the official script
curl -fsSL https://pixi.sh/install.sh | bash

# Restart your shell or source the profile to update PATH   
source ~/.bashrc  # or ~/.zshrc 

# Verify installation
pixi --version

If the version prints — you’re ready.

2.2 Create a Pixi Project

we like to analyse RNA‑seq data, so let’s create a project for that. In your terminal:

# create a new directory for your project and initialize Pixi
mkdir rnaseq-qc
# Navigate into the project directory
cd rnaseq-qc
# Initialize a new Pixi project with conda-forge and bioconda channels
pixi init --channel conda-forge --channel bioconda

This creates:

File	Purpose
pixi.toml	Human-edited project configuration

2.3 Add your platform

pixi platform add linux-64

This ensures that the environment is solved for your specific platform. If you want to share with collaborators on different platforms, add those too:

pixi platform add osx-arm64
pixi platform add osx-64
pixi platform add win-64

2.4 Add tools or packages

add bioinformatics tools from conda channels:

pixi add fastqc samtools python=3.11

add tools from PyPI channels:

pixi add --pypi multiqc

Pixi automatically updates the lockfile. ## Install & test tools @ pixi environment ### Install tools in the environment:

# Install the environment based on pixi.toml and pixi.lock
pixi install

This creates a fully reproducible environment in .pixi/ with the exact versions of all dependencies. You can share the pixi.toml and pixi.lock files with collaborators to ensure they get the same environment.

2.4.1 Test the tools to confirm they are installed correctly, you can run:

# Check versions to confirm correct installation
pixi run fastqc --version
pixi run samtools --version

2.4.2 interactive shell:

pixi shell
# Inside the shell, you can run any command with the environment activated
fastqc --version
samtools --version

to quit the shell when done:

exit

2.5 Use the Pixi environment for data analysis

Now that you have your environment set up, you can run your bioinformatics analyses using pixi run to ensure reproducibility. But first, let’s create some sample data to work with. ### Create Sample Data (for demo)

mkdir -p data genome_files

Create a dummy fastq files and reference files for testing. Useful for RNA‑seq, WGS, metagenomics, QC teaching, alignment modules.

create a forward reads fastq file or R1:

cat << 'EOF' | gzip > data/sample_R1.fastq.gz
@READ_0001/1
ACGTTGACCTGATCGTAGGCTAATCGTAGGCTATGCTAGCTAGCA
+
IIIIIIIIHIIIHIIIIIIIIIIIIHIIIGIIIIHIIIIIIIII
@READ_0002/1
TTGACCGTAGCTAGCTAGGATCGTAGCATGATGCTAGCTAGGTCA
+
IIIIIIHIGIIHIIIIIIIIIIIIHIIIIIIGIIIIHIIIIIII
EOF

create a reverse reads fastq file or R2:

cat << 'EOF' | gzip > data/sample_R2.fastq.gz
@READ_0001/2
TGCTAGCTAGCATAGCCTACGATTAGCCTACGATCAGGTCAACGT
+
IIIIIIIIIIIIHIIIHIIHIIIIIIIHIIIIIGIIIIHIIIII
@READ_0002/2
TGACCTAGCTAGCATGCTACGATCCTAGCTAGCTAGCTACGGCAA
+
IIIIHIIIIIIIIIIIHIIIIGIIIIIIHIIIIHIIIIIIIIII
EOF

create reference.fa file:

cat << 'EOF' > data/reference.fa
>chrDemo
ATGCGTACGTTAGCGTACGTAGCTAGCTAGGCTAGCTAGGCGTACGATCGTAGGCTAACGTTAGCGATCGTAGCTAGCTAGGATCGTACGATCGTACGATCGTAGCTAGCGTTA
EOF

2.5.1 Run analyseis with Pixi:

Run ctools with pixi run to ensure they use the exact same environment every time, regardless of where or when you run them.

Recommended (reproducible):

pixi run fastqc data/sample_R1.fastq.gz
pixi run fastqc data/sample_R2.fastq.gz

Interactive debugging: pixi shell → runs commands in an interactive shell with the environment activated. Useful for testing commands or exploring the environment.

2.6 Add Tasks in pixi.toml

Inside pixi.toml: Add tasks to automate your workflow. Edit pixi.toml:

[tasks]
qc = "fastqc data/*fastq.gz -o results/"
report = "multiqc results/ -o reports/"

Or add tasks via command line:

# Add tasks to pixi.toml
pixi task add qc "fastqc data/*fastq.gz* -o results/"

# Add a report task that depends on the qc task
pixi task add report "multiqc results/ -o reports/"

# Add a clean task to remove results (optional)
pixi task add clean "rm -rf results/*"

2.6.1 Run Tasks individually:

# Run individual tasks
pixi run qc

# Run the report task (which depends on qc)
pixi run report

2.7 Hands-on Exercise on Run Tasks:

Run the qc task to perform quality control on the sample fastq files. Check the results/ directory to see the output.
Run the report task to generate a multiqc report from the QC results. Check the reports/ directory for the multiqc report.
Try running the report task without running qc first. What happens? (Hint: it should fail because the report task depends on the output of qc).
Next session (Now, let’s add a pipeline task that depends on both qc and report to run the entire workflow in one command.)

3 Session 3 — Pipelines

3.1 Automation with Pipelines

You can define a pipeline task that depends on multiple tasks to run them in the correct order.

3.1.1 Add Tasks as dependencies in a pipeline:

# In pixi.toml, add dependencies to ensure tasks run in the correct order
report = { cmd = "multiqc results/ -o reports/", depends-on = ["qc"] } 

# This ensures that 'report' will only run after 'qc' has successfully completed.
pipeline = { depends-on = ["qc", "report"] }

3.2 Run the pipeline (all tasks in order):

# Now you can run the entire pipeline with one command, and Pixi will handle the task dependencies for you.
pixi run pipeline

OR add the pipeline task via command line:

nano pixi.toml

copy & paste the followings in the [tasks] section at pixi.toml:

[tasks]
qc = "fastqc data/*.fastq* -o results/"
report = { cmd = "multiqc results/ -o reports/", depends-on = ["qc"] }
pipeline = { depends-on = ["qc", "report"] }

Now run the pipeline to run all tasks in the correct order:

pixi run pipeline

4 Session 4 — Nextflow + Pixi

See the bar up to the right of this page for the Nextflow run, or click here to jump to the Nextflow run.

5 Session 5 — Snakemake + Pixi

See the bar up to the right of this page for the Snakemake run, or click here to jump to the Snakemake run.

6 Wrap-up — Sharing & Best Practices

6.1 Share with Collaborators

Commit your configuration to git:

git init
echo ".pixi/" >> .gitignore
git add pixi.toml pixi.lock .gitignore
git commit -m "Add Pixi configuration and lockfile for reproducible RNA-seq QC pipeline"

Collaborators can now reproduce your exact environment:

git clone <your-repo>
cd rnaseq-qc
pixi install          # Sets up identical environment
pixi run pipeline     # Runs with exact same tool versions

6.2 Multi-Environment Features

Example:

# QC tools feature
[feature.qc.dependencies]
fastqc = "*"
python = ">=3.11"

[feature.qc.pypi-dependencies]
multiqc = "*"

# Alignment tools feature
[feature.alignment.dependencies]
bwa = "*"
samtools = "*"
star = "*"

# Python 2 legacy tool (conflicts with QC)
[feature.legacy.dependencies]
python = "2.7.*"
htseq = "*"

# Define environments from features
[environments]
qc = ["qc"]                      # Just QC tools
alignment = ["alignment"]         # Just alignment tools
legacy = ["legacy"]              # Just Python 2 tools
default = ["qc", "alignment"]    # Everything except legacy

Run in a specific environment: !! alert fix the command below to use pixi run instead of pixi shell for reproducibility!!!

pixi run -e qc multiqc results/
pixi run -e alignment bwa index data/reference.fa
pixi run -e legacy htseq-count  # Uses Python 2

6.3 Best Practices

Commit pixi.toml + pixi.lock
Do not commit .pixi/
Use tasks for all analyses
Use features when tools conflict
Use pixi run, not pixi shell, in workflows
Use distinct environments for teaching modules

6.4 some shortcuts:

# List all packages:
pixi list

# Search for a package:
pixi search samtools

# By default, pixi search doesn’t list all available versions. To list more package versions, use the -l <int> flag
pixi search -l 40 samtools

# Update a specific package:
pixi update samtools

# Update all packages:
pixi update

# Update all packages including dependencies:
pixi upgrade

# Removing Packages
pixi remove samtools

# Remove environment binaries (will be recreated from lock file):
pixi clean

# Remove downloaded package cache:
pixi clean cache

# Remove all environments and cache (use with caution):
pixi clean all

# Check for updates to Pixi itself:
pixi self-update

# Get help on any command:
pixi --help

# Get help on a specific command:
pixi run --help

# Get help on a specific task:
pixi run qc --help

# Get help on a specific environment:
pixi run -e qc --help

# Get help on a specific feature:
pixi run -f qc --help

# Get help on a specific package:
pixi run -p samtools --help

# Get help on a specific version of a package:
pixi run -p samtools=1.16 --help


# Run a command in an interactive shell with the environment activated:
pixi shell

# Inside the shell, you can run any command with the environment activated
fastqc --version
samtools --version

# Exit the shell when done
exit

6.5 Git (to do):

# Manage git tracking for Pixi files:
# Safe to delete (regenerated from pixi.lock):
.pixi/ directory

# Ignore the .pixi/ directory in git:
echo ".pixi/" >> .gitignore

# Must keep (commit to git):
pixi.toml - your configuration
pixi.lock - exact package versions

6.6 Resources

Documentation: https://pixi.sh/latest GitHub: https://github.com/prefix-dev/pixi Examples: https://github.com/prefix-dev/pixi/tree/main/examples Tutorial: https://pixi.sh/latest/tutorials/python/