Installing tucca-rna-seq
System Requirements
Section titled “System Requirements”The tucca-rna-seq workflow requires the following software and resources:
Software Dependencies
Section titled “Software Dependencies”| Software | Version | Purpose | |----------|---------|---------| | Snakemake | ≥8.27.1 | Workflow management system | | Conda/Mamba | Latest | Package and environment management | | Singularity/Apptainer | Singularity ≥3.8.4 or Apptainer v1.4.1 (potentially later versions, but has not been tested) | Container runtime (recommended) |
Installation Methods
Section titled “Installation Methods”HPC Cluster Installation
Section titled “HPC Cluster Installation”Most HPC clusters provide pre-installed software via modules, making this the easiest installation method.
1. Check Available Modules
Section titled “1. Check Available Modules”# Check available modulesmodule avail snakemakemodule avail singularitymodule avail condamodule avail mambamodule avail miniforge2. Load Required Modules
Section titled “2. Load Required Modules”# Purge all currently loaded modulesmodule purge
# Load required modules based on what versions are availablemodule load snakemakemodule load singularitymodule load miniforge3. Verify Installation
Section titled “3. Verify Installation”# Check versionssnakemake --versionsingularity --versionconda --versionLocal Installation with Conda/Mamba
Section titled “Local Installation with Conda/Mamba”For local development and testing, you can install the workflow on your personal machine.
1. Install Conda or Mamba
Section titled “1. Install Conda or Mamba”2. Install Snakemake
Section titled “2. Install Snakemake”3. Install Singularity/Apptainer
Section titled “3. Install Singularity/Apptainer”Cloud and Container Orchestration
Section titled “Cloud and Container Orchestration”For running the workflow on cloud platforms like Google Cloud Batch, AWS Batch, or Kubernetes, you will need to install and configure their respective command-line interface (CLI) tools.
Google Cloud CLI (gcloud) DocsAWS CLI DocsKubernetes CLI (kubectl) DocsGitHub Actions (for CI/CD)
Section titled “GitHub Actions (for CI/CD)”This repository includes a pre-configured GitHub Actions workflow for continuous integration (CI) and testing. This setup uses free, public runners and is intended for verifying the workflow’s integrity, not for production analysis.
To use it for testing, simply fork this repository and enable GitHub Actions in
your fork’s settings. You will also need to configure any required secrets
(e.g., an NCBI API key) in your repository settings under Settings > Secrets and variables > Actions.
Workflow Setup
Section titled “Workflow Setup”1. Create Project Directory
Section titled “1. Create Project Directory”# Create and navigate to project directorymkdir my_rnaseq_projectcd my_rnaseq_project2. Clone the Repository
Section titled “2. Clone the Repository”# Clone the workflow repositorygit clone https://github.com/tucca-cellag/tucca-rna-seq.git
# Navigate into the workflow directorycd tucca-rna-seq3. Set Up Environment Variables (Optional but Recommended)
Section titled “3. Set Up Environment Variables (Optional but Recommended)”If you plan to download genomes or SRA datasets from NCBI, we recommend setting
up a .env file to store your API key securely. This prevents accidental commits
of sensitive credentials.
# Copy the template filecp .env.template .env
# Edit .env and add your NCBI API key# Replace 'your_ncbi_api_key_here' with your actual API keyNext Steps
Section titled “Next Steps”Once installation is complete, you are ready to configure the workflow. Please
refer to the Configuration Guide for detailed instructions on
how to set up the config.yaml, samples.tsv, and units.tsv files for your
analysis.
Linked external resources are independent of TUCCA and Tufts University and remain under their own licenses.