Print: Summary Body Related Articles

Available Software on the HPC Cluster

Summary

A comprehensive reference of software available on the Bowdoin HPC Linux cluster, including commercial packages such as MATLAB, Gaussian, Mathematica, Stata, and COMSOL, as well as over 130 open-source scientific applications. Includes instructions for using the module system to load software and detailed usage guides for each commercial package.

Body

Questions

What software is available on the Bowdoin HPC cluster?
How do I run MATLAB on the HPC cluster?
How do I run Gaussian on the HPC cluster?
How do I run Mathematica on the HPC cluster?
How do I run Stata on the HPC cluster?
How do I run COMSOL on the HPC cluster?
How do I load software on the HPC Linux environment?
What is the module load command?
Is R available on the HPC cluster?
Is Python available on the HPC cluster?

Environment

This article applies to Bowdoin faculty, students, and researchers using the HPC Linux cluster. The software listed below is installed across the HPC environment, which runs Rocky Linux. Some packages require the module load command before they can be used. Commercial packages require active Bowdoin licenses.

Resolution

Loading Software with the Module System

Many software packages on the HPC cluster use the module utility to configure the environment for each application.

Run module avail to see all available software modules.
Run module load name to load a package (for example, module load pmerge).
The module remains active for the current login session. If you log out and back in, load it again.

To use a module in a job script:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load pmerge

pmerge

Commercial Packages

Click any package name to expand its usage instructions.

COMSOL Multiphysics — general-purpose simulation software

Single license: Only one person at a time can use COMSOL.

GUI mode (up to 24 hours, 16 cores, 64 GB): Use the HPC Web Portal > Interactive Apps > Bowdoin HPC Desktop > Applications > Bowdoin > Comsol.

Batch mode (up to 60 days, 32 cores, 2 TB):

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

comsol batch -inputfile MYFILE.mph -outputfile MYOUTPUT.mph

Submit: sbatch -N 1 -n 16 myscript.sh. Add --mem=200G for more memory.

Disk space tip: Change temp files folder under Preferences > Files to /mnt/hpc/tmp/username.

CUDA — NVIDIA GPU programming environment

Installed at /mnt/local/cuda with docs at /mnt/local/cuda/doc. See Use GPU and High-Memory Resources on the HPC Cluster in the Related Articles section for GPU job submission.

Visit NVIDIA CUDA Zone for learning resources.

Gaussian / GaussView — computational chemistry

Access required: Contact the IT Service Desk to be added to the Gaussian group.

GaussView GUI: HPC Web Portal > HPC Desktop > Applications > Bowdoin > GView.

Batch mode:

g16sub myfile.com 20

Runs on 20 cores; omit the number for 1 core; maximum 32 cores.

Important: Do not include %nprocs=, %nprocshared=, or %LindaWorkers= in your input file.

MATLAB — numerical computing and analysis

GUI: HPC Web Portal > HPC Desktop > Applications > Bowdoin > MATLAB.

Batch script example:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

matlab -nodesktop -nodisplay -nosplash < MYSCRIPT.m

Submit with sbatch myscript.sh.

GPU: sbatch -p gpu --gres=gpu:rtx2080:1 myscript.sh

Parallel: Add parpool(N) in your MATLAB code, submit with sbatch -N 1 -n N myscript.sh.

FAQ: If MATLAB locks up with an Xlib error, run matlab -nodesktop or delete ~/.matlab/R2021a/MATLABDesktop.xml.

Mathematica — symbolic and numerical computation

GUI: HPC Web Portal > HPC Desktop > Applications > Bowdoin > Mathematica.

Batch:

mathmsub myprog.m myout.txt --mem=16G

Parallel: Add LaunchKernels[N]; as the first line of your script. The mathmsub script detects this and requests matching cores automatically.

Plots in batch: Use Export["myplotdata.m", myplot, "TEXT"] to save, then <<myplotdata.m in GUI mode to view.

Stata — statistical analysis

GUI: HPC Web Portal > HPC Desktop > Applications > Bowdoin > Stata.

Interactive: stata (IC), stata-se (SE), or stata-mp (MP).

Batch:

statasub myfile.do

Output goes to myfile.log.

Parallel (MP): Include set processors N in your .do file. statasub reads this and requests matching cores.

Open-Source, Free Packages, and Utilities

Click any package name to expand its usage instructions. Some packages may need reinstallation following the Summer 2024 operating system upgrade — contact the IT Service Desk if something does not work.

Tip: Many packages use the module load system. Run module avail at the command line to see all available modules.

Ancestry hmm — estimating local ancestry and admixture time using next generation sequence data

Ancestry_HMM

Once during the login session, then "ancestry_hmm" and related commands.

module load ancestry_hmm

ancestry_hmm

Installed in /mnt/local/ancestry_hmm
Ancestry_HMM Web Page

Angsd — Software for analyzing next generation sequencing data

Angsd

At the Linux prompt to view the command line options

angsd

Installed in /usr/local/agsd
Angsd Web Page

Auto07p — Software for continuation and bifurcation problems in ordinary differential equations

Auto07p

At the Linux prompt

auto

Installed in /usr/local/auto07p
Auto07p Web Page

Bayescan — Identifies candidate loci under natural selection from genetic data, using differences in allele frequencies between populations

Bayescan

Where "n" is the number of CPU cores to use

module load bayescan

bayescan -threads n

Installed in /mnt/local/bayescan
Bayescan Web Page

Bbmap — This package includes BBMap, a short read aligner, as well as various other bioinformatic tools.

BBMap

At the Linux prompt.

module load bbmap

bbmap.sh

Installed in /mnt/local/bbmap
BBMap Web Page

There are several other bioinformatic tools included in this package. Please refer to the web page, and also look in /mnt/local/bbmap for a list of available commands.

Beast — A cross-platform program for Bayesian MCMC analysis of molecular sequences

Beast

At the Linux prompt.

module load beast

beast

beast-sub is used to submit Beast jobs to the standard queue on the HPC Grid
beast-gpusub is used to submit Beast jobs to the GPU systems on the HPC Grid
Installed in /mnt/local/beast
Beast Web Page

Installing Beast related packages

If you need to install Beast related packages, you can do so through their Beauti GUI interface.

For example, to install SNAPP:

Start a Linux Desktop session from the HPC Web Portal:

See Use the HPC Web Portal (Open OnDemand) in the Related Articles section.

Once in the Linux Desktop session, go to the Applications menu in the upper left, and select System Tools, then MATE Terminal.

In the terminal window, run module load beast, then run beauti. You can safely ignore any error messages in the terminal about "dconf".

If it asks to install new packages, click "Not now".

Inside the Beauti window, click the "File" menu in the upper left, then select "Manage Packages".

Scroll down and click on SNAPP to select it, then click on the "Install/Upgrade" button.

You should see a pop up saying SNAPP is installed. Click OK.

Click the "Close" button in the Package Manager window.

In the Beauti window, click the File menu, then select Exit.

It should drop you back to the terminal window. You can now go to the "System" menu at the top of the screen, and select "Log out" to terminate the Desktop Session.

SNAPP should now be installed.

Bedtools — a swiss-army knife of tools for a wide-range of genomics analysis tasks

BEDTools

Installed in /mnt/local/bedtools
BEDTools Web Page

To run interactively (like on machines dover or foxcroft):

module load bedtools

bedtools (your options go here)

To run on the Slurm cluster, your job script would look like:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load bedtools

bedtools (your options go here)

Blast — Basic Local Alignment Search Tool finds regions of local similarity between sequences.

Blast

Then the blast command you want ie "blastn, blastp, blastx", etc.

module load blast

blastn, blastp, blastx

Installed in /mnt/local/blast
Blast Web Page

A sample HPC Grid submit script might look like:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m b -m e



module load blast

blastn -db nt -query nt.fsa -out results.out

Blat — BLAST-Like Alignment Tool

Blat

Blat Web Page

Interactive Use

To run Blat interactively:

module load blat

then you can run the various commands, such as "blat", "pslSort", etc.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load blat

blat (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Blender — video rendering tool

Blender

Blender Web Page

Please note that the HPC system can only support running Blender from the command line as a batch job for rendering. It does not support using the front end GUI of Blender.

Running on the HPC Grid

A sample script called myscript.sh might look something like this:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load blender

blender -b <path to blender file here> -P render.py

Remember to replace "" with the name and path to your .blend file.

Submit this to the HPC Cluster with "sbatch myscript.sh"

If you need more memory, you can add the "--mem" option. For example, if you need 32GB of memory, use "sbatch --mem=32G myscript.sh".

If you need more memory and want to use multiple CPU cores, you can use "sbatch -N 1 -n 8 --mem=32G myscript.sh" to request 8 CPU cores and 32GB of memory.

Bowtie — an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences

Bowtie

At the Linux prompt.

bowtie

Bowtie Web Page

Bowtie2 — an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences

Bowtie 2

At the Linux prompt.

bowtie2

Bowtie 2 Web Page

Bpp — Bayesian Markov chain Monte Carlo (MCMC) program for analyzing sequence alignments

BPP

At the Linux prompt.

module load bpp

bpp

Installed in /mnt/local/bpp
Beast Web Page

Busco — BUSCO metric is complementary to technical metrics like N50

Busco

Busco Web Page
Installed as miniconda3 virtual environment in /mnt/local/miniconda3/envs/busco

To run interactively:

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate busco

busco (your options here)

HPC Grid Submit Script:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY_BOWDOIN_ACCOUNT@bowdoin.edu -m be



source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate busco

busco (your options here)

Bwa — Mapping low-divergent sequences against a large reference genome

BWA

At the Linux prompt to view the command line options

bwa

Installed in /mnt/local/bwa
BWA Web Page

Bwa-meth — Fast and accurate alignment of BS-Seq reads

BWA-Meth

BWA-Meth Web Page

Interactive Use

To run BWA-Meth interactively:

source /mnt/local/python-venv/bwa-meth/bin/activate

then you can run "bwameth.py".

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



source /mnt/local/python-venv/bwa-meth/bin/activate

bwameth.py (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Cactus — An open source problem solving environment designed for scientists and engineers

Cactus

Cactus is an open source problem solving environment designed for scientists and engineers. Its modular structure easily enables parallel computation across different architectures and collaborative code development between different groups.
Cactus requires an individual installation within your data space. Refer to the Cactus web site for information on downloading and installing the package, or contact IT for assistance.
Cactus Web Page
Cactus Online Documentation

Cactus use on Campus

[[linuxhelp:Cactus:Physics|Physics]] - Thomas Baumgarte's use of Cactus to study Gravity Waves of Black Holes

Ccgcrv — Plotting and Curve Fitting of NOAA/CMDL Trace Gas Measurements.

CCGCRV / CCGVU

Or "ccgvu" at the Linux prompt.

module load ccg

ccgcrv

ccgvu

Installed in /mnt/local/ccgcrv and /mnt/local/ccgvu
CCGvu is a program for plotting and curve fitting of the NOAA/CMDL trace gas measurements.

Cdhit — a program for clustering and comparing protein or nucleotide sequences

CD-Hit

Then you can run any of the multiple programs that are part of this software (cd-hit, cd-hit-div, etc)

module load cdhit

Installed in /mnt/local/cdhit
CD-Hit Web Page

There are several commands included in this package. Please refer to the web page, and also look in /mnt/local/cdhit for a list of available commands.

Cellranger — a single cell analysis pipeline

Cell Ranger

module load cellranger

cellranger

Installed in /mnt/local/cellranger
Cell Ranger Web Page

Sample HPC Submit script

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load cellranger



cellranger (your arguments here)

Cgat — a collection of tools for the computational genomicist

CGAT

"cgat --help" for a brief overview
CGAT Tools Web Page

Chipseeker — statstical methods for estimate the significance of overlap among ChIP peak data sets

ChIPseeker

ChIPseeker is an R package
Information on using the package can be found on the ChIPseekerWeb Page

Clustal — Fast, accurate, scalable multiple sequence alignment for proteins

Clustal Omega

At the Linux prompt.

clustalo

To submit jobs to the HPC Grid (Example: clustal-sub -i test.in.fa -o test.out.fa -v)

clustal-sub

Clustal Omega Web Page

Cufflinks — Transcriptome assembly and differential expression analysis for RNA-Seq

Cufflinks

At the Linux prompt to view the command line options

cufflinks

Installed in /usr/local/cufflinks
Cufflinks Web Page

Running on the HPC Grid

To run on the HPC grid, you need to create a small script. This example is named "cuff.sh"

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M (your-bowdoin-account)@bowdoin.edu -m be



/usr/local/cufflinks/cufflinks -p 8 (your-data-file).sam

Replace (your-bowdoin-account) with your actual Bowdoin account name (ie, "jdoe"), and "(your-data-file).sam" with your actual data input file.

Note the "-p 8" in the cufflinks options within the cuff.sh script, which tells cufflinks to use 8 CPU cores.

To run this on the Grid, you would type the following on the HPC Cluster headnode:

qsub -pe smp 8 cuff.sh

This tells the Grid to request 8 CPUs on an SMP machine to run the cuff.sh script.

If you want to run on 16 CPUs, change the "-p 8" to "-p 16" in the cuff.sh script, and the "-pe smp 8" to "-pe smp 16" on the command line.

The number of CPUs in the script needs to match the number requested at the command line.

Deeplabcut — a toolbox for markerless pose estimation of animals performing various tasks

DeepLabCut

DeepLabCut Project Home Page

= EXPERIMENTAL - DeepLabCut version 3.0.0-rc8 =

Tagging Videos

Select the Interactive Applications menu and choose the "Bowdoin HPC Desktop". Select at least 16 Gb of memory, and the number of hours you want to run the Desktop session. Press the Blue Launch button. Wait several seconds as the Cluster sets up the job, then press the blue Launch Bowdoin HPC Desktop button.

Once you are at the Linux desktop, open a Linux shell by going to the Applications menu, Systems Tools, then MATE Terminal.

In the terminal, type (Note this can take several seconds to run):

module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate dlc-3.0.0-rc8

ipython

import deeplabcut

You can safely ignore any messages about "Tensorflow binary optimizations", "Unable to register cuBLAS", and "networkx backend definted more than once"

You can now run the DeepLabCut GUI by typing:

deeplabcut.launch_dlc()

You can safely ignore any messages about "error creating runtime directory".

Running the analysis on the Slurm HPC Cluster

If you are submitting to the Slurm HPC Cluster, create a job script named myscript.sh that looks like this, replacing "my-python-file" with your DeepLabCut python filename:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate dlc-3.0.0-rc8



export LD_LIBRARY_PATH=/mnt/local/miniconda3/envs/dlc-3.0.0-rc8/lib/python3.11/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH



export DLClight="True";

python my-python-file

Login to the HPC headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

cd to your directory containing the DeepLabCut files.

Submit it to the Slurm Cluster with:

sbatch -p gpu --gres=gpu:rtx3080:1 --mem=32G myscript.sh

= STABLE - DeepLabCut version 2.2.1 =

Tagging videos

Once you are at the Linux desktop, open a Linux shell by going to the Applications menu, Systems Tools, then MATE Terminal.

In the terminal, type (Note this can take several seconds to run):

source /mnt/local/python-venv/dlc-2.2.1-gui/bin/activate

ipython

import deeplabcut

You can safely ignore any messages about "Tensorflow binary optimizations", "Unable to register cuBLAS", and "networkx backend definted more than once"

You can now run the DeepLabCut GUI by typing:

deeplabcut.launch_dlc()

You can safely ignore any messages about "error creating runtime directory".

Running the analysis on the Slurm HPC Cluster

If you are submitting to the Slurm HPC Cluster, create a job script named myscript.sh that looks like this, replacing "my-python-file" with your DeepLabCut python filename:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

source /mnt/local/python-venv/dlc-2.2.1-gui/bin/activate



export LD_LIBRARY_PATH=/mnt/local/python-venv/dlc-2.2.1-gui/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cuda_runtime/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cublas/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cufft/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cusparse/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cudnn/lib:/mnt/local/python-venv/dlc-2.2.1-gui/lib/python3.9/site-packages/nvidia/cusolver/lib:$LD_LIBRARY_PATH



export DLClight="True";

python my-python-file

Login to the HPC headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

cd to your directory containing the DeepLabCut files.

Submit it to the Slurm Cluster with:

sbatch -p gpu --gres=gpu:rtx3080:1 --mem=32G myscript.sh

= Tutorials =

Note that the "source" command has changed to run DLC on the new 2024 Slurm Cluster. See above for the correct "source" command to activate the Python virtual environment.

Lucy Sullivan has created an excellent set of instructions for using DeepLabCut in Bowdoin's HPC environment. I highly recommend that you take a look!

https://github.com/losullil/Rat-Behavioral-Analysis-Using-DeepLabCut

Some more generic tutorials on using DLC itself can be found here:

Tutorial Part I: DeepLabCut- How to create a new project, label data, and start training

Tutorial Part II: DeepLabCut - network evaluation, refinement, and re-training

Deepbinner — a tool for demultiplexing barcoded Oxford Nanopore sequencing reads

Deepbinner

To run the software

scl enable rh-python36 /mnt/local/deepbinner/deepbinner-runner.py

Installed in /mnt/local/deepbinner
Deepbinner Web Page

Deeptools — efficient analysis of high-throughput sequencing data, such as ChIP-seq, RNA-seq or MNase-seq

DeepTools

DeepTools Web Page

Interactive Use

To run DeepTools interactively:

source /mnt/local/python-venv/deeptools/bin/activate

then you can run the various commands, such as "bamCoverage", "bamCompare", etc.

Running on the Slurm HPC Cluster

If you are submitting to the Slurm HPC Cluster, your submit script would look something like this:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



source /mnt/local/python-venv/deeptools/bin/activate

bamCoverage (args)

Eigensoft — Uses principal components analysis to explicitly model ancestry differences between cases and controls

Eigensoft

Installed in /usr/local/eigensoft
Eigensoft Web Page
Please reference the web site for information on how to run the software

Fastqc — A quality control application for high throughput sequence data

FastQC

At the Linux command prompt to run the GUI version

fastqc

At the Linux prompt to run the batch version

fastqc input_filename (input_filename2) (input_filename3 etc)

Installed in /mnt/local/fastqc
FastQC Web Page

Sample HPC Grid submit script:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m b -m e



fastqc input_filename

Please replace "MYACCOUNT" with your Bowdoin login account.

Fastx — a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing

FastX Toolkit

To load the FastX module, only needed once per login session

module load fastx

Installed in /mnt/local/fastx
FastX Toolkit Web Page

For example, to run the "fastx_trimmer" software, you would first:

module load fastx once in your login session

then type:

fastx_trimmer

Fermitools — a suite of instrument specific tools

FermiTools

FermiTools Home Page

Installed as a Conda environment.

To use FermiTools, do:

module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate fermi

Then you can run the various Fermi commands.

Fftw — "Fastest Fourier Transform in the West"

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions.
FFTW Web Page

Figtree — GUI application for viewing phylogenies and producing publication quality figures

Figtree

At the Linux prompt.

figtree

Figtree Web Page
Companion program to the Beast software

Filter reads — Filter out parent-of-origin alignment reads according to SNP databases

Filter_reads

There are two Filter_reads packages installed within a Python virtual environment
Alignment reads: parent-of-origin filtering
Snakemake workflow : retains small RNA reads that align to a reference sequence (ie. a genome + transgene)
The files filter_reads.py and filter_reads.snakefile are located in /mnt/local/python-venv/filter_reads/lib/python3.6/site-packages

Interactive Use

To use the Filter_tools you first need to setup the environment:

source /mnt/local/python-venv/filter_reads/bin/activate

module load bowtie2

module load fastx

then you can run Python and include the "filter_reads.py" and/or "filter_reads.snakefile" in your code.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



source /mnt/local/python-venv/filter_reads/bin/activate

module load bowtie2

module load fastx

python (my python code file name)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Gamess — General Atomic and Molecular Electronic Structure System (GAMESS) is a general ab initio quantum chemistry package.

Gamess

Unfortunately Gamess does not seem to be compatible with the latest OS and HPC environment, so we are unable to run it at present.

The below information is left here for archival purposes, but the software does not work.

Archived Information

At the Linux prompt to run at the command line.

gms (filename).inp (number-of-cpus)

To submit to the HPC Grid

gmssub (filename).inp (number-of-cpus) (outputfile) (SGE args)

Gamess Web Page

Archived - Submitting to HPC Grid

The gmssub command allows you to submit Gamess jobs to the HPC Grid without writing your own submit script.

(filename).inp is your input data filename. This argument is required.

(number-of-cpus) is the number of CPUs to use on the HPC Grid

(outputfile) is the name of an output file

(SGE args) are options to be passed to the HPC Grid to request more memory, for example

gmssub myfile.inp 8 myoutfile.txt will run on 8 CPUs and write the output to myoutfile.txt, and will use the default 4Gb of RAM per CPU core (32 GB of RAM in total)

gmssub myfile.inp 8 myoutfile.txt -l virtual_free=10g will run on 8 CPUs, putting the output into a file named "myoutfile.txt" and requesting 10Gb of RAM per CPU (total of 80 Gb of RAM)

Gap — A system for computational discrete algebra, with particular emphasis on Computational Group Theory

Gap

gap

To run the software
Installed in /usr/local/gap
Gap Web Page

Gatk — Genome Analysis Toolkit - a software package to analyze high-throughput sequencing data

GATK - Genome Analysis Toolkit

At the Linux prompt to run the program

gatk

At the Linux prompt to view the command line options

gatk --help

Installed in /mnt/local/gatk
GATK Web Page

Gdal — Geospatial Data Abstraction Library

GDAL and OGR

GDAL is a translator library for raster geospatial data formats.
GDAL Web Page

Gemma — Implements the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model

Gemma

At the Linux prompt to view the command line options

gemma -h

Installed in /usr/local/gemma
Gemma Web Page

Genome-ucsc — Genome tools from UCSC

Genome tools from UCSC

Interactive Use

To run the tools interactively:

module load genome-ucsc

then you can run the various commands, such as "faToTwoBit", "genePredToBed", etc.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load genome-ucsc

faToTwoBit (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Gmap — GMAP: A Genomic Mapping and Alignment Program. GSNAP: Genomic Short-read Nucleotide Alignment Program

GMAP and GSNAP

GMAP/GSNAP Web Page

Interactive Use

To run GMAP interactively:

module load gmap

then you can run the various commands, such as "gmap", "gsnap", etc.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load gmap

gmap (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Gmt — Manipulate Geographic and Cartesian Data Sets

GMT

Or "GMT" at the Linux prompt.

gmt

GMT

GMT Web Page

Graphmap — A highly sensitive and accurate mapper for long, error-prone reads

Graphmap Toolkit

To load the Graphmap module, only needed once per login session

module load graphmap

Installed in /mnt/local/graphmap
Graphmap Web Page

For example, to run the "graphmap" software, you would first:

module load graphmap once in your login session

then type:

graphmap

Grass — Geographic Resources Analysis Support System

GRASS

At the Linux prompt.

module load grass

grass

GRASS Web Page

Gromacs — A versatile package to perform molecular dynamics

Gromacs - a versatile package to perform molecular dynamics

At the Linux prompt to run the program

gmx

Installed in /usr/local/gromacs
Gromacs Web Page

Guava — A GUI tool for the processing, analysis, and visualization of ATAC-seq data

Guava

This software is too old and is not compatible with our modern HPC environment.

The below information is kept for archival purposes.

At the Linux prompt to run the GUI.

module load quava

quava.sh

Installed in /mnt/local/guava
Guava Web Page and Documentation

Guava can be run either in GUI mode on one of the Interactive machines, or in command line mode on the HPC Grid.

Interactive GUI mode

To run in an interactive GUI, login to one of the HPC Interactive machines like dover, foxcroft, or pauling[.bowdoin.edu]

module load guava

guava.sh

This should open the GUI.

Batch mode on the HPC Grid

To run on the HPC Grid, you will need to create a submit script with your commands in it.

A sample HPC submit script called myscript might look like:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m b -m e



module load guava

quava.sh (command line options)

Don't forget to replace MYACCOUNT with your actual Bowdoin account name, and of course (command line options) would be replaced with your actual commands to run.

Submit this to the HPC Grid using qsub myscript

Guppy — Local accelerated basecalling for Nanopore data

Guppy

Or "module load guppy-gpu" at the Linux prompt.

module load guppy-cpu

module load guppy-gpu

Guppy Web Page

A sample HPC submit script named myscript.sh for CPU:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load guppy-cpu

guppy_basecaller --input_path (my-read-dir) --save_path (my-output-dir) --flowcell FLO-FLG001 --kit SQK-RAD004

Replace (my-read-dir) with the path to your directory containing the .fast5 files, and (my-output-dir) with the directory to write out the .fastq files.

Submit with "sbatch myscript.sh".

A sample HPC submit script for GPU:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load guppy-gpu

guppy_basecaller --device cuda:0 --input_path (my-read-dir) --save_path (my-output-dir) --flowcell FLO-FLG001 --kit SQK-RAD004

Replace (my-read-dir) with the path to your directory containing the .fast5 files, and (my-output-dir) with the directory to write out the .fastq files.

Submit with "sbatch -p gpu --gres=gpu:rtx3080:1 myscript.sh".

Handbrake — Multiplatform, multithreaded video transcoder

Handbrake

At the Linux prompt.

handbrake

Handbrake Web Page

Heasoft — A Unified Release of the FTOOLS and XANADU Software Packages

HEASoft

Then you can run the various HEASoft commands.

source $HEADAS/headas-init.sh

HEASoft Web Page

Hisat2 — a fast and sensitive alignment program for mapping next-generation sequencing reads

Hisat2

Hisat2 Web Page

Interactive Use

To run Hisat2 interactively:

module load hisat2

then you can run the various commands, such as "hisat2", "hisat2-align-s", etc.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load hisat2

hisat2 (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Hmmratac — a Hidden Markov ModeleR for ATAC-seq

HHMRATAC

Installed in /mnt/local/hmmratac
HMMRATAC Web Page

HMMRATAC is a Java application, and is run like:

"java -jar /mnt/local/hmmratac/HMMRATAC_exe.jar -b ATACseq.sorted.bam -i ATACseq.sorted.bam.bai -g genome.info"

More details are given on the HMMRATAC web page linked above.

Hydrobase — A data processing toolkit for oceanographers

Hydrobase

To prepare for running Hydrobase version 2

module load hb2

To prepare for running Hydrobase version 3

module load hb3

Once loaded, you can then run any of the multiple programs that are part of this software (hd_surf, hd_surf2d, etc.)
Installed in /mnt/local/hydrobase
Hydrobase2 Web Page
Hydrobase3 Web Page

Igv — Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets

IGV

To run the command line version, or "igvtools_gui" for the GUI version

module load igv

igvtools

igvtools_gui

Installed in /mnt/local/igv
IGV Web Page

Ima2 — Extends the method of Hey and Nielsen (2007) to two or more populations

IMa2

IMa2 and Ima2p Web Page

For single threaded (one CPU core), use the "ima2" program from Moosehead by calling "hpcsub":

hpcsub -cmd ima2 (options to ima2)

For multi-threaded, parallel (multiple CPU cores), use the "ima2p" program within an HPC submit script:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT@bowdoin.edu -m b -m e



mpiexec -np (number of CPU cores to use) ima2p (options to ima2p)

Replace MY-BOWDOIN-ACCOUNT with your actual Bowdoin login account name.

Submit to the Grid using "qsub -pe openmpi (number of CPU cores to use) (name of the above script)".

Please note that the number of CPUs you specify in the submit script must match the number that you give at the command line.

For example, if the script above were named "myscript", and I wanted to submit to the HPC Grid using 4 CPU core, I would use:

qsub -pe openmpi 4 myscript

Iqtree — Efficient and versatile phylogenomic software by maximum likelihood

iqtree

Installed in /mnt/local/iqtree and /mnt/local/iqtree2
iqtree Web Page

We have both iqtree version 1 and 2 available. run iqtree to run version 1, and "iqtree2" to run version 2.

Isis — Interactive Spectral Interpretation System, is designed to facilitate the interpretation and analysis of high resolution X-ray spectra

ISIS

Once during the login session, then "isis" at the Linux prompt.

module load isis

isis

Installed in /mnt/local/isis
ISIS Web Page

Jekyll — A simple, blog-aware, static site generator (like a file-based CMS, without all the complexity)

Jekyll

Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites. Think of it like a file-based CMS, without all the complexity.
Jekyll Web Site
Github Site
Installed in /mnt/local/jekyll

Once per login session to setup the environment,

enable-jekyll

then type jekyll followed by any command line arguments.

Julia — A flexible dynamic language, appropriate for scientific and numerical computing

Julia

At the Linux prompt.

module load julia

julia

Julia Web Page

There are two ways to access Julia within the HPC environment, through the HPC Web Portal, and by submitting a script from the command line on the HPC login node.

If you need additional Julia packages installed, please contact the IT Service Desk.

Running Julia Interactively through the HPC Web Portal

Login to the HPC Web Portal, and start an HPC Linux Desktop Session by following the directions as described in Use the HPC Web Portal (Open OnDemand) in the Related Articles section

Once you are at the Linux Desktop, select the Applications menu in the

upper left, then System Tools, then MATE Terminal.

In the terminal,, run:

module load julia

then

julia

You should be able to safely ignore the message giving an error about opening a log file under /mnt/local/juliapackages. That is a read-only file system that is not writable, but it should not prevent you from

running Julia programs.

Submitting a non-interactive Julia program to the HPC Slurm Cluster via the command line

Login to the HPC Web Portal, and follow the directions for Command Line Access as described in Use the HPC Web Portal (Open OnDemand) in the Related Articles section

cd to the directory containing your Julia script file (named testjulia.jl for this example), and type

hpcsub -cmd "module load julia; julia testjulia.jl"

Keras — a minimalist, highly modular neural networks library

Keras

Keras Web Page
Setup with Theano backend (http://deeplearning.net/software/theano/)
GPU support built in

If you only have a single command to run on the HPC Grid using a GPU, you can use the "hpcsub" script to submit to the Grid.

For example, to run the MNIST sample code (available at https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py), type:

hpcsub -l gpu=1 -cmd python mnist_cnn.py

If you have multiple commands to run, you will need to create a script containing all of your commands.

For example:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT@bowdoin.edu -m b -m e

#$ -l gpu=1



python first_command.py

python second_command.py

python third_command.py

Replace MY-BOWDOIN-ACCOUNT with your actual Bowdoin login account name.

Submit to the Grid using "qsub (name of script)".

Lammps — a classical molecular dynamics code with a focus on materials modeling

LAMMPS

LAMMPS Web Page and Documentation

To setup the environment to run LAMMPS, run module load lammps.

Once you have setup the environment, then you can run lmp to run the software.

Running on the HPC Slurm Cluster

To run LAMMPS on the Slurm HPC Cluster, create a job script named myscript.sh that looks like this example:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load mpi/mpich-x86_64

module load lammps

mpiexec -np $SLURM_NPROCS lmp -in (your input file here)

To run the job on the HPC Cluster using 8 CPU cores, use "sbatch -N 1 -n 8 myscript.sh" on the HPC headnode.

Lnknet — Simplifies the application of Statistical, Neural Network, and Machine Learning Pattern Classifiers

LNKnet

To run graphical, interactive LNKnet, run module load lnknet, then "lnknet" at the Linux prompt.
LNKnet Web Page

Macs — Model-based Analysis for ChIP-Seq

MACS

MACS Web Page

Interactive Use

To run MACS interactively:

source /mnt/local/python-venv/macs/bin/activate

then you can run the various commands, such as "macs3".

Running on the Slurm HPC Cluster

If you are submitting to the Slurm HPC Cluster, your submit script would look something like this:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



source /mnt/local/python-venv/macs/bin/activate

macs3 (args)

Mafft — Multiple alignment program for amino acid or nucleotide sequences

MAFFT

Once during the login session, then "mafft" and related commands.

module load mafft

mafft

Installed in /mnt/local/mafft
MAFFT Web Page

Meme — Motif-based sequence analysis tools

MEME

Once during the login session, then you can run the meme commands.

module load meme

Installed in /mnt/local/meme
MEME Web Page

There are a variety of programs associated with this package, which you can find in /mnt/local/meme/bin.

There is a manual as well as Guides and Tutorials located on the MEME Web site (link above).

Migrate-n — estimates effective population sizes and past migration rates between n population

Migrate-n

At the Linux prompt to load version 3.7.2.

module load migrate-n

migrate-n

At the Linux prompt to load version 4.4.4.

module load migrate-n-444

migrate-n

"migrate-sub" is used to submit single CPU core 3.7.2 jobs to the Grid
Installed in /mnt/local/migrate-n
Migrate-N Home Page

To submit multi-CPU core parallel jobs to the Grid, create a submit script similar to this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m be



module load migrate-n

mpiexec -n $NSLOTS migrate-n-mpi MY-PARM-FILE -nomenu

Replace "MYACCOUNT" with your Bowdoin account name, and "MY-PARM-FILE" with the name of your input parmfile.

Submit to the HPC Grid using "qsub -pe openmpi (NUMBER-OF-CPUS) (MYSUBMITSCRIPT)".

For example, if my submit script is called myscript.sh, and I want to run on 8 CPU cores, I would use:

qsub -pe openmpi 8 myscript.sh

Mitoz — filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences

MitoZ

At the Linux prompt.

/mnt/research/singularity/MitoZ.simg

Installed as a singularity image
MitoZ Web Page

Sample HPC Submit script

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M YOUR-BOWDOIN-ACCOUNT@bowdoin.edu -m be



/mnt/research/singularity/MitoZ.simg (your arguments here)

Mpich2 — Message Passing Interface for developing Parallel processing software

MPICH2

MPICH2 (Message Passing Interface) is a software package used to develop parallel processing software.
MPICH2 Web Page
Online Documentation

Mrbayes — A program for the Bayesian inference of phylogeny

MrBayes

To run MrBayes at the command line, run module load mrbayes, then "mb" at the Linux prompt.
To run MrBayes in batch mode on our Grid cluster, login to the head node of the Grid (moosehead), cd to the directory containing your .nex file(s), and run mbsub (filename).nex (outputfile) where (filename) is the name of your .nex file, and (outputfile) is the name of your outputfile.
MrBayes Home Page
MrBayes Online Manual

Multiqc — A tool to create a single report with interactive plots for multiple bioinformatics analyses across many samples

MultiQC

MultiQC Web Page

Interactive Use

To run MultiQC interactively:

source /mnt/local/python-venv/multiqc/bin/activate

then you can run "multiqc".

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



source /mnt/local/python-venv/multiqc/bin/activate

multiqc (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Namd — a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems

NAMD

NAMD Web Page and Documentation

There are two versions of NAMD installed, one using CPU only, and one that can use GPUs.

To setup the environment to run the CPU only version of NAMD, run module load namd.

To setup the environment to run the GPU version of NAMD, run module load namd-cuda.

Once you have setup the environment, then you can run the various commands for NAMD, such as "namd3".

Running on the HPC Slurm Cluster

To run the CPU only version on the Slurm HPC Cluster, create a job script named myscript.sh that looks like this example:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load namd

namd3 +p$SLURM_NPROCS (your parameters here)

To run the job on the HPC Cluster using 16 CPU cores, use "sbatch -n16 myscript.sh" on the HPC headnode.

To run the GPU version on the Slurm HPC Cluster, create a job script named myscript.sh that looks like this example:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load namd-cuda

namd3 +p$SLURM_NPROCS (your parameters here)

Note the difference in the "module load" command compared to the CPU-only script.

To run the job on the HPC Cluster using a GPU and 8 CPU cores, use "sbatch -p gpu --gres=gpu:rtx3080:1 -n8 myscript.sh" on the HPC Cluster headnode.

Nanopolish — signal-level analysis of Oxford Nanopore sequencing data

Nanopolish

To load the Nanopolish module, only needed once per login session

module load nanopolish

Installed in /mnt/local/nanopolish
Nanopolish Web Page

Ngspopgen — Several tools to perform population genetic analyses from NGS data

ngsPopGen

Installed in /usr/local/ngsPopGen
ngsPopGen Web Page
Please refer to the web page for information on how to run the various commands (scroll down the page)

Nilmtk — a toolkit designed to help researchers evaluate the accuracy of NILM algorithms

Nilmtk

Installed in a Python virtual environment that includes nilmtk, nilmtk-contrib, and nilm_metadata
Nilmtk Web Page
Nilmtk-contrib Web Page
Nilm_metadata Web Page
Also available within About High-Performance Computing (HPC) at Bowdoin in the Related Articles section

Running on the HPC Slurm Cluster

To run on the Slurm HPC Cluster, create a job script named myscript.sh that looks like this example, and place it in the directory with your nilmtk files:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

source /mnt/local/python-venv/nilmtk/bin/activate

# Replace the following command with your own - this is an example

./disag.py exp-17457/exp.h5 test.out 'washing machine' 'fridge' 'dish washer' 'microwave'

Login to the HPC headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

cd to the directory containing your Nilmtk data files.

Submit it to the Slurm Cluster with:

sbatch -p gpu --gres=gpu:rtx3080:1 --mem=32G myscript.sh

Running in JupyterLab

On the Server Option screen, click on the down arrow and select one of the GPU options.

Go to the "File" menu in the upper right, and select "Open from Path".

Enter the path to your Nilmtk data files and click the blue Open" button. Example, /mnt/research/jsmith

On the Launcher window on the right side of the screen, click on "Nilmtk" in the Notebook section. This will start a new Jupyter notebook using the Nilmtk python virtual environment.

Running on the Command Line

Get a command shell on one of the GPU systems by running: srun -p gpu --gres=gpu:rtx3080:1 --pty /bin/bash

Once at the command line, invoke the Python virtual environment that contains the nilmtk software.

source /mnt/local/python-venv/nilmtk/bin/activate

Then you can run Python and import nilmtk:

python

import nilmtk

Ollama — Ollama AI engine for Large Language Models

Ollama

We are sharing a couple of very basic scripts to run Ollama on the HPC Cluster.

The interactive script can be found at /mnt/local/bin/run-ollama.

The batch submission script can be found at /mnt/local/bin/run-ollama-job.

Run Ollama interactively - run-ollama

Login to the HPC headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

The format to submit the job to the HPC Cluster is:

srun -p gpu --gres=gpu:(GPU-CARD):1 --mem=XG --pty run-ollama MODEL

GPU-CARD should be rtx2080, rtx3080, rtx5090, a100, or one of the GPU cards listed on the Submit and Manage Jobs on the HPC Slurm Cluster in the Related Articles section

X should be the amount of CPU memory to use, like 32G

MODEL is the LLM model to use, like llama3

For example, to request 32 Gb of CPU memory, an NVidia RTX 3080 card, and to run the llama3 model:

srun -p gpu --gres=gpu:rtx3080:1 --mem=32G --pty run-ollama llama3

Models available to run can be found at: https://ollama.com/library

Run Ollama as a batch job and query it from your program - run-ollama-job

Login to the HPC headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

The format to submit the job to the HPC Cluster is:

sbatch -p gpu --gres=gpu:(GPU-CARD):1 --mem=XG run-ollama-job MODEL MYPROGRAM

GPU-CARD should be rtx2080, rtx3080, rtx5090, a100, or one of the GPU cards listed on the Submit and Manage Jobs on the HPC Slurm Cluster in the Related Articles section

X should be the amount of CPU memory to use, like 32G

MODEL is the LLM model to use, like llama3

MYPROGRAM is the command to run your program, Python script, etc

For example, to request 32 Gb of CPU memory, an NVidia RTX 3080 card, and to run the llama3 model and run your Python program myprogram.py:

sbatch -p gpu --gres=gpu:rtx3080:1 --mem=32G run-ollama-job llama3 python myprogram.py

If you need more than 1 CPU core, you can request up to 3 CPU cores in the "gpu" partition:

sbatch -p gpu --gres=gpu:rtx3080:1 --mem=32G --cpus-per-gpu=3 run-ollama-job llama3 python myprogram.py

If you need more than 3 CPU cores, you can request 32 in the "mixed" partition, however this partition only has RTX2080 GPU cards currently:

sbatch -p mixed --gres=gpu:rtx2080:1 --mem=128G --cpus-per-gpu=32 run-ollama-job llama3 python myprogram.py

Sample Python program

A silly little Python program to test, named myprogram.py.

The model in the Python program needs to be pulled before it can be accessed. In this case we specify llama3 in the sbatch command, and the run-ollama-job script will pull it.

A test run with, "sbatch -p gpu --gres=gpu:rtx2080:1 --mem=32G run-ollama-job llama3 python myprogram.py"

import ollama



# Use the generate function for a one-off prompt

result = ollama.generate(model='llama3', prompt='What is the airspeed of an unladen swallow?')

print(result['response'])

Openmm — a toolkit for molecular simulation

OpenMM

OpenMM Home Page

OpenMM supports running on both CPU and GPU systems.

Running as an HPC Batch Job

Sample submit script named "myscript.sh" for the HPC Cluster:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate openmm



python (your OpenMM program).py

Replace "(your OpenMM program)" with the name of your Python program.

To submit to the HPC Grid using 8GB of memory on a CPU system:

sbatch --mem=8G myscript.sh

To use a GPU system and 32GB of CPU memory:

sbatch -p gpu --gres=gpu:rtx2080:1 --mem=32G myscript.sh

Running OpenMM-Setup using an HPC Desktop Session

OpenMM-Setup on the HPC is accessed via the Linux Graphical Desktop session through the HPC Web Portal.

Information on opening a Desktop session can be found at Use the HPC Web Portal (Open OnDemand) in the Related Articles section

Choose "HPC Desktop" or "HPC Desktop with a GPU" as appropriate.

Once you are at the Linux Desktop, click on the "Applications" menu in the top left, then click on "System Tools", then select "MATE Terminal".

In the terminal window, run run-openmm-setup.

A web browser should appear with the OpenMM Setup program shown.

Openmolcas — Quantum chemistry software package

OpenMolcas

At the Linux prompt.

module load python3.7

module load openmolcas

pymolcas

Installed in /mnt/local/openmolcas
OpenMolcas Web Page
Molcas Forum

Sample submit script for HPC Grid:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m be



module load python3.7

module load openmolcas



export MOLCAS_NPROCS=$NSLOTS

export MOLCAS_MEM=2000

export WorkDir=/mnt/hpc/tmp/MYACCOUNT/openmolcas



pymolcas 900.input

Be sure to replace "MYACCOUNT" in both lines above with your actual Bowdoin account name. If your Bowdoin account is "jsmith", then it would look like:

#$ -M jsmith@bowdoin.edu -m be



export WorkDir=/mnt/hpc/tmp/jsmith/openmolcas

To submit to the HPC Grid requesting the use of 8 CPU cores, use:

qsub -pe smp 8 (name of script)

Orca — Quantum chemistry software package

Orca

At the Linux prompt.

module load orca

orca

Installed in /mnt/local/orca
Orca Web Site - Requires Registration
Orca Tutorials

To run on the HPC Cluster headnode:

For simple single CPU processing: hpcsub -cmd "module load orca; /mnt/local/orca/orca myinputfile.inp > myoutputfile.txt"

For SMP processing (8 cpu cores shown):

The myinputfile.inp should have "Opt PAL8" in the top line for 8 CPU cores, to match the "-n 8" for 8 cores requested on the Cluster.

Run the command:

hpcsub -N 1 -n 8 -cmd "module load mpi/openmpi-x86_64; module load orca; /mnt/local/orca/orca myinputfile.inp > myoutputfile.txt"

For 16 CPU cores:

"Opt PAL16" at the top of the myinputfile.inp, and run the command:

hpcsub -N 1 -n 16 -cmd "module load mpi/openmpi-x86_64; module load orca; /mnt/local/orca/orca myinputfile.inp > myoutputfile.txt"

Petsc — Portable, Extensible Toolkit for Scientific Computation

PETSc

PETSc (Portable, Extensible Toolkit for Scientific Computation), pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.
Installed in /usr/local/petsc
PETSc Web Page
Online Documentation

Note that to run PETSc code, which relies on MPICH2, it must be submitted via the Grid Batch system and not run interactively from the command line as shown [[linuxhelp:batchcluster#how_do_i_submit_parellel_jobs_to_the_cluster|on the Grid web page]].

An example script called "gridtest.sh" used to test the Petsc examples is:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M **(MY_BOWDOIN_EMAIL_ACCOUNT)**@bowdoin.edu -m be

#$ -o gridtest.out

export MPIEXEC_RSH=rsh

export PATH=/usr/local/mpich2/bin/:$PATH

mpiexec -rsh -nopm -n $NSLOTS -machinefile $TMPDIR/machines (FULL_PATH_TO_MY_HOME_DIRECTORY)/petsc/src/snes/examples/tutorials/ex19

mpiexec -rsh -nopm -n $NSLOTS -machinefile $TMPDIR/machines (FULL_PATH_TO_MY_HOME_DIRECTORY)/petsc/src/snes/examples/tutorials/ex5f

Be sure to change the header to reflect your e-mail account, and paths to where you have your code.

Submit it to the Grid by running a command similar to the following on the Grid headnode (moosehead):

qsub -pe mpich2 2 gridtest.sh

Phyluce — Software for analyzing data collected from ultraconserved elements in organismal genomes

Phyluce

Installed in /mnt/local/miniconda2/bin
Phyluce Home Page
Phyluce Software Repository on Github
To run Phyluce interactively, you must first type the command "module load phyluce" to setup the environment.
To run on the HPC Grid, create a script containing your Phyluce commands.

An example script called "phylucetest.sh" might look like:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M **(MY_BOWDOIN_EMAIL_ACCOUNT)**@bowdoin.edu -m be



module load phyluce



# Place your commands below this line

for i in *;

do

phyluce_assembly_get_fastq_lengths --input $i/split-adapter-quality-trimmed/ --csv;

done

Be sure to replace "**(MY_BOWDOIN_EMAIL_ACCOUNT)**" with your actual Bowdoin account name.

The "export PATH" line MUST be present for Phyluce to work properly.

Submit it to the Grid by running a command similar to the following on the Grid headnode (moosehead):

qsub phylucetest.sh

Picard — A set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats.

Picard

To run with the default 2 GB of memory

picard

Installed in /mnt/local/picard
Picard Web Page

If you need more than 2GB of memory, copy /mnt/local/picard/picard.sh into your own directory, edit as desired, and run picard from your own copy of the script instead of the default one, ie "./picard.sh".

Pimass — Performs genome-wide joint analysis of all SNPs in association with a phenotype

Pimass

At the Linux prompt to view the command line options

pimass -h

Installed in /usr/local/pimass
Pimass Web Page
Pimass Manual

Ploticus — A software package for producing plots, charts, and graphics from data.

Ploticus

To run Ploticus at the command line, run module load ploticus, then "pl" at the Linux prompt.
To run Ploticus in batch mode on our Grid cluster, login to the head node of the Grid (moosehead), cd to the directory containing your data file(s), and run plsub (filename) (options) where (filename) is the name of your data file, and (options) is whatever options you wish to pass to the Ploticus program.
Ploticus Home Page

Polymake — Open Source software for research in polyhedral geometry

Polymake

Polymake is available via the HPC Web Portal within a Linux Desktop Session, and via an HPC Batch job
Polymake Home Page

Running Polymake Interactively

Open an HPC Linux Desktop Session (details at Use the HPC Web Portal (Open OnDemand) in the Related Articles section)

Within the Linux Desktop Session, go to the Applications menu in the upper left, select Bowdoin, then Polymake

Running a Polymake script as an HPC batch job

Create a Slurm submit script called myscript.sh that looks similar to the following, replacing "scriptfile" with the filename of your Polymake script, and ARG1, ARG2, etc with your values:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load polymake

polymake --script scriptfile ARG1 ARG2

From the HPC login node, cd into the directory containing your Polymake script, and submit this to the HPC Cluster with the sbatch command:

sbatch myscript.sh

To request more memory, you can use the --mem option. For example, to request 16G of memory, do:

sbatch --mem=16G myscript.sh

Porechop — a tool for finding and removing adapters from Oxford Nanopore reads.

Porechop

To run the software

/mnt/local/porechop/porechop-runner.py

Installed in /mnt/local/porechop
Porechop Web Page

Pplacer — Places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood

Pplacer

At the Linux prompt.

pplacer

Pplacer Web Page
guppey and rppr also installed

Proj — Cartographic Projections Library

PROJ

At the Linux prompt.

proj

PROJ Web Page

Pyferret — an analysis environment for analyzing large and complex gridded data sets

PyFerret

To run from the Linux command line

module load pyferret

pyferret

FerretDatasets are Installed in /mnt/local/FerretDatasets
Installed in /mnt/local/pyferret
PyFerret Web Page

Please note that pyferret requires Python 3.6.

To run in Jupyterhub, open a Python 3.6 notebook, then put the following at the top of your code:

import sys

sys.path.append('/mnt/local/pyferret/lib/python3.6/site-packages')

import pyferret

pyferret.addenv(FER_DIR='/mnt/local/pyferret', FER_DAT='/mnt/local/FerretDatasets')

pyferret.start(journal=False, quiet=True, unmapped=True)

%load_ext ferretmagic

Then you can run pyferret:

for i in [100,500,1000]:

%ferret_run -s 400,400 'plot sin(i[i=1:%(i)s]*0.1)' % locals()

To run on the HPC Grid, you would create a submit script called myscript.sh that looks like:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m b -m e



module load pyferret

pyferret (options)

Don't forget to replace MYACCOUNT with your actual Bowdoin account name.

Submit to the HPC Grid on machine moosehead with "qsub myscript.sh"

Python — Python versions 3.11 and 3.12

Python 3.12

Python 3.12 is the default Python environment on our systems as of Aug 2024.

Python 3.11

Python 3.11 is also available. run module load python3.11 at the command line to prep the environment, then you can run "python".

Running Python on the HPC Cluster

To run Python 3.11 on the Slurm HPC Cluster, create a job script named myscript.sh that looks like this example, and place it in the directory with your Python files.

For Python 3.12, remove the "module load python3.11" line.

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load python3.11



# Replace the following command with your own - this is an example

python (your-python-file.py)

Login to the HPC Cluster headnode (or get shell access through the HPC Web Portal, Clusters menu, "Slurm HPC Cluster Shell Access")

cd to the directory containing your Python files.

Submit it to the Slurm Cluster with:

sbatch myscript.sh

Qiime2 — A next-generation microbiome bioinformatics platform

Qiime2

At the Linux prompt.

use-qiime2

qiime

Qiime2 Web Page

"use-qiime2" sets up the environment so Qiime2 can run. You need to type this once at the start of your login session.

Once you have setup the environment, you then run qiime to run the software.

Qmcpack — High-performance electronic structure code that implements numerous Quantum Monte Carlo (QMC) algorithms

Qmcpack

To enable, then "qmcpack" at the Linux prompt.

module load qmcpack

qmcpack

Installed in /mnt/local/qmcpack
Qmcpack Web Page and Documentation

A sample HPC submit script called myscript.sh might look like:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

module load mpi/openmpi-x86_64

export OMP_NUM_THREADS=1

module load qmcpack

mpiexec -n $SLURM_NPROCS qmcpack input.file

To run on 8 CPU cores, submit to the Slurm HPC Cluster with:

sbatch -n 8 myscript.sh

Qualimap — Facilitate the quality control of alignment sequencing data and its derivatives like feature counts

Qualimap

At the Linux prompt

module load qualimap

qualimap

Installed in /mnt/local/qualimap
Qualimap Web Page

Quantumexpresso — An integrated suite of Open-Source software for electronic-structure calculations and materials modeling at the nanoscale

QuantumExpresso

Then run the desired command at the Linux prompt.

module load quantumexpresso

Installed in /mnt/local/qe
QuantumExpresso Web Page and Documentation

QuantumExpresso is setup for OpenMPI computations, and should be submitted as a job on the HPC Grid with at least 2 CPU cores for best results. Please see [[linuxhelp:Batchcluster##OpenMPI_and_MPICH|this page]] for information about running OpenMPI jobs.

Please note that there are reported issues when trying to run QuantumExpresso with more than 50 CPU cores.

A sample HPC submit script called myscript to run on 4 CPU cores might look like:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYACCOUNT@bowdoin.edu -m b -m e



module load quantumexpresso

mpiexec -np $NSLOTS neb.x -nk 1 -nd 1 -nb 1 -nt 1 -inp H2+H.in > H2+H.out

Don't forget to replace MYACCOUNT with your actual Bowdoin account name.

To run with 4 CPU cores, use:

qsub -pe openmpi 4 myscript

R — Software environment for statistical computing and graphics

R Project Home Page

We have four ways of running R; via [[linuxhelp:rstudio|Rstudio Workbench]], [[linuxhelp:jupyterhub|Jupyterhub]], the free Rstudio Desktop, and batch submissions to the HPC Cluster.

[[linuxhelp:rstudio|Rstudio Workbench]] runs on a small standalone server and is essentially for class use. It is not suitable for running large computationally intensive or long term jobs.

RStudio Desktop is available through the HPC Web Portal Linux desktop sessions. This is suitable for small to large jobs (that need more computational resources) and are short term in duration (< 24 hours).

[[linuxhelp:jupyterhub|Jupyterhub]] is suitable for small to medium jobs that are short term in duration (< 24 hours).

R via batch jobs on the HPC Cluster is suitable for large, long term computationally intensive jobs.

This page describes how to use R from the Linux command line and submit jobs to the HPC Cluster.

To submit single threaded (uses 1 CPU core) R jobs to the HPC Cluster, use the "hpcsub" command on the HPC Cluster headnode.

For example, if the R input file is named mycode.r, type

hpcsub -cmd R CMD BATCH mycode.r

Some R libraries can run in parallel using multiple threads on the same machine (more than 1 CPU core).

For example, to submit "mycode.r" using 8 CPU cores:

hpcsub -N 1 -n 8 -cmd R CMD BATCH mycode.r

Note that within the R program, you must specify the same number of CPU cores that you are requesting in the "hpcsub" command.

If you need more memory to run your job, you can use the "--mem" option. For example, to request 40 Gb of memory:

hpcsub --mem=40G -cmd R CMD BATCH mycode.r

You can also create job scripts if you have more complicated job runs.

Here is a sample HPC job script with the filename called myscript.sh:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



R CMD BATCH mycode.r

Job scripts are submitted to the Slurm Cluster with the command "sbatch myscript.sh" from the HPC Cluster headnode. You can also use the same above options for requesting CPUs, memory, etc, like "sbatch --mem=40G myscript.sh".

Rstudio — Web based development environment for R

RStudio

Bowdoin offers two options for accessing RStudio; RStudio Workbench, and RStudio on the HPC Cluster
RStudio Workbench is recommended for classes running small jobs, and can be found at "https://rstudio.bowdoin.edu".
RStudio on the HPC is available via the Linux Graphical Desktop session, and is recommended for larger jobs needing lots of memory and CPU cores
RStudio Web Page

RStudio is a web based, graphical development environment for writing and running R code.

RStudio Workbench

Please note if you are off campus, you will need to connect to the Bowdoin VPN service before you can access the RStudio Workbench web page. Please go to this page in the Bowdoin IT Knowledgebase for information about setting up the VPN on your computer/iPad.

To access RStudio Workbench, use the web browser on your computer or iPad to go to "https://rstudio.bowdoin.edu" and login with your Bowdoin login account. Note, do not type "@bowdoin.edu" after your login account. If your account is "jsmith", then just run jsmith with nothing after it.

RStudio on the HPC Cluster

RStudio on the HPC is accessed via the Linux Graphical Desktop session through the HPC Web Portal.

Information on opening a Desktop session can be found at Use the HPC Web Portal (Open OnDemand) in the Related Articles section

Once you are at the Linux Desktop, click on the "Applications" menu in the top left, then click on "Bowdoin", then select "RStudio".

It may take 15-30 seconds for the RStudio application to open.

iPads

Using the iPad to Upload Files into RStudio Workbench

You might want to use a file provided by a faculty member or other source, and are curious how you might do that from an iPad. Here's how:

Using the Safari web browser on the iPad, go to the web page that has the file on it.

Click and hold on the link to the file (hold your finger on it) until a pop up menu appears.

Select "Download Linked File". The file is now downloaded to your iPad.

In the lower right hand corner of the screen, click on the "Files" menu.

Click on the "Upload" button.

Under the "File to Upload" option, click on the "Browse" button.

You should see a list of the files that have been downloaded to your iPad. Click on the one you wish to upload.

Click on the "OK" button. The file should now upload.

Sometimes the iPad adds a ".txt" ending to the file which will need to be removed before we can run it.

Click on the square box to the left of the filename, then click on the "Rename" above and slightly to the right of the file list. Click in the text field and remove the ".txt" ending, then click the blue "OK" button.

Rgt — Regulatory Genomics Toolbox

RGT

Regulatory Genomics Toolbox
RGT Web Page

Interactive Use

To run RGT interactively:

source /mnt/local/python-venv/RGT/bin/activate

then you can run the various RGT commands.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



source /mnt/local/python-venv/RGT/bin/activate

(your RGT commands here)

Rdseed — reads and interprets Standard for Exchange of Earthquake Data (SEED) files

rdseed

Located in /mnt/local/rdseed
rdseed Web Page

To run the software (this points to "/mnt/local/rdseed/rdseed.rh6.linux_64")

rdseed

Radtools — A pipeline for transforming Illumina reads into candidate genetic markers

RADtools

RADtools consists of four commands, "RADmarkers", "RADMIDS", "RADpools", and "RADtags"
Installed in /usr/local/radtools
RADtools Web Page
Please refer to the RADManual.pdf file found on the RADtools Web Page for information on how to run and use this software

Raxml — A Tool for computing TeraByte Phylogenies

RaxML

There are two versions of RaxML available, the "light" version, and the SMP parallel version.

For the "light" version:

At the Linux prompt.

raxmlLight

Example usage "raxmlLight -m GTRCAT -s /mnt/local/raxml/examples/dna.phy -t RaxML_parsimonyTree.startingTree -n TreeInference"
Installed in /mnt/local/raxml
RaxML Web Page

For the SMP version, which is mostly only useful within the HPC Grid:

Online Manual

Create an HPC Grid submit script called myscript.sh. An example using 8 CPU cores would be:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MYBOWDOINACCOUNT@bowdoin.edu -m b -m e



raxmlHPC-PTHREADS -T 8 (rest of raxml options)

Submit to the Grid from machine moosehead with

qsub -pe smp 8 myscript.sh

Please note the "8" after the -T option to raxml, and the matching "8" in the qsub command SMP option. These two numbers must match or the HPC Grid will likely kick the job back.

Rcorrector — a kmer-based error correction method for RNA-seq data

rCorrector

To run

rcorrector

Installed in /mnt/local/rcorrector
rCorrector Web Page

Sac — a general purpose interactive program designed for the study of sequential signals

Sac

To load the Sac module, only needed once per login session

module load sac

Installed in /mnt/local/sac
Sac Web Page

For example, to run "sac" or any of its related programs, you would first:

module load sac once in your login session

then type:

sac

Sage — An open source alternative to Magma, Maple, Mathematica and Matlab

Sage

At the Linux prompt.

module load sage

sage

Installed in /mnt/local/sage
Sage Web Page and Documentation
Also available within About High-Performance Computing (HPC) at Bowdoin in the Related Articles section. Please note that it can take 30 seconds or more for the software to initialize and be ready to run after you open a Sage notebook.

You can safely ignore warnings similar to "Not using mpz_powm_sec. You should rebuild using libgmp >= 5 to avoid timing attack vulnerability." Sage tech support says this is not an issue (http://ask.sagemath.org/question/25183/powminsecurewarning-how-do-i-rebuild-using-libgmp-5/)

Sambamba — Process your BAM data faster

Sambamba

At the Linux prompt

module load sambamba

sambamba

Installed in /mnt/local/sambamba
Sambamba Web Page

Samtools — SAM tools provide efficient utilities on manipulating alignments in the SAM (Sequence Alignment/Map) format.

SAM Tools

Then you can use the samtools commands, "samtools", "tabix", etc.

module load samtools

Installed in /mnt/local/samtools
SAM Tools Web Page

Scala — Programming language designed to express common programming patterns in a concise, elegant, and type-safe way

Scala

At the Linux prompt.

module load scala

scala

Scala Web Page

Sclvm — a modelling framework for single-cell RNA-seq data

scLVM

scLVM Web Page

Seadas — Image Analysis Package for Ocean Color Data

SeaDAS

To run graphical, interactive SeaDAS, simply run module load seadas, then "seadas" at the Linux prompt.
SeaDAS Web Page

Seqkit — a cross-platform and ultrafast toolkit for FASTA/Q file manipulation

SeqKit

At the Linux command prompt

seqkit

Installed in /mnt/local/seqkit
SeqKit Web Page

Seqtk — a fast and lightweight tool for processing sequences in the FASTA or FASTQ format

Seqtk

Seqtk Web Page

Interactive Use

To run Seqtk interactively:

module load seqtk

then you can run "seqtk".

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load seqtk

seqtk (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Singularity — HPC Container software, works with Docker images

Singularity

Singularity is a container solution designed around HPC environments. The primary use on our systems is to allow people to run pre-built Docker containers on the HPC Systems (both Interactive systems like dover as well as the Grid). Currently we do not support the creation of containers within the HPC systems, however if you have Docker or Singularity setup on your workstation you can create the container there and transfer it to the HPC environment.

For a quick command summary.

singularity --help

Quick example:

Download the "lolcow" Docker image from Docker Hub, and build a Singularity image from it:

singularity build lolcow.sif docker://godlovedc/lolcow

This creates an executable file named "lolcow.sif".

To run at the command line on the current machine,, run:

./lolcow.sif

To submit to the HPC Grid when logged into moosehead, type:

hpcsub -cmd lolcow.sif

Slang — a multi-platform programmer's library designed to allow a developer to create robust multi-platform software

S-Lang

Once during the login session, then "slsh" at the Linux prompt.

module load slang

slsh

Installed in /mnt/local/slang
S-Lang Web Page

The SLIRP package is also installed in the S-Lang directory.

Slxfig — S-Lang package that produces plots, drawings, etc in a variety of formats

SLXfig

Installed in /mnt/local/slxfig
SLXfig Web Page

Snakepipes — simplify the analysis of NGS data

Snakepipes

Snakepipes Web Page

Installing your own copy of snakepipes

We are using an account name of ***jsmith*** below.

Please change this to your account name in the examples given and remove the asterisks ***.

Change directory to your home directory:

cd

Remove the old .condarc file if you have one. It is okay if this returns "rm: cannot remove: No such file or directory"

rm .condarc

Remove the old .conda directory if you have one. It is okay if this returns "rm: cannot remove: No such file or directory"

rm -rf .conda

logout of the Linux machine, and then log back in to clear any old Conda settings in your account.

Change directory to your research space, for example:

cd /mnt/research/mpalopol/students/***jsmith***/

Create the package directory:

mkdir pkgs

Clear the PYTHONPATH variable:

unset PYTHONPATH

Install miniconda:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

bash Miniconda3-latest-Linux-x86_64.sh

Hit the enter key to continue, and the space bar to scroll through the license agreement.

At the "Do you accept the license terms?" prompt.

yes

Type in the directory of your research space followed by miniconda3 for the install location:

/mnt/research/mpalopol/students/***jsmith***/miniconda3

To conda init.

no

Enable miniconda:

source /mnt/research/mpalopol/students/***jsmith***/miniconda3/bin/activate

Set some Conda settings:

conda config --set auto_activate_base false

conda config --set safety_checks disabled

conda config --add pkgs_dirs /mnt/research/mpalopol/students/***jsmith***/pkgs

Create the snakepipes conda environment:

conda update --yes conda

conda install --yes python-Levenshtein

conda install --yes mamba -c conda-forge

mamba create --yes -n snakePipes -c mpi-ie -c conda-forge -c bioconda snakePipes

You've now installed snakepipes. To use in the future, you would run the following commands to setup the environment:

unset PYTHONPATH

source /mnt/research/mpalopol/students/$USER/miniconda3/bin/activate

conda activate snakePipes

For example, to run on the HPC Cluster, your submit script would look like:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL

unset PYTHONPATH

source /mnt/research/mpalopol/students/$USER/miniconda3/bin/activate

conda activate snakePipes

snakePipes createEnvs

Sratoolkit — Enables reading of sequencing files from the SRA database and writing files into the .sra format

SRA Toolkit

To enable the commands, then "fastq-dump", etc.

module load sratoolkit

Installed in /mnt/local/sratoolkit
SRA Toolkit Web Page

May need to run "vdb-config -i" to set the download path before the software will work.

Sample HPC Submit script

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load sratoolkit



fastq-dump --stdout -X 2 SRR390728

Stacks — A software pipeline for building loci from short-read sequences

Stacks

Installed in /mnt/local/stacks-1.37 and /mnt/local/stacks-2.2
Stacks Web Page
Stacks Manual
Please reference the manual for information on how to run the command line version of Stacks

To enable Stacks 1.37, type module load stacks-1.37

To enable Stacks 2.2, type module load stacks-2.2

Once enabled, you can then type the Stacks command you wish to run (ie, sstacks, ustacks, etc).

Stampy — Mapping of short reads from illumina sequencing machines onto a reference genome

Stampy

At the Linux prompt to view the command line options

stampy.py

Installed in /usr/local/stampy
Stampy Web Page

Star — Spliced Transcripts Alignment to a Reference

Star

Star Web Page

Interactive Use

To run Star interactively:

module load star

then you can run the various commands, such as "STAR", "STARlong", etc.

Please note the program name is in all upper case - STAR.

Running on the HPC Grid

If you are submitting to the HPC Grid, your submit script would look something like this:

#!/bin/bash

#$ -cwd

#$ -j y

#$ -S /bin/bash

#$ -M MY-BOWDOIN-ACCOUNT-NAME@bowdoin.edu -m be



module load star

STAR (args)

Please remember to replace "MY-BOWDOIN-ACCOUNT-NAME" with your actual Bowdoin login account name.

Svn — An open source version control system

Subversion

We offer a Subversion service to the Bowdoin community. If you are interested in using this service, please contact us so that we can setup your account for access.

At the Linux prompt for quick options.

svn --help

At the Linux prompt for additional documentation.

man svn

Subversion on-line book This is an excellent resource for learning about Subversion.
Please note that to access the SVN service from off campus, you will need to establish a VPN connection to the campus network first.

Once your account has been setup for access, the general URL for using SVN is in the format "https://repo.bowdoin.edu/svn/(login_account_name)/(repository_name)".

If your login account is "jsmith", and you have multiple projects named "project1", "project2", and "project3", you would use:

https://repo.bowdoin.edu/svn/jsmith/project1

https://repo.bowdoin.edu/svn/jsmith/project2

https://repo.bowdoin.edu/svn/jsmith/project3

To import your files into SVN for the first time, you would:

cd into the directory containing your files
svn import https://repo.bowdoin.edu/svn/(login_acct_name)/(repository_name) -m "(comment)"

where (login_acct_name) is your Bowdoin login account, (repository_name) is the name you want to call the subdirectory in Subversion, and (comment) is a comment you wish to add to the operation.

It will ask you for your Bowdoin login and password, and may also ask you to verify the security certificate if this is the first time you've ever connected to the SVN service.

To check out the files:

svn checkout https://repo.bowdoin.edu/svn/(login_acct_name)/(repository_name)

which will create a local subdirectory named "repository_name", and place the files in that subdirectory.

Supermongo — Interactive Plotting Program

SuperMongo

At the Linux prompt.

module load supermongo

sm

Installed in /usr/local/sm
SuperMongo Web Page

Tesseract — an open source text recognition (OCR) Engine

Tesseract

To enable the commands, then "tesseract", "cntraining", etc.

module load tesseract

Installed in /mnt/local/tesseract
Tesseract Web Page

You will need to download training data into your own account before you can use the Tesseract software.

Detailed instructions can be found at https://tesseract-ocr.github.io/tessdoc/Compiling-%E2%80%93-GitInstallation.html#post-install-instructions

Look at the section that starts with "If you want to put the traineddata files in a different directory".

If you are only going to run this within the HPC Cluster, you don't need to edit the ".bashrc" file, just put the "export TESSDATA_PREFIX" command in the HPC job script as shown below.

Essentially, you need to place the training data into a directory that you own, and then use the "TESSDATA_PREFIX" environment variable to point to this location.

For example, with training data located in directory "/mnt/research/myspace/training-data", you would use:

export TESSDATA_PREFIX="/mnt/research/myresearchspace/tessdata"

Sample HPC Submit script

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load tesseract



export TESSDATA_PREFIX="/mnt/research/myresearchspace/tessdata"



tesseract (your options here)

Trilinos — Software framework solving of large-scale, complex multi-physics engineering and scientific problems

Trilinos

Installed in /mnt/local/trilinos
Trilinos Web Page
Github Page

Tophat — a fast splice junction mapper for RNA-Seq reads

Tophat

Installed in /usr/local/tophat
Tophat Web Page
Invoke with "tophat"

Torchmd — a simple to use API for performing molecular dynamics using PyTorch

TorchMD

Please note this software does not currently work

TorchMD Home Page

Running as an HPC Batch Job

Sample submit script named "myscript.sh" for the HPC Cluster:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate torchmd



python (your TorchMD program).py

Replace "(your TorchMD program)" with the name of your Python program.

To submit to the HPC Grid using 8GB of memory on a CPU system:

sbatch --mem=8G myscript.sh

To use a GPU system and 32GB of CPU memory:

sbatch -p gpu --gres=gpu:rtx2080:1 --mem=32G myscript.sh

Tracer — Graphical tool for visualization and diagnostics of MCMC output

Tracer

At the Linux prompt.

tracer

Tracer Web Page
Companion program to the Beast software

Tracker — A video analysis and modeling tool built on the Open Source Physics (OSP) Java framework.

Tracker

At the Linux prompt to run the GUI.

module load tracker

tracker.sh

Installed in /mnt/local/tracker
Tracker Web Page and Documentation

To run the interactive GUI, login to one of the HPC Interactive machines like dover, foxcroft, or pauling[.bowdoin.edu]

module load tracker

tracker.sh

This should open the GUI.

Transdecoder — identifies candidate coding regions within transcript sequences

Transdecoder

Then you can run TransDecoder.LongOrfs, TransDecoder.Predict, etc.

module load transdecoder

TransDecoder.LongOrfs

TransDecoder.Predict

Installed in /mnt/local/transdecoder
Transdecoder Web Page

Trimmomatic — A flexible read trimming tool for Illumina NGS data

Trimmomatic

Installed in /mnt/local/trimmmomatic
Trimmomatic Web Page

Trimmomatic is a Java application, and is run like:

"java -jar /mnt/local/trimmomatic/trimmomatic.jar" followed by your processing arguments. Examples are given on the Trimmomatic web page linked above.

Trinity — assembles transcript sequences from Illumina RNA-Seq data

Trinity

At the Linux prompt.

module load trinity

trinity

Installed in /mnt/local/trinity
Trinity Web Page

Sample HPC Submit script

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load trinity



Trinity (your arguments here)

Vcflib — A C++ library for parsing and manipulating VCF files

VCFlib

Installed in /mnt/local/vcflib
VCFlib Web Page

To enable.

module load vcflib

Vcftools — Provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

VCF Tools

Installed in /mnt/local/vcftools
VCF Tools Web Page

To enable.

module load vcftools

Velvet — Sequence assembler for very short reads

Velvet

And "velveth" at the Linux prompt to view the command line options

velvetg

velveth

Installed in /usr/local/velvet
Velvet Web Page

Winpca — A package for windowed principal component analysis

WinPCA

WinPCA Home Page

Running as an HPC Batch Job

Sample submit script named "myscript.sh" for the HPC Cluster:

#!/bin/bash

#SBATCH --mail-type=BEGIN,END,FAIL



module load miniconda3

source /mnt/local/miniconda3/etc/profile.d/conda.sh

conda activate winpca



winpca (your arguments here)

Replace "(your arguments here)" with options appropriate to your analysis.

An example to submit to the HPC Grid using 16GB of memory:

sbatch --mem=16G myscript.sh

Xmmsas — a collection of tasks, scripts and libraries, specifically designed to reduce and analyze data collected by the XMM-Newton observatory

XMM-SAS

XMM-SAS Web Page

XMM-SAS requires HEASoft, so first you must initialize HEASoft by running:

source $HEADAS/headas-init.sh

Then you can initialize XMM-SAS by running:

source /mnt/local/xmm-sas/xmmsas_20230412_1735/setsas.sh

Then you will need to define SAS_CCFPATH to point to the directory where you have stored the Calibration Files.

export SAS_CCFPATH=(path to your Calibration Files)

Once you have chosen the XMM-Newton Observation to analize, SAS_CCF and SAS_ODF must be defined specifically for it.

export SAS_CCF=(path to CCF files)

export SAS_ODF=(path to ODF files)

Image Manipulation and Viewing

GIMP — the GNU Image Manipulation Program

Gimp

At the Linux prompt.

gimp

Installed in /usr/bin/gimp
Gimp Web Page

Editing and Text Processing

LaTeX — document preparation system for typesetting

LaTeX

At the command prompt.

latex

LaTeXe help 1.4

Please note that the default paper size in Latex and Dvips is A4. To set the default paper size to "US Letter", type the following commands:

texconfig dvips paper letter

texconfig xdvi paper us

You only need to type these two commands once. They will write a config file in your home directory that will keep the default at "US Letter".

To convert a Latex file to postscript:

latex (filename).tex

dvips (filename).dvi -o (filename).ps

The .dvi file is produced from the latex command, which you then pass to the dvips command. (filename) is the name of your file.

OpenOffice — office productivity suite

OpenOffice

ooffice

At the Linux prompt.
OpenOffice Web Page

Additional editing and text processing tools

OpenOffice — office productivity suite
Emacs — extensible text editor
gedit — GUI text editor
LaTeX — document preparation system for typesetting
evince — PDF and PostScript viewer
GhostView (gv) — PostScript viewer

Additional Help

If you need further assistance or want to request new software, you have several options:

Bowdoin Bot: Chat with Bowdoin Bot directly from any KB page for instant answers.
Phone: Call the Bowdoin College Service Desk at (207) 725-3030.
In person: Visit the Tech Hub in Smith Union during business hours.
Submit a ticket: Request assistance through the Service Catalog.

Additional Resources

AI-assisted content: This article was drafted with the assistance of an AI writing tool and reviewed by Bowdoin IT staff for accuracy.

Details

Article ID: 173088

Created

Thu 5/14/26 1:48 PM

Modified

Thu 5/14/26 3:12 PM