HPC Cluster Queues, Policies, and Hardware Overview

Summary

Reference information for the Bowdoin HPC Slurm cluster, including queue (partition) descriptions, job policies and resource limits, and a hardware overview suitable for grant proposals.

Body

Questions

  • What queues are available on the Bowdoin HPC cluster?
  • What are the resource limits for HPC jobs?
  • What is the maximum number of CPU cores I can request?
  • What is the maximum runtime for an HPC job?
  • How many jobs can I run at the same time?
  • What is the difference between the main, gpu, and highmem queues?
  • What hardware does the Bowdoin HPC cluster have?
  • How do I describe the HPC cluster in a grant proposal?

Environment

This article is a reference for Bowdoin faculty, students, and researchers using the HPC Slurm cluster. It covers cluster queues (also called partitions in Slurm), job policies, resource limits, and hardware specifications.

Resolution

Queue (Partition) Descriptions

The Bowdoin HPC Slurm cluster organizes resources into queues (Slurm calls these "partitions"). When submitting a job, you can specify which queue to use with the -p option. If you do not specify a queue, your job is submitted to the main queue by default.

  • main — the default queue. Provides access to standard compute nodes with up to 370 GB of RAM per node.
  • gpu — for jobs that require GPU computing. You must also specify the GPU type with the --gres option. See Use GPU and High-Memory Resources on the HPC Cluster in the Related Articles section.
  • highmem — for jobs that require more than 370 GB of RAM per node, up to 2 TB. Use the --mem option to specify the amount of memory needed.

Job Policies and Resource Limits

The following policies and limits apply to jobs submitted to the HPC cluster. These limits help ensure fair access to shared resources across the Bowdoin HPC community.

Note: Specific numeric limits for maximum concurrent jobs, maximum runtime per job, and maximum cores per user may change as the cluster is expanded or reconfigured. Run sacctmgr show qos format=Name,MaxWall,MaxTRESPerUser on the cluster headnode to see the current limits, or contact the IT Service Desk for the latest information.

General policies include:

  • Jobs are scheduled on a first-come, first-served basis.
  • Each job is assigned a priority based on submission time and resource request.
  • Email notifications are available for job start, completion, and failure (configured with #SBATCH --mail-type=BEGIN,END,FAIL in your job script).
  • Jobs that exceed their requested resources (memory, time) may be terminated by the scheduler.

Hardware Overview

The Bowdoin HPC Cluster consists of the following hardware (suitable for inclusion in grant proposals):

  • CPU cores: approximately 1,400 cores spread across multiple compute nodes
  • Compute nodes: ranging from 16 to 192 CPU cores and 192 GB to 2 TB of RAM per node
  • GPU cards: approximately 20 NVIDIA GPU cards, including RTX 3080, RTX 2080 Ti, RTX 5090, A100, and Blackwell Pro 6000 models
  • Networking: 2x100 GB low-latency Ethernet per node (200 GB aggregate)
  • Storage: dedicated, redundantly configured Gluster high-speed networked filesystem for temporary scratch storage
  • Operating system: Rocky Linux
  • Job scheduler: Slurm Workload Manager
  • Parallel processing: single-threaded, SMP (shared memory), and OpenMPI environments
  • Experimental: NVIDIA Grace Hopper integrated GPU-CPU system

HPC Environment Status

A live status dashboard showing the current state of HPC resources is available at hpc.bowdoin.edu/status.

Additional Help

If you need further assistance, you have several options:

  • Bowdoin Bot: Chat with Bowdoin Bot directly from any KB page for instant answers.
  • Phone: Call the Bowdoin College Service Desk at (207) 725-3030.
  • In person: Visit the Tech Hub in Smith Union during business hours.
  • Submit a ticket: Request assistance through the Service Catalog.

Additional Resources

 

 

AI-assisted content: This article was drafted with the assistance of an AI writing tool and reviewed by Bowdoin IT staff for accuracy.

Details

Details

Article ID: 173094
Created
Thu 5/14/26 2:05 PM
Modified
Thu 5/14/26 2:06 PM

Related Articles

Related Articles (3)

Bowdoin College provides a Linux-based High-Performance Computing (HPC) cluster for faculty, students, and researchers. The cluster offers approximately 1,400 CPU cores, GPU computing, up to 2 TB of RAM per node, and a variety of scientific software. This article provides an overview of HPC resources and how to get started.
Instructions for submitting, monitoring, and managing jobs on the Bowdoin HPC Slurm cluster. Covers writing job scripts, using sbatch and the hpcsub wrapper, running parallel processing jobs (SMP and OpenMPI), running interactive jobs, and controlling jobs with squeue and scancel.
Instructions for requesting GPU computing, high-memory nodes, and other specialized resources on the Bowdoin HPC Slurm cluster. Covers available NVIDIA GPU cards and request syntax, memory reservation options, mixed GPU and CPU jobs, and the experimental NVIDIA Grace Hopper system.