Using the G2 cluster

Introduction

  • The G2 cluster is an Ubuntu 20.04 replacement for the graphite cluster.
  • For a researcher/research group to join/gain access to G2, the researcher/group must purchase an NFS server and a compute node.
    • Create a ticket via the help-ticket system to find out system requirements and to acquire quotes for the purchases.
  • Participating groups’ nodes are added to both member and private preemptive partitions (queues).  All participating member nodes are available for use by the member community via the low/medium-priority partitions; however, an individual group’s jobs have immediate preemptive priority reclaiming their node to their high-priority (private) partition (queue) as needed if in use by others.
  • If you are a member of a research group that has G2 research nodes, you may request an account via the help-ticket system.
    • Please explicitly include your research group information in the ticket.
    • cc your PI on the request.
  • Users log in with their Cornell NetID credentials from on-campus or via a Cornell VPN.
  • Assume that there are no backups of any data.  Some research groups may subscribe to EZ-Backup but most do not.

Directories

  • Home directories are automounted onto /home from the NFS server purchased by your research group
  • Data directories are automounted in /share/DATASERVER/export, where DATASERVER is a link to the NFS server purchased by your research group.
  • GPU servers contain “Scratch” directories to facilitate quickly moving data into GPU memory.
    • Scratch directories reside in /scratch.
      • This is a local directory on each GPU server and is not shared across the cluster.
      • There is a folder named /scratch/datasets that can hold common datasets and is usable by all groups.
      • The size of the /scratch directory depends upon the server upon which it resides.
      • Clean up of the /scratch space is the responsibility of the user who writes data to the /scratch partition.

Software

  • Slurm v20.11.8                                    (workload manager/job scheduler)
  • Anaconda3                                           (Python environment — /share/apps//anaconda3/2021.05 )
  • Singularity v3.7.0                               (Docker compatibility — /share/apps/singularity/3.7.0)
  • OpenMPI 4.1.0                                    (default MPI capability)
  • MATLAB R2021a                                (compile on login node for execution)
  • MOSH  
  • CUDA v11.4 || cuDNN v8.2.2
  • CUDA v11.2 || cuDNN v8.1.1           (Default CUDA installation)
  • CUDA v10.2 || cuDNN v8.1.1
  • CUDA v10.1 || cuDNN v8.0.5

Where is MATLAB and how do I use it on compute nodes?

Matlab is installed on the login node (g2-login.coecis.cornell.edu) in “/usr/local/MATLAB/R2021a” and it isn’t available directly on the compute nodes. Our recommendation and best practice for Matlab cluster use is to employ the MATLAB Compiler. This allows users to build Matlab executables and then later run the executable in a batch job without having Matlab locally installed. This is the case with a cluster environment where installing Matlab for every node is prohibitively expensive. The MATLAB compiler is included as one of the standard tool boxes that Cornell purchases.

Can I get Scipy (or some other tool) installed?

Anaconda3 is available in /share/apps/anaconda3/2021.05/. Use anaconda to build into your home directory the desired python environment and packages.
For other packages that would help you with your research and are available via academic licensing, open a help-ticket and we’ll work on getting it installed. We can potentially install commercial software if a license is purchased and honored.

Using G2

  • Log into the cluster via SSH at g2-login.coecis.cornell.edu using your Cornell NetID and password from on-campus or via a Cornell VPN.
    • You must submit jobs/processes to the cluster via the slurm scheduler.
    • The login node is strictly for logging into and submitting jobs to the scheduler. Any process that uses more than a minimal amount of resources is not allowed.
    • The login node is a shared resource with limited resources. Heavy use can prevent others from accessing the cluster.
  • All jobs, including interactive jobs, must be submitted to a specific partition (or queue).
    • Preemption order. Please submit your job to the lowest priority that is needed.
      • Low-Priority: For batch and interactive jobs requiring CPUs and/or GPUs, use the “default_partition” partition.
      • Medium-priority, GPU required: For batch and interactive jobs requiring GPUs, use the “gpu” partition.
        • This partition will preempt any jobs running on a GPU server which were submitted to the Low-Priority partition.
      • High-priority: for batch and interactive jobs requiring CPUs and/or GPUS, use the priority queue that belongs to your group.
        • Only the servers owned by the faculty to whom this priority partition belongs will be available through these partitions.
        • This partition will preempt any jobs running on any server (owned by faculty to which it belongs) which were submitted to the Low/Medium-Priority partitions.
  • To monitor your job, you can log into any node where you have a job running.
    • You are limited to the resources that your original job requested
      • Ex: If your original job asked for 1 GPU, that is all you will be able to see/use if you log into the node.
    • Your login will be canceled when your job ends.
    • NOTE: If you are running multiple jobs on different servers, this feature may not work as expected.
  • All interactive jobs submitted to the cluster are assigned to the interactive partition associated with the partition to which the job was submitted.
    • Ex: Submit an interactive job to the “default_partition” partition and it will be reassigned to the “default_partition-interactive” partition, with it’s associated “max time limit”.
    • Ex: Submit an interactive job to the priority partition “xxxxxx” and it will be reassigned to the “xxxxxx-interactive” priority partition, with it’s associated “max time limit”.
      • The “xxxxxx-interactive” partition will not preempt jobs submitted to the “xxxxxx” priority partition, but will preempt any jobs submitted to lower priority partitions.
      • For a priority interactive partition, the owner may request an extended “max time limit” for their own partition.

Create a SLURM Submission Script:

Example: test-gpu.sub
#!/bin/bash
#SBATCH -J test_file                         # Job name
#SBATCH -o test_file_%j.out                  # output file (%j expands to jobID)
#SBATCH -e test_file_%j.err                  # error log file (%j expands to jobID)
#SBATCH --mail-type=ALL                      # Request status by email 
#SBATCH --mail-user=NETID@cornell.edu        # Email address to send results to.
#SBATCH -N 1                                 # Total number of nodes requested
#SBATCH -n 1                                 # Total number of cores requested
#SBATCH --get-user-env                      # retrieve the users login environment
#SBATCH --mem=2000                          # server memory requested (per node)
#SBATCH -t 2:00:00                          # Time limit (hh:mm:ss)
#SBATCH --partition=default_partition       # Request partition
#SBATCH --gres=gpu:1080ti:1                  # Type/number of GPUs needed
/home/netid/server_name.sh 

Optional entries here can include:

#SBATCH --partition=default_partition              # Request partition for resource allocation
--partition specifies which partition the job should run on where <queue name> can be:
	default_partition
	<group name> - for example kilian or ramin

#SBATCH --gres=gpu:1                         # Specify a list of generic consumable resources (per node)
-–gres specifies a list of generic consumable resources (per node)
--gres=gpu:1080ti:1 means one gpu of type GeForce GTX 1080Ti
--gres=gpu:2 means two gpus of type any

Create a shell script to be run on the cluster (can be named anything you wish):

Example: test-datasets.sh           (Execute permissions should be set on this file using the command “chmod u+x test-datasets.sh”)
#!/bin/bash
/usr/bin/hostname

Submit the job:

sbatch --requeue test-gpu.sub

Scheduler Notes:

  • When submitting a job to either the “default_partition” or “gpu” partition, there is a possibility that a job may be preempted.
    • Use the switch “–requeue” with the sbatch command, and the job will be resubmitted if it is preempted.
  • In the job submission script
    • there must be no blank lines between the top of the file and the last “#SBATCH” line.
      • Blank lines in this area causes the #SBATCH commands following it to not be treated as they should be.
    • only use full path names to the file.
      • ex: /home/NETID/test.sh, not ./test.sh
    • Define the resources your job will need, such as:
      • Memory
      • CPU
      • GPU.
      • Partition desired
      • Maximum time limit for your job
  • It is important to tell the scheduler what resources the job will need.
    • The scheduler does not necessarily use the numbers given to control the job, but it makes sure that jobs will not be scheduled on nodes that CANNOT support them or that don’t have the resources requested available (if each job accurately requests the resources needed).
    • The following default resources are assigned for any job that does not specify specific resource needs:
      • 4 hour time limit
      • 1 cpu
      • 1G memory
      • default_partition
    • For any job, batch or interactive, that will need to run for more than 4 hours, please set a time limit close to what your job should actually take.
    • Singularity commands can be run from the sbatch/srun commands and should include all resources needed for any use the container is needed for.
  • It is also important to tell the application what resources it can use.
    • For example, if you do not limit a MATLAB job, it will use every core on every server that it is running on.
    • Please either request every core for the job, or tell MATLAB to limit its use.
  • The cluster scheduler is currently set up to kill a job that tries to use too much memory (more memory than the job asked for).
    • This behavior can be changed (see job submission script comments above), but please be mindful to properly set parameters before scheduling a job.

SLURM Commands:

  • srun                                                                     # When using srun, select the “interactive” partition.
  • squeue -l                                                             # Get list of your active or pending jobs.
  • scancel 183                                                         # Cancel an existing job, where 183 is the job number retrieved by the squeue command.
  • sinfo -o %G,%N,%P                                          # Get info on GPUs available, the nodelist they are on and the partition to use.
  • sinfo                                                                     # Get info on compute resources.

Starting a Jupyter notebook session (Tunneling the notebook):

Be sure the Anaconda environment is defined by adding the following line to your ~/.bashrc file or by typing it on the command line before the rest of this process is executed.

export PATH=/share/apps/anaconda3/2021.05/bin:$PATH

On the user’s local machine, create an SSH tunnel between the users’s local machine and NODE, on PORT (where NODE is the name of the server you plan to connect to and PORT is a single unused port number between 8000 and 10000). Replace all instances of PORT with the desired port #.

ssh netid@g2-login.coecis.cornell.edu -LPORT:NODE:PORT

On the G2 login node, start an interactive session to NODE (defined above) using one of “default_partition”, “gpu” or your priority partition. You need to specify the resources that your job needs, such as Memory, CPUs, GPUs and partition.

srun -p default_partition --pty --nodelist=NODE /bin/bash

Once logged into NODE (defined above), start jupyter-notebook for the first time (where /tmp/use_your_netid means /tmp/”Cornell NetID”)

XDG_RUNTIME_DIR=/tmp/use_your_netid jupyter-notebook --ip=0.0.0.0 --port=PORT

Open a browser on the user’s local machine using the string containing “127.0.0.1:” displayed by the jupyter-notebook command in the previous step. It will look similar to the following link.

http://127.0.0.1:PORT/?token=LONG_ALPHANUMERIC_STRING_FROM_JUPYTER-NOTEBOOK_OUTPUT