X

Information for Current Faculty and Students

This page contains information to assist faculty with getting started with High Performance Computing (HPC).  To begin, faculty members should send an email to hpcadmins@memphis.edu.

Getting Started

The HPC system uses your normal University of Memphis credentials to authenticate so you log in with your UofM username (UUID) and password - the same credentials you use for University email, UMdrive access, etc. However, an account still needs to be created for you on the HPC system before you can log in. To do that, please make a ticket to request an account (Self Service->Research and HPC Software->HPC Account) with your University username (UUID), full name and department and, if a student, your advisor's name. To receive an HPC account, you must be an enrolled student, faculty or staff member, or conducting research affiliated with a department within the University of Memphis. If the latter case applies, then we must have verification from the department which will sponsor your time on our resources and we may discuss this on a case by case basis.

Logging In

Once you have acquired an HPC account, you may open up a s(ecure)sh(ell) to the login node by opening up a Terminal window (Unix/Linux/Mac) and submitting the following command:

ssh [username]@hpclogin.memphis.edu

Windows users will most likely need to install an application such as PuTTY, a SSH client which offers a graphical user interface where you can input the necessary login information, such as your username, password, and the the host name for our login node hpclogin.memphis.edu (or just hpclogin). An SFTP(Secure File Transfer Protocol) client such as WinSCP or Tunnelier will also be necessary for uploading to, and/or downloading from, your share on the cluster. Please read the paragraph directly below for information on how to connect to the cluster from off campus, keep in mind that if you are able to secure an off campus connection you will need to appropriately configure whichever third party application(s) you decide to use.
Please note, access to the cluster is restricted to connections within the University of Memphis's LAN (assuming that you are trying to ssh within the LAN, you can drop the ".memphis.edu" tag and merely use ssh [username]@hpclogin ).

If you wish to access the HPC from outside the University's LAN you must secure an account on a host within the LAN that does allow for an off campus connection and log into their server before logging into the HPC cluster. It is the users responsibility to secure an account on a host server outside the HPC. One method for students and faculty is the University's Virtual Private Network (VPN), which only requires your UUID and password. Detailed information regarding installation and usage, as well as a comprehensive FAQs, can be found at the University's VPN site.

Submitting Jobs

ALL jobs must be submitted to the job scheduler as either Batch (through a submission script) or Interactively.
Through the links below a number of batch templates are provided with explanations on how to modify them for use. A selection of optional commands are also available for customization

Batch Scripts (Templates for batch jobs with explanations on how to modify)

Interactive Submission (Without GUI and with GUI information)

Scheduler Control Commands (A selection of optional commands available for customization)

Python (How to use python on the cluster)

Matlab (How to use matlab on the cluster)

Best Practices

Jobs

  • The login nodes are meant for editing scripts and submitting jobs, and if a command takes longer than a few minutes to run, it is probably appropriate to run an interactive job with the SLURM srun or salloc commands.
  • The SLURM sbatch command submits batch job scripts.
  • The SLURM salloc command allocates resources for a job. Run exit or scancel commands to relinquish the allocation.
  • Make sure you define, with --ncpus-per-task and --ntasks-per-node, the appropriate number of threads for your workload.
  • Make sure you define, with --ntasks and --nodes, the appropriate number of nodes for your workload.
  • Make sure you define, with --mem-per-cpu, the appropriate amount of memory for your workload.
  • Use job names, with --job-name, that help you identify what jobs you are running.
  • Use the least amount of time, with --timelimit, you think your job will take. E-mail us with the jobId if you need more time.
  • The SLURM sacct command can provide resource usage and allocation details for completed jobs.
  • The SLURM squeue and sstat commands can provide resource allocation and usage details, respectively, for pending and running jobs.
  • The SLURM scontrol command can be used to update pending and running jobs' resources.
  • The SLURM scancel command can be used to cancel pending and running jobs.
  • Run a few small jobs that get progressively larger to determine appropriate resource, such as memory or time, sizes.
  • Over 200 other users run jobs on the cluster, and if your job doesn't start immediately, it will be in pending state until resources are available.
  • Don't modify or delete your jobs' files when your jobs' are running!
  • Try not to include the % character in your jobs. It is usually a control character in SLURM. I.e. %a, %j, and %A are all sbatch related.

Storage

  • We have 348 terabytes of storage shared among 200+ users, be considerate and use sparingly if you can.
  • If you have a lot of past job files you would like to keep, consider compressing them, tar -czf [jobFolder].tar.gz [jobFolder], or ask us about archive storage.
  • Cluster storage is for cluster jobs, not personal backup.
  • Storage in your home directory, /home/$USER/, is backed up regularly, but your scratch, /home/scratch/$USER/, is never backed up.
  • Previous cluster home directories are available on the cluster, /oldhome/$USER/, and in archive.
  • Consider using a self contained folder for jobs that include the job batch script, input data, and output data files.
  • Deleted files, using the rm command, are gone. If the deleted file has been in your home directory since the previous backup, contact us to retrieve them.
  • The file system doesn't maintain revisions, so files prior to modification are also gone. Contact us to retrieve files modified since the previous backup.
  • If you have millions of files, backups take a very long time. Consider using scratch, /home/scratch/$USER/, instead.
  • Files and directories names' containing a space usually need an escape sequence,  Hello\ HPC, or quotes, "Hello HPC".

Network

  • We have a very fast connection inbound and outbound from the cluster. Feel free to upload and download as much as you need.
  • Our very fast connection is only intended for cluster data. We will not host your server or database.
  • Our internal network is extremely fast, especially between nodes, but storage is the slowest component.
  • Our extremely fast internal network can still fill the storage in less than a day.

Software

  • Our operating system is CentOS 7.4, and we cannot run Windows, Mac, Ubuntu, etc only programs.
  • You can install software in your home directory. Generally, installs use the make toolchain:
    1. ./configure --prefix=$HOME
    2. make
    3. make install
  • If you need software that requires root/sudo, make a ticket under Self Service->Research and HPC Software->HPC Account.
  • In general, we will not install software that gives users root/sudo access or interferes with the scheduling software.
  • Learn what you need about Linux commands, but generally, you only need to know cd, ls, rm, mkdir, rmdir, scp/rsync, ssh, and vi/nano/emacs commands.
  • Linux programs are generally case sensitive, i.e. a and A are different, unlike Windows.
  • Be careful when using the $ character in scripts and command line because it usually indicates a variable. Use the escape sequence, \$threeFifty, or single quotes, '$threeFifty', instead.

Useful Links

Resources

Intel Compiler Directives for Fortran

Intel Preprocessor Options for C++

Links

Job Scheduler Control Commands and Environmental Variables

CUDA Documentation