Menu Close
 

CAPS Campus Cluster Best Practices

  1. When you login into the campus cluster, you are in the “login” node. This login node should not be used to run memory intensive programs (jobs). You should be able to run a small less memory intensive program without any problem.
  2. What happens if you run a program that consumes a lot of memory in the login node?
    1. It will automatically be killed.
  3. Also, you should not be running multiple programs simultaneously on the login node. If you do so, they will all be killed automatically as well.
  4. To run a program that consumes a large amount of memory but for a short amount of time, you can use “interactive” jobs. You can open an interactive job like this:
    1. "srun --partition=caps --time=03:00:00 --nodes=1 --ntasks-per-node=16 --pty /bin/bash"
    2. If you do not want to remember this, you can set an environment variable in .bashrc as
      1. alias openintbash='srun --partition=caps --time=03:00:00 --nodes=1 --ntasks-per-node=16 --pty /bin/bash'
    3. Now all you must do is to type “openintbash” to open an interactive job.
  5. For other programs that will be run for a long time, you should submit jobs. Templates can be found in campuscluster documentation. You could also ask Srini (srinirag@illinois.edu) for some example scripts.
  6. Resource usage:
    1. Please try not to submit jobs that run for more than a day.
    2. You can run 40 jobs at a time. But be careful about the computing resources before submitting random jobs.
    3. Important:
      1. Please note that 40 jobs does not imply core limit.
      2. Please do not grab a lot of resources for more than a day. It is fine to grab maybe 60% of the resources for ~3 days.
  7. Disk usage: /projects/caps has a lot of storage. Your home directory has a limit of 5 GB. But note that if your scripts outputs any files and your home directory has more than 5 GB, then your script will fail. Type “quota” to see your storage.
  8. Some useful environment variables to set in .bashrc:
    1. alias sq='squeue --format="%.18i %.9P %.50j %.8u %.8T %.10M %.9l %.6D %R" -u [your_user_name]' #to show your queue
    2. alias capsstatus='sinfo --partition caps -N' ##to show the available CAPS nodes.