Robert Nicholas (ren10@psu.edu), Earth and Environmental Systems Institute, Penn State University

updated 29 May 2017

NOTE: This guide documents the ICS-ACI 3.1 system, which was brought online in late May 2017. ICS-ACI 1.0 has been retired is no longer accessible.

Systems and Services

SCRiM’s resources with ICS-ACI comprise a computing allocation of 400 “standard-memory” cores and a 400 TB storage allocation, as codified in a 5-year service level agreement (SLA) that ends on 31 January 2020. Services available to SCRiM researchers under this agreement include:

  • ICS-ACI 3.1 batch system (cluster) aci-b:
    • replacement for now-retired ACI 1.0 and legacy Lion-X systems
    • shell access via SSH: ssh -Y username@aci-b.aci.ics.psu.edu
    • time-averaged usage of up to 400 cores (averaged over a 90-day moving window) and instantaneous (a.k.a. “burst”) usage of up to 1600 cores with a guaranteed response time of one hour or less on the “kzk10_a_g_sc_default” queue (this is a change from ACI 1.0)
    • unlimited use of the “open” queue up to 20 cores (total, not per job) with a maximum wall clock time of 24 hours and a maximum of 100 queued jobs
  • ICS-ACI 3.1 interactive system aci-i:
    • for interactive sessions only; similar to the old Hammer system
    • only accessible via Exceed onDemand (no direct SSH access)
  • storage:
    • up to 10 GB in your home directory, /storage/home/<username>
    • up to 128 GB in your work directory, /storage/work/<username>
    • up to 400 TB in our group storage pool, /gpfs/group/kzk10/default, shared with all other SCRiM users (this is a change from ACI 1.0)
    • up to 1 million files in your scratch directory, /gpfs/scratch/<username>; this storage resource is intended for temporary files and is not backed up; files residing here for more than 30 days may be automatically deleted
    • all filesystems are accessible from both aci-b and aci-i
  • move files to and from these storage pools via special file transfer node datamgr.aci.ics.psu.edu using scp from the command line or a graphical SFTP client (note that transfers to/from woju should use hostname woju-rn.scrim.psu.edu to bypass a currently-unresolved network issue)

These systems are managed separately from other SCRiM computing resources (woju, mizuna, napa, RStudio Server, and the web-based download server), which are hosted by the Penn State Meteorology Computing Facility and documented here.

Getting an Account

The above resources are available to all SCRiM researchers, including non-PSU SCRiM participants. Contact SCRiM Managing Director Robert Nicholas (ren10@psu.edu) for details on how to gain access.

Submitting Jobs on the Batch System

  • qsub -A kzk10_a_g_sc_default <yourrunscript>
  • alternately, add #PBS –A kzk10_a_g_sc_default to your run script
  • to use the open queue, replace kzk10_a_g_sc_default with open

Using the Modules System

The modules system controls which software packages are available in your current environement.

  • to list all available modules: module avail
  • to search available modules: module avail <keyword>
  • to load a specific software package: module load <modulename>
  • to show modules currently loaded in your environment: module list
  • to show paths for executables and libraries associated with a particular module (whether it is loaded or not): module show <modulename>
  • load newer GNU compilers, including GNU Fortran: module load gcc/5.3.1 (system default is version 4.7.7)

In addition to the systemwide modules, SCRiM also has its own custom software stack containing additional modules. You can make these modules available with the following command:

module use /gpfs/group/kzk10/default/sw/modules

Specific software packages can be loaded as follows:

  • Anaconda (a full “scientific Python” environment): module load anaconda2 (Python 2.7.x) or module load anaconda3 (Python 3.x)
  • Git Cola (a graphical Git client): module load git-cola/2.10
  • ImageMagick: module load ImageMagick/7.0.5-2
  • Julia: module load julia/0.5.1
  • libGD: module load libgd/2.2.4
  • NCAR Command Language (NCL): module load ncl_ncarg/6.3.0_gcc447
  • ncview: module load ncview/2.1.7
  • Panoply: module load panoply/4.7
  • Perl Data Language (PDL): module load perl/5.24.1
  • PyCharm (a Python IDE): module load pycharm/2016.3.3
  • RStudio Desktop: module load rstudio/0.98.1103
  • uvcdat: module load uvcdat/2.4.1

Note that CDO, GDAL, gnuplot, TeX/LaTeX, NCL, NCO, and Pandoc are also available and are being provided as part of the two Anaconda distributions (see above).

Monitoring the Batch System

At present, there is no way to obtain allocation usage information from the command line as there was on ACI 1.0.

The current state of the batch queue may be obtained using the showq command. To show the queue state for just the SCRiM allocation, use:

showq –w acct=kzk10_a_g_sc_default

To show the status of the open queue, use:

showq –w acct=open

Getting Support

You can obtain professional support for ICS-ACI resources in the following ways:

Support hours are Mon-Fri 9:00am-6:00pm and Sat-Sun 1:00pm-4:00pm.

ICS has not prepared a comprehensive user guide for ACI. In the past, users have found the ACI Onboarding Materials to be a helpful resource, but these materials are now out-of-date and should not be used.