Aspen Systems (Talon) User Guide
Table of Contents
- 1. Introduction
- 1.1. Document Scope and Assumptions
- 1.2. Policies to Review
- 1.3. Obtaining Accounts
- 1.4. Requesting Assistance
- 2. System Configuration
- 2.1. System Summary
- 2.2. Processor
- 2.3. Memory
- 2.4. Operating System
- 2.5. Peak Performance
- 3. Accessing the System
- 3.1. Kerberos
- 3.2. Logging In
- 3.2.1. Kerberized SSH
- 3.3. File Transfers
- 4. User Environment
- 4.1. Shells
- 4.2. Environment Variables
- 4.2.1. Login Environment Variables
- 4.2.2. Batch-Only Environment Variables
- 5. Program Development
- 5.1. Message Passing Interface (MPI)
- 6. Batch Scheduling
- 6.1. Scheduler
- 6.2. Queue Information
- 6.3. Interactive Logins
1.1. Document Scope and Assumptions
This document provides an overview and introduction to the use of the Aspen Systems Intel (Talon) located at the AFRL DSRC and a description of the specific computing environment on Talon. The intent of this guide is to provide information that will enable the average user to perform computational tasks on the system. To receive the most benefit from the information provided here, you should be proficient in the following areas:
- Use of the UNIX operating system
- Use of an editor (e.g., vi or emacs)
- Remote usage of computer systems via network or modem access
- A selected programming language and its related tools and libraries
1.2. Policies to Review
Users are expected to be aware of the following policies for working on Talon.
1.3. Obtaining Accounts
Authorized DOD and Contractor personnel may request an account on Lancer by submitting a proposal to the AFRL DSRC via email to email@example.com. The proposal should include the following information:
- HPC experience and level of required support
- Project suitability for a Shared Memory system
- Project contribution to the DoD mission and/or HPC technical advancement
- Proposed workload
Direct any questions regarding this non-allocated system to firstname.lastname@example.org.
1.4. Requesting Assistance
The Consolidated Customer Assistance Center (CCAC) is available to help users with unclassified problems, issues, or questions. Analysts are on duty 8:00 a.m. - 11:00 p.m. Eastern, Monday - Friday (excluding Federal holidays).
For more detailed contact information, please see our Contact Page.
2. System Configuration
2.1. System Summary
Talon is an Aspen Systems Intel system. The project and compute nodes are populated with Intel x86 processors. Talon uses DDR Infiniband as its highspeed network for MPI messages and IO traffic. Talon uses Panasas File System to manage its parallel file system that targets its storage arrays. Talon has 6 project (i.e. user and/or login nodes) that share memory only on the node. Each user node has two 4-core processors (8 cores) with its own CENTOS operating system, sharing 48 GBytes of 1333-MHz DDR3 memory, with no user-accessible swap space. (Note talon1 and talon6 have only 24 GBytes of memory.) Talon has 12 compute nodes that share memory only on the node; memory is not shared across the nodes. Each compute node has two 4-core processors (8 cores) with its own CENTOS operating system, sharing 24 GBytes of DDR3 memory, with no user-accessible swap space. Talon is rated at 1.07 peak TFLOPS and has 35 TBytes (formatted) of disk storage.
Talon is intended to be used as a project and applications development and experiment system. Access and use of Talon's project nodes and other resources are assigned to specific project user(s) to run code, scripts, databases, and user interfaces that drive the processing submitted to the Talon or other DSRC HPC batch systems. Job executions that require large amounts of system resources should be sent to the compute nodes or other HCP systems by batch job submission. Talon nodes can also be reconfigured to experiment with unique operating systems, software or hardware that is not readily available on standard HPCMP systems. All assigned projects will essentially share the system resources but may have dedicated resources for specific purposes or events as necessary to the project's needs. Usage and conflicts will be monitored by the DSRC and any conflicts or priorities will be arbitrated by the DSRC management.
In the table below, we have added "User Accessible Memory/Node" because in many cases, the entire amount of memory installed is not accessible due to overhead or usage polices, and our goal was to provide "useable" data.
|Project Nodes||Compute Nodes|
|Operating System||CENTOS 5.7||CENTOS 5.7|
|Core Type||Intel Nahalem-EP||Intel Nahalem-EP|
|Core Speed||2.8 GHz||2.8 GHz|
|Memory/Node||48 GBytes*||24 GBytes|
|Accessible Memory/Node||48* GBytes||24 GBytes|
|Interconnect Type||DDR IB||DDR IB|
|*talon1 and talon6 have only 24 GBytes of memory.|
(shared across both partitions)
|Panasas File System|
|Panasas File System|
Talon uses 2.8-GHz Intel Nahalem-EP (X5560, Gainstown) processors on its project and compute nodes. There are 2 processors per node, each with 4 cores, for a total of 8 cores per node. In addition, these processors have 4 x 64 KBytes of L1 cache, 4 x 256 KBytes of L2 cache and 8 MBytes of shared cache.
Talon uses both shared and distributed memory models. Memory is shared among all the cores on a node, but is not shared among the nodes across the cluster.
Each project node contains 48 GBytes (Talon1 and Talon6 have 24 GBytes) of main memory. All memory and cores on the node are shared among all users who are logged in or the processes they run on these nodes. Therefore, users should not use more than 48/24 GBytes of memory at any one time.
Each compute node contains 24 GBytes of user accessible shared memory. When running under the batch scheduling system, a process or job will have exclusive access to all 24 GBytes of compute node memory while executing.
2.4. Operating System
The operating system on Talon is Linux (CENTOS).
2.5. Peak Performance
Talon is rated at 1.07 peak TFLOPS.
3. Accessing the System
A Kerberos client kit must be installed on your desktop to enable you to get a Kerberos ticket. Kerberos is a network authentication tool that provides secure communication by using secret cryptographic keys. Only users with a valid HPCMP Kerberos authentication can gain access to Talon. More information about installing Kerberos clients on your desktop can be found at HPC Centers: Kerberos & Authentication.
3.2. Logging In
3.2.1. Kerberized SSH
% ssh user@talon#.afrl.hpc.mil (# = 1 to 6)
3.3. File Transfers
File transfers to DSRC systems (except transfers to the local archive system) must be performed using Kerberized versions of the following tools: scp, ftp, sftp, and mpscp.
4. User Environment
The following shells are available on Talon: csh, bash, ksh, tcsh, and sh.
4.2. Environment Variables
A number of environment variables are provided by default on all HPCMP high performance computing (HPC) systems. We encourage you to use these variables in your scripts where possible. Doing so will help to simplify your scripts and reduce portability issues if you ever need to run those scripts on other systems.
4.2.1. Login Environment Variables
|$ARCHIVE_HOME||Your directory on the archive server.|
|$ARCHIVE_HOST||The host name of the archive server.|
|$BC_HOST||The generic (not node specific) name of the system.|
|$CC||The currently selected C compiler. This variable is automatically updated when a new compiler environment is loaded.|
|$CENTER||Your directory on the Center-Wide File System (CWFS).|
|$CSI_HOME||The directory containing the following list of heavily used application packages: ABAQUS, Accelrys, ANSYS, CFD++, Cobalt, EnSight, Fluent, GASP, Gaussian, LS-DYNA, MATLAB, and TotalView, formerly known as the Consolidated Software Initiative (CSI) list. Other application software may also be installed here by our staff.|
|$CXX||The currently selected C++ compiler. This variable is automatically updated when a new compiler environment is loaded.|
|$DAAC_HOME||The directory containing the ezVIZ visualization software.|
|$F77||The currently selected Fortran F77 compiler. This variable is automatically updated when a new compiler environment is loaded.|
|$F90||The currently selected Fortran 90 compiler. This variable is automatically updated when a new compiler environment is loaded.|
|$HOME||Your home directory on the system.|
|$JAVA_HOME||The directory containing the default installation of JAVA.|
|$KRB5_HOME||The directory containing the Kerberos utilities.|
|$PET_HOME||The directory containing the tools installed by the PET CE staff. The supported software includes a variety of open-source math libraries (see BC Policy FY06-01) and open-source performance and profiling tools (see BC Policy FY07-02).|
|$PROJECTS_HOME||A common directory where group-owned and supported applications and codes may be maintained for use by members of a group. Any project may request a group directory under $PROJECTS_HOME.|
|$SAMPLES_HOME||The Sample Code Repository. This is a collection of sample scripts and codes provided and maintained by our staff to help users learn to write their own scripts. There are a number of ready-to-use scripts for a variety of applications.|
|$WORKDIR||Your work directory on the local temporary file system (i.e., local high-speed disk).|
4.2.2. Batch-Only Environment Variables
In addition to the variables listed above, the following variables are automatically set only in your batch environment. That is, your batch scripts will be able to see them when they run. These variables are supplied for your convenience and are intended for use inside your batch scripts.
|$BC_CORES_PER_NODE||The number of cores per node for the compute node on which a job is running.|
|$BC_MEM_PER_NODE||The approximate maximum user-accessible memory per node (in integer MBytes) for the compute node on which a job is running.|
|$BC_MPI_TASKS_ALLOC||The number of MPI tasks allocated for a job.|
|$BC_NODE_ALLOC||The number of nodes allocated for a job.|
5. Program Development
5.1. Message Passing Interface (MPI)
MPI establishes a practical, portable, efficient, and flexible standard for message passing that makes use of the most attractive features of a number of existing message-passing systems, rather than selecting one of them and adopting it as the standard. See "man mpi" for additional information.
A copy of the MPI 2.2 Standard, in PDf format, can be found at the following URL:
6. Batch Scheduling
The Maui/TORQUE scheduling system is currently running on Talon. It schedules jobs and manages resources and job queues, and can be accessed through the interactive batch environment or by submitting a batch request. Maui/TORQUE is able to manage both single-processor and multiprocessor jobs
6.2. Queue Information
The following table describes the Maui/TORQUE queues available on Talon:
|Highest||debug||Debug||1 Hour||N/A||User diagnostic jobs.|
|prepost||N/A||24 Hours||1||Pre/Post processing for user jobs|
|urgent||Urgent||N/A||N/A||Designated High Urgent jobs by DoD HPCMP|
|staff||N/A||368 Hours||N/A||DSRC staff testing only|
|high||High||N/A||N/A||Jobs belonging to DoD HPCMP High Priority Projects.|
|challenge||Challenge||168 Hours||N/A||Jobs belonging to DoD HPCMP Challenge Projects.|
|cots||Standard||96 Hours||N/A||Abaques, Fluent, and Cobalt jobs|
|interactive||Standard||12 Hours||N/A||Interactive jobs|
|standard-long||Standard||200 Hours||N/A||DSRC Permission Required|
|standard||Standard||96 Hours||N/A||Non Challenge User Jobs|
|Lowest||background||Background||24 Hours||N/A||User jobs that will not be charged against the project allocation|
6.3. Interactive Logins
When you log in to Talon, you will be running in an interactive shell on a project node. The project nodes provide login access for Talon and support such activities as compiling, editing, and general interactive use by all users. Users and projects will be assigned a specific project node to support login and development as well as hosting long running processes, services or scripts for the project application. The preferred method to run resource intensive executions is to use an interactive batch session.
6.4. Advnace Reservations
The Advance Reservation Service (ARS) is not available on Talon and all projects share the assigned or shared batch queues. The Talon compute cluster or a group of compute nodes can be reserved for specific scheduled project events but must be requested in advance from the Talon System Manager and coordinated with other users/projects.