User Tools

Site Tools


getting_started_guide

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
getting_started_guide [2016/01/20 10:35]
Editor created
getting_started_guide [2016/04/20 14:31]
Editor
Line 1: Line 1:
-Getting Started+**Getting Started** 
 + 
 +This section shows how to login to the the system and submit a basic job on the cluster. If you do no have an account already, please apply for one by following the link [[applying_for_an_account|]] 
 + 
 +**Logging In** 
 + 
 +To connect to the cluster, ssh to ranger.zamren.zm using the username and password you registered during account application. Once you login, you will be asked to reset the password. 
 + 
 + 
 +**Environmental Variables** 
 +To see the variables in your environment execute the command: env 
 + 
 +**Modules** 
 + 
 +The Environment Modules package provides for the dynamic modification of a user’s environment via modulefiles. To see available modules type the command 
 + 
 +module avail 
 + 
 +----------------/path/toModules------------------------------------------------------------------------------- 
 + 
 + 
 +cmake/3.5.0     FFTW/3.3.4      gmp/4.3.2       gromacs/5.1.0   mpich/3.1       mvapich/2.1     openmpi/1.10.1 
 +FFTW/2.1.5      gcc/4.4.7       gotoblas2/    gsl/1.9         mpich/3.2       openblas/0.2.15 
 + 
 + 
 +**Submitting a Job** 
 + 
 +The cluster uses [[http://www.schedmd.com|SLURM]] for scheduling and resource management.Slurm 
 +is an open-source cluster resource management and job scheduling system that strives to be simple, scalable, portable, fault-tolerant, and interconnect agnostic. Slurm currently has been tested only under Linux. 
 + 
 +As a cluster resource manager, Slurm provides three key functions. First,it allocates exclusive and/or non-exclusive access to resources(compute nodes) to users for some duration of time so they can perform 
 +work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work. 
 +Key commands to view the status of the cluster are 
 + 
 +**sinfo** reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options 
 + 
 + 
 +**squeue** reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order. 
 + 
 + 
 +**srun** is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation. 
 + 
 + 
 +**sbatch** is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks. 
 + 
 +**scancel** is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step. 
 + 
 +More information can be obtained from [[https://computing.llnl.gov/linux/slurm/quickstart.html|Slurm Quick User Guide]] 
  
-This section show how to login to the the system and submit a basic job on the cluster 
getting_started_guide.txt · Last modified: 2017/10/19 10:53 (external edit)