This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
getting_started_guide [2016/02/16 10:57] Editor |
getting_started_guide [2017/10/19 10:53] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | **Getting Started** | + | =====Getting Started |
This section shows how to login to the the system and submit a basic job on the cluster. If you do no have an account already, please apply for one by following the link [[applying_for_an_account|]] | This section shows how to login to the the system and submit a basic job on the cluster. If you do no have an account already, please apply for one by following the link [[applying_for_an_account|]] | ||
Line 24: | Line 24: | ||
- | **Submitting | + | **Submitting |
- | The cluster uses [[http:// | + | [[http:// |
- | **sinfo** reports | + | In order to use the HPC compute |
+ | **Creating a PBS Script** | ||
- | **squeue** reports | + | To set the parameters for your job, you can create |
+ | Here is a sample PBS file, named myjobs.pbs, followed by an explanation of each line of the file. | ||
- | **srun** | + | < |
+ | # | ||
+ | #PBS -l nodes=1: | ||
+ | #PBS -l walltime=00: | ||
+ | cd / | ||
+ | | ||
+ | sas my.sas | ||
+ | </ | ||
+ | |||
+ | The first line in the file identifies which shell will be used for the job. In this example, bash is used. | ||
+ | The second line specifies the number of nodes and processors desired for this job. In this example, one node with two processors is being requested. | ||
+ | The third line in the PBS file states how much wall-clock time is being requested. In this example 59 seconds of wall time have been requested. | ||
+ | The fourth line tells the HPC cluster | ||
+ | The fifth line tells the cluster which program you would like to use to analyze your data. In this example, the cluster sources the environment | ||
+ | The sixth line tells the cluster to run the program. In this example, it runs SAS, specifying my.sas as the argument in the current directory, / | ||
+ | To submit your job without requesting additional resources, issue the command | ||
+ | **qsub myjob.pbs** | ||
+ | |||
+ | If you have the myjob.pbs set up as explained | ||
+ | |||
+ | Below are some examples of these overrides. | ||
+ | |||
+ | **Requesting Additional Wall Time** | ||
+ | |||
+ | If you need to request more or less wall time after you have already created your PBS script, you can do this by using the qsub command. | ||
+ | |||
+ | In the example script above, we have requested 59 seconds | ||
+ | |||
+ | **qsub -l walltime=0:05:00 myjob.pbs** | ||
+ | will ask PBS for a limit of five minutes of wall time. If your job does not finish within the specified time, it will be terminated. | ||
+ | |||
+ | **Requesting Nodes and Processors** | ||
+ | |||
+ | You may also alter the number of nodes and processors requested for a job by using the qsub command. In the example script, we have requested one node with two processors, or one dual-processor | ||
+ | |||
+ | If you later decide that you need four HPC nodes for your job but you are going to use only one of the dual-processors on each node, then use the following command: | ||
+ | **qsub -l walltime=0: | ||
+ | |||
+ | If you want to use both processors on each HPC node, you should use the following command: | ||
+ | **qsub -l walltime=0: | ||
+ | |||
+ | **Requesting a Specific Network** | ||
+ | |||
+ | To run your job on the infiniband network, add the IB feature to your PBS script. | ||
+ | |||
+ | **#PBS -l nodes=1: | ||
+ | |||
+ | MPI jobs using OpenMPI 1.6.4 or later can run on the Infiniband network. | ||
+ | |||
+ | NOTE: Only one network should be specified for each job. If no network is specified. the job will be scheduled to run on whichever network is available. | ||
+ | |||
+ | **Checking Job Status** | ||
+ | |||
+ | To check on the status of your job, you will use the qstat command. The command | ||
+ | **qstat –u [your username]** | ||
+ | will show you the current status of all your submitted jobs. | ||
+ | |||
+ | More information can be obtained from the [[http:// | ||
- | **scancel** is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step. |