|
High Performance Computing | May 13, 2008 | ||||||||||
|
|
||||||||||
| Home | News | System Status | User Policies | Account Info | Hardware | Software | Documentation | FAQs | Search | |
|
|
||||||||||
| User Policies | UKy HPC | |||
Effective July 1, 2000 , all users of the high performance scientific computing facilities at UK will be assigned resource allocations . This is a significant change from previous UK policies. However, it will ensure that our finite supercomputing resources are allocated fairly and equitably.
Students, faculty, and staff at the University of Kentucky and at
other public universities in the Commonwealth (subject to certain
conditions) may be eligible for accounts on these facilities, in
addition to national users assigned through the National
Computational Science Alliance. Every user will be assigned to a
project. Each project will receive an allocation for three separate
resources - CPU hours, local disk space, and mass storage. A
project may have more than one user assigned; including the primary
investigator (PI) or investigators and the others associated with
the project. Each individual will be assigned a single user ID. A
particular user and userid may also belong to more than one
project. Each project will be assigned a yearly resource allocation.
Resources will be allocated to each project. The allocation for each
resource is the total amount of that resource that may be used by
all of the project's userids combined , with the exception
that local disk allocations will be by userid rather than
project. Each userid will get the default allocation for up to
three userids per project. Each project will be assigned a
group id that each userid will be a member of. This will allow the
project members to arrange project disk sharing securely and
easily. If more userids are needed, then the allocations will be
reduced proportionately. More complicated arrangements can be worked
out when necessary. Requests for more local disk space should be
made to the system administrators at
help-hpc@uky.edu
More information about High Performance Computing at the University
of Kentucky is available at
hpc.uky.edu.
Resources Available:
CPU –
The theoretical maximum number of CPU hours on the UK system is
1,927,200 per year (365*24*220).
Local Disk –
The UK system has approximately 640 GB of home dir disk
space on the login node; this space will be assigned among the
various users. Each userid will receive a standard allocation. Any
project needing more home dir disk space should request additional
space through the system administrators at
help-hpc@uky.edu
Mass Storage
– The UK mass storage system has approximately 1.8 TB available
for research allocation. Each project on the UK system will receive
a standard allocation on the Mass Storage system. Any project
needing more mass storage space should request additional space
through the system administrators.
Policies and Procedures:
All users of University of Kentucky Computing and Communications
facilities are expected to abide by all relevant policies and
procedures. For more information, please see the
UK Information Systems Policies document.
In particular, accounts are granted to one person only and no
other person may use the account. No exceptions will be
made. Failure to abide by these policies may result in the
suspension of computing privileges.
If your account becomes inaccessible, please contact
User Account Services by email (accounts@lsv.uky.edu) or by
phone (859-257-1300). Ask if the account has been suspended
or disabled for any reason. If an account actually has been
suspended, then send email to the support team
(
help-hpc@uky.edu
) for more information.
Allocation procedures:
Accounts will be established according to the following guidelines:
| Allocation Level | CPU | Local Disk | Mass Storage | ||||||||||||||||||||||||||
| ____________________________________________________________________________ | |||||||||||||||||||||||||||||
| Group 1 | up to 1000 hrs | 500 MB | 25 GB | ||||||||||||||||||||||||||
| Group 2 | up to 10,000 hrs | 1 GB | 100 GB | ||||||||||||||||||||||||||
| ____________________________________________________________________________ | |||||||||||||||||||||||||||||
Requests for startup (small) accounts will be assigned upon request
to User Account Services. This level of allocation is intended for
new users or users who have very modest computational requirements
up to 1000 CPU hours per year. Any eligible Principal Investigator
should submit a request, using these
forms.
A full proposal is not normally required. Requests for startup awards
may be submitted at any time. A startup account may be renewed each
year, or the researchers may apply for a Group 2 or 3 account.
The University of Kentucky RAC will review and allocate resources for all Group 2 peer reviewed allocations. This committee meets quarterly and makes allocations for proposals requesting 1001 to 10,000 CPU hours annually. An award is for a period of one year. The application form and the proposal requirements are available here.
Login procedures:
For security reasons, interactive logins to the cluster will be by secure-shell (SSH) only. See this page for more information. No exceptions will be made.
Submitting Jobs:
The batch system must be used for all jobs. Non-batch jobs on
any node will be killed, unless special permission has been obtained
in advance from a system administrator
(
help-hpc@uky.edu
).
Please checkpoint long running jobs! See
this page for more information.
Control the number of processors that each job uses by submitting it
to the proper batch queue. See
this page
for a summary of the job queues. If the job uses the Message Passing
Interface (MPI), then specify the appropriate number of processors
for the job and the queue run in the mpirun command. For more
information on using MPI, see
this page.
Jobs using an excessive number of processors for a queue may be
killed.
To avoid slowing jobs down, transfer the necessary files to the temporary disk area on the node where the job will run. Output files may also be written to this area. However, since temporary disk space on the cluster is limited, please use the temporary disk space only for storage of those files that will be needed immediately. Files to be kept may be copied to the local disk space, or to the mass storage system (see Data Storage ).
Files in the temporary disk areas are not backed
up and may be deleted without warning, if necessary, to provide
proper service.
Job Queues:
Job queues have been established to assure equitable distribution and access to the entire complex for all users. Normally, each userid will be limited to 64 job-slots on the complex at a time ( projects are allowed an additional 50% of the user job-slot quota) for parallel jobs. ie two (or more) users on the same project would be allocated 96 job-slots in total, not 128 . Otherwise, a large number individual users on a given project can envelop more than their fair share of resources.
The job-slot limit must be viewed in the context of the queue. A special case would be the serial queue. The 50% rule would still apply ie the project would get 50% more than the individual limit (this is less than the 64 parallel limit outlined above, which would be more than all possible serial jobs combined with existing resources).
When a userid belongs to more than one project, the job-slots belonging to it will count against the limit of each project that the userid could be running under. A serial job uses 1 job-slot; an N-way parallel job uses N job slots. For example, a userid might run two 32-way jobs or eight 8-way jobs at a time. There is no penalty for running two 8-way parallel jobs in a parallel queue, unless the user reserves more slots for a job at job submission than it actually uses. If a user specifies a set number of job-slots when submitting a job, the higher of the specification or the actual usage will count towards the limit. When the resources on the complex are underutilized, these restrictions may be relaxed in various ways to optimize throughput.
When the computing resources are near exhaustion, restrictions
may be tightened; the ultimate goal of the policy is to give all
researchers access to a fair share of the computing resources.
Initially, the largest standard job queue will allow
32-way
jobs. This limit will be reevaluated as necessary.s
Highly Parallel Jobs and Timing Runs:
Primary Investigators should note in their proposal when there will
be a requirement for highly parallel runs (larger than 64-way), or
when a specific group of processors or nodes will need to be
reserved for a timing run. Please send mail to the system
administrators
(
help-hpc@uky.edu
)
to schedule these runs. Depending upon machine load and scheduling,
it may take considerable time to get these jobs scheduled. Special
reservations will normally be limited to eight hours at a
time.
Account Termination:
After an account is terminated for any reason, the data associated with it will be retained for one year.
It is the researcher's responsibility to copy or transfer any data that must be kept past this period.
Please contact
technical support
(
help-hpc@uky.edu
) for help, if necessary.
Help and FAQ's:
For most problems, please consult the web pages at hpc.uky.edu first. In particular, the Frequently Asked Questions page may be useful.
Next try email to the technical support group
(
help-hpc@uky.edu
).
Please avoid calling or emailing individuals
directly. Email to the help address will generally be answered more
quickly.