Introduction
to the Penrose Cluster
Accessing
Penrose Cluster and Graphical Desktop Enviroment
qsub - Submit scripts to queue
Links
to Useful Software and Documentation
The Penrose cluster is a Linux compute
server which appears as one big, multiprocessing, compute server that can run many
computational intensive jobs. The system currently consists of a front-end node
server and eight (9) compute nodes.
Researchers and educators in the department can take advantage of this
computing power by running a range of jobs from shell scripts to larger
computational jobs and batch processes.
The Penrose cluster uses ROCKs 4.3 Cluster software which is based on
CENTOS. A number of chemistry,
biochemistry and material science related computational packages are installed
and configured.
Researchers should log in to the
Penrose headnode, penrose.nsm.iup.edu, using an SSH client program (e.g.
putty.exe) and their username and password.
The username and password can be obtained by contacting Dr. LeBlond. You must use SSH2 protocol. File transfer between the users home
directory can be performed with SFTP (e.g. Filezilla). To run graphical programs on your desktop PC
remotely you will also need an X server.
Either Xming or cygwin/X has been successfully used.
Your home directory on the Penrose
cluster is /home/$USER. The initial quota limit for your home directory is set
at XXXGB.
The recommended, node-local, temporary
workspace or scratch space is /state/partition1/$USER, the contents of which is subject to
automatic deletion after 30 days. This
will be configured by the system administrator during account setup. It is your responsibility to clean your
scratch space should you have a crashed job.
An out of control job could fill the disk and result in the node
crashing.
The following computational chemistry,
bioinformatics and visualization software packages are available on the Penrose
cluster.
Note: To use the visualization software described
below on a remote Windows PC will require you to install Xming X-server or the
equivalent. Some of these applications
require OpenGL and will not work with Xming and some video boards. If a graphical application does not work
remotely, you will need to physically login to the frontend.
The following compilers are available
on both the Penrose cluster. In
addition, there are a variety of other programming languages available
including python and perl.
·
C/C++ Compilers
·
GNU:
gcc / g++
·
Intel: icc 10.1
·
Fortran Compilers
·
GNU:
g77 (for FORTRAN 77)
·
Intel: ifort
10.1 (FORTRAN 77, 90, 95)
SSH (Secure
Shell) can be used to connect to the penrose cluster. Once connected to the front end node
(penrose), you can begin your computations.
If you would like to use graphical applications you will need to install
NXClient for Windows or Linux on your computer.
Access to terminal
via SSH (putty)
1) Download
and install Putty
Putty
http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
2) Configure
Putty.
Run the PuTTY
Windows SSH client, and enter 'penrose.nsm.iup.edu' as the Host Name as shown
below. Enter a name to call this session
such as ‘Penrose-VNC’ and click the save button.

Next, go back to the [Session] window, click on [Save] and then click on the [Open] button. Log in to penrose using your password and username.
Your tunnel
is now active. You do not need to run anything else in this putty terminal
window. You are connected to the
penrose cluster and could begin computing at this point. If you would like to run X windows and
graphical applications install NXclient as outlined below.
Accessing Graphical
Applications on the Penrose Cluster
1)
Download
NXclient from following website (http://www.nomachine.com/download.php)
appropriate for your computer.
2)
Email
Dr. Carl LeBlond for key required to access the penrose cluster.
3)
Install
NXClient.
Use
the following settings:
a)
Host: penrose.nsm.iup.edu
b)
Desktop: unix/gnome
c)
Choose appropriate network setting (e.g. LAN (if your on campus), ADSL (if you
are setting it up at home and you have DSL connection)
4)
Install
the key (Click Key/Import and point to the key)
5)
Login
to cluster using your login and password assigned.
Note: The performance of the NX client depends on your graphics card and network connection
This section describes the Sun Grid
Engine 6.0 queue system on the Penrose Linux cluster and provides an
introduction to help users get started using SGE. For more details, consult the appropriate man
pages. The sge_intro man page (type "man sge_intro") gives a brief description of
all of the SGE commands. There is only one queue and the user interacts with
this queue by the commands summarized below.
Typically a job command file is prepared and submitted to the queue with
the qsub
command.
e.g. To
submit the serial gromacs job command file type;
qsub gromacs-ser
e.g. To
delete job with job ID of 201 type;
qdel 201
An SGE job
has to be defined by a job command file.
There are many options available for job command files. An example job command file which will run
the gromacs programs, grompp and mdsim, is provide as an example below. This file will run the job on 2 nodes in
parallel using mpi. The command jobs
error and output files will be written to the current working directory and
would be called gromacs-par.errout and gromacs-par.out.
Example job command
file: gromacs-par

When a job from the SGE
queue has completed, the job command error and output files will be available
for you to review. These are standard
ASCII text files which may be viewed with the cat or more commands. They could also be edited with an editor such
as vi or emacs.
To learn more about these commands see there man pages ((type "man
vi"). There are various applications for reviewing
output from computational chemistry packages.
See the Visualization
software overview and Application
Specific Notes for details of there operation.
1) Construct
input file - The
input file contains information about system and control parameter, type of calculation
and information concerning the molecular geometry and charge. An input file can easily be prepared with the
Linux program, gabedit, or using the Windows application, Chemcraft. You could also prepare the file with a text
editor (i.e. vi
or emacs on Unix systems or Notepad in
Windows), however gabedit and Chemcraft allow you to construct molecules and
obtain there coordinates.
2) Submit
job – A script file
named gamess
should be invoked. This script will
start the GAMESS command shell and should be started from your current
directory (see below). From this shell
you can submit GAMESS jobs to the SGE queue.

For
serial execution type
gamess>gamess
inputfilename
Note: You should not include the .inp extension
For
parallel execution on 4 nodes you would type the following
gamess>gamess –n 4
inputfilename
3) Analysis
and visualization -
The output from the GAMESS job will be located in the text file, inputfilename.out, in your current directory. The output can be visualized with either molden
or VMD locally or using a remote X
server. You can expect very poor
performance when visualizing with a remote X server.
GROMACS is a collection of programs
for molecular dynamic simulations. In
the simplest case you would prepare a topology (.top) and a molecular dynamics parameter
file (.mdp)
before executing the program. The
topology file (.top)
is first converted to a binary topology file (.tpr) with the command grompp.
Once you have created the binary topology file, the molecular dynamics
simulation can be initiated with the mdrun command. Read the getting started section of the
manual and work your way through the examples given here http://www.gromacs.org/documentation/reference/online/getting_started.html.
You should run these from the command
prompt in the directory where your topology and parameter files are
located. An example of this procedure is
given below. You could also submit these
to SGE queue for parallel or serial execution with the gromacs-par and gromacs-ser scripts.
![Text Box: Example: Copying GROMACS examples and tutorials to home directory and executing from the Unix command prompt.
First copy tutor directory from the GROMACS installation to your home directory
[cleblond@penrose ~]$cp –r /opt/Bio/gromacs/share/gromacs/tutor ~
Change to the first demo directory called water:
[cleblond@penrose ~]$cd tutor/water
Convert topology file to binary topology file:
[cleblond@penrose ~]$grompp –v –np 2
Initiate molecular dynamic simulation:
[cleblond@penrose ~]$mdrun –v –np 2
Notes: The –v options means verbose (i.e. print all output), while the –np 2 indicates the simulation will run on two computer nodes. When run in this way the first computer is the front-end server node and the second would be compute-0-0. You should be warned that this is not the typical way in which to run programs on the cluster, but can be used for the demo and practice. If your job does not complete before you logout, the simulation will halt.](beowulf_files/image015.gif)
![Text Box: Example: Submitting parallel and serial GROMACS jobs to SGE queue.
For parallel execution:
[cleblond@penrose ~]$qsub gromacs-par
For serial execution:
[cleblond@penrose ~]$qsub gromacs-ser
Note: When jobs are submitted to he SGE queue, the computations are processed on the compute nodes only. This has two advantages; 1) It does not take resources away for the front end server node and 2) you can logout and your simulations will continue processing.](beowulf_files/image016.gif)
To run a NAMD
job requires at least four files. A
CHARMM force field file, a X-PLOR format PSF file describing the molecular
geometry, an initial coordinate file in PDB format and a NAMD configuration
file. Please refer to the NAMD user
guide for more detailed information on constructing these files.
.
Running an Espresso job requires an input file (e.g. filename.in). This input file can either be constructed by hand (i.e. vi or emacs) or using the the program pwgui. The SGE-command file, pwsub-par, should be used for submission of jobs. The file should be edited to point to your input file and to reflect the number of processors you desire to run on. Output files can be viewed with the program XCrySDen (type – xcrysden).
![Text Box: Copying Espresso demo’s and running the examples interactively.
Make a directory to hold the input files for demo and for temporary files;
[cleblond@penrose ~]$ mkdir ~/PWscf
[cleblond@penrose ~]$ mkdir ~/tmp
Change to PWscf directory;
[cleblond@penrose ~]$ cd PWscf
Copy demo examples to PWscf directory;
[cleblond@penrose ~]$ cp -r /opt/espresso-3.2/examples ~/PWscf
Run the example files on the front-end node (i.e. to run example01);
[cleblond@penrose ~]$ cd examples/example01
[cleblond@penrose ~]$ ./run_example
Note: There are over 30 examples to choose from. Read the README file in the examples directory for more details. More advanced users should continue to the next section for parallel execution.](beowulf_files/image018.gif)
The above example scripts will run
interactively on the head node. If you
wish to submit jobs to the SGE queue, use the pwscf4
scripts (type pwscf4). Read the Espresso manual for definitions of X
and Y.
|
![Text Box: Example: Submitting Espresso Batch Jobs using SGE Scripts
A parallel or serial Espresso run is initiated using the SGE command file, pwsub-par. You must edit this file (e.g. vi) and change the name of the input file. Also, you should change the line near the top to reflect the number of processors you plan to run on (e.g. #$ -pe mpi 2). This would indicate to run on two processors. To submit to the queue as type:
[cleblond@penrose ~]$ qsub pwsub-par
For a serial job type, you would edit the script such that (#$ -pe mpi 1)](beowulf_files/image009.gif)
Alternatively you could also write
your own script to submit jobs to the SGE queue. The pwsub-par script is an example. If you would like other example scripts for
optimizing K-points, Celldm or energy cutoffs, contact Dr. LeBlond.
AutoDock4/MDLTools
AutoDock4 (and AutoGrid4) must be run
in the directory where the rigid macromolecule, ligand and parameter files are
to be found. Also if you plan to use ADT
for interactive jobs you must start ADT from the directory where these files
are located. AutoDock and AutoGrid are
not parallel applications (i.e. it will run in serial on a single node). Begin by working through the tutorial at the
link provided below.
![Text Box: Copying AutoDock tutorials and running interactively with MGLTools (ADT).
Copy tutorials to home directory;
[cleblond@penrose ~]$ cp -r /share/apps/autodocksuite-4.0.1/tutorial4 ~
Change to tutorial4/reults4 directory;
[cleblond@penrose ~]$ cd tutorial4/Results4
Start adt
[cleblond@penrose ~]$ adt&
You could now begin the step by step tutorial located at
http://autodock.scripps.edu/faqs-help/tutorial/using-autodock-4-with autodocktools/UsingAutoDock4WithADT_1.4.5b.pdf](beowulf_files/image019.gif)
![Text Box: Example: Submitting AutoDock Batch Jobs using SGE
A batch AutoDock4 run is initiated using the SGE command file, autodock-bat. You must edit this file (e.g. vi) and change the name of the input file.
[cleblond@penrose ~]$ qsub autodock-bat](beowulf_files/image011.gif)
Provided here
are links to the software packages websites.
Cluster Software
ROCKS Cluster Software - http://www.rocksclusters.org
SSH, FTP and X servers
Putty SSH
client software - http://www.chiark.greenend.org.uk/~sgtatham/putty/
PortaPutty –
A portable version (i.e. can run from USB drive) - http://socialistsushi.com/portaputty
Filezilla FTP
client software - http://filezilla-project.org/
Filezilla Portable
–A portable version of Filezilla http://portableapps.com/apps/internet/filezilla_portable
XMing X
server - http://www.straightrunning.com/XmingNotes/
Computational
Chemistry Software
ROCKS
Bio Roll Software - http://www.rocksclusters.org/roll-documentation/bio/4.3/
GAMESS - http://www.msg.ameslab.gov/GAMESS/GAMESS.html
GROMACS - http://www.gromacs.org. See the getting started with GROMACS
tutorials at http://www.gromacs.org/documentation/reference/online/getting_started.html.
NAMD - http://www.ks.uiuc.edu/Research/namd/.
Espresso-3.2 - http://www.pwscf.org/
AutoDock4/MGLTools - http://mgltools.scripps.edu/
CRL is
grateful to John Draganosky, Joe Shyrok, Tom Kirkpatrick and Paul Grieggs, for
equipment donations and configuration help.
Special thanks to NEETC for equipment donations.