This is Info file parallel.doc, produced by Makeinfo-1.61 from the input file parallel.texi.
Parallel Implementation of CHARMM CHARMM has been modified to allow computationally intensive simulations to be run on multi-machines using a replicated data model. This version, though employing a full communication scheme, uses an efficient divide-and-conquer algorthm for global sums and broadcasts. Curently the following hardware platforms are supported: 1. Cray T3D 7. Intel Paragon machine 2. Cray C90, J90 8. Thinking Machines CM-5 3. SGI Power Challenge 9. IBM SP1/SP2 machines 4. Convex SPP-1000 Exemplar 10. Parallel Virtual Machine (PVM) 5. Intel iPSC/860 gamma 11. Workstation clusters (SOCKET) 6. Intel Delta machine * Menu: * Running:: Running CHARMM on parallel systems * Status:: Parallel Code Status (as of October 1993) * Using PVM:: Parallel Code implemented with PVM
Running CHARMM on parallel systems 1. Cray T3D (Cray-PVM) ~charmm/exec/t3d/charmm24 -npes 256 < Input_file > output_file & The same command may be used in a batch script but without `&'. Example using batch: #QSUB -lM 16Mw #QSUB -lT 600:00 #QSUB -mb -me #QSUB -l mpp_p=32 #QSUB -l mpp_t=600:00 #QSUB -q mpp setenv MPP_NPES 32 ~charmm/exec/t3d/charmm24 < Input_file > output_file Preflx directives required: T3D UNIX PARALLEL PARAFULL PVM 2. Cray C90, J90 (Cray-PVM) No info yet 3. SGI Power Challenge (PVM) pvm quit setenv NTPVM 16 (or NTPVM=16 ; export NTPVM) ~charmm/exe/sgi/charmm24 <input_file >output_file & Preflx directives required: SGI UNIX PARALLEL PARAFULL PVM SGIMP 4. Convex SPP-1000 Exemplar With PVM (see below for information setting up a PVM Hostfile) mpa -sc <name_of_subcomplex> /bin/csh setenv PVM_ROOT /usr/convex/pvm /usr/lib/pvm/pvm quit ~/pvm3/bin/CSPP/charmm24 -n 16 <input_file >output_file & ~charmm/exe/cspp/charmm24 <input_file >output_file & Which subcomplexes are available check with the scm utility. (For information on how to set up a PVM hostfile see *note 1: Using PVM.) Preflx directives required: CSPP UNIX PARALLEL PARAFULL PVM HPUX SYNCHRON (GENCOMM) Note: The first time that you build CHARMM with PVM specify the P option with install.com. You will be asked for the location of the PVM include files and libraries. If these do not change and you do not reconstruct the Makefiles, you do not have to specify this option each time you run install.com. With MPI mpa -DATA -STACK -sc <name_of_subcomplex> \ ~charmm/exe/cspp/charmm24 -np <n> <input_file >output_file & Where <n> is the number of processors to use. There are two environmanet variables that can be set: setenv MPI_GLOBMEMSIZE <m> where <m> is the size of the shared memory region on each hypernode in bytes. The default is 16MB. And: setenv MPI_TOPOLOGY <i>,<j>,<k>,<l>,... where <i>, <j>, <k>, <l>, ... are the number of tasks on each hypernode. The sum must equal the number of processors specified with -np on the command line. This is optional the default behavior is generally what you want. If you are using a sub-complex with more than one hypernode, use may want to include '-node 0' after mpa to keep the 0th process on the 0th hypernode of the sub-complex. Preflx directives required: CSPP UNIX PARALLEL PARAFULL HPUX MPI CSPPMPI The CSPPMPI directive specifies the use of extensions in the Convex MPI implementation. This directive is optional. Use of the MPI directive alone will result in a fully MPI Standard compliant program, albeit with a loss of performance. Note: The first time that you build CHARMM with MPI specify the M option with install.com. You will be asked for the location of the MPI include files and libraries. If these do not change and you do not reconstruct the Makefiles, you do not have to specify this option each time you run install.com. 5. Intel gamma Because the fortran compiler on the Intel gamma does not know how to rewind the redirected input file the program uses charmm.inp file name from current working directory. The script for running CHARMM should look like the following example: cp input_file.inp charmm.inp getcube -t128 > output_file load ~charmm/exec/intel/charmm24 waitcube Preflx directives required: INTEL UNIX PARALLEL PARAFULL 6. Intel Delta mexec "-t(32,16)" ~charmm/exec/intel/charmm23<input_file>output_file& Preflx directives required: INTEL UNIX DELTA PARALLEL PARAFULL 7. Intel Paragon ~charmm/exec/intel/charmm23 <input_file >output_file & Preflx directives required: INTEL UNIX PARAGON PARALLEL PARAFULL 8. CM-5 ~charmm/exec/cm5/charmm23 <input_file >output_file & Preflx directives required:CM5 UNIX PARALLEL PARAFULL 9. IBM SP2 or SP1 setenv MP_RESD yes setenv MP_PULSE 0 setenv MP_RMPOOL 1 setenv MP_EUILIB us setenv MP_INFOLEVEL 0 poe ~charmm/exec/ibmsp/charmm24 -hfile nodes -procs 64 <input >output See `man poe' for details. Preflx directives required:IBMSP UNIX PARALLEL PARAFULL 10. PVM pvm add host host1 add host host2 quit setenv NTPVM 3 ~/pvm3/bin/SGI5/charmm24 <input_file >output_file& Preflx directives required: machine_type UNIX PARALLEL PVM PARAFULL SYNCHRON 11. PARALLEL VERSION OF CHARMM23 ON WORKSTATION CLUSTERS Preflx directives required: machine_type UNIX PARALLEL SOCKET PARAFULL SYNCHRON Currently the code runs on HP, DEC alpha, and IBM RS/6000 machines. This has been tested. The rest of UNIX world should run too without any changes as long as the following is true: Assumptions for cluster environment: Before you run CHARMM you have to define some environment variables. If you define nothing then CHARMM will run in a scalar mode, i.e. default is one node run. (We could adopt PARALLEL keyword in pref.dat as default.) PWD The program supports two shells: ksh (Korn Shell) and tcsh, which is available from anonymous ftp. The only difference from csh on which CHARMM makes assumption is definition of variable PWD. This variable is correctly defined in ksh and tcsh by default, while using csh it has to be defined by the user. Variable PWD points to the current working directory. If some other directory is requested the PWD environment variable may be changed appropriately. The program can figure out current working directory by itself but there are problems in some NFS environments, because home directory names can vary on different machines.( PWD is always defined correctly by shell which supports it ) So csh may sometimes cause problems. Using csh the cd command may be redefined so that it always defines also PWD. This is done with something like: alias cd 'chdir \!*; setenv PWD $cwd ' in the ~/.cshrc file. If you get an error which looks something like nonexistent directory then define PWD variable directly. [NIH specific: If you want to use tcsh as your login shell you may run the following command: runall chsh username /usr/local/bin/tcsh runall is a script which runs the command on the whole cluster of machines it is on /usr/local/bin at NIH. ] NODEx In order to run CHARMM on more then one node environment variables NODE0, NODE1, ..., NODEn have to be defined. Example for a 4 node run: setenv NODE0 par0 setenv NODE1 par1 setenv NODE2 par2 setenv NODE3 par4 charmm23 < input_file > output_file 1:parameter1 2:parameter2 ... "par0,par1,par2,.." are the names of the machines in the local network. There is no requirement that all machines should be of the same type. There is nothing in the program to adjust for unequal load balance so all nodes will follow the slowest one. In near future we may implement dynamic load balance method based on actual time required. The assumption here is that the node from where CHARMM program is started is always NODE0! Setup for your login environment In order to run CHARMM in parallel you have to be able to rlogin to any of the nodes defined in NODEx environment variables. Before you run CHARMM check this out: rlogin $NODE1 if it doesn't ask you for Password then you are OK. If it asks for Password then put a line like this: machine_name user_name in your ~/.rhosts file. [NIH specific: How to submit job to HP. Currently we have assigned machines par0, par1, par2, and par4 to work in parallel. You may use script /usr/local/bin/charmm23.parallel and submit it to par0. Example: submit par0 charmm23.parallel <input_file >output_file ^D To construct your own parallel scripts look at /usr/local/bin/charmm23.parallel ] In the input scripts Everything should work, but avoid usage of IOLEV and PRNLEV in your parallel scripts.
Parallel Code Status (as of October 1993) The symbol ++ indicates that parallel code development is underway. ----------------------------------------------------- Fully parallel and functional features: Energy evaluation ENERgy, GETE MINImization DYNAmics (leap frog integrator) BLOCK CRYSTAL IMAGES CONStraints (SHAKE,HARM,IC,DIHEdral,FIX,NOE) ANAL (energy partition) NBONds (generic) ----------------------------------------------------- Functional, but nonparallel code in the parallel version (no speedup): ( ** indicates that these can be very computationally intensive and are not recommended on parallel systems) VIBRAN ** CORREL **(Except for the energy time series evaluation, which is parallel) READ, WRITE, and PRINT (I/O in general) CORMAN commands HBONds HBUIld ** IC (internal coordinate commands) SCALar commands CONStraints (setup, DROPlet, SBOUnd) Miscellaneous commands GENErate, PATCh, DELEte, JOIN, RENAme, IMPAtch (all PSF modification commands) MERGE NBONDS (BYCUbe option) QUANtum ** ++ QUICk REWInd (not fully supported on the Intel) SOLANA EWALD ** ----------------------------------------------------- Nonfunctional code in parallel version: PERT (just doesn't work yet) ++ ANAL (table generation) DYNAmics (old integrator, NOSE integrator) GRAPhics INTEraction energy TSM MMFP PATH RISM TRAVEL RXNCOR ----------------------------------------------------- Untested Features (we don't know if it works or not): ANALysis MOLVIB (No testcase for this code?) MONItor NMR PRESsure (the command) RMSD
Note: Currently one should specify the absolute path to the pvm include files and the pvm library files. This is done because PVM installation is not currently standard. During installation, through use of install.com, you are asked to specify these paths. Convex PVM This version runs using PVM (Parallel Virtual Machine) versions 3.2.6 and higher. To run: 1. create hostfile - as in the example below: #host file puma0 dx=/usr/lib/pvm/pvmd3 ep=/chem/sfleisch/c24a2/exec/cspp The first field (puma0) is the hostname of the machine. The dx= field is the absolute path to the PVM daemon, pvmd3. This includes the filename, pvmd3. The last field, ep= is the search path for find the executable when the tasks are spawned. This can be a colon (:) separated string for searching multiple directories. The PVM system can be monitored using the console program, pvm. It has some useful commands: conf list machines in the virtual machine. ps -a list the tasks that are running. help list the commands. quit exit the console program without killing the daemon. halt kill everything that is running and the daemon and exit the console program. 2. Run the PVM daemon, pvmd3: pvmd3 hostfile & 3. Run the program e.g.: /chem/sfleisch/c24a2/exec/cspp/charmm -n <ncpu> <input_file >output_file & where -n <ncpu> indicates how many pvm controlled processes to run 4. Halt the daemon. See above. The Convex Exemplar PVM implementation uses shared memory via the System V IPC routines, shmget and shemat. Generic PARALLEL PVM version for workstation clusters Preflx directives required: <MACHTYPE> UNIX SCALAR PVM PARALLEL PARAFULL SYNCHRON Where <MACHTYPE> is the workstation you are compiling on, e.g., HPUX, ALPHA, etc. Note: Currently one must specify the absolute path to the pvm include files and the pvm library files. This is done because PVM installation is not currently standard. During installation, through use of install.com, you are asked to spceify these paths. This version runs using PVM (Parallel Virtual Machine) versions 3.2.6 and higher. To run: 1. create hostfile - as in the example below: #host file boa0 dx=/usr/lib/pvm/pvmd3 ep=/cb/manet1/c24a2/exec/hpux boa1 dx=/usr/lib/pvm/pvmd3 ep=/cb/manet1/c24a2/exec/hpux boa2 dx=/usr/lib/pvm/pvmd3 ep=/cb/manet1/c24a2/exec/hpux boa3 dx=/usr/lib/pvm/pvmd3 ep=/cb/manet1/c24a2/exec/hpux The first field (boa0, etc) is the hostname of the machine. The dx= field is the absolute path to the PVM daemon, pvmd3. This includes the filename, pvmd3. The last field, ep= is the search path for find the executable when the tasks are spawned. This can be a colon (:) separated string for searching multiple directories. The PVM system can be monitored using the console program, pvm. It has some useful commands: conf list machines in the virtual machine. ps -a list the tasks that are running. help list the commands. quit exit the console program without killing the daemon. halt kill everything that is running and the daemon and exit the console program. 2. Run the PVM daemon, pvmd3: pvmd3 hostfile & 3. Run the program e.g.: /cb/manet1/c24a2/exec/hpux/charmm -n <ncpu> <input_file >output_file & where -n <ncpu> indicates how many pvm controlled processes to run 4. Halt the daemon. See above.
NIH/DCRT/Laboratory for Structural Biology
FDA/CBER/OVRR Biophysics Laboratory