About GAMESS

GAMESS is a program for ab initio molecular quantum chemistry. Briefly, GAMESS can compute SCF wavefunctions ranging from RHF, ROHF, UHF, GVB, and MCSCF. Correlation corrections to these SCF wavefunctions include Configuration Interaction, second order perturbation Theory, and Coupled-Cluster approaches, as well as the Density Functional Theory approximation. Nuclear gradients are available, for automatic geometry optimization, transition state searches, or reaction path following. Computation of the energy hessian permits prediction of vibrational frequencies, with IR or Raman intensities. Solvent effects may be modeled by the discrete Effective Fragment potentials, or continuum models such as the polarizable Continuum Model. Numerous relativistic computations are available, including third order Douglas-Kroll scalar corrections, and various spin-orbit coupling options. The Fragment Molecular Orbital method permits use of many of these sophisticated treatments to be used on very large systems, by dividing the computation into small fragments. Nuclear wavefunctions can also be computed, in VSCF, or with explicit treatment of nuclear orbitals by the NEO code.

Official website : http://www.msg.ameslab.gov/gamess/

This document explains how to build GAMESS 2010R1 on Intel Westmere with Infiniband network, using the follow software:

  • Intel Compiler Suite 11.1.072 (also includes MKL)
  • OpenMPI-1.4.2*

* OpenMPI was build with ICS-11.1.072, BLCR-0.8.2, OFED-1.5.1 and with SGE flags.

It's important to know that this build it's highly optimised for our environment, and obviously, if you have other network or architecture, you will have to investigate what kind of compilers, libraries and parallel environments offers to you the best performance.

GAMESS can resolve some problems using parallel environment, that's why it's interesting to maintain a serial version and a parallel version of this software.

Environment Set Up

For long time builds it's recommended to use an screen session. This software allows to reattach a background session.

# screen -S GAMESS-ics-mkl-ompi
# tar -zxvf gamess.tar.gz

First at all, we load the modules needed to build this software. We used to integrate the dependencies inside the module files. In this case, when we load the OpenMPI environment, this module loads the Intel Compiler Suite, BLCR and OFED modules also.

#  . /opt/modules/init/bash
# module load OpenMPI/1.4.2_ics-11.1.072_ofed-1.5.1_blcr-8.2
# module load intel_mkl/11.1.072
# /opt/intel/Compiler/11.1/072/bin/ifortvars.sh intel64
# /opt/intel/Compiler/11.1/072/bin/iccvars.sh intel64

Configure

Run config script and answer all the questions.

#  ./config

If you continue with the standard way, the build will fail. There is some relevant changes to get a successful build and improve the performance.

compddi

Enter on ddi directory and modify compddi

#  cd ddi/
# cp compddi compddi.0
# vi compddi

In our case, we have Intel Nehalem X5650 and each node have 12 cores, then the MAXCPUS will be 12. We don't have non-bloking features on our IB switch, and the maximum nodes to send a parallel job is 36 nodes (432 cores).

# diff compddi compddi.0
83,84c83,84
< set MAXCPUS = 12
< set MAXNODES = 36
---
> set MAXCPUS = 8
> set MAXNODES = 128
133c133
< set MPI_INCLUDE_PATH = '/aplic/MPI/OpenMPI/1.4.2_ics-11.1.072_ofed-1.5.1_blcr-8.2/include'
---
> set MPI_INCLUDE_PATH = ''
696c696
< set CC = 'icc'
---
> set CC = 'gcc'

Run compddi script

# ./compddi | tee -a compddi_nehalem.log

at the end, it will returns something like:

DDI compilation ended successfully.
Mon Aug 30 16:17:37 CEST 2010
6.904u 2.672s 2:21.41 6.7% 0+0k 2584976+23408io 13999pf+0w

compall

Change the CC compiler for icc at compall

#  cp -p compall compall.0
# vi compall
# diff compall compall.0
54c54
< if ($TARGET == linux64) set CCOMP='icc'
---
> if ($TARGET == linux64) set CCOMP='gcc'

and run the script:

#  ./compall | tee -a compall_nehalem_dp.log

lked

Modify lked in order to link MKL correctly

#  cp -p lked lked.0
# vi lked

# diff lked lked.0
532,535c532,534
< #set MATHLIBS=" $mpath/libmkl_intel_lp64.a"
< #set MATHLIBS="$MATHLIBS $mpath/libmkl_sequential.a"
< #set MATHLIBS="$MATHLIBS $mpath/libmkl_core.a"
< set MATHLIBS="$mpath/libmkl_solver_lp64_sequential.a -Wl,--start-group $mpath/libmkl_intel_lp64.a $mpath/libmkl_sequential.a $mpath/libmkl_core.a -Wl,--end-group -lpthread"
---
> set MATHLIBS=" $mpath/libmkl_intel_lp64.a"
> set MATHLIBS="$MATHLIBS $mpath/libmkl_sequential.a"
> set MATHLIBS="$MATHLIBS $mpath/libmkl_core.a"

and run lked script:

#  ./lked | tee -a lked_nehalem-dp.log

At the end, you will find this message:

The linking of GAMESS to binary gamess.00.x was successful.
4.212u 0.420s 0:40.68 11.3% 0+0k 567832+78616io 3454pf+0w

Now, you have to modify to adjust to your needs.

#  cp -p rungms rungms.orig
# vi rungms

The next script, will allow to send MPI jobs with SGE and OpenMPI. If you try to send a serial job, this script will advise you and exit.

Tests

In order to verify the results and the performance for multicore scenario, you will need to modify runall script.

# cp -p runall runall-12cores
# diff runall-12cores runall.orig
13c13
< #chdir /u1/mike/gamess
---
> chdir /u1/mike/gamess
30c30
< ./rungms exam$NUM $VERNO 12 >& exam$NUM.log
---
> ./rungms exam$NUM $VERNO 1 >& exam$NUM.log

Then, run this script

#  ./runall-12cores

To verify the results, you may run ./tools/checktst/checktst. Don't be afraid when some errors appears on the screen.

Gamess have 7 tests files that don't runs on parallel mode (exam05, exam23, exam25, exam27, exam32, exam39 and exam42).

#  ./tools/checktst/checktst | tee -a checktst_nehalem-dp.log
Please type the full directory containing exam??.log [.]:
Checking the results of your sample GAMESS calculations,
the output files (exam??.log) will be taken from .
Only 37 out of 44 examples terminated normally.
Please check carefully each of the following runs:
./exam05.log
./exam23.log
./exam25.log
./exam27.log
./exam32.log
./exam39.log
./exam42.log
which did not completely finish.
exam01: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam02: Eerr=0.0e+00 Gerr=0.0e+00 Serr=0.0e+00 Lerr=1.8e-03+6.6e-05. Passed.
exam03: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam04: Eerr=0.0e+00 Gerr=0.0e+00 Oerr=0.0e+00 Derr=0.0e+00. Passed.
exam05: sed: -e expression #1, char 3: invalid usage of line address 0
Eerr=3.8e+01 Gerr=3.2e-02 Derr=6.9e-01. !!FAILED.
exam06: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam07: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam08: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam09: Eerr=8.0e-10. Passed.
exam10: Eerr=0.0e+00 Werr=0.0e+00 Ierr=0.0e+00 Perr=0.0e+00. Passed.
exam11: Eerr=0.0e+00 Rerr=0.0e+00. Passed.
exam12: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam13: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam14: Eerr=0.0e+00+0.0e+00 Derr=0.0e+00+0.0e+00. Passed.
exam15: Eerr=0.0e+00. Passed.
exam16: Eerr=0.0e+00. Passed.
exam17: Eerr=0.0e+00 Werr=0.0e+00 Ierr=0.0e+00 Perr=0.0e+00. Passed.
exam18: Eerr=0.0e+00 Werr=0.0e+00. Passed.
exam19: Eerr=0.0e+00+0.0e+00. Passed.
exam20: Eerr=0.0e+00 Xerr=0.0e+00. Passed.
exam21: Eerr=0.0e+00 Werr=0.0e+00 Ierr=0.0e+00 Perr=0.0e+00. Passed.
exam22: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam23: Herr=2.8e+00 Gerr=1.9e-05. !!FAILED.
exam24: Eerr=0.0e+00 Gerr=0.0e+00 Derr=0.0e+00. Passed.
exam25: Eerr=4.9e+01 Gerr=2.1e-05. !!FAILED.
exam26: Eerr=0.0e+00 Lerr=1.1e-07. Passed.
exam27: sed: -e expression #1, char 3: invalid usage of line address 0
Eerr=9.1e+00 Verr=2.9e-02. !!FAILED.
exam28: Eerr=0.0e+00. Passed.
exam29: Eerr=0.0e+00. Passed.
exam30: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam31: Eerr=0.0e+00 Gerr=0.0e+00 Ierr=0.0e+00. Passed.
exam32: Eerr=1.3e+02 Terr=1.4e-02. !!FAILED.
exam33: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam34: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam35: Eerr=6.9e-09. Passed.
exam36: Eerr=0.0e+00 Werr=0.0e+00 Ierr=0.0e+00. Passed.
exam37: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam38: Eerr=0.0e+00 Gerr=0.0e+00. Passed.
exam39: RIerr=1.2e+02 HRIerr=9.8e+02. !!FAILED.
exam40: E1err=0.0e+00 E2err=0.0e+00 RMSerr=0.0e+00 Passed.
exam41: EXCerr=0.0e+00 eV, Gerr=2.0e-09, OSCerr=0.0e+00. Passed.
exam42: Eerr=9.2e+01 Gerr=2.7e-02. !!FAILED.
exam43: HEATerr=0.0e+00 kcal/mol. Passed.
exam44: SCFerr=0.0e+00 MP2err=0.0e+00. Passed.
Only 37 out of 44 examples terminated normally.
7 job(s) got incorrect numerical results. Please examine why.

If you open this log files, you will find some error message like this:

ERROR: ONLY CCTYP=CCSD OR CCTYP=CCSD(T) CAN RUN IN PARALLEL.
ALL OTHER CCTYP CALCULATIONS MUST RUN ON A SINGLE CPU.
*** ERROR *** INCORRECT USE OF RUNTYP=TDHFX
THE EXTENDED TDHF PACKAGE IS NOT ENABLED FOR
A. USE OF ANY SCFTYP OTHER THAN -RHF-,
B. TREATMENT OF CORRELATION (NO DFT, MP2, CC, OR CI),
C. OR PARALLEL EXECUTION
ERROR: THE MOPAC PARAMETERIZATION REQUESTED IS AM1
SEMI-EMPIRICAL COMPUTATIONS MAY NOT BE PERFORMED WITH
ANY TYPE OF CI, MP, CC, DFT, OR WITH SCFTYP=MCSCF.
SEMI-EMPIRICAL JOBS CANNOT LOCALIZE ORBITALS.
SEMI-EMPIRICAL RUNS USE MINIMAL STO BASIS SETS, SO
YOU CANNOT REQUEST EXOTIC GAUSSIAN BASIS PROPERTIES.
SEMI-EMPIRICAL JOBS MAY NOT BE RUN IN PARALLEL. <-----------
ONLY THE GRADIENTS FOR CITYP=CIS RUN IN PARALLEL.
* * * ERROR * * *
YOUR RUNTYP=GRADIENT REQUIRES NUCLEAR GRADIENTS,
WHICH ARE NOT AVAILABLE ANALYTICALLY (SEE ABOVE).
IF THE NUMBER OF SYMMETRY UNIQUE ATOMS IS RATHER SMALL,
YOU MIGHT CONSIDER NUMERICAL DERIVATIVES:
$CONTRL NUMGRD=.T. $END
YOU MUST EXPLICITLY ASK FOR THIS OPTION BECAUSE IT IS SO TIME CONSUMING.

SGE integration

In order to integrate our rungms script to SGE batch queue system, we have made the follow changes:

# diff rungms rungms.local
58c58,61
< set GMSPATH=/aplic/gamess/2010R2/ics-11.1.072/ompi-1.4.2
---
> set SCR=/scratch/$USER
> set USERSCR=/scratch/$USER
> set GMSPATH=/scratch/jblasco/GAMESS/gamess_ompi
> #set SCR=$TMPDIR
67,72d69
<
< set SCR=/scratch/$USER
< # Habilito suport per SGE
< if (null$TMPDIR == null) set SCR=$TMPDIR
< set USERSCR=$SCR

Setting up the modulefile

#  vi /opt/modules/modulefiles/Applications/gamess/2010r2_intel11.1_mkl11.1_ompi-1.4.2
# cat /opt/modules/modulefiles/Applications/gamess/2010r2_intel11.1_mkl11.1_ompi-1.4.2
#%Module1.0#####################################################################
##
##
proc ModulesHelp { } {
puts stderr "\tCarga las variables de entorno para el uso de gamess 2010r2 compilado con INTEL+MKL 11.1 y OMPI-1.4.2"}
module-whatis "Carga las variables de entorno para el uso de gamess 2010r2 compilado con INTEL+MKL 11.1 y OMPI-1.4.2"

prepend-path PATH /aplic/gamess/2010R2/ics-11.1.072/ompi-1.4.2
module load intel_mkl/11.1.072
module load OpenMPI/1.4.2_ics-11.1.072_ofed-1.5.1_blcr-8.2