lammps/lib/gpue927794696f8lammm-devel
gpu
README
LAMMPS ACCELERATOR LIBRARY -------------------------------- W. Michael Brown (ORNL) Trung Dac Nguyen (ORNL) Peng Wang (NVIDIA) Axel Kohlmeyer (Temple) Steve Plimpton (SNL) Inderaj Bains (NVIDIA)
This directory has source files to build a library that LAMMPS links against when using the GPU package.
This library must be built with a C++ compiler, before LAMMPS is built, so LAMMPS can link against it.
You can type "make lib-gpu" from the src directory to see help on how to build this library via make commands, or you can do the same thing by typing "python Install.py" from within this directory, or you can do it manually by following the instructions below.
Build the library using one of the provided Makefile.* files or create your own, specific to your compiler and system. For example:
make -f Makefile.linux
When you are done building this library, two files should exist in this directory:
libgpu.a the library LAMMPS will link against Makefile.lammps settings the LAMMPS Makefile will import
Makefile.lammps is created by the make command, by copying one of the Makefile.lammps.* files. See the EXTRAMAKE setting at the top of the Makefile.* files.
Makefile.lammps has settings for 3 variables:
user-gpu_SYSINC = leave blank for this package user-gpu_SYSLIB = CUDA libraries needed by this package user-gpu_SYSPATH = path(s) to where those libraries are
Because you have the CUDA compilers on your system, you should have the needed libraries. If the CUDA developement tools were installed in the standard manner, the settings in the Makefile.lammps.standard file should work.
GENERAL NOTES --------------------------------
This library, libgpu.a, provides routines for GPU acceleration of certain LAMMPS styles and neighbor list builds. Compilation of this library requires installing the CUDA GPU driver and CUDA toolkit for your operating system. Installation of the CUDA SDK is not necessary. In addition to the LAMMPS library, the binary nvc_get_devices will also be built. This can be used to query the names and properties of GPU devices on your system. A Makefile for OpenCL compilation is provided, but support for OpenCL use is not currently provided by the developers. Details of the implementation are provided in:
Brown, W.M., Wang, P. Plimpton, S.J., Tharrington, A.N. Implementing Molecular Dynamics on Hybrid High Performance Computers - Short Range Forces. Computer Physics Communications. 2011. 182: p. 898-911.
and
Brown, W.M., Kohlmeyer, A. Plimpton, S.J., Tharrington, A.N. Implementing Molecular Dynamics on Hybrid High Performance Computers - Particle-Particle Particle-Mesh. Computer Physics Communications. 2012. 183: p. 449-459.
and
Brown, W.M., Masako, Y. Implementing Molecular Dynamics on Hybrid High Performance Computers - Three-Body Potentials. Computer Physics Communications.
- 184: p. 2785–2793.
Current styles supporting GPU acceleration:
1 beck 2 born/coul/long 3 born/coul/wolf 4 born 5 buck/coul/cut 6 buck/coul/long 7 buck 8 colloid 9 coul/dsf 10 coul/long 11 eam/alloy 12 eam/fs 13 eam 14 gauss 15 gayberne 16 lj96/cut 17 lj/charmm/coul/long 18 lj/class2/coul/long 19 lj/class2 20 lj/cut/coul/cut 21 lj/cut/coul/debye 22 lj/cut/coul/dsf 23 lj/cut/coul/long 24 lj/cut/coul/msm 25 lj/cut/dipole/cut 26 lj/cut 27 lj/expand 28 lj/gromacs 29 lj/sdk/coul/long 30 lj/sdk 31 lj/sf/dipole/sf 32 mie/cut 33 morse 34 resquared 35 soft 36 sw 37 table 38 yukawa/colloid 39 yukawa 40 pppm MULTIPLE LAMMPS PROCESSES --------------------------------
Multiple LAMMPS MPI processes can share GPUs on the system, but multiple GPUs cannot be utilized by a single MPI process. In many cases, the best performance will be obtained by running as many MPI processes as CPU cores available with the condition that the number of MPI processes is an integer multiple of the number of GPUs being used. See the LAMMPS user manual for details on running with GPU acceleration.
BUILDING AND PRECISION MODES --------------------------------
To build, edit the CUDA_ARCH, CUDA_PRECISION, CUDA_HOME variables in one of the Makefiles. CUDA_ARCH should be set based on the compute capability of your GPU. This can be verified by running the nvc_get_devices executable after the build is complete. Additionally, the GPU package must be installed and compiled for LAMMPS. This may require editing the gpu_SYSPATH variable in the LAMMPS makefile.
Please note that the GPU library accesses the CUDA driver library directly, so it needs to be linked not only to the CUDA runtime library (libcudart.so) that ships with the CUDA toolkit, but also with the CUDA driver library (libcuda.so) that ships with the Nvidia driver. If you are compiling LAMMPS on the head node of a GPU cluster, this library may not be installed, so you may need to copy it over from one of the compute nodes (best into this directory).
The gpu library supports 3 precision modes as determined by the CUDA_PRECISION variable:
CUDA_PRECISION = -D_SINGLE_SINGLE # Single precision for all calculations CUDA_PRECISION = -D_DOUBLE_DOUBLE # Double precision for all calculations CUDA_PRECISION = -D_SINGLE_DOUBLE # Accumulation of forces, etc. in double
EXAMPLE BUILD PROCESS --------------------------------
cd ~/lammps/lib/gpu emacs Makefile.linux make -f Makefile.linux ./nvc_get_devices cd ../../src emacs ./MAKE/Makefile.linux make yes-asphere make yes-kspace make yes-gpu make linux