lammps/bench/KEPLER93190a548a4alammm-devel
KEPLER
README
These are build and input and run scripts used to run the LJ benchmark in the top-level bench directory using all the various accelerator packages currently available in LAMMPS. The results of running these benchmarks on a GPU cluster with Kepler GPUs are shown on the "GPU (Kepler)" section of the Benchmark page of the LAMMPS WWW site: lammps.sandia.gov/bench.
The specifics of the benchmark machine are as follows:
It is a small GPU cluster at Sandia National Labs called "shannon". It has 32 nodes, each with two 8-core Sandy Bridge Xeon CPUs (E5-2670, 2.6GHz, HT deactivated), for a total of 512 cores. Twenty-four of the nodes have two NVIDIA Kepler GPUs (K20x, 2688 732 MHz cores). LAMMPS was compiled with the Intel icc compiler, using module openmpi/1.8.1/intel/13.1.SP1.106/cuda/6.0.37.
You can, of course, build LAMMPS yourself with any of the accelerator packages installed for your platform.
The build.py script will build LAMMPS for the various accelerlator packages using the Makefile.* files in this dir, which you can edit if necessary for your platform. You must set the "lmpdir" variable at the top of build.py to the home directory of LAMMPS as installed on your system. Note that the build.py script hardcodes the arch setting for the USER-CUDA package, which should be matched to the GPUs on your system, e.g. sm_35 for Kepler GPUs. For the GPU package, this setting is in the Makefile.gpu.* files, as is the CUDA_HOME variable which should point to where NVIDIA Cuda software is installed on your system.
Once the Makefiles are in place, then typing, for example,
python build.py cpu gpu
will build executables for the CPU (no accelerators), and 3 variants (double, mixed, single precision) of the GPU package. See the list of possible targets at the top of the build.py script.
Note that the build.py script will un-install all packages in your LAMMPS directory, then only install the ones needed for the benchmark. The Makefile.* files in this dir are copied into lammps/src/MAKE, as a dummy Makefile.foo, so they will not conflict with makefiles that may already be there. The build.py script also builds the auxiliary GPU and USER-CUDA library as needed.
LAMMPS executables that are generated by build.py are copied into this directory when the script finishes each build.
The in.* files can be run with any of the accelerator packages, if you specify the appropriate command-line switches. These include switches to set the problem size and number of timesteps to run.
The run*.sh scripts have sample mpirun commands for running the input scripts on a single node or on multiple nodes for the strong and weak scaling results shown on the benchmark web page. These scripts are provided for illustration purposes, to show what command-line arguments are used with each accelerator package.
Note that we generate these run scripts, either for interactive or batch submission, via Python scripts which often produces a long list of runs to exercise a combination of options. To perform a quick benchmark calculation on your platform, you will typically only want to run a few commands out of any of the run*.sh scripts.