diff --git a/bench/FERMI/README b/bench/FERMI/README index 247cbf9f4..d8c2f4749 100644 --- a/bench/FERMI/README +++ b/bench/FERMI/README @@ -1,78 +1,75 @@ These are input scripts used to run versions of several of the benchmarks in the top-level bench directory using the GPU and USER-CUDA accelerator packages. The results of running these scripts on two different machines (a desktop with 2 Tesla GPUs and the ORNL Titan supercomputer) are shown on the "GPU (Fermi)" section of the Benchmark page of the LAMMPS WWW site: lammps.sandia.gov/bench. Examples are shown below of how to run these scripts. This assumes you have built 3 executables with both the GPU and USER-CUDA packages installed, e.g. lmp_linux_single lmp_linux_mixed lmp_linux_double The precision (single, mixed, double) refers to the GPU and USER-CUDA pacakge precision. See the README files in the lib/gpu and lib/cuda directories for instructions on how to build the packages with different precisions. The GPU and USER-CUDA sub-sections of the doc/Section_accelerate.html file also describes this process. ------------------------------------------------------------------------ To run on just CPUs (without using the GPU or USER-CUDA styles), do something like the following: mpirun -np 1 lmp_linux_double -v x 8 -v y 8 -v z 8 -v t 100 < in.lj -mpirun -np 12 lmp_linux_double -v x 16 -v y 16 -v z 16 -v t 100 < in.lj +mpirun -np 12 lmp_linux_double -v x 16 -v y 16 -v z 16 -v t 100 < in.eam The "xyz" settings determine the problem size. The "t" setting determines the number of timesteps. These mpirun commands run on a single node. To run on multiple nodes, scale up the "-np" setting. ------------------------------------------------------------------------ To run with the GPU package, do something like the following: -mpirun -np 12 lmp_linux_single -sf gpu -pk gpu 1 -v x 32 -v y 32 -v z 64 -v t 100 < in.lj -mpirun -np 8 lmp_linux_mixed -sf gpu -pk gpu 2 -v x 32 -v y 32 -v z 64 -v t 100 < in.lj +mpirun -np 12 lmp_linux_single -sf gpu -v x 32 -v y 32 -v z 64 -v t 100 < in.lj +mpirun -np 8 lmp_linux_mixed -sf gpu -pk gpu 2 -v x 32 -v y 32 -v z 64 -v t 100 < in.eam The "xyz" settings determine the problem size. The "t" setting determines the number of timesteps. The "np" setting determines how -many MPI tasks (per node) the problem will run on, The numeric -argument to the "-pk" setting is the number of GPUs (per node). Note -that you can use more MPI tasks than GPUs (per node) with the GPU -package. +many MPI tasks (per node) the problem will run on. The numeric +argument to the "-pk" setting is the number of GPUs (per node); 1 GPU +is the default. Note that you can use more MPI tasks than GPUs (per +node) with the GPU package. -These mpirun commands run on a single node. To run on multiple -nodes, scale up the "-np" setting, and control the number of -MPI tasks per node via a "-ppn" setting. +These mpirun commands run on a single node. To run on multiple nodes, +scale up the "-np" setting, and control the number of MPI tasks per +node via a "-ppn" setting. ------------------------------------------------------------------------ To run with the USER-CUDA package, do something like the following: -If the script has "cuda" in its name, it is meant to be run using -the USER-CUDA package. For example: - -mpirun -np 1 ../lmp_linux_single -c on -sf cuda -v g 1 -v x 16 -v y 16 -v z 16 -v t 100 < in.lj.cuda - -mpirun -np 2 ../lmp_linux_double -c on -sf cuda -v g 2 -v x 32 -v y 64 -v z 64 -v t 100 < in.eam.cuda +mpirun -np 1 lmp_linux_single -c on -sf cuda -v x 16 -v y 16 -v z 16 -v t 100 < in.lj +mpirun -np 2 lmp_linux_double -c on -sf cuda -pk cuda 2 -v x 32 -v y 64 -v z 64 -v t 100 < in.eam The "xyz" settings determine the problem size. The "t" setting determines the number of timesteps. The "np" setting determines how -many MPI tasks per compute node the problem will run on, and the "g" -setting determines how many GPUs per compute node the problem will run -on, i.e. 1 or 2 in this case. For the USER-CUDA package, the number -of MPI tasks and GPUs (both per compute node) must be equal. - -These mpirun commands run on a single node. To run on multiple -nodes, scale up the "-np" setting. +many MPI tasks (per node) the problem will run on. The numeric +argument to the "-pk" setting is the number of GPUs (per node); 1 GPU +is the default. Note that the number of MPI tasks must equal the +number of GPUs (both per node) with the USER-CUDA package. + +These mpirun commands run on a single node. To run on multiple nodes, +scale up the "-np" setting, and control the number of MPI tasks per +node via a "-ppn" setting. ------------------------------------------------------------------------ If the script has "titan" in its name, it was run on the Titan supercomputer at ORNL.