<p>Note that if the “-sf gpu” switch is used, it also issues a default
<a class="reference internal" href="package.html"><span class="doc">package gpu 1</span></a> command, which sets the number of
GPUs/node to 1.</p>
<p>Using the “-pk” switch explicitly allows for setting of the number of
GPUs/node to use and additional options. Its syntax is the same as
same as the “package gpu” command. See the <a class="reference internal" href="package.html"><span class="doc">package</span></a>
command doc page for details, including the default values used for
all its options if it is not specified.</p>
<p>Note that the default for the <a class="reference internal" href="package.html"><span class="doc">package gpu</span></a> command is to
set the Newton flag to “off” pairwise interactions. It does not
affect the setting for bonded interactions (LAMMPS default is “on”).
The “off” setting for pairwise interaction is currently required for
GPU package pair styles.</p>
<p><strong>Or run with the GPU package by editing an input script:</strong></p>
<p>The discussion above for the mpirun/mpiexec command, MPI tasks/node,
and use of multiple MPI tasks/GPU is the same.</p>
<p>Use the <a class="reference internal" href="suffix.html"><span class="doc">suffix gpu</span></a> command, or you can explicitly add an
“gpu” suffix to individual styles in your input script, e.g.</p>
<pre class="literal-block">
pair_style lj/cut/gpu 2.5
</pre>
<p>You must also use the <a class="reference internal" href="package.html"><span class="doc">package gpu</span></a> command to enable the
GPU package, unless the “-sf gpu” or “-pk gpu” <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switches</span></a> were used. It specifies the
number of GPUs/node to use, as well as other options.</p>
<p><strong>Speed-ups to expect:</strong></p>
<p>The performance of a GPU versus a multi-core CPU is a function of your
hardware, which pair style is used, the number of atoms/GPU, and the
precision used on the GPU (double, single, mixed).</p>
<p>See the <a class="reference external" href="http://lammps.sandia.gov/bench.html">Benchmark page</a> of the
LAMMPS web site for performance of the GPU package on various
hardware, including the Titan HPC platform at ORNL.</p>
<p>You should also experiment with how many MPI tasks per GPU to use to
give the best performance for your problem and machine. This is also
a function of the problem size and the pair style being using.
Likewise, you should experiment with the precision setting for the GPU
library to see if single or mixed precision will give accurate
results, since they will typically be faster.</p>
<p><strong>Guidelines for best performance:</strong></p>
<ul class="simple">
<li>Using multiple MPI tasks per GPU will often give the best performance,
as allowed my most multi-core CPU/GPU configurations.</li>
<li>If the number of particles per MPI task is small (e.g. 100s of
particles), it can be more efficient to run with fewer MPI tasks per
GPU, even if you do not use all the cores on the compute node.</li>
<li>The <a class="reference internal" href="package.html"><span class="doc">package gpu</span></a> command has several options for tuning
performance. Neighbor lists can be built on the GPU or CPU. Force
calculations can be dynamically balanced across the CPU cores and
GPUs. GPU-specific settings can be made which can be optimized
for different hardware. See the <a class="reference internal" href="package.html"><span class="doc">packakge</span></a> command
doc page for details.</li>
<li>As described by the <a class="reference internal" href="package.html"><span class="doc">package gpu</span></a> command, GPU
accelerated pair styles can perform computations asynchronously with
CPU computations. The “Pair” time reported by LAMMPS will be the
maximum of the time required to complete the CPU pair style
computations and the time required to complete the GPU pair style
computations. Any time spent for GPU-enabled pair styles for
computations that run simultaneously with <a class="reference internal" href="bond_style.html"><span class="doc">bond</span></a>,
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.