mpi.rst
No OneTemporary
Actions

Subscribers

None

File Metadata

Created: Thu, Jul 3, 08:32

mpi.rst
View Options

	Working with MPI
	================

	Distributed memory parallelism in Tamaas is implemented with `MPI
	<https://www.mpi-forum.org/>`_. Due to the bottleneck role of the fast-Fourier
	transform in Tamaas' core routines, the data layout of Tamaas is that of `FFTW
	<http://fftw.org/fftw3_doc/MPI-Data-Distribution.html#MPI-Data-Distribution>`_.
	Tamaas is somewhat affected by limitations of FFTW, and MPI only works on
	systems with a 2D boundary, i.e. ``basic_2d``, ``surface_2d`` and ``volume_2d``
	model types (which are the most important anyways, since rough contact mechanics
	can yield different scaling laws in 1D).

	MPI support in Tamaas is still experimental, but the following parts are tested:

	- Rough surface generation

	- Surface statistics computation

	- Elastic normal contact

	- Elastic-plastic contact (with :py:class:`DFSANECXXSolver
	<tamaas.nonlinear_solvers.DFSANECXXSolver>`)

	- Dumping models with :py:class:`H5Dumper <tamaas.dumpers.H5Dumper>` and
	:py:class:`NetCDFDumper <tamaas.dumpers.NetCDFDumper>`.

	.. tip:: One can look at ``examples/plasticity.py`` for a full example of an
	elastic-plastic contact simulation that can run in MPI.


	Transparent MPI context
	-----------------------

	Some parts of Tamaas work transparently with MPI and no additional work or logic
	is needed.

	.. warning:: ``MPI_Init()`` is automatically called when importing the
	``tamaas`` module in Python. While this works transparently most of the time,
	in some situations, e.g. in Singularity containers, the program can hang if
	``tamaas`` is imported first. It is therefore advised to run ``from mpi4py
	import MPI`` before ``import tamaas`` to avoid issues.


	Creating a model
	~~~~~~~~~~~~~~~~~~~~

	The following snippet creates a model whose global shape is ``[16, 2048, 2048]``::

	import tamaas as tm

	model = tm.Model(tm.model_type.volume_2d,
	[0.1, 1, 1], [16, 2048, 2048])
	print(model.shape, model.global_shape)

	Running this code with ``mpirun -np 3`` will print the following (not necessarily
	in this order)::

	[16, 683, 2048] [16, 2048, 2048]
	[16, 682, 2048] [16, 2048, 2048]
	[16, 683, 2048] [16, 2048, 2048]

	Note that the partitioning occurs on the `x` dimension of the model (see below
	for more information on the data layout imposed by FFTW).

	Creating a rough surface
	~~~~~~~~~~~~~~~~~~~~~~~~

	Similarly, rough surface generators expect a global shape and return the
	partionned data::

	iso = tm.Isopowerlaw2D()
	iso.q0, iso.q1, iso.q2, iso.hurst = 4, 4, 32, .5
	gen = tm.SurfaceGeneratorRandomPhase2D([2048, 2048])
	gen.spectrum = iso

	surface = gen.buildSurface()
	print(surface.shape, tm.mpi.global_shape(surface.shape))

	With ``mpirun -np 3`` this should print::

	(682, 2048) [2048, 2048]
	(683, 2048) [2048, 2048]
	(683, 2048) [2048, 2048]

	Handling partitioning edge cases
	................................

	Under certain conditions, FFTW may assign to one or more processes a size of zero to the `x` dimension of
	the model. If that happens, the surface generator will raise a runtime error,
	which causes a deadlock because it does not exit the processes with zero data.
	The correct way to handle this edge case is::

	from mpi4py import MPI

	try:
	gen = tm.SurfaceGeneratorRandomPhase2D([128, 128])
	except RuntimeError as e:
	print(e)
	MPI.COMM_WORLD.Abort(1)

	This will correctly kill all processes. Alternatively, ``os._exit()`` can be used,
	but one should avoid ``sys.exit()``, as it kills the process by raising an
	exception, which still results in a deadlock.


	Computing statistics
	~~~~~~~~~~~~~~~~~~~~

	With a model's data distributed among independent process, computing global
	properties, like the true contact area, must be done in a collective fashion.
	This is transparently handled by the :py:class:`Statistics
	<tamaas._tamaas.Statistics2D>` class, e.g. with::

	contact = tm.Statistics2D.contact(model.traction)

	This gives the correct contact fraction, whereas something like
	``np.mean(model.traction > 0)`` will give a different result on each processor.

	Nonlinear solvers
	~~~~~~~~~~~~~~~~~

	The only nonlinear solver (for plastic contact) that works with MPI is
	:py:class:`DFSANECXXSolver <tamaas.nonlinear_solvers.DFSANECXXSolver>`, which is
	a C++ implementation of :py:class:`DFSANESolver
	<tamaas.nonlinear_solvers.DFSANESolver>` that works in an MPI context.

	.. note:: Scipy and Numpy use optimized BLAS routines for array operations,
	while Tamaas does not, which results in `serial` performance of the C++
	implementation of the DF-SANE algorithm being lower than the Scipy version.

	Dumping models
	~~~~~~~~~~~~~~

	The only dumpers that properly works in MPI are the :py:class:`H5Dumper
	<tamaas.dumpers.H5Dumper>` and :py:class:`NetCDFDumper
	<tamaas.dumpers.NetCDFDumper>`. Output is then as simple as::

	from tamaas.dumpers import H5Dumper

	H5Dumper('output', all_fields=True) << model

	This is useful for doing post-processing separately from the main simulation:
	the post-processing can then be done in serial.


	MPI convenience methods
	-----------------------

	Not every use case can be handled transparently, but although adapting existing
	scripts to work in an MPI context can require some work, especially if said
	scripts rely on numpy and scipy for pre- and post-processing (e.g. constructing
	a parabolic surface for hertzian contact, computing the total contact area), the
	module :py:mod:`mpi <tamaas._tamaas.mpi>` provides some convenience functions to
	make that task easier. The functions :py:func:`mpi.scatter
	<tamaas._tamaas.mpi.scatter>` and :py:func:`mpi.gather
	<tamaas._tamaas.mpi.gather>` can be used to scatter/gather 2D data using the
	partitioning scheme expected from FFTW (see figure below). The functions
	:py:func:`mpi.rank <tamaas._tamaas.mpi.rank>` and :py:func:`mpi.size
	<tamaas._tamaas.mpi.size>` are used to determine the local process rank and the
	total number of processes respectively.

	If finer control is needed, the function :py:func:`mpi.local_shape
	<tamaas._tamaas.mpi.local_shape>` gives the 2D shape of the local data if given
	the global 2D shape (its counterpart :py:func:`mpi.global_shape
	<tamaas._tamaas.mpi.global_shape>` does the exact opposite), while
	:py:func:`mpi.local_offset <tamaas._tamaas.mpi.local_offset>` gives the offset
	of the local data in the global :math:`x` dimension. These two functions mirror
	FFTW's own data distribution `functions
	<http://fftw.org/fftw3_doc/Basic-and-advanced-distribution-interfaces.html#Basic-and-advanced-distribution-interfaces>`_.

	.. figure:: figures/mpi_data_distribution.svg
	:align: center
	:width: 75%

	2D Data distribution scheme from FFTW. ``N0`` and ``N1`` are the
	number of points in the :math:`x` and :math:`y` directions
	respectively. The array ``local_N0``, indexed by the process rank,
	give the local size of the :math:`x` dimension. The
	:py:func:`local_offset <tamaas._tamaas.mpi.local_offset>` function
	gives the offset in :math:`x` for each process rank.

	The :py:mod:`mpi <tamaas._tamaas.mpi>` module also contains a function
	:py:func:`sequential <tamaas._tamaas.mpi.sequential>` whose return value is
	meant to be used as a context manager. Within the sequential context the default
	communicator is ``MPI_COMM_SELF`` instead of ``MPI_COMM_WORLD``.

	For other MPI functionality not covered by Tamaas that may be required, one can
	use `mpi4py <https://mpi4py.readthedocs.io>`_, which in conjunction with the
	methods in :py:mod:`mpi <tamaas._tamaas.mpi>` should handle just about any use
	case.

mpi.rstNo OneTemporaryActions

File Metadata

mpi.rstView Options

Event Timeline

mpi.rst
No OneTemporary
Actions

mpi.rst
View Options