STORMM Source Documentation
Loading...
Searching...
No Matches
stormm::mm::CellGrid< T, Tacc, Tcalc, T4 > Class Template Reference

An object to manage the spatial decomposition of a system of particles. The general strategy is to arrange particles in a grid of cells, each at least half the direct-space cutoff in width in all directions. A work unit entails computing all interactions assigned to one of the cells. This is done using a neutral-territory decomposition the lines of the "tower-plate" arrangement, which consists of seventeen cells: More...

#include <cellgrid.h>

Public Member Functions

int getSystemCount () const
 Get the number of systems in the object.
 
int getTotalCellCount () const
 Get the total number of cells allocated in the object.
 
uint getCellBaseCapacity () const
 Get the base capacity of any particular cell in the grid. All systems' cells share the same base capacity.
 
double getCutoff () const
 Get the particle-particle cutoff defining the neighbor list.
 
double getPadding () const
 Get the padding incorporated into each neighbor list cell.
 
double getEffectiveCutoff () const
 Get the effective cutoff.
 
NonbondedTheme getTheme () const
 Get the non-bonded theme of this cell grid–does it hold atoms with only electrostatic or van-der Waals properties?
 
TinyBoxPresence getTinyBoxPresence () const
 Get the presence of a tiny unit cell anywhere in the synthesis.
 
CoordinateCycle getCyclePosition () const
 Get the position in the time cycle.
 
int getPositionScalingBits () const
 Get the number of bits in the fixed-precision format.
 
int getMeshSubdivisions () const
 Get the number of mesh elements spanning each spatial decomposition cell. The "mesh" refers to a particle-mesh interaction grid, e.g. the PME reciprocal space lattice.
 
const PhaseSpaceSynthesisgetCoordinateSynthesisPointer () const
 Get a const pointer to the coordinate synthesis served by this object.
 
const AtomGraphSynthesisgetTopologySynthesisPointer () const
 Get a const pointer to the topology synthesis served by this object.
 
const CellGrid< T, Tacc, Tcalc, T4 > * getSelfPointer () const
 Get a pointer to the object itself, useful if the object is only available by const reference.
 
void initializeForces (HybridTargetLevel tier=HybridTargetLevel::HOST, const GpuDetails &gpu=null_gpu)
 Initialize forces for the cell grid, on the CPU host or GPU device. This standalone feature provides a means for performing this activity at will, although in the most optimized code the process it performs will be fused with some other kernel. This will initialize forces for the current cell layout and should only be called once the contents of image_cell_limits or image_cell_limits_alt have been settled, whichever is appropriate to the current time cycle. There is only one array of forces in the CellGrid, as once they are accumulated in this object they are added to one of the two time-cycle dependent arrays in the accompanying PhaseSpaceSynthesis.
 
void markImageRelevance (HybridTargetLevel tier=HybridTargetLevel::HOST, const GpuDetails &gpu=null_gpu)
 Compute whether various atoms may be within range of the tower when they are in the plate of various neutral territory decompositions. Bitwise information (1 for relevant, 0 for irrelevant) is recorded in a 16-bit unsigned integer for each atom in the current image, as indicated by its stage in the coordinate cycle.
 
 CellGrid (const PhaseSpaceSynthesis *poly_ps_ptr_in, const AtomGraphSynthesis *poly_ag_ptr_in, double cutoff_in, double padding_in, int mesh_subdivisions_in, NonbondedTheme theme_in, uint cell_base_capacity_in=default_cellgrid_base_capacity, ExceptionResponse policy_in=ExceptionResponse::WARN)
 The constructor accepts an associated PhaseSpaceSynthesis object and a precision model. The system count and scaling factors will be set by the coordinate synthesis. The total cell count and cell layout can be modulated by specifying a cutoff and a padding related to how much larger each cell should be sized.
 
 CellGrid (const PhaseSpaceSynthesis *poly_ps_ptr_in, const AtomGraphSynthesis &poly_ag_ptr_in, double cutoff_in, double padding_in, int mesh_subdivisions_in, NonbondedTheme theme_in, uint cell_base_capacity_in=default_cellgrid_base_capacity, ExceptionResponse policy_in=ExceptionResponse::WARN)
 
 CellGrid (const PhaseSpaceSynthesis &poly_ps_ptr_in, const AtomGraphSynthesis &poly_ag_ptr_in, double cutoff_in, double padding_in, int mesh_subdivisions_in, NonbondedTheme theme_in, uint cell_base_capacity_in=default_cellgrid_base_capacity, ExceptionResponse policy_in=ExceptionResponse::WARN)
 
 CellGrid (const CellGrid &original)
 The presence of POINTER-kind Hybrid objects in the cell origin rulers invalidates the default copy and move constructors as well as assignment operators. Manual implementations are needed.
 
 CellGrid (CellGrid &&original)
 
CellGridoperator= (const CellGrid &original)
 
CellGridoperator= (CellGrid &&original)
 
int getCellCount (int index) const
 Get the number of cells allotted to any one system.
 
int getCellCount (int index, UnitCellAxis axis) const
 
int getCellCount (int index, CartesianDimension axis) const
 
uint getImageIndex (int system_index, int atom_index, CoordinateCycle orientation) const
 Find the image location of an atom based on its system and topological atom number.
 
uint getImageIndex (int system_index, int atom_index) const
 
int4 getCellLocation (int system_index, int atom_index, CoordinateCycle orientation) const
 Find the cell grid location of an atom based on its system and topological atom number. The result will be returned as the index of the cell within the grid belonging to the system of interest, with the indices along the A, B, and C axes placed in the "x", "y", and "z" members of the tuple. The overall index of the cell in the entire CellGrid object (up to 2^28 cells) will be returned in the "w" member of the tuple. All of this information is available in bit-packed form in the system_cell_limits array. Overloading and descriptions of parameters follow from getImageIndex() above.
 
int4 getCellLocation (int system_index, int atom_index) const
 
int2 getChainLocation (int system_index, int atom_index, CoordinateCycle orientation) const
 Get the chain index of an atom based on its system and topological atom index. The chain index is returned in the "x" member of the tuple while the offset within the chain is returned in the "y" member. Overloading and descriptions of parameters follow from getImageIndex() above.
 
int2 getChainLocation (int system_index, int atom_index) const
 
const PsSynthesisBorders getUnitCellTransforms (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST) const
 Get a miniature abstract of the underlying coordinate synthesis containing pointers to the arays of unit cell transformation matrices for each system.
 
const PsSynthesisBorders getUnitCellTransforms (HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
CellGridWriter< T, Tacc, Tcalc, T4 > data (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST)
 Obtain the object's abstract in order to access its members in C-programming fashion, whether on the CPU host or GPU device.
 
CellGridWriter< T, Tacc, Tcalc, T4 > data (HybridTargetLevel tier=HybridTargetLevel::HOST)
 
const CellGridReader< T, Tacc, Tcalc, T4 > data (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
const CellGridReader< T, Tacc, Tcalc, T4 > data (HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
CellGridWriter< void, void, void, void > templateFreeData (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST)
 Obtain the object's abstract with templating removed. Overloading and descriptions of parameters follow from data() above.
 
CellGridWriter< void, void, void, void > templateFreeData (HybridTargetLevel tier=HybridTargetLevel::HOST)
 
const CellGridReader< void, void, void, void > templateFreeData (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
const CellGridReader< void, void, void, void > templateFreeData (HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
const CellOriginsReader getRulers (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST) const
 Obtain the object's rulers for determining the origins of each neighbor list cell in terms of the fixed-precision coordinate system of the accompanying PhaseSpaceSynthesis.
 
const CellOriginsReader getRulers (HybridTargetLevel tier=HybridTargetLevel::HOST) const
 
CellOriginsWriter getRulers (CoordinateCycle orientation, HybridTargetLevel tier=HybridTargetLevel::HOST)
 
CellOriginsWriter getRulers (HybridTargetLevel tier=HybridTargetLevel::HOST)
 
void contributeForces (PhaseSpaceSynthesis *dest, HybridTargetLevel tier, const GpuDetails &gpu) const
 Contribute forces accumulated in the cell grid back to a coordinate synthesis with its own force accumulators and forces from other sources.
 
void contributeForces (HybridTargetLevel tier=HybridTargetLevel::HOST, const GpuDetails &gpu=null_gpu) const
 
void updatePositions (PhaseSpaceSynthesis *dest, HybridTargetLevel tier, const GpuDetails &gpu)
 Update the positions and cells of residence for all particles based on the current positions in the associated synthesis. Overloading and descriptions of input parameters follow from contributeForces(), above.
 
void updatePositions (HybridTargetLevel tier=HybridTargetLevel::HOST, const GpuDetails &gpu=null_gpu)
 
void updateCyclePosition ()
 Update the object's cycle position as it participates in the WHITE:BLACK tick-tock mechanics of other coordinate objects.
 
void updateCyclePosition (CoordinateCycle time_point)
 

Detailed Description

template<typename T, typename Tacc, typename Tcalc, typename T4>
class stormm::mm::CellGrid< T, Tacc, Tcalc, T4 >

An object to manage the spatial decomposition of a system of particles. The general strategy is to arrange particles in a grid of cells, each at least half the direct-space cutoff in width in all directions. A work unit entails computing all interactions assigned to one of the cells. This is done using a neutral-territory decomposition the lines of the "tower-plate" arrangement, which consists of seventeen cells:

      ---X--- ^                    ---X--- ^                    ------- ^
      ---X--- |                    ---X--- |                    ------- |

View along A: —XXX- c View along B: -XXXXX- c View along C: —XXX- b —X— —X— -XXXXX- —X— b-> —X— a-> -XXXXX- a->

As with other grids in STORMM, the cell grid index increments most rapidly along the A axis. Furthermore, the atoms of all cells in a stack along A are arranged in contiguous memory, with no spacing between them. Each row of cells along a mesh's A axis is arranged with a degree of padding so that the population of any particular row may fluctuate within some anticipated bounds. Memory access is optimized by arranging the tower along the C axis, providing continuous atom flow for 3, 5, and 5 cells when loading the larger plate. Atoms in the tower will interact with the most other atoms and require one global write (atomicAdd) to commit their forces back to the totals at the end of the work unit. Therefore, it is most efficient to have these atoms be the least contiguous of the whole.

In all cells, cache behavior is optimized by storing coordinates and properties as four-tuples such that only eight atoms in single-precision mode (or four in double-precision mode) are required to fill a cache line.

The work cycle is as follows: 1 .) A KERNEL COMPUTING VALENCE INTERACTIONS will accumulate all forces, including non-bonded contributions originating in this kernel on a previous cycle, and move particles. Following moves, this VALENCE INTERACTION KERNEL will compute new cell addresses for each atom and populate the vestibules of particles entering and leaving each cell. The VALENCE INTERACTION KERNEL will also deposit the new, scaled coordinates of each cell in the previous cycle's atoms array, which is no longer in use. 2.) A NEIGHBOR LIST UPDATE KERNEL will fire off ahead of the non-bonded kernels. This will handle re-imaging of coordinates and updates to cells. Taking the (now complete) vestibules of entering and exiting particles, the kernel begins by performing an out-of-place sort to adjust each cell in the mesh. Each warp performs a prefix sum over the adjustments in the particular cell's column in order to arrive at the limits within which to transfer particles, first from the cell's contents in the old atoms array (so long as they are to be retained) and then by referencing the prior indices of particles that are moving in and transferring the particles from their prior cells, but will likely be hidden by the memory accesses happening at the same time. Particles exit cells, causing contractions of the list of bitmasks and then of the bitmasks themselves. Particles leaving the plate cell prunes the array of bitmasks while particles leaving the tower prunes the bitmasks themselves–see notes in this library's template implementation file. Likewise, particles entering the tower cells will add bits to the back of each mask, and particles entering cells of the plate will expand the arrays of bitmasks in various places. The exact list of particles in each cell will be in flux, but with the identities of particles entering and the indices (positions within the cell grid image) of particles exiting already established in the preceding kernel, the non-bonded lists can be updated without risk of any race conditions. One warp will be tasked with updating each cell's atom coordinates while a second will be tasked with updating its exclusion masks. Both the atom coordinate update and the neighbor list update are performed out-of-place. 3 .) Taking the newly updated neighbor lists, the RECIPROCAL-SPACE KERNEL will proceed to assemble the the FFT pencils and use strong barriers in order to complete the convolution. 4 .) The DIRECT-SPACE KERNEL will launch to complete the work, and perform interpolation of long-ranged particle-particle interactions if that is required.

Constructor & Destructor Documentation

◆ CellGrid() [1/2]

template<typename T, typename Tacc, typename Tcalc, typename T4>
stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::CellGrid ( const PhaseSpaceSynthesis * poly_ps_ptr_in,
const AtomGraphSynthesis * poly_ag_ptr_in,
double cutoff_in,
double padding_in,
int mesh_subdivisions_in,
NonbondedTheme theme_in,
uint cell_base_capacity_in = default_cellgrid_base_capacity,
ExceptionResponse policy_in = ExceptionResponse::WARN )

The constructor accepts an associated PhaseSpaceSynthesis object and a precision model. The system count and scaling factors will be set by the coordinate synthesis. The total cell count and cell layout can be modulated by specifying a cutoff and a padding related to how much larger each cell should be sized.

Parameters
mesh_subdivisions_inThe number of mesh elements along each axis of one of the spatial decomposition cell. This value, times the cell grid dimensions themselves, defines the size of a particle-mesh interaction grid associated with the neighbor list, i.e. the PME reciprocal space grid.
theme_inThe non-bonded potential that will be computed based on the neighbor list represented in the cell grid
gpuDetails of the GPU that will utilize and manipulate the cell grid
cell_base_capacity_inThe atom capacity of an individual decomposition cell. This is set to be larger than the atom content is likely to go, and in all cases where it might be relevant multiple cells will pool their capacities to ensure that overflow is extremely unlikely.
policy_inCourse of action in the case of bad input

◆ CellGrid() [2/2]

template<typename T, typename Tacc, typename Tcalc, typename T4>
stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::CellGrid ( const CellGrid< T, Tacc, Tcalc, T4 > & original)

The presence of POINTER-kind Hybrid objects in the cell origin rulers invalidates the default copy and move constructors as well as assignment operators. Manual implementations are needed.

Parameters
originalThe original object to copy or move
otherAnother object placed on the right hand side of the assignment statement

Member Function Documentation

◆ contributeForces()

template<typename T, typename Tacc, typename Tcalc, typename T4>
void stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::contributeForces ( PhaseSpaceSynthesis * dest,
HybridTargetLevel tier,
const GpuDetails & gpu ) const

Contribute forces accumulated in the cell grid back to a coordinate synthesis with its own force accumulators and forces from other sources.

Overloaded:

  • Contribute the forces to an arbitrary PhaseSpaceSynthesis
  • Contribute the forces back to the CellGrid's internally referenced synthesis
Parameters
destSpecify a different synthesis to receive the accumulated forces
tierIndicate whether to initialize forces on the CPU host or GPU device
gpuDetails of the GPU that will perform the initialization of device memory

◆ data()

template<typename T, typename Tacc, typename Tcalc, typename T4>
CellGridWriter< T, Tacc, Tcalc, T4 > stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::data ( CoordinateCycle orientation,
HybridTargetLevel tier = HybridTargetLevel::HOST )

Obtain the object's abstract in order to access its members in C-programming fashion, whether on the CPU host or GPU device.

Overloaded:

  • Return a const reader from a const CellGrid object
  • Return an editable writer from a non-const CellGrid object
  • Indicate the point in the time cycle or take the device's internal setting
Parameters
orientationThe point in the time cycle at which to obtain some pointers
tierIndicate whether to target pointers to memory on the CPU host or GPU device

◆ getCellCount()

template<typename T, typename Tacc, typename Tcalc, typename T4>
int stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::getCellCount ( int index) const

Get the number of cells allotted to any one system.

Overloaded:

  • Get the total number of cells in the system.
  • Get the number cells along one of the system's simulation box axes.
Parameters
indexThe system of interest
axisThe axis of interest

◆ getImageIndex()

template<typename T, typename Tacc, typename Tcalc, typename T4>
uint stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::getImageIndex ( int system_index,
int atom_index,
CoordinateCycle orientation ) const

Find the image location of an atom based on its system and topological atom number.

Overloaded:

  • Check in the most recent image
  • Specify the primary or alternate image to check for the atom
Parameters
system_indexThe system of interest within the associated synthesis
atom_indexTopological atom index within the system
orientationSpecify whether to look in the primary or

◆ getRulers()

template<typename T, typename Tacc, typename Tcalc, typename T4>
const CellOriginsReader stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::getRulers ( CoordinateCycle orientation,
HybridTargetLevel tier = HybridTargetLevel::HOST ) const

Obtain the object's rulers for determining the origins of each neighbor list cell in terms of the fixed-precision coordinate system of the accompanying PhaseSpaceSynthesis.

Overloaded:

  • Obtain the rulers for a particular stage of the time cycle, or use the object's internal timekeeping
  • Obtain a mutable collection of rulers from a non-const object, or a read-only collection or rulers from a const object
Parameters
orientationThe point in the time cycle at which to obtain the rulers
tierIndicate whether to target pointers to memory on the CPU host or GPU device

◆ getUnitCellTransforms()

template<typename T, typename Tacc, typename Tcalc, typename T4>
const PsSynthesisBorders stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::getUnitCellTransforms ( CoordinateCycle orientation,
HybridTargetLevel tier = HybridTargetLevel::HOST ) const

Get a miniature abstract of the underlying coordinate synthesis containing pointers to the arays of unit cell transformation matrices for each system.

Overloaded:

  • Specify a stage of the coordinate cycle at which to retrieve box information
  • Take the CellGrid's own stage of the coordinate cycle (the CellGrid and its underlying synthesis should be aligned in their cycle positions in most situations)
Parameters
orientationIndicate whether to get the transformation matrices for the WHITE or BLACK stages of the coordinate cycle in the underlying synthesis. If unspecified, the coordinate cycle position of the CellGrid itself will be submitted.
tierIndicate whether to retrieve pointers to memory on the CPU host or GPU device

◆ initializeForces()

template<typename T, typename Tacc, typename Tcalc, typename T4>
void stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::initializeForces ( HybridTargetLevel tier = HybridTargetLevel::HOST,
const GpuDetails & gpu = null_gpu )

Initialize forces for the cell grid, on the CPU host or GPU device. This standalone feature provides a means for performing this activity at will, although in the most optimized code the process it performs will be fused with some other kernel. This will initialize forces for the current cell layout and should only be called once the contents of image_cell_limits or image_cell_limits_alt have been settled, whichever is appropriate to the current time cycle. There is only one array of forces in the CellGrid, as once they are accumulated in this object they are added to one of the two time-cycle dependent arrays in the accompanying PhaseSpaceSynthesis.

Parameters
tierIndicate whether to initialize forces on the CPU host or GPU device
gpuDetails of the GPU that will perform the initialization of device memory

◆ markImageRelevance()

template<typename T, typename Tacc, typename Tcalc, typename T4>
void stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::markImageRelevance ( HybridTargetLevel tier = HybridTargetLevel::HOST,
const GpuDetails & gpu = null_gpu )

Compute whether various atoms may be within range of the tower when they are in the plate of various neutral territory decompositions. Bitwise information (1 for relevant, 0 for irrelevant) is recorded in a 16-bit unsigned integer for each atom in the current image, as indicated by its stage in the coordinate cycle.

On GPU resources, the functionality in this routine is fused with particle migration.

The bitwise map is such that any work unit with a given home cell and twelve surrounding "plate" cells 0 (-2 cells along the A axis, -2 cells along the B axis relative to the home cell), 1 (-1, -2 cells along the A and B axes), ..., 4 (+2, -2), ..., 9 (+2, -1), 10 (-2, 0) and 11 (-1, 0) will look in bits 0, 1, ..., 11 to determine whether the atom is relevant. The relevance of atoms in the "tower" cells 0 (-2 along the C axis relative to the home cell), 1 (-1 along the C axis), 3 (+1 along the C axis), and 4 (+2 along the C axis) can be tested by looking at bits 12, 13, 14, and 15, respectively. Only atoms in the bottom and top cells of the tower might be culled due to being more than the cutoff from the nearest plane bounding the plate. Similarly, atoms in the "inner" four cells of the plate (6, 7, 8, and 10) will almost always be relevant. These facts will be utilized when filling the bit mask: only two cells of the tower (for which the calculation is a simple comparison of the Cartesian Z position and the height of a neighbor list cell) and eight cells of the plate are needed. If the unit cell is orthorhombic, the calculations for eight plate cells are likewise very simple. If it is not, the calculations involve finding the location of a point along which to situate a line parallel to the unit cell C axis, then finding the distance of the particle to that line.

Parameters
tierIndicate whether to update the relevance array on the CPU host or GPU device
gpuSpecifications of the GPU that will carry out calculations

◆ updateCyclePosition()

template<typename T, typename Tacc, typename Tcalc, typename T4>
void stormm::energy::CellGrid< T, Tacc, Tcalc, T4 >::updateCyclePosition ( )

Update the object's cycle position as it participates in the WHITE:BLACK tick-tock mechanics of other coordinate objects.

Overloaded:

  • Increment the cycle position one step forward (this toggles between the WHITE and BLACK states, and is the same as decrementing the cycle position unless the time cycle takes on a third possible stage)
  • Set the time cycle to a specific point
Parameters
time_pointThe point in the time cycle which the object is to take as "current"

The documentation for this class was generated from the following file: