Python

The generation of python bindings has the objective to simplify the usage of the library, allowing users without low level programming experience to exploit the library in their applications. Python is a natural choice because of its wide usage and its simplicity. Using DistDir library in a python script allows to fully hide the complexity of the language and the algorithm to the end users.

A python module pydistdir can be generated by adding -DENABLE_PYTHON=ON option to the configuration command described in the Installation page. Once the python module is installed, the folder where the pythonn module is installed needs to be added to the PYTHONPATH environment variable. The generation of the python module requires cython and mpi4py which can be easily installed using pip.

This section describes the usage of the library in a python script. The basic principles of the library are still the same, but the interface is slightly different because Python is an object oriented language.

First of all, the module must be loaded

import pydistdir

Then the library needs to be initialized by calling the constructor of the distdir class

dd = pydistdir.distdir()

The finalization step is handled by python after that all the destructors of the generated objects are called by the garbage collector, so the user does not need to worry to destroy objects or finalize the library.

A new group can be created using a method of the main distdir class

dd.group(new_comm, work_comm, id)

where the work_comm is the communicator containing all the processes running the application, the id is an integer which defines in which group each process belongs and new_comm is the new communicator.

The distdir class also contains methods to setup the library

dd.verbose(pydistdir.pydistdir_verbose.verbose_true)

dd.exchanger(pydistdir.pydistdir_exchanger.IsendIrecv1)

The verbose method has an object of pydistdir_verbose class which is an integer enumerator and the exchanger method has an object of pydistdir_exchanger class which is also an integer enumerator. They are defined as follows:

class pydistdir_verbose(IntEnum):
    verbose_true  = 0
    verbose_false = 1

class pydistdir_exchanger(IntEnum):
    IsendIrecv1       = 0
    IsendIrecv2       = 1
    IsendRecv1        = 2
    IsendRecv2        = 3
    IsendIrecv1NoWait = 4
    IsendIrecv2NoWait = 5
    IsendRecv1NoWait  = 6
    IsendRecv2NoWait  = 7

A idxlist class is defined in the bindings and it allows to create an index list object which can be created easily by passying a numpy array or a tuple:

src_idxlist = pydistdir.idxlist([0, 1, 4, 5, 8, 9, 12, 13])

An empty index list object can be created not passing any array to the constructor of the idxlist class:

dst_idxlist = pydistdir.idxlist()

A map object can be created by passing the source and destination idxlist objects and the MPI communicator:

map = pydistdir.map(src_idxlist=src_idxlist, dst_idxlist=dst_idxlist, comm=MPI.COMM_WORLD)

In addition the stride argument can be passed explicitly (the default stride is -1 which means that striding is not used):

map = pydistdir.map(src_idxlist=src_idxlist, dst_idxlist=dst_idxlist, comm=MPI.COMM_WORLD, stride=8)

In case the map object is generated expanding a 2D map, the following constructor can be called:

map = pydistdir.map(map2d=map2d, nlevels=10)

where the map2d objects was generated by a previous call to pydistdir.map.

An exchanger object can be generated passing the map object, the hardware type and the MPI datatype

exchanger = pydistdir.exchanger(map, hw, type)

where the hw argument is a class defined as an integer enumerator:

class pydistdir_hardware(IntEnum):
    CPU        = 0
    GPU_NVIDIA = 1
    GPU_AMD    = 2

and type is an MPI datatype. The following MPI datatypes are supported: MPI.DOUBLE, MPI.FLOAT and MPI.INT.

The default hardware is the CPU and the default type is MPI.DOUBLE, so the exchanger can also be created providing only the map object:

exchanger = pydistdir.exchanger(map)

The data to be exchanger can be allocated as a numpy array:

data = np.array([0, 1, 2, 3, 4, 5, 6, 7], dtype = np.double)

Finally, the exchanger class has a go method which performs the actual exchange:

exchanger.go(data, data)

If a transformation is provided on the data layout, the following method can be used:

exchanger.go(data, data, transform_src, transform_dst)

where transform_src and transform_dst are integer numpy arrays.

The destructors of the idxlist, map and exchanger classes free the memory internally allocated by the library. They can be called explicitly inside a Python script or let the garbage collector call them.

Overall, the Python bindings provide the following classes:

distdir: initialize the library, create new group and handle the library settings
idxlist: create and destroy and index list
map: create and destroy a map
exchanger: create and destroy an exchanger and performs the actual exchange

If compared with the C API described in the Getting started page, the python interface is significantly easier even if with minor limitations which will be tackled in future developments.

All the C examples provided in the examples folder are replicated in python in the bindings/python/examples folder.

C++

A tiny wrapper around the library is generated to provide C++ bindings. This is not strictly necessary because a C library can be called directly from C++, but since C++ is an object oriented language, a class-based interface is provided. The header file distdir.hpp should be included. This is installed in the include folder by adding -DENABLE_CXX=ON. The classed belonging to the C++ interface are all defined in distdir namespace.

The design of the C++ bindings are similar to the Python bindings for consistency, since they are both object oriented languages.

The library is initialized by calling the constructor of the distdir class:

distdir::distdir::Ptr distdir( new distdir::distdir() );

And it is finalized by calling the destructor:

distdir.reset();

A new group can be created using the group method of the distdir class.

distdir->group(new_comm, work_comm, id);

where the work_comm is the communicator containing all the processes running the application, the id is an integer which defines in which group each process belongs and new_comm is the new communicator.

The distdir class also contains methods to setup the library

distdir->set_verbose(verbose_true)

distdir->set_exchanger(IsendIrecv1)

The verbose type is specified using the distdir_verbose enumerator and the exchanger type is specified using the distdir_exchanger enumerator.

A idxlist class is defined which allows to create an index list. A idxlist object can be created by passying a vector to the constructor:

distdir::idxlist::Ptr idxlist( new distdir::idxlist(list) );

where list is a std::vector<int>. The design of the C++ bindings is based on shared pointers, so each class defines a shared pointer type to itself distdir::idxlist::Ptr.

An empty idxlist object can be created by not passing any array to the constructor:

distdir::idxlist::Ptr idxlist_empty( new distdir::idxlist() );

The destructor of the idxlist class frees the memory internally generated:

idxlist.reset();

A map class is defined which allows to create a map. A map object can be created by passing the source and destination idxlist objects shared pointers and the MPI communicator

distdir::map::Ptr map( new distdir::map(idxlist, idxlist_empty, MPI_COMM_WORLD) );

In addition the stride argument can be passed

distdir::map::Ptr map( new distdir::map(idxlist, idxlist_empty, stride, MPI_COMM_WORLD) );

The default stride is -1 which means that striding is not used internally to generate the map.

In case the map object is generated expanding a 2D object, the constructor can be used as follows

distdir::map::Ptr map( new distdir::map(map2d, nlevels) );

where map2d is a shared pointer to a map object generated previously.

The destructor of the map class frees the memory internally generated:

map.reset();

An exchanger object can be generated passing the map object shared pointer, the hardware type and the MPI datatype

distdir::exchanger<int>::Ptr exchanger ( new distdir::exchanger<int>(map, MPI_INT, CPU) );

The exchanger class is templated, in the example it is specified which is used for int types and the type has to match the MPI datatype passed to the constructor. The hardware type is specified using the distdir_hardware enumerator defined in the C library. If not specified, the default hardware type is the CPU:

distdir::exchanger<int>::Ptr exchanger ( new distdir::exchanger<int>(map, MPI_INT) );

The main difference with the Python bindings is that the exchanger class is templated in C++ and there is no default MPI datatype.

The destructor of the exchanger class frees the memory internally generated:

exchanger.reset();

The data to be exchanger has to be defined in a vector. Finally, the go method of the exchanger class can be called

exchanger->go(data, data);

If a transformation is provided on the data layout, the following method can be used:

exchanger->go(data, data, transform_src, transform_dst);

where transform_src and transform_dst are vector of type int.

If compared with the C API described in the Getting started page, the C++ interface has an object oriented design.

All the C examples provided in the examples folder are replicated in C++ in the bindings/C++/examples folder.

Fortran

Fortran bindings are generated passing -DENABLE_FORTRAN=ON to the configuration command and the distdir_mod module is installed in the modules folder. Then, the module can be used in a Fortran program with

USE distdir_mod

The library is initialized and finalized as follows:

CALL distdir_initialize()
...
CALL distdir_finalize()

A new group can be created by calling the following subroutine:

CALL new_group(new_comm, work_comm, id)

new_group

void new_group(MPI_Comm *new_comm, MPI_Comm work_comm, int id)

Create new group of processes with the same provided id.

where the work_comm is the communicator containing all the processes running the application, the id is an integer which defines in which group each process belongs and new_comm is the new communicator.

The following subroutines can be used to set up the library:

CALL set_config_verbose(verbose_type)

CALL set_config_exchanger(exchanger_type)

set_config_verbose

void set_config_verbose(int verbose_type)

Set library verbosity.

set_config_exchanger

void set_config_exchanger(int exchanger_type)

Set library exchanger type.

The verbose type is specified using one of the following parameters:

DISTDIR_VERBOSE_TRUE

DISTDIR_VERBOSE_FALSE

The exchanger type is specified using one of the following parameters:

DISTDIR_EXCHANGER_IsendIrecv1
DISTDIR_EXCHANGER_IsendIrecv2
DISTDIR_EXCHANGER_IsendRecv1
DISTDIR_EXCHANGER_IsendRecv2
DISTDIR_EXCHANGER_IsendIrecv1NoWait
DISTDIR_EXCHANGER_IsendIrecv2NoWait
DISTDIR_EXCHANGER_IsendRecv1NoWait
DISTDIR_EXCHANGER_IsendRecv2NoWait

A new index list can be created passing a 1D integer array with the list of global indices and its size:

CALL new_idxlist(idxlist, list, npoints_local)

new_idxlist

t_idxlist * new_idxlist(int *idx_array, int num_indices)

Create new index list.

where idxlist is a variable of type TYPE(t_idxlist).

An empty index list can be generated without passing any list:

CALL new_idxlist(idxlist_empty)

The memory internally allocated for an index list can be freed by calling the following subroutine:

CALL delete_idxlist(idxlist)

delete_idxlist

void delete_idxlist(t_idxlist *idxlist)

Clean memory of a t_idxlist structure.

A map can be generated by passing the source and destination index lists and the MPI communicator:

CALL new_map(map, idxlist, idxlist_empty, MPI_COMM_WORLD)

new_map

t_map * new_map(t_idxlist *src_idxlist, t_idxlist *dst_idxlist, int stride, MPI_Comm comm)

Create a new t_map structure.

where map is a variable of type TYPE(t_map). In addition the stride argument can be passed:

CALL new_map(map, idxlist, idxlist_empty, stride, comm)

The default stride is -1 which means that striding is not used internally.

In case the map is generated expanding a 2D map, the new_map subroutine can be called as follows:

CALL new_map(map, map2d, nlevels)

where the map2d is a variable of type TYPE(t_map).

The memory internally allocated for a map can be freed by calling the following subroutine:

CALL delete_map(map)

delete_map

void delete_map(t_map *map)

Clean memory of a t_map structure.

An exchanger can be generated passing the previously created map, the hardware type and the MPI datatype:

CALL new_exchanger(exchanger, map, type, hw)

new_exchanger

t_exchanger * new_exchanger(t_map *map, MPI_Datatype type, distdir_hardware hw)

Create a new t_exchanger structure.

The exchanger variable is of type TYPE(t_exchanger). The type is an MPI datatype and the hw variable is an integer which can have the following values:

DISTDIR_HW_CPU
DISTDIR_HW_GPU_NVIDIA
DISTDIR_HW_GPU_AMD

The default hardware is the CPU, so the exchanger can also be created as follows:

CALL new_exchanger(exchanger, map, type)

The memory internally allocated for an exchanger can be freed by calling the following subroutine:

CALL delete_exchanger(exchanger)

delete_exchanger

void delete_exchanger(t_exchanger *exchanger)

Clean memory of a t_exchanger structure.

The data to be exchanger has to be defined in a 1D array. Finally, the exchanger_go subroutine can be called:

CALL exchanger_go(exchanger, C_LOC(data(1)), C_LOC(data(1)))

exchanger_go

void exchanger_go(t_exchanger *exchanger, void *src_data, void *dst_data)

Arbitrary exchange given a map.

If a transformation is provided on the data layout, the following subroutine can be used:

CALL exchanger_go(exchanger, C_LOC(data(1)), C_LOC(data(1)), transform_src, transform_dst)

where transform_src and transform_dst are 1D integer arrays.

The design of the Fortran bindings is very comparable to the C interface with the addition of function overloading which makes it slimmer.

All the C examples provided in the examples folder are replicated in Fortran in the bindings/Fortran/examples folder.

Julia

The development of Julia bindings has followed a different strategy. Since Julia development is standardized, a DistDir.jl package was created which is a submodule in the main DistDir repository. Since Julia is just in time (JIT) compiled, the module needs to open the library (using dlopen), so LOAD_PATH or LD_LIBRARY_PATH needs to be set to the location where the library is installed. In addition to that, the JULIA_LOAD_PATH needs to be set to the location of the DistDir.jl folder in order to be able to import the module in your Julia application. This step is needed because DistDir.jl is not an official package.

The library is initialized and finalized calling the following functions:

DistDir.initialize()
...
DistDir.finalize()

A new group can be created by calling the following function:

DistDir.new_group(new_comm, work_comm, id)

where the work_comm is the communicator containing all the processes running the application, the id is an integer which defines in which group each process belongs and new_comm is the new communicator.

The following functions can be used to set up the library:

DistDir.set_config_verbose(verbose_type)

DistDir.set_config_exchanger(exchanger_type)

The verbose_type argument is an enumerator which replicates the one defined in the C library:

@enum distdir_verbose begin
    verbose_true = 0
    verbose_false = 1
end

The same concept is applied to the exchanger_type argument:

@enum distdir_exchanger begin
    IsendIrecv1 = 0
    IsendIrecv2 = 1
    IsendRecv1 = 2
    IsendRecv2 = 3
    IsendIrecv1NoWait = 4
    IsendIrecv2NoWait = 5
    IsendRecv1NoWait = 6
    IsendRecv2NoWait = 7
end

An index list can be created by passing a integer vector with the list of global indices:

idxlist::DistDir.t_idxlist_jl = DistDir.new_idxlist(list)

where idxlist is a variable of type t_idxlist_jl.

An empty index list can be generated without passing any list:

idxlist_empty::DistDir.t_idxlist_jl = DistDir.new_idxlist()

The memory internally allocated for an index list can be freed by calling the following function:

DistDir.delete_idxlist(idxlist)

A map can be created by passing the source and destination index lists and the MPI communicator:

map :: DistDir.t_map_jl = DistDir.new_map(idxlist, idxlist_empty, MPI.COMM_WORLD)

where map is a variable of type t_map_jl. In addition the stride argument can be passed:

map :: DistDir.t_map_jl = DistDir.new_map(idxlist, idxlist_empty, stride, MPI.COMM_WORLD)

The default stride is -1 which means that striding is not used internally.

In case the map is generated expanding a 2D previously generated map, the following function can be used:

map :: DistDir.t_map_jl = DistDir.new_map(map2d, nlevels)

where the map2d is a t_map_jl variable.

The memory internally allocated for a map can be freed by calling the following function:

DistDir.delete_map(map)

An exchanger can be generated by passing the map, the hardware type and the MPI datatype:

exchanger :: DistDir.t_exchanger_jl = DistDir.new_exchanger(map, MPI.Datatype(Float64), hw)

where the exchanger variable is of type t_exchanger_jl. The type is an MPI datatype, like MPI.Datatype(Float64) for doubles, and the hardware is an integer which can be defined using the following enumerator:

@enum distdir_hardware begin
    CPU = 0
    GPU_NVIDIA = 1
    GPU_AMD = 2
end

The default hardware is the CPU, so the exchanger can also be created as follows:

exchanger :: DistDir.t_exchanger_jl = DistDir.new_exchanger(map, MPI.Datatype(Float64))

The memory internally allocated for an exchanger can be freed by calling the following function:

DistDir.delete_exchanger(exchanger)

The data to be exchanger has to be defined in a vector. Finally, the exchanger_go function can be called:

DistDir.exchanger_go(exchanger, data, data)

If a transformation is provided on the data layout, the following function can be used:

DistDir.exchanger_go(exchanger, data, data, transform_src, transform_dst)

where transform_src and transform_dst are integer vectors.

Finally, a Julia script can be launched as follows:

mpiexecjl -n 4 julia example_basic1.jl

example_basic1

int example_basic1()

Basic example of exchange between two 2D domain decomposition each using 2 MPI processes.

Definition: example_basic1.c:68

The Julia interface is very comparable to the Fortran one. All the C examples provided in the examples folder are replicated in Fortran in the bindings/Julia/examples folder.

Table of Contents

Python

C++

Fortran

Julia