Full Transcript

Message Passing Interface (MPI) High Performance Computing Master in Applied Artificial Intelligence Nuno Lopes Motivation u Parallel architectures can scale by adding new nodes to the system. u Each node has a processor and memory independen...

Message Passing Interface (MPI) High Performance Computing Master in Applied Artificial Intelligence Nuno Lopes Motivation u Parallel architectures can scale by adding new nodes to the system. u Each node has a processor and memory independent of the other nodes. u In the message-switching paradigm: u processes only send and receive messages to and from each other, u there is no memory sharing. u Message Passing Interface standard, defines an API for processes to exchange data among themselves. u Single Program Multiple Data (SPMD) approach. Open MPI Library u Open MPI (https://www.open-mpi.org) is an open-source implementation for the pc (Windows, Mac, Linux). u Header file include: u #include u Script to compile MPI programs, wrapper for gcc: u $ mpicc u Script to execute the program: u $ mpirun (ou mpiexec) MPI API Initialization u int MPI_Init(int *argc, char ***argv) u Function that must be invoked before any other MPI in the program. u Receives the address of the main function parameters, or NULL. u int MPI_Finalize() u Function that terminates the MPI library in the process, from which point you should no longer invoke MPI functions. MPI API Initialization u int MPI_Comm_rank(MPI_Comm comm, int *rank) u Function that returns a process identifier, within the process set u rank is outbound parameter u int MPI_Comm_size(MPI_Comm comm, int *size) u Function that returns the size of the process set, u size parameter is output. u MPI_COMM_WORLD u Constant representing the set of all processes in an execution. MPI Basic Example #include #include int main(int argc, char** argv) { MPI_Init(NULL, NULL); int rank; int world; MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &world); printf("Hello: rank %d, world: %d\n",rank, world); MPI_Finalize(); } MPI Compilation and Execution u Compile using the wrapper around the system compiler, typically gcc: $ mpicc exemplo.c -o exemplo u To execute with 4 process: $ mpirun -n 4./exemplo Option if necessary: --use-hwthread-cpus Point-to-Point Communication Communication between Processes u Communication happens by sending and receiving messages. u As the code is the same in both processes, the operations must be aligned in their execution. u Each process executes a different part of the same code, through "if "s. u Each process is identified by its rank value. u A process executes function to send: "Send"; u Another process executes function to receive: "Recv". MPI Send int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) u Buffer: memory pointer to data u Count: number of elements in the message u Datatype: type of data sent (MPI constant) u Destination: rank of the destination process u Tag: tag (integer value) used to distinguish message channels u Comm: process group (general: MPI_COMM_WORLD) MPI Receive int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) u Buffer: memory pointer to receive message u Count: maximum number of possible elements to receive u Datatype: type of data of the message u Source: rank of the sender process (general: MPI_ANY_SOURCE) u Tag: message tag (general: MPI_ANY_TAG) u Comm: set of processes in communication (general: MPI_COMM_WORLD) u Status: status of the result of the operation, to be consulted later. MPI DataTypes MPI Count int MPI_Get_count(const MPI_Status *status, MPI_Datatype datatype, int *count) u Returns the number of elements received by the last message. u Status: status of the receive operation u Datatype: type of data of the message u Count: number of elements received. Message Exchange Example Synchronization Model u Send function can have a synchronous or deferred sending behaviour: u synchronous, if it blocks until the receiver receives the message; u delayed, if it returns at the sender, before the receiver receives the message (it does not block). u The Recv function is always synchronous, that is, it blocks until it receives "the" message. u If the next message received does not match the reception parameters, the program may block! u The parameters correspond to the sender rank, the tag, the message data type and count and the comm group. Synchronization Model u The Ssend function has the same behaviour as Send function, but it is always synchronous and blocking, it only ends when the message reaches the destination. u It has the same parameters as the Send function, except for the name: MPI_Ssend Sender/Receiver Symmetry u For a message communication to be successful there needs to be an alignment between the sending function and receiving function. u Both functions must be symmetrical in sender and receiver; u If one function is not matched, the process may be blocked! u The message must be of the same type. u The receiver must have room to be able to receive the sent message. Collectives Motivation for Collectives u Sometimes the need arises to exchange messages between all processes and not just two of them. u This group communication can be optimised by the implementation of the library and the communication hardware itself through specific functions. u Barrier: synchronisation of processes; u Broadcast: sending messages to everyone, including oneself; u Scatter/Gather: sending from one to all, sending from all to one u Reduce: global collection to a single MPI Barrier int MPI_Barrier (MPI_Comm comm) u Barrier: message for synchronisation of processes u All processes invoke this function; u processes are blocked until all processes in the comm group call the function. MPI Broadcast (Bcast) int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm) u All processes invoke this function; u The root process sends data; u The other processes receive the data. u buffer: memory address with data u count: number of data to send u datatype: type of data to send u root: rank of the process that sends the data u comm: communication group. Broadcast/Scatter/Gather Comparison Source: mpitutorial.com MPI Reduce MPI Reduce int MPI_Reduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm) u Reduce function: collects a value from all processes; applies an aggregation function; merges the result in the root process. u sendbuf: memory pointer to data to be collected in all processes; u recvbuf: memory pointer to the final aggregate value (in the root process); u count: number of items in the buffer; u datatype: type of data to send; u op: operation to apply to aggregate the results (ex: MPI_SUM) u root: rank of the process which will have the only global result; u comm: communication group. u Example: MPI_Reduce(&x, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD) MPI Reduce - Operations

Use Quizgecko on...
Browser
Browser