scemama/TCCM2021

Fork 0

Anthony Scemama 7037335bff Starting MPI

2021-11-20 23:19:20 +01:00

46 KiB

Raw Blame History

Fundamentals of parallelization

Message Passing Interface (MPI)
OpenMP
Exercises

#+LaTeX_CLASS_OPTIONS:[aspectratio=169]

Message Passing Interface (MPI)

Message Passing Interface

Application Programming Interface for inter-process communication
Takes advantage of HPC hardware:
- TCP/IP: 50 $\mu \text{s}$ latency
- Remote Direct Memory Access (RDMA): <2 $\mu \text{s}$ (low-latency network)
Portable
Each vendor has its own implementation adapted to the hardware
Standard in HPC
Initially designed for fixed number of processes:
- No problem for the discovery of peers
- Fast collective communications
Single Program Multiple Data (SPMD) paradigm

Communicators

Group of processes that can communicate together
Each process has an ID in the communicator: no need for IP adresses and port numbers
MPI_COMM_WORLD: Global communicator, default
size: number of processes in the communicator
rank: ID of the process in the communicator

Point-to-point communication

Python

Send: comm.send(data, dest, tag)
Receive: comm.recv(source, tag)

Fortran

Send: MPI_SEND(buffer, count, datatype, destination, tag, communicator, ierror)
Receive: MPI_RECV(buffer, count, datatype, source, tag, communicator, status, ierror)

Point-to-point communication (Python)

from mpi4py import MPI

def main():
 comm = MPI.COMM_WORLD
 rank = comm.Get_rank()
 size = comm.Get_size()

 if rank == 0:
     data = 42
     print("Before: Rank: %d    Size: %d    Data: %d"%(rank, size, data))
     comm.send(data, dest=1, tag=11)
     print("After : Rank: %d    Size: %d    Data: %d"%(rank, size, data))
 elif rank == 1:
     data = 0
     print("Before: Rank: %d    Size: %d    Data: %d"%(rank, size, data))
     data = comm.recv(source=0, tag=11)
     print("After : Rank: %d    Size: %d    Data: %d"%(rank, size, data))

if __name__ == "__main__": main()

Point-to-point communication (Python)

$ mpiexec -n 4 python mpi_rank.py 
Before: Rank: 0    Size: 4    Data: 42
Before: Rank: 1    Size: 4    Data: 0
After : Rank: 0    Size: 4    Data: 42
After : Rank: 1    Size: 4    Data: 42

In Fortran, compile using mpif90 and execute using mpiexec (or mpirun).

Point-to-point communication (Fortran)

program test_rank
   use mpi
   implicit none
   integer :: rank, size, data, ierr, status(mpi_status_size)

   call MPI_INIT(ierr)      ! Initialize library (required)
   if (ierr /= MPI_SUCCESS) then
      call MPI_ABORT(MPI_COMM_WORLD, 1, ierr)
   end if
   
   call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
   if (ierr /= MPI_SUCCESS) then
      call MPI_ABORT(MPI_COMM_WORLD, 2, ierr)
   end if

   call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
   if (ierr /= MPI_SUCCESS) then
      call MPI_ABORT(MPI_COMM_WORLD, 3, ierr)
   end if

Point-to-point communication (Fortran)

 if (rank == 0) then
     data = 42
     print *, "Before: Rank:", rank, "Size:", size, "Data: ", data
     call MPI_SEND(data, 1, MPI_INTEGER, 1, 11, MPI_COMM_WORLD, ierr)
     print *, "After : Rank:", rank, "Size:", size, "Data: ", data

 else if (rank == 1) then
     data = 0
     print *, "Before: Rank:", rank, "Size:", size, "Data: ", data
     call MPI_RECV(data, 1, MPI_INTEGER, 0, 11, MPI_COMM_WORLD, &
                   status, ierr)
     print *, "After : Rank:", rank, "Size:", size, "Data: ", data

 end if
 call MPI_FINALIZE(ierr)      ! De-initialize library (required)
end program

Collective communications

One-to-all

Broadcast: send same data to all
Scatter: distribute an array

All-to-one

Reduction: Sum/product/… of data coming from all ranks
Gather: collect a distributed array

All-to-all

Reduction and broadcast

Deadlocks

OpenMP

Exercises

Monte Carlo

Write a Fortran
```
double precision function compute_pi(M)
```
that computes $\pi$ with the Monte Carlo algorithm using $M$ samples

Call it like this:

program pi_mc
implicit none
integer          :: M
logical          :: iterate
double precision :: sample
double precision, external   :: compute_pi

call random_seed()  ! Initialize random number generator
read (*,*) M        ! Read number of samples in compute_pi

iterate = .True.
do while (iterate)  ! Compute pi over N samples until 'iterate=.False.'
sample = compute_pi(M)
write(*,*) sample
read (*,*) iterate
end do
end program pi_mc

Monte Carlo

Write a Fortran

double precision function compute_pi(M)

that computes $\pi$ with the Monte Carlo algorithm using $M$ samples

program pi_mc
implicit none
integer          :: M
logical          :: iterate
double precision :: sample
double precision, external   :: compute_pi

call random_seed()  ! Initialize random number generator
read (*,*) M        ! Read number of samples in compute_pi

iterate = .True.
do while (iterate)  ! Compute pi over N samples until 'iterate=.False.'
sample = compute_pi(M)
write(*,*) sample
read (*,*) iterate
end do
end program pi_mc

Monte Carlo (solution)

double precision function compute_pi(M)
implicit none
integer, intent(in) :: M
double precision    :: x, y, n_in
integer             :: i

n_in = 0.d0
do i=1, M
  call random_number(x)
  call random_number(y)
  if (x*x + y*y <= 1.d0) then
     n_in = n_in+1.d0
  end if
end do
compute_pi = 4.d0*n_in/dble(nmax)

end function compute_pi

46 KiB Raw Blame History

Fundamentals of parallelization

Message Passing Interface (MPI)

Message Passing Interface

Communicators

Point-to-point communication

Python

Fortran

Point-to-point communication (Python)

Point-to-point communication (Python)

Point-to-point communication (Fortran)

Point-to-point communication (Fortran)

Collective communications

One-to-all

All-to-one

All-to-all

Deadlocks

OpenMP

Exercises

Monte Carlo

Monte Carlo

Monte Carlo (solution)

46 KiB

Raw Blame History