At this point your instinctual responce to this question is likely that it differentiates communication by IP address. While this would a reasonable ssumption to make and would likely be reliable way to differentiate single-core systems, many modern systems are multi-core, meaning that multiple processor cores can be utilized for each node for a distributed task.
To resolve this identification issue, MPI assigns each process it sends out a unique identification number. This unique number is called the rank
of the process.
Now, lets look at how this process is developed in code.
In our last tutorial you developed a simple Python program to show that the cluster does, in fact,
function and to get you familiar with the commands used to operate the cluster properly. The last example did not include actually take advantage of the MPI library
or any of the true functionality provided by the cluster. This tutorial differs in its example by using the mpi4py
library to allow the user to access and return the rank
of a program taking advantage of the cluster environment.
So to get started we need to create a new file, in this case we'll call it RankedHelloWorld.py
. This file needs to be opened in a text editor.
First and formost, the library for mpi4py must be imported. this can be done simply with from mpi4py import MPI
At this point we need MPI to tell us exactly how many processes were created in order to make it easier to differentiate one processor from another.
to do this we want to access a feature of MPI called the communicator
. This communicator's purpose is to group up all of the processes at the start of a MPI program.
MPI4PY provides a default communicator for easy access called MPI_COMM_WORLD
. This is actually a type of data and can be stored in a variable
by declaring the name of the variable to be equal to MPI_COMM_WORLD
as seen below.
comm = MPI_COMM_WORLD
From this point the rank can be accessed throught the use of the comm.rank
function. The use of this function requests the rank of the current process.
So in theory, when this code is run on the cluster, for each processor in the cluster the rank stored in comm.rank
should be a different integer.
At this point we can easily test our code by simply putting a print statement that displays the rank of our process.
print 'rank: ',rank
At this point if the code is run on the cluster you can expect the following output:
You may notice that your rank numbers print in what seems to be a random order and will likley differ from the above image. This is because the print statement executes as each process returns its task.
Now that we can see the rank of our processes, we can now print some dummy data along with them for the sake of making this truely a Hello World program.
To do this we must simply declare a data variable, in this case we'll use data
to hold our information. Now set that data to Hello World
.
At this point only the print statement needs to be modified. to print both the rank, and the data.
print 'Rank : ',rank,'Data: ',data
At this point the code can be run on the cluster and the expected output is seen below:
Again, rank will be out of order, this is normal and expected.