Tutorial 1: Hello World

How Does a Cluster Say Hello World?

Part 1:

In most programming languages the first program made by someone new to the language is a "Hello World" program to sanity check that the language's resources have been installed sucessfully. With clustering this is much the same. For the sake of simplicity, these tutorials will be using the Python programming language

The purpose of this tutorial will be to demonstrate that the cluster is comunicating properly with the child nodes and get the user familiar with how to execute things on the cluster.

Before using any of the advanced helper functions, we want to become familiar with how the clusters commands work under the hood, so the only helper function we will be using in this tutorial is the sendToCluster filename command.

To begin, we need our "Hello World" program. In Python, this is a trivial matter composed of only a single line: print("Hello World") So if we create a file, for the sake of simplicity we'll call it HelloWorld.py and within it we can put our print statement.

Run the command: python ./HelloWorld.py this invokes the Python interpreter to execute the Python script. The output should be as follows.

Hello World!

Part 2:

So that was cool..kinda a running Python program is something, but it simply invoking Python won't actually run the code on the cluster, this is where MPI and MPICH come into play.

So, instead of invoking Python directly , how to wo let the message passing interface know that we want to run a Python program on multiple processors? Thre process is fairly simple, but the command itself can be faily intimidating.

In order to execute a command on the cluster, the mpirun command should be used.

The full command is as follows: mpirun.openmpi -np NUMPROCESS -machinefile IPfile python file For this command there are several required arguements:

-np : This defines the number of processors to run with.
-machinefile : This is a file that holds the IP addresses of the child machines, this file has been provided for you and is located at ~./clusterConfig/nodesips
python file : Specifies the language the script is written in and the scripts name always follows.

So for our purposes we have twenty avaliable processors because each RPi has four core avaliable for use. The script we are using is called HelloWorld.py

So the command we need to execute is: mpirun.openmpi -np 20 -machinefile ~/.clusterConfig/nodesips python ./HelloWorld.py

Part 3:

Hello World!...with and Error?

So... we got an error. After it ran on four of the twenty avaliable processors, our script wasn't found. This is because when mpirun is executed it tells all the nodes in the cluster to run the script provided, however it does't actually copy the script itself. To remedy this, a helper function has been made avaliable to you to assist in transfering files to all the nodes in the cluster.

This command is sendToCluster scriptname.

Output of sendToCluster

So now our script exists on all our children aswell as the master, lets see how mpirun outputs this time.

Working Output of Hello World!

Excellent! Our cluster sent back the responce twenty times, meaning that each processor sent back a responce sucessfully.

Part 4:

Now that we have reviewed how mpirun works and what it requires to run properly, we can now review the simplified cluster commands. These commands were designed to streamline the process of running scripts on the cluster in various configurations for benchmarking purposes.

You should take notice of the "Cluster Nodes Enabled: #" line at the top of your screen when you log in to the cluster. This value can be set to use any up to 5 nodes and provide 20 processors.

This is adjusted using the setNodeNumber number command.

Node count set to two

Now that we've set our node count, the output of mpirun will be much different, showing only 8 instances of "Hello World!" if the next helper command is run.

That command is: runOnCluster file

This command is the all-in-one cluster script execution command that we'll be using in future tutorials. It takes the scripts and sends it to all of the child nodes similarly to how sendToCluster file does, it then procedes to make note of the time at the begining of execution then runs the script and saves the output with the total execution time stamped aswell as the number of processors used in execution to a file called clusterUse.log.