Using IPython

Print

 The official documentation is here:

Different ways of using IPython on Guillimin

There are four ways of using IPython on Guillimin:

  • Single node, interactive mode
  • Single node, automatic mode
  • Multiple nodes, interactive mode
  • Multiple nodes, automatic mode

In all case, there is one controller and multiple engines (1+n processes). In the current documentation, they are all started with the ipcluster command.

[Single, Multiple] - The modules

IPython 0.13.2 is part of the python/2.7.3 module. It also needs ZeroMQ. So, in your ~/.bashrc file, you must include these lines:

module add python/2.7.3
module add zeromq

If you are using multiple nodes, you need to add a MPI module. The following combination works:

module add gcc/4.5.3
module add openmpi/1.4.3-gcc

TODO: testing with Intel and PGI compilers, testing with OpenMPI 1.6.2 and MVAPICH2.

[Single, Multiple] - Setting a IPython parallel profiles

In the case you run multiple IPython Parallel jobs simultaneously, you would need one IPython Parallel profile for each job. Otherwise, one parallel profile is enough since it is reusable.

SingleMultiple

This is how to create the local0 (arbitrary name) profile:

ipython profile create --parallel \
    --profile=local0

The local0 profile will be in profile_local0:

cd ~/.ipython/profile_local0/

This is how to create the mpi0 (arbitrary name) profile:

ipython profile create --parallel \
    --profile=mpi0

The mpi0 profile will be in profile_mpi0:

cd ~/.ipython/profile_mpi0/

Since the engines will be launched on one or many worker nodes, the controller must be able to listen to connections from any IP address. So, the following line must be in ipcontroller_config.py:

c.HubFactory.ip = '*'

Finally, in the case of profile_mpi0, the IPCluster must launch the engines with MPI. So, the following line must be in ipcluster_config.py:

c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'

[Single] - The PBS script for a single node

Here is the content of the PBS script for each type of execution:

InteractiveAutomatic
#!/bin/bash
#PBS -l nodes=1:ppn=12
#PBS -l walltime=00:10:00
#PBS -V
#PBS -N IPython

Note : there will be no output file and no error file.

#!/bin/bash
#PBS -l nodes=1:ppn=12
#PBS -l walltime=00:10:00
#PBS -j oe
#PBS -V
#PBS -N IPython

ipcluster start --n=12 --profile=local0

In both cases, ppn=12 means that a full node is reserved, and this would have no impact on other users. But if only 3 engines are needed, because the IPController counts as one process, the PBS script should reserve ppn=4.

Here is how to submit the job for each type of execution:

InteractiveAutomatic
qsub -I -q sw script.sh

Wait for the interactive shell. Then, launch ipcluster:

ipcluster start --n=12 --profile=local0
qsub -q sw script.sh

Wait for the job to start. Use showq -u <username>.

The ipcluster command will be executed automatically.

The ipcluster needs some time to get ready. In the interactive mode, you will see the following message: "Engines appear to have started successfully". In the automatic mode, you would have to redirect (>) the output of ipcluster in a file and to monitor that file.

[Multiple] - The PBS script for multiple nodes

Here is the content of the PBS script for each type of execution:

InteractiveAutomatic
#!/bin/bash
#PBS -l nodes=2:ppn=12
#PBS -l walltime=00:10:00
#PBS -V
#PBS -N IPythonMPI0

Note : there will be no output file and no error file.

#!/bin/bash
#PBS -l nodes=2:ppn=12
#PBS -l walltime=00:10:00
#PBS -j oe
#PBS -V
#PBS -N IPythonMPI0

ipcluster start --n=24 --profile=mpi0

In this example, 2 full nodes will be reserved. The IPController will run on the first node reserved by Moab.

Here is how to submit the job for each type of execution:

InteractiveAutomatic
qsub -I -q hb script.sh

Wait for the interactive shell. Then, launch ipcluster:

ipcluster start --n=24 --profile=mpi0
qsub -q hb script.sh

Wait for the job to start. Use showq -u <username>.

The ipcluster command will be executed automatically.

The ipcluster needs some time to get ready. In the interactive mode, you will see the following message: "Engines appear to have started successfully". In the automatic mode, you would have to redirect (>) the output of ipcluster in a file and to monitor that file.

[Single, Multiple] - Using IPython Parallel

Connect to Guillimin with a second terminal. On the login node, launch ipython with the appropriate profile:

SingleMultiple
ipython --profile=local0
ipython --profile=mpi0

To get access to the engines:

In [1]: from IPython.parallel import Client

In [2]: c = Client()

In [3]: c.ids

To get the host name of each engine:

In [4]: with c[:].sync_imports(): import socket

In [5]: %px print(socket.gethostname())

 To exit IPython:

In [6]: exit

 [Single, Multiple] - Stopping the job

InteractiveAutomatic

Use Ctrl+C to kill ipcluster. Then, enter exit to stop the job.

Use canceljob to kill the remote processes and stop the job.

Conclusion

The interactive mode may do a cleaner and easier to monitor job than the automatic mode. On the other hand, the automatic mode does not require an opened terminal waiting for the job to start. The interactive mode is the recommended method.