© Copyright 1994
Maui High Performance Computing Center.
Parallel Operating Environment Overview
- What is the Parallel Operating Environment?
- Understanding the Parallel Operating Environment
- Partition
- Partition Manager
- Resource Manager
- Executing Parallel Programs Using the POE
- Set your Path
- Compiling and Linking a Parallel Program
- Fortran (mpxlf)
- C (mpcc)
- Setting up the Execution Environment
- Environment Variables
- Some Preset POE Environments
- Creating a .rhosts File
- Creating a host.list File
- Invoking the Executable
- Recommendations for Running on the SP2
- Miscellaneous Environment Variables
- Acknowledgements and References
-
IBM's environment for developing and running distributed memory, parallel
Fortran, C or C++ programs. Runs on the IBM RS/6000 platform using AIX.
-
The POE consists of components for developing, executing, debugging,
profiling, and tuning parallel programs.
- Parallel Compiler Scripts
- PE Environment Variables
- Program Marker Array
- System Status Array
- Parallel Debugging Facility
- Parallel Profiling Capability
Click here to see a graphical representation of the
Partition Manager and the Resource Manager.
- Partition:
- The group of processor nodes on
which you run your program is called your partition.
- Partition Manager:
- The Partition Manager establishes and controls your partition.
It consists of a set of subroutines that are linked into your program
and an Internet daemon process called pmd. The Partition
Manager is responsible for:
- Requesting the nodes for your parallel job
- Acquiring the nodes needed for your parallel job (if the Resource
Manager is not used)
- Copying the executables from the initiating node to each node in your
partition
- Loading the executable on each node in your partition
- Setting up stdout and stdin
- Resource Manager:
- The Resource Manager keeps track of the nodes that are
currently processing a parallel task. When requested by the
Partition Manager, the Resource Manager will allocate
nodes for your use. It attempts to enforce a "one parallel task
per node" rule.
- Processor Pool
- The local systems administrator will divide the processor nodes into
disjoint pools of processors for management purposes. You may request
that your parallel tasks run on specific pools. The processor pools
at the MHPCC are listed here.
You may determine the available pools by typing:
jm_status -P
- User Space Protocol
- A fast method of communication between nodes. Can be used only
with the high performance switch.
- Internet Protocol
- A slower method of communication between nodes. Can be used with
ethernet or the high performance switch.
- Communication SubSystem (CSS)
- The Communication SubSystem (CSS) is the set of library routines
which implement either of the above communication protocols.
In order to execute a parallel program, you need to:
- Compile and link the program using shell scripts which call the C or
FORTRAN compilers while linking in the Partition Manager and MPL
subroutines.
- Set up your execution environment. This includes setting the number
of tasks and specifying the method of node allocation.
- Optional: start either of the POE X-Windows analysis tools:
- Program Marker Array (Optional: click
here).
- System Status Array (Optional: click
here).
- Load and execute the parallel program on the processor nodes of your
partition.
- Load the same executable on all nodes; normal procedure for SPMD
programs.
- Individual load separate executables onto the nodes; normal
procedure for MPMD programs.
Before you execute any POE commands, make sure that your path includes
/usr/lpp/poe/bin. This can be done by typing at the Unix prompt or adding
it to one of your startup files (.cshrc, .profile or .login):
set path = ($path /usr/lpp/poe/bin)
-
mpxlf and mpcc invoke the corresponding
compilers (xlf, and cc) for compilation and link in the Partition Manager
initialization routines. See the mpxlf and
mpcc man pages for more detailed information.
- Fortran:
mpxlf options {-us,-ip} filename
- C:
mpcc options {-us,-ip} filename
-
-ip causes the IP CSS to be statically bound with the executable.
Communication during execution will use the Internet Protocol.
-
-us causes the US CSS library to be statically bound with the
executable. This CSS library uses the US protocol for dedicated
use of the high-performance switch. It allows you to drive the
switch adaptor directly from your parallel tasks.
-
If neither flag is set, then a CSS library will be dynamically
linked with the executable at run time. The library which will be linked
is determined by the MP_EUILIB environment variable.
- any of the options normally accepted by the xlf or c compiler
- -v causes a "verbose" output listing of the mpxlf or mpcc shell
script
- -g compiles the program suitable for debugging with
pdbx or xpdbx. This
option is also necessary to use the Source Code view in
the Visualization Tool .
- -c compiles only (does not link the object files)
- -o exename names the executable to
exename
- -l (note: this is lower case L) names additional libraries
to be searched. Several libraries are automatically included.
See the mpxlf and mpcc man pages for more detailed
information.
- -L<pathname> Places
<pathname> into the library search
path. Directories will be searched in the order of their
occurrance on the command line.
- -p permits profiling with prof or gprof
- -I (note: this is upper case i) names directories for
additional include files. /usr/lpp/poe/include
is automatically included.
- -O optimize the output code.
There are many environment variables and command line flags that you can
set to influence the operation on PE tools and the execution of parallel
programs. A complete list of the PE environment variables can be found
in the IBM AIX Parallel Environment Operation and
Use manual. Some things that critically affect program execution
are:
- MP_PROCS The number of nodes to allocate for your partition
- MP_RESD specifies whether or not the Partition Manager should
connect to the POWERparallel system Resource Manager to allocate
nodes. Can be either:
- yes
- no
- MP_HOSTFILE specifies the name of a host file for node
allocation.
- Can be any file name, NULL or ""
- The default host list file is host.list in
the current directory. You do not need to set MP_HOSTFILE if
your host list file is the default host.list.
- You must specify a host list file
if you:
- need specific node allocation
- request non-specific node allocation from a number of system
pools.
- use a host list file named something other than the default
host.list.
- MP_EUILIB Specifies which CSS library implementation to use for
communication. Can be either:
- ip The Internet Protocol CSS.
- us The User Space CSS; lets you drive the high-performance
switch directly from your parallel tasks without going through the
kernel or operating system; use this library on the MHPCC SP2.
- MP_EUIDEVICE The adaptor set to use for message passing; either
Ethernet, FDDI, token-ring, or the high-performance switch adaptor
(note: this variable is ignored if the US CSS library is used).
- en0 Ethernet
- fi0 FDDI
- tr0 token-ring
- css0 high-performance switch; use this on the MHPCC SP2.
- MP_RMPOOL The number of the POWERparallel system pool that
should be used by the Resource Manager for non-specific node
allocation. This is only valid if you are using the
POWERparallel Resource Manager for non-specific node
allocation (from a single pool) without a host list file. You
may obtain information about available pools by typing
"jm_status -P" at the Unix prompt.
- SP_NAME Specifies the name of the Control Workstation. This
variable is used by the System Status Array tool.
The following four sets of POE environment variables should cover most users'
needs and are provided to help simplify getting started:
Once the environment is setup and the executables are created, invoking
the executables is relatively easy.
- Start any X-Windows analysis tools (ie. pmarray) before the
executable is invoked.
- For SPMD programs, simply issue the name of the executable,
specifying any command line flags that may be required. Command
line flags may be used to temporarily override any MP environment
variables that have been set.
- For MPMD programs, execution begins once you have loaded the
partition using poe.
-
Use dynamic linking of the communication libraries instead of static linking.
-
During program development mode, use IP communication libraries over the switch.
-
During production mode, use US communication over the switch.
-
If you are using User Space communications and enough nodes are not
available, use IP communications instead.
-
Write scratch files to local scratch space whenever possible. At the
MHPCC, each node has a /localscratch directory - it
is faster than writing scratch files to your home directory.
-
Use non-specific node allocation (currently just one pool).
-
Inform the Resource Manager about your job.
-
Set MP_PROCS to the number of nodes you wish to use.
-
Start the executable from an SP2 node
-
During program development it may be helpful to set MP_INFOLEVEL to
a higher value (5) and to run with MP_EUILIB set to ip .
-
MP_RETRY The period of time between processor node allocation
retries if there are not enough processor nodes immediately
available.
C Shell: setenv MP_RETRY 10
Korn Shell: export MP_RETRY=10
-
MP_RETRYCOUNT The number of times that the Partition Manager
should attempt to allocate processor nodes before returning without
running your program.
C Shell: setenv MP_RETRYCOUNT 15
Korn Shell: export MP_RETRYCOUNT=15
-
MP_SAVEHOSTFILE The name of an output host list file to be
generated by the Partition Manager.
C Shell: setenv MP_SAVEHOSTFILE progname.hosts.used
Korn Shell: export MP_SAVEHOSTFILE=progname.hosts.used
-
MP_EUIDEVELOP Causes MPL to do more detailed checking during
program execution.
C Shell: setenv MP_EUIDEVELOP yes
Korn Shell: export MP_EUIDEVELOP=yes
-
MP_STDOUTMODE Enables you to manage the STDOUT from your
parallel tasks.
C Shell: setenv MP_STDOUTMODE ordered
Korn Shell: export MP_STDOUTMODE=ordered
-or-
C Shell: setenv MP_STDOUTMODE unordered
Korn Shell: export MP_STDOUTMODE=unordered
-or-
C Shell: setenv MP_STDOUTMODE 6
Korn Shell: export MP_STDOUTMODE=6
-
MP_INFOLEVEL The amount of diagnostic information displayed as
your program runs; set to an integer between 0 and 5 (lower value
implies less diagnostic information).
C Shell: setenv MP_INFOLEVEL 2
Korn Shell: export MP_INFOLEVEL=2
"IBM AIX Parallel Environment Operation and Use, Release 1.0". IBM
Corporation.
"IBM AIX Parallel Environment Operation and Use, Release 2.0". IBM
Corporation.
We gratefully acknowledge the IBM Corporation for providing much of the
original material included in this document.
© Copyright 1994, Maui High Performance Computing Center.
All rights reserved.
Documents located on the Maui High Performance Computing Center's WWW server
are copyrighted by the MHPCC. Educational institutions are encouraged to
reproduce and distribute these materials for educational use as long as
credit and notification are provided. Please retain this copyright notice
and include this statement with any copies that you make. Also, the MHPCC
requests that you send notification of their use to help@mail.mhpcc.edu.
Commercial use of these materials is prohibited without prior written
permission.
Last revised: 2/13/95 Blaise Barney