HP Fortran
Release Notes for Tru64 UNIX Systems


Previous Contents

1.8.3 Version 5.3 ECO 01 HPF New Features

The following information pertains to HPF using MPI.

Overview of HPF and MPI

The Compaq Fortran compiler now generates code that uses MPI as its message-passing library instead of PSE's HPF-specific support. The compiler provides a choice of three different variants of MPI: one for Compaq's SC supercomputer systems, one that supports shared-memory and Memory Channel interconnects, and public domain MPI for other interconnects that include Ethernet and FDDI.

It is now possible to write HPF programs that also call or use MPI (such as distributed-memory libraries that invoke MPI). The compiler's MPI runtime library uses its own private MPI "communicator" so it won't interfere with other MPI code. A new example program, /usr/examples/hpf/call_mpi.f90, illustrates this.

You enable the new MPI-based runtime library, that supports Compaq Fortran's HPF directives, by adding the -wsf_target option. This option, which requires an argument, belongs in the compilation and link commands.

Compiling HPF Programs for MPI

You must now specify which variant of MPI support you wish to use for HPF programs by including the option -wsf_target with an MPI selection (argument target ) in the command to the f90 compiler. An example is next that selects Compaq MPI.


% f90 -wsf 2 -wsf_target cmpi -c lu.f90

An expansion of this example is next that invokes both the compiler and linker.


% f90 -wsf 2 -wsf_target cmpi -o lu lu.f90

The values of target in the option -wsf_target target appear next with their explanations.
target Explanation
smpi SC (Quadrics) MPI
This MPI comes installed on SC-series systems. It works with the SC's RMS software that provides a set of commands for launching MPI jobs, scheduling these jobs on SC clusters, and performing other miscellaneous tasks.
cmpi Compaq MPI
This MPI is a version that is specifically tuned for Alpha systems. It is distributed as a Compaq layered product. Compaq MPI supports only Memory Channel clusters and shared-memory (SMP) machines.
gmpi Generic MPI
This target is for use with MPICH V1.2.0 or other compatible libraries. MPICH is a public domain implementation of the MPI specification that is available for many platforms. You can obtain this implementation from http://www-unix.mcs.anl.gov/mpi/mpich/. MPICH V1.2.0 supports many interconnection networks including Ethernet, FDDI, and other hardware. Using Compaq Fortran and HPF with this MPI is, officially, not supported. Compaq does not guarantee support of problems caused by specifying -wsf_target gmpi . However, Compaq remains quite interested in receiving problem reports and will attempt to respond to them.

If the command to the f90 compiler includes -wsf_target target , then the command must also include -wsf .

Another way of specifying the version of MPI to the compiler, instead of using the option -wsf_target , is to set the environment variable DECF90_WSF_TARGET to a value in the first column of the previous table. For example, the command


% f90 -wsf 2 -wsf_target cmpi -c lu.f90

is equivalent to the commands


% setenv DECF90_WSF_TARGET cmpi
% f90 -wsf 2 -c lu.f90

If an f90 command contains -wsf_target with a value (such as cmpi ) and environment variable DECF90_WSF_TARGET is set to a different value, then the value in the f90 command overrides the value of the environment variable.

Using the environment variable to select the desired MPI variant is the recommended method. This will require the fewest changes to existing scripts for building HPF programs, and will allow users generating code for more than one MPI variant to do so more easily. Compaq additionally recommends setting the environment variable in your shell initialization file (e.g. .cshrc if you use 'csh'), particularly if you usually use only one MPI variant.

A table, showing all changes to HPF-related compiler options between Fortran V5.3 and V5.3 ECO 01, is next.
Fortran V5.3 Fortran V5.3 ECO 01
-assume bigarrays No change
-assume nozsize No change
-hpf_matmul Deleted
-nearest_neighbor No change
-nowsf_main No change (but currently does not work)
-pprof Use only with -wsf_target pse
-show hpf* No change
-show wsfinfo No change
-wsf No change
--- -wsf_target target

Linking HPF Programs with MPI

You must now specify which variant of MPI support you wish to use for HPF programs by including the option -wsf_target with an MPI selection (argument target ) in the link command. An example is next.


% f90 -wsf 2 -wsf_target cmpi -o lu lu.o

The values of target come from the table in the section "Compiling HPF Programs for MPI".

If you specified generic MPI at compilation time, either by including the -wsf_target gmpi option or by setting the environment variable DECF90_WSF_TARGET to gmpi, you must specify a path to the desired generic MPI library during linking. Do this in one of these ways:

An example of a link command for a generic MPI library is next.


% f90 -wsf 2 -wsf_target gmpi -o lu lu.o /usr/users/me/libmpich.a

In addition, you must have the Developer's Tool Kit software installed on your computer to link properly with the option -wsf_target gmpi .

Finally, programs linked with -wsf_target and an MPI target must be linked with -call_shared (which is the default); the -non_shared option does not link correctly.

Running HPF Programs Linked with MPI

The dmpirun command executes program files created with the -wsf_target cmpi option. Include the -n n option in the command line where n is the same value of -wsf n in the compilation command line. Or, if no value was given with the -wsf option, then set n to the desired number of peers. Also include the name of the program file.

An example is next where the compilation command line included -wsf 4 and the name of the program file is heat8.


% dmpirun -n 4 heat8

If your AlphaServer SC system is running with Revision A of the Quadrics switch, your boot log will contain the message:


  elan0: Rev A Elite network detected - disabling adaptive routing (1) 

To make MPI programs (including HPF programs generated with the "-wsf_target smpi" option) run properly with Revision A hardware, you need to set the LIBELAN_GROUP_HWBCAST environment variable to DISABLE; for example, from csh:


% setenv LIBELAN_GROUP_HWBCAST DISABLE

The manpage dmpirun contains a full description of this command.

The prun command executes program files created with the -wsf_target smpi option. Include the -n n option in the command line where n is the same value of -wsf n in the compilation command line. Or, if no value was given with the -wsf option, then set n to the desired number of peers. Also include the name of the program file.

An example is next where the compilation command line included -wsf 4 and the name of the program file is heat8.


% prun -n 4 -N 4 heat8

The mpirun command executes program files created with the -wsf_target gmpi option. Include the -np n option in the command line where n is the same value of -wsf n in the compilation command line. Also include the name of the program file. The mpirun command varies according to where you installed the generic MPI.

An example is next where the compilation command line included -wsf 4 and the name of the program file is heat8.


% /usr/users/me/mpirun -np 4 heat8

In the /usr/examples/hpf directory, there is a sample script that will launch an HPF program for any variant of MPI. This script, called "hpfrun", will even determine the number of processors a source program was compiled for (if that was specified at compile time), and invoke the proper MPI run command with the number of processors specified. Portions of the script, or the entire script, may be useful for users automating the building and running of HPF programs.

Cleaning up After Running HPF Programs Linked with MPI

Execution of the dmpirun command (but not the prun and mpirun commands) may leave various system resources allocated after the program has completed. To free them, give the mpiclean with no arguments. An example is next.


% mpiclean

Changing HPF Programs for MPI

There two changes you should make to Fortran source files before compiling them for MPI. If a module contains an EXTRINSIC (HPF_LOCAL) statement and it executes on a system different from peer 0, then its output intended for stdout may (depending on the variant of MPI used) go instead to /dev/null. Change such modules or your execution commands to have the extrinsic subroutine do input/output only from peer 0.

In addition, the ability to call parallel HPF subprograms from non-parallel (Fortran or non-Fortran) main programs, is not supported in this release. For more information, see Chapter 6 of the DIGITAL High Performance Fortran 90 HPF and PSE Manual.

1.8.4 Version 5.3 New Features

The following new Compaq Fortran features are now supported:

1.8.5 Version 5.3 Important Information

Some important information to note about this release:

1.8.6 Version 5.3 Corrections

From version X5.2-829-4296F ECO 01 to FT1 T5.3-860-4498G, the following corrections have been made:

From version FT1 T5.3-860-4498G to FT2 T5.3-893-4499U, the following corrections have been made:

From version FT2 T5.3-893-4499U to V5.3-915-449BB, the following corrections have been made:

1.8.7 HPF in Compaq Fortran Version 5.3

As in Fortran 90 Version 5.2, the HPFLIBS subset replaces the old PSESHPF subset. If you previously installed the PSESHPF subset you do not need to delete it. If you choose to delete it, delete it before you install the Fortran 90 V5.3 HPFLIBS170 subset. If you delete the PSESHPF subset after you install the Fortran HPFLIBS170 subset, you need to delete the HPFLIBS170 subset and then reinstall it. For information on using the setld command to check for and delete subsets, see the Compaq Fortran Installation Guide for Tru64 UNIX Systems.

To execute HPF programs compiled with the -wsf switch you must have both PSE160 and Fortran 90 Version 5.3 with the HPFLIBS170 subset installed. For this release the order of the installation is important. You must first install PSE160 and then install Fortran 90 Version 5.3 with the HPFLIBS170 subset. The HPFLIBS170 subset must be installed last. If you do this it will be properly installed.

If you also need to use the latest versions of MPI and PVM, you must install PSE180. PSE180 contains only MPI and PVM support. The support for HPF programs compiled with the -wsf option is only found in PSE160. Therefore you must install both versions of PSE and you must install PSE180 after PSE160.

To install Compaq Fortran with HPF and MPI and PVM, install them in the following order. The order is very important.

  1. Delete any old versions that you wish to delete.
  2. Install PSE160.
  3. Install Compaq Fortran Version 5.3 including the HPFLIBS170 subset.
  4. Install PSE180.

The HPF runtime libraries in Compaq Fortran Version 5.3 are only compatible with PSE Version 1.6. Programs compiled with this version will not run correctly with older versions of PSE. In addition, programs compiled with older compilers will no longer run correctly when linked with programs compiled with this version. Relinking is not sufficient; programs must be recompiled and relinked.

If you cannot install these in the order described, follow these directions to correct the installation:

For more information about installing PSE160, see the Compaq Parallel Software Environment Release Notes, Version 1.6.

For more information about installing PSE180, see the Compaq Parallel Software Environment Release Notes, Version 1.8.

1.8.8 Version 5.3 Known Problems

The following known problems exist with Compaq Fortran Version 5.3:

1.9 New Features, Corrections, and Known Problems in Version 5.2

Version 5.2 is a minor release that includes corrections to problems discovered since Version 5.1 was released and certain new features.

The following topics are discussed:

1.9.1 Version 5.2 ECO 01 New Features

The following new Compaq Fortran (DIGITAL Fortran 90) features are now supported:

Some important information to note about this release:

From version V5.2-705-428BH to X5.2-829-4296F, the following corrections have been made:

1.9.2 Version 5.2 New Features

Version 5.2 supports the following new features :

1.9.3 Version 5.2 Important Information

Some important information to note about this release:

1.9.4 Version 5.2 Corrections

From version V5.1-594-3882K to FT1 T5.2-682-4289P, the following corrections have been made:

From version FT1 T5.2-682-4289P to FT2 T5.2-695-428AU, the following corrections have been made: From version FT2 T5.2-695-428AU to V5.2-705-428BH, the following corrections have been made:

1.10 High Performance Fortran (HPF) Support in Version 5.2

Compaq Fortran (DIGITAL Fortran 90) Version 5.2 supports the entire High Performance Fortran (HPF) Version 2.0 specification with the following exceptions:

In addition, the compiler supports many HPF Version 2.0 approved extensions including:

1.10.1 Optimization

This section contains release notes relevant to increasing code performance. You should also refer to Chapter 7 of the DIGITAL High Performance Fortran 90 HPF and PSE Manual for more detail.

1.10.1.1 The -fast Compile-Time Option

To get optimal performance from the compiler, use the -fast option if possible.

Use of the -fast option is not permitted in certain cases, such as programs with zero-sized data objects or with very small nearest-neighbor arrays.

For More Information:

1.10.1.2 Non-Parallel Execution of Code

The following constructs are not handled in parallel:

If an expression contains a non-parallel construct, the entire statement containing the expression is executed in a nonparallel fashion. The use of such constructs can cause degradation of performance. Compaq recommends avoiding the use of constructs to which the above conditions apply in the computationally intensive kernel of a routine or program.

1.10.1.3 INDEPENDENT DO Loops Currently Parallelized

Not all INDEPENDENT DO loops are currently parallelized. It is important to use the -show hpf or -show hpf_indep compile-time option, which will give a message whenever a loop marked INDEPENDENT is not parallelized.

Currently, a nest of INDEPENDENT DO loops is parallelized whenever the following conditions are met:

When the entire loop nest is encapsulated in an ON HOME RESIDENT region, then only the first two restrictions apply.

For More Information:

1.10.1.4 Nearest-Neighbor Optimization

The following is a list of conditions that must be satisfied in an array assignment, FORALL statement, or INDEPENDENT DO loop in order to take advantage of the nearest-neighbor optimization:

Compile with the -show hpf or -show hpf_nearest switch to see which lines are treated as nearest-neighbor.

Nearest-neighbor communications are not profiled by the pprof profiler. See the section about the pprof Profile Analysis Tool in the Parallel Software Environment (PSE) Version 1.6 release notes.

For More Information:

1.10.1.5 Widths Given with the SHADOW Directive Agree with Automatically Generated Widths

When compiler-determined shadow widths don't agree with the widths given with the SHADOW directive, less efficient code will usually be generated.

To avoid this problem, create a version of your program without the SHADOW directive, and compile with the -show hpf or -show hpf_near option. The compiler will generate messages that include the sizes of the compiler-determined shadow widths. Make sure that any widths you specify with the SHADOW directive match the compiler-generated widths.

1.10.1.6 Using EOSHIFT Intrinsic for Nearest Neighbor Calculations

In the current compiler version, the compiler does not always recognize nearest-neighbor calculations coded using EOSHIFT. Also, EOSHIFT is sometimes converted into a series of statements, only some of which may be eligible for the nearest neighbor optimization.

To avoid these problems, Compaq recommends using CSHIFT or FORALL instead of EOSHIFT if these alternatives meet the needs of your program.

1.10.2 New Features

This section describes the new HPF features in this release of Compaq Fortran.

1.10.2.1 RANDOM_NUMBER Executes in Parallel

The RANDOM_NUMBER intrinsic subroutine now executes in parallel for mapped data. The result is a significant decrease in execution time.

1.10.2.2 Improved Performance of TRANSPOSE Intrinsic

The TRANSPOSE intrinsic will execute faster for most arrays that are mapped either * or BLOCK in all dimensions.

1.10.2.3 Improved Performance of DO Loops Marked as INDEPENDENT

Certain induction variables are now recognized as affine functions of the INDEPENDENT DO loop indices, thus meeting the requirements listed in Section 1.10.1.3. Now, the compiler can parallelize array references containing such variables as subscripts. An example is next.


!     Compiler now recognizes a loop as INDEPENDENT because it 
!        knows that variable k1 is k+1. 
      PROGRAM gauss 
      INTEGER, PARAMETER    :: n = 1024 
      REAL, DIMENSION (n,n) :: A 
      !HPF$ DISTRIBUTE A(*,CYCLIC) 
 
      DO k = 1, n-1 
         k1 = k+1 
         !HPF$ INDEPENDENT, NEW(i) 
         DO j = k1, n 
            DO i = k1, n 
               A(i,j) = A(i,j) - A(i,k) * A(k,j) 
            ENDDO 
         ENDDO 
      ENDDO 
      END PROGRAM gauss 

1.10.3 Corrections

This section lists problems in previous versions that have been fixed in this version.

1.10.4 Known Problems

1.10.4.1 "Variable used before its value has been defined" Warning

The compiler may inappropriately issue a "Variable is used before its value has been defined" warning. If the variable named in the warning does not appear in your program (e.g. var$0354), you should ignore the warning.

1.10.4.2 Mask Expressions Referencing Multiple FORALL Indices

FORALL statements containing mask expressions referencing more than seven FORALL indices do not work properly.

1.10.5 Unsupported Features

This section lists unsupported features in this release of Compaq Fortran.

1.10.5.1 SMP Decomposition (OpenMP) not Currently Compatible with HPF

Manual decomposition directives for SMP (such as the OpenMP directives enabled with the -omp option, or the directives enabled with the -mp option) are not currently compatible with the -wsf option.

1.10.5.2 Command Line Options not Compatible with the -wsf Option

The following command line options may not be used with the -wsf option:

1.10.5.3 HPF_LOCAL Routines

Arguments passed to HPF_LOCAL procedures cannot be distributed CYCLIC(n). Furthermore, they can have neither the inherit attribute nor a transcriptive distribution.

Also, the following procedures in the HPF Local Routine Library are not supported in the current release:

1.10.5.4 SORT_UP and SORT_DOWN Functions

The SORT_UP and SORT_DOWN HPF library procedures are not supported. Instead, use GRADE_UP and GRADE_DOWN, respectively.

1.10.5.5 Restricted Definition of PURE

In addition to the restrictions on PURE functions listed in the Fortran 95 language standard and in the High Performance Fortran Language Specification, Compaq Fortran adds the additional restriction that PURE functions must be resident. "Resident" means that the function can execute on each processor without reading or writing any data that is not local to that processor.

Non-resident PURE functions are not handled. They will probably cause failure of the executable at run-time if used in FORALLs or in INDEPENDENT DO loops.

1.10.5.6 Restrictions on Procedure Calls in INDEPENDENT DO and FORALL

In order to execute in parallel, procedure calls from FORALL and DO INDEPENDENT constructs must be resident. "Resident" means that the function can execute on each processor without reading or writing any data that is not local to that processor. The compiler requires an explicit assertion that all procedure calls are resident. You can make this assertion in one of two ways:

  1. by labeling every procedure called by the FORALL or INDEPENDENT DO loop as PURE
  2. by encapsulating the entire body of the loop in an ON HOME RESIDENT region.

Because of the restricted definition of PURE in Compaq Fortran (see Section 1.10.5.5), the compiler interprets PURE as an assertion by the program that a procedure is resident.

Unlike procedures called from inside FORALLs, procedures called from inside INDEPENDENT DO loops are not required to be PURE. To assert to the compiler that any non-PURE procedures called from the loop are resident, you can encapsulate the entire body of the loop in an ON HOME RESIDENT region.

If you incorrectly assert that a procedure is resident (using either PURE or ON HOME RESIDENT), the program will either fail at run time, or produce incorrect program results.

Here is an example of an INDEPENDENT DO loop containing an ON HOME RESIDENT directive and a procedure call:


!HPF$ INDEPENDENT 
DO i = 1, 10 
   !HPF$ ON HOME (B(i)), RESIDENT  BEGIN 
   A(i) = addone(B(i)) 
   !HPF$ END ON 
END DO 
. 
. 
. 
 
CONTAINS 
  FUNCTION addone(x) 
    INTEGER, INTENT(IN) :: x 
    INTEGER addone 
    addone = x + 1 
  END FUNCTION addone 

The ON HOME RESIDENT region does not impose any syntactic restrictions. It is merely an assertion that inter-processor communication will not actually be required at run time.

For More Information:

1.10.5.7 Restrictions on Routines Compiled with -nowsf_main

The following are restrictions on dummy arguments to routines compiled with the -nowsf_main compile-time option:

Failure to adhere to these restrictions may result in program failure, or incorrect program results.

1.10.5.8 RAN and SECNDS Are Not PURE

The intrinsic functions RAN and SECNDS are serialized (not executed in parallel). As a result, they are not PURE functions, and cannot be used within a FORALL construct or statement.

1.10.5.9 Nonadvancing I/O on stdin and stdout

Nonadvancing I/O does not work correctly on stdin and stdout . For example, this program is supposed to print the prompt ending with the colon and keep the cursor on that line. Unfortunately, the prompt does not appear until after the input is entered.


PROGRAM SIMPLE 
 
 
        INTEGER STOCKPRICE 
 
        WRITE (6,'(A)',ADVANCE='NO') 'Stock price1   : ' 
        READ  (5, *) STOCKPRICE 
 
        WRITE (6,200) 'The number you entered was ', STOCKPRICE 
200     FORMAT(A,I) 
 
 
END PROGRAM SIMPLE 

The work-around for this bug is to insert a CLOSE statement after the WRITE to stdout . This effectively flushes the buffer.


PROGRAM SIMPLE 
 
 
        INTEGER STOCKPRICE 
 
        WRITE (6,'(A)',ADVANCE='NO') 'Stock price1   : ' 
        CLOSE (6)                        ! Add close to get around bug 
        READ  (5, *) STOCKPRICE 
 
        WRITE (6,200) 'The number you entered was ', STOCKPRICE 
200     FORMAT(A,I) 
 
 
END PROGRAM SIMPLE 

1.10.5.10 WHERE and Nested FORALL

The following statements are not currently supported:

When nested DO loops are converted into FORALLs, nesting is ordinarily not necessary. For example,


DO x=1, 6 
  DO y=1, 6 
    A(x, y) = B(x) + C(y) 
  END DO 
END DO 

can be converted into


FORALL (x=1:6, y=1:6)  A(x, y) = B(x) + C(y) 

In this example, both indices (x and y) can be defined in a single FORALL statement that produces the same result as the nested DO loops.

In general, nested FORALLs are required only when the outer index is used in the definition of the inner index. For example, consider the following DO loop nest, which adds 3 to the elements in the upper triangle of a 6 x 6 array:


DO x=1, 6 
  DO y=x, 6 
    A(x, y) = A(x, y) + 3 
  END DO 
END DO 

In Fortran 95/90, this DO loop nest can be replaced with the following nest of FORALL structures:


FORALL (x=1:6) 
  FORALL (y=x:6) 
    A(x, y) = A(x, y) + 3 
  END FORALL 
END FORALL 

However, nested FORALL is not currently supported in parallel (i.e. with the -wsf option).

A work-around is to use the INDEPENDENT directive:


      integer, parameter :: n=6 
      integer, dimension (n,n) :: A 
!hpf$ distribute A(block,block) 
 
      A = 8 
 
      !hpf$ independent, new(i) 
      do j=1,n 
         !hpf$ independent 
         do i=j,n 
            A(i,j) = A(i,j) + 3 
         end do 
      end do 
 
      print "(6i3)", A 
 
      end 

All three of these code fragments would convert a matrix like this:
<left[ symbol><matrix symbol> 8&8&8&8&8&8<cr symbol> 8&8&8&8&8&8<cr symbol> 8&8&8&8&8&8<cr symbol> 8&8&8&8&8&8<cr symbol> 8&8&8&8&8&8<cr symbol> 8&8&8&8&8&8<cr symbol> <right] symbol>

into this matrix:
<left[ symbol><matrix symbol> 11&11&11&11&11&11<cr symbol> 8&11&11&11&11&11<cr symbol> 8&8&11&11&11&11<cr symbol> 8&8&8&11&11&11<cr symbol> 8&8&8&8&11&11<cr symbol> 8&8&8&8&8&11<cr symbol> <right] symbol>


Previous Next Contents