The ASSERT directive provides information to the compiler about the characteristics of DO loops. This assists the compiler in optimizing the source code.
The directive only takes effect if you specify the -qhot compiler option.
Syntax
>>-ASSERT--(--assertion_list--)-------------------------------->< |
Rules
The first noncomment line (not including other directives) following the ASSERT directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The ASSERT directive applies only to the DO loop immediately following the directive, and not to any nested DO loops.
ITERCNT provides an estimate to the compiler about roughly how many iterations the DO loop will typically run. There is no requirement that the value be accurate; ITERCNT will only affect performance, never correctness.
When NODEPS is specified, the user is explicitly declaring to the compiler that no loop-carried dependencies exist within the DO loop or any procedures invoked from within the DO loop. A loop-carried dependency involves two iterations within a DO loop interfering with one another. Interference occurs in the following situations:
It is possible for two complementary ASSERT directives to apply to any given DO loop. However, an ASSERT directive cannot be followed by a contradicting ASSERT directive for a given DO loop:
!IBM* ASSERT (ITERCNT(10))
!IBM* INDEPENDENT, REDUCTION (A)
!IBM* ASSERT (ITERCNT(20)) ! invalid
DO I = 1, N
A(I) = A(I) * I
END DO
In the example above, the ASSERT(ITERCNT(20)) directive contradicts the ASSERT(ITERCNT(10)) directive and is invalid.
The ASSERT directive overrides the -qassert compiler option for the DO loop on which the ASSERT directive is specified.
Examples
Example 1:
! An example of the
ASSERT directive with NODEPS.
PROGRAM EX1
INTEGER A(100)
!IBM* ASSERT (NODEPS)
DO I = 1, 100
A(I) = A(I) * FNC1(I)
END DO
END PROGRAM EX1
FUNCTION FNC1(I)
FNC1 = I * I
END FUNCTION FNC1
Example 2:
! An example of the ASSERT directive with NODEPS and
ITERCNT.
SUBROUTINE SUB2 (N)
INTEGER A(N)
!IBM* ASSERT (NODEPS,ITERCNT(100))
DO I = 1, N
A(I) = A(I) * FNC2(I)
END DO
END SUBROUTINE SUB2
FUNCTION FNC2 (I)
FNC2 = I * I
END FUNCTION FNC2
Related Information
When the CNCALL directive is placed before a DO loop, you are explicitly declaring to the compiler that no loop-carried dependencies exist within any procedure called from the DO loop.
The directive only takes effect if you specify the -qhot compiler option.
Syntax
>>-CNCALL------------------------------------------------------>< |
Rules
The first noncomment line (not including other directives) that is following the CNCALL directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The CNCALL directive applies only to the DO loop that is immediately following the directive and not to any nested DO loops.
When specifying the CNCALL directive, you are explicitly declaring to the compiler that no procedures invoked within the DO loop have any loop-carried dependencies. If the DO loop invokes a procedure, separate iterations of the loop must be able to concurrently call that procedure. The CNCALL directive does not assert that other operations in the loop do not have dependencies, it is only an assertion about procedure references.
A loop-carried dependency occurs when two iterations within a DO loop interfere with one another. See the ASSERT directive for the definition of interference.
Examples
! An example of CNCALL where the procedure invoked has
! no loop-carried dependency but the code within the
! DO loop itself has a loop-carried dependency.
PROGRAM EX3
INTEGER A(100)
!IBM* CNCALL
DO I = 1, N
A(I) = A(I) * FNC3(I)
A(I) = A(I) + A(I-1) ! This has loop-carried dependency
END DO
END PROGRAM EX3
FUNCTION FNC3 (I)
FNC3 = I * I
END FUNCTION FNC3
Related Information
The COLLAPSE directive reduces an entire array dimension to a single element by specifying that only the element in the lower bound of an array dimension is accessible. If you do not specify a lower bound, the default lower bound is one.
Used with discretion, the COLLAPSE directive can facilitate an increase in performance by reducing repetitive memory access associated with multiple-dimension arrays.
Syntax
>>-COLLAPSE--(--collapse_array_list--)------------------------->< |
where collapse_array is:
>>-array_name--(--expression_list--)--------------------------->< |
where expression_list is a comma separated list of expression.
Rules
The COLLAPSE directive must contain at least one array.
The COLLAPSE directive applies only to the scoping unit in which it is specified. The declarations of arrays contained in a COLLAPSE directive must appear in the same scoping unit as the directive. An array that is accessible in a scoping unit by use or host association must not specified in a COLLAPSE directive in that scoping unit.
The lowest value you can specify in expression_list is one. The highest value must not be greater than the number of dimensions in the corresponding array.
A single scoping unit can contain multiple COLLAPSE declarations, though you can only specify an array once for a particular scoping unit.
You can not specify an array in both a COLLAPSE directive and an EQUIVALENCE statement.
You can not use the COLLAPSE directive with arrays that are components of derived types.
If you apply both the COLLAPSE and SUBSCRIPTORDER directives to an array, you must specify the SUBSCRIPTORDER directive first.
The COLLAPSE directive applies to:
Examples
Example 1: In the following example, the COLLAPSE directive is applied to the explicit-shape arrays A and B. Referencing A(m,2:100,2:100) and B(m,2:100,2:100) in the inner loops, become A(m,1,1) and B(m,1,1).
!IBM* COLLAPSE(A(2,3),B(2,3))
REAL*8 A(5,100,100), B(5,100,100), c(5,100,100)
DO I=1,100
DO J=1,100
DO M=1,5
A(M,J,I) = SIN(C(M,J,I))
B(M,J,I) = COS(C(M,J,I))
END DO
DO M=1,5
DO N=1,M
C(M,J,I) = C(M,J,I) + A(N,J,I)*B(6-N,J,I)
END DO
END DO
END DO
END DO
END
Related Information
For more information on the SUBSCRIPTORDER directive, seeSUBSCRIPTORDER
EJECT directs the compiler to start a new full page of the source listing. If there has been no source listing requested, the compiler will ignore this directive.
Syntax
>>-EJECT------------------------------------------------------->< |
Rules
The EJECT compiler directive can have an inline comment and a label. However, if you specify a statement label, the compiler discards it. Therefore, you must not reference any label on an EJECT directive. An example of using the directive would be placing it before a DO loop that you do not want split across pages in the listing. If you send the source listing to a printer, the EJECT directive provides a page break.
The INCLUDE compiler directive inserts a specified statement or a group of statements into a program unit.
Syntax
>>-INCLUDE--+-char_literal_constant-+--+---+-------------------><
'-(--name--)------------' '-n-'
|
You are not required to specify the full path of the desired file, but must specify the file extension if one exists.
name must contain only characters allowable in the XL Fortran character set. See Characters for the character set supported by XL Fortran.
char_literal_constant is a character literal constant.
A feature called conditional INCLUDE provides a means for selectively activating INCLUDE compiler directives within the Fortran source during compilation. You specify the included files by means of the -qci compiler option.
In fixed source form, the INCLUDE compiler directive must start after column 6, and can have a label.
You can add an inline comment to the INCLUDE line.
Rules
An included file can contain any complete Fortran source statements and compiler directives, including other INCLUDE compiler directives. Recursive INCLUDE compiler directives are not allowed. An END statement can be part of the included group. The first and last included lines must not be continuation lines. The statements in the include file are processed with the source form of the including file.
If the SOURCEFORM directive appears in an include file, the source form reverts to that of the including file once processing of the include file is complete. After the inclusion of all groups, the resulting Fortran program must follow all of the Fortran rules for statement order.
For an INCLUDE compiler directive with the left and right parentheses syntax, XL Fortran translates the file name to lowercase unless the -qmixed compiler option is on.
The file system locates the specified filename as follows:
Examples
INCLUDE '/u/userid/dc101' ! full absolute file name specified INCLUDE '/u/userid/dc102.inc' ! INCLUDE file name has an extension INCLUDE 'userid/dc103' ! relative path name specified
INCLUDE (ABCdef) ! includes file abcdef
INCLUDE '../Abc' ! includes file Abc from parent directory
! of directory being searched
Related Information
-qci Option in the User's Guide
The INDEPENDENT directive, if used, must precede a DO loop, FORALL statement, or FORALL construct. It specifies that each operation in the FORALL statement or FORALL construct, can be executed in any order without affecting the semantics of the program. It also specifies each iteration of the DO loop, can be executed without affecting the semantics of the program.
The directive only takes effect if you specify the -qhot compiler option.
Syntax
.---------------------------------------------.
V |
>>-INDEPENDENT----+-----------------------------------------+-+-><
+-,--NEW--(--named_variable_list--)-------+
'-,--REDUCTION--(--named_variable_list--)-'
|
Rules
The first noncomment line (not including other directives) following the INDEPENDENT directive must be a DO loop, FORALL statement, or the first statement of a FORALL construct. This line cannot be an infinite DO or DO WHILE loop. The INDEPENDENT directive applies only to the DO loop that is immediately following the directive and not to any nested DO loops.
An INDEPENDENT directive can have at most one NEW clause and at most one REDUCTION clause.
If the directive applies to a DO loop, no iteration of the loop can interfere with any other iteration. Interference occurs in the following situations:
If the NEW clause is specified, the directive must apply to a DO loop. The NEW clause modifies the directive and any surrounding INDEPENDENT directives by accepting any assertions made by such directive(s) as true. It does this even if the variables specified in the NEW clause are modified by each iteration of the loop. Variables specified in the NEW clause behave as if they are private to the body of the DO loop. That is, the program is unaffected if these variables (and any variables associated with them) were to become undefined both before and after each iteration of the loop.
Any variable you specify in the NEW clause or REDUCTION clause must not:
For FORALL, no combination of index values affected by the INDEPENDENT directive assigns to an atomic storage unit that is required by another combination. If a DO loop, FORALL statement, or FORALL construct all have the same body and each is preceded by an INDEPENDENT directive, they behave the same way.
The REDUCTION clause asserts that updates to named variables will occur within REDUCTION statements in the INDEPENDENT loop. Furthermore, the intermediate values of the REDUCTION variables are not used within the parallel section, other than in the updates themselves. Thus, the value of the REDUCTION variable after the construct is the result of a reduction tree.
If you specify the REDUCTION clause, the directive must apply to a DO loop. The only reference to a REDUCTION variable in an INDEPENDENT DO loop must be within a reduction statement.
A REDUCTION variable must be of intrinsic type, but must not be of type character. A REDUCTION variable must not be an allocatable array.
A REDUCTION variable must not occur in:
A REDUCTION statement can have one of the following forms:
>>-reduction_var_ref--=--expr--reduction_op--reduction_var_ref->< >>-reduction_var_ref--=--reduction_var_ref--reduction_op--expr->< >>-reduction_var_ref =--reduction_function--(expr,--reduction_var_ref)->< >>-reduction_var_ref =--reduction_function--(reduction_var_ref,--expr)->< |
where:
The following rules apply to REDUCTION statements:
>>-reduction_var_ref-- = --expr-- - --reduction_var_ref-------->< |
Examples
Example 1:
INTEGER A(10),B(10,12),F
!IBM* INDEPENDENT ! The NEW clause cannot be
FORALL (I=1:9:2) A(I)=A(I+1) ! specified before a FORALL
!IBM* INDEPENDENT, NEW(J)
DO M=1,10
J=F(M) ! 'J' is used as a scratch
A(M)=J*J ! variable in the loop
!IBM* INDEPENDENT, NEW(N)
DO N=1,12 ! The first executable statement
B(M,N)=M+N*N ! following the INDEPENDENT must
END DO ! be either a DO or FORALL
END DO
END
Example 2:
X=0
!IBM* INDEPENDENT, REDUCTION(X)
DO J = 1, M
X = X + J**2
END DO
Example 3:
INTEGER A(100), B(100, 100)
!IBM* INDEPENDENT, REDUCTION(A), NEW(J) ! Example showing an array used
DO I=1,100 ! for a reduction variable
DO J=1, 100
A(I)=A(I)+B(J, I)
END DO
END DO
Related Information
The #line directive associates code that is created by cpp or any other Fortran source code generator with input code created by the programmer. Because the preprocessor may cause lines of code to be inserted or deleted, the #line directive can be useful in error reporting and debugging because it identifies which lines in the original source caused the preprocessor to generate the corresponding lines in the intermediate file.
Syntax
>>-#line--line_number--+----------+----------------------------><
'-filename-'
|
The #line directive is a noncomment directive and follows the syntax rules for this type of directive.
Rules
The #line directive follows the same rules as other noncomment directives, with the following exceptions:
The #line directive indicates the origin of all code following the directive in the current file. Another #line directive will override a previous one.
If you supply a filename, the subsequent code in the current file will be as if it originated from that filename. If you omit the filename, and no previous #line directive with a specified filename exists in the current file, the code in the current file is treated as if it originated from the current file at the line number specified. If a previous #line directive with a specified filename does exist in the current file, the filename from the previous directive is used.
line_number indicates the position, in the appropriate file, of the line of code following the directive. Subsequent lines in that file are assumed to have a one to one correspondence with subsequent lines in the source file until another #line directive is specified or the file ends.
When XL Fortran invokes cpp for a file, the preprocessor will emit #line directives unless you also specify the -d option.
Examples
The file test.F contains:
! File test.F, Line 1 #include "test.h" PRINT*, "test.F Line 3" ... PRINT*, "test.F Line 6" #include "test.h" PRINT*, "test.F Line 8" END
The file test.h contains:
! File test.h line 1 RRINT*,1 ! Syntax Error PRINT*,2
After the C preprocessor (/lib/cpp) processes the file test.F with the default options:
#line 1 "test.F" ! File test.F, Line 1 #line 1 "test.h" ! File test.h Line 1 RRINT*,1 ! Syntax Error PRINT*,2 #line 3 "test.F" PRINT*, "test.F Line 3" ... #line 6 PRINT*, "test.F Line 6" #line 1 "test.h" ! File test.h Line 1 RRINT*,1 ! Syntax Error PRINT*,2 #line 8 "test.F" PRINT*, "test.F Line 8" END
The compiler displays the following messages after it processes the file that is created by the C preprocessor:
2 2 |RRINT*,1
!Syntax error
......a................
a - "test.h", line 2.6: 1515-019 (S) Syntax is incorrect.
4 2 |RRINT*,1 !Syntax error
......a................
a - "test.h", line 2.6: 1515-019 (S) Syntax is incorrect.
Related Information
The PERMUTATION directive specifies that the elements of each array that is listed in the integer_array_name_list have no repeated values. This directive is useful when you use array elements as subscripts for other array references.
The PERMUTATION directive only takes effect if you specify the -qhot compiler option.
Syntax
>>-PERMUTATION--(--integer_array_name_list--)------------------>< |
Rules
The first noncomment line (not including other directives) that is following the PERMUTATION directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The PERMUTATION directive applies only to the DO loop that is immediately following the directive, and not to any nested DO loops.
Examples
PROGRAM EX3
INTEGER A(100), B(100)
!IBM* PERMUTATION (A)
DO I = 1, 100
A(I) = I
B(A(I)) = B(A(I)) + A(I)
END DO
END PROGRAM EX3
Related Information
You can specify compiler options to affect an individual compilation unit by putting the @PROCESS compiler directive in the source file. It can override options that are specified in the configuration file, in the default settings, or on the command line.
Syntax
.-+---+----------------------------.
| '-,-' |
V |
>>-@PROCESS----option--+----------------------+-+--------------><
'-(--suboption_list--)-'
|
Rules
In fixed source form, @PROCESS can start in column 1 or after column 6. In free source form, the @PROCESS compiler directive can start in any column.
You cannot place a statement label or inline comment on the same line as an @PROCESS compiler directive.
By default, any option settings you designate with the @PROCESS compiler directive are effective only for the compilation unit in which the statement appears. If the file has more than one compilation unit, the option setting is reset to its original state before the next unit is compiled. Trigger constants specified by the DIRECTIVE option are in effect until the end of the file (or until NODIRECTIVE is processed).
The @PROCESS compiler directive must usually appear before the first statement of a compilation unit. The only exceptions are for SOURCE and NOSOURCE, which you can put in @PROCESS directives anywhere in the compilation unit.
Related Information
See Compiler Option Details in the User's Guide for details on compiler options.
You can use the SNAPSHOT directive to specify a safe location where a breakpoint can be set with a debug program, and provide a set of variables that must remain visible to the debug program.
There may be a slight performance hit at the point where the SNAPSHOT directive is set, because the variables must be kept in memory for the debug program to access. Variables made visible by the SNAPSHOT directive are read-only. Undefined behavior will occur if these variables are modified through the debugger. Use with discretion.
Syntax
>>-SNAPSHOT--(--named_variable_list--)------------------------->< |
Rules
To use the SNAPSHOT directive, you must specify the -qdbg compiler option at compilation.
Related Information
See the User's Guide for details on the -qdbg compiler option.
The SOURCEFORM compiler directive indicates that all subsequent lines are to be processed in the specified source form until the end of the file is reached or until an @PROCESS directive or another SOURCEFORM directive specifies a different source form.
Syntax
>>-SOURCEFORM--(--source--)------------------------------------>< |
Rules
The SOURCEFORM directive can appear anywhere within a file. An include file is compiled with the source form of the including file. If the SOURCEFORM directive appears in an include file, the source form reverts to that of the including file once processing of the include file is complete.
The SOURCEFORM directive cannot specify a label.
| Tip |
|---|
|
To modify your existing files to Fortran 90 free source form where include files exist:
|
Examples
@PROCESS DIRECTIVE(CONVERT*)
PROGRAM MAIN ! Main program not yet converted
A=1; B=2
INCLUDE 'freeform.f'
PRINT *, RESULT ! Reverts to fixed form
END
where file freeform.f contains:
!CONVERT* SOURCEFORM(FREE(F90)) RESULT = A + B
The STREAM_UNROLL directive instructs the compiler to apply the combined functionality of software prefetch and loop unrolling to DO loops with a large iteration count. Stream unrolling functionality is available only on PowerPC 970 platforms or higher, and optimizes DO loops to use multiple streams. You can specify STREAM_UNROLL for both inner and outer DO loops, and the compiler will use an optimal number of streams to perform stream unrolling where applicable. Applying STREAM_UNROLL to a loop with dependencies will produce unexpected results.
Syntax
>>---STREAM_UNROLL--+---------------------+--------------------><
'-(--unroll_factor--)-'
|
Rules
You must specify the -qhot or -qipa=level=2 compiler option to enable stream unrolling. An optimization level of -O4 or higher also allows the compiler to perform stream unrolling.
For stream unrolling to occur, a STREAM_UNROLL directive must precede a DO loop.
You must not specify the STREAM_UNROLL directive more than once, or combine the directive with UNROLL, NOUNROLL, UNROLL_AND_FUSE, or NOUNROLL_AND_FUSE directives for the same DO construct.
You must not specify the STREAM_UNROLL directive for a DO WHILE loop or an infinite DO loop.
Examples
The following is an example of how STREAM_UNROLL can increase performance.
integer, dimension(1000) :: a, b, c
integer i, m, n
!IBM* stream_unroll(4)
do i =1, n
a(i) = b(i) + c(i)
enddo
end
An unroll factor reduces the number of iterations from n to n/4, as follows:
m = n/4
do i =1, n/4
a(i) = b(i) + c(i)
a(i+m) = b(i+m) + c(i+m)
a(i+2*m) = b(i+2*m) + c(i+2*m)
a(i+3*m) = b(i+3*m) + c(i+3*m)
enddo
The increased number of read and store operations are distributed among a number of streams determined by the compiler, reducing computation time and boosting performance.
Related Information
The SUBSCRIPTORDER directive rearranges the subscripts of an array. This results in a new array shape, since the directive changes the order of array dimensions in the declaration. All references to the array are correspondingly rearranged to match the new array shape.
Used with discretion, the SUBSCRIPTORDER directive may improve performance by increasing the number of cache hits and the amount of data prefetching. You may have to experiment with this directive until you find the arrangement that yields the most performance benefits. You may find SUBSCRIPTORDER especially useful when porting code originally intended for a non-cached hardware architecture.
In a cached hardware architecture, such as the PowerPC, an entire cache line of data is often loaded into the processor in order to access each data element. Changing the storage arrangement can be used to ensure that consecutively accessed elements are stored adjacently. This may result in a performance improvement, as there are more element accesses for each cache line referenced. Additionally, adjacently storing array elements which are consecutively accessed may help to better exploit the processor's prefetching facility.
Syntax
.-,--------------------.
V |
>>-SUBSCRIPTORDER--(----subscriptorder_array-+--)--------------><
|
where subscriptorder_array is:
.-,---------------------.
V |
>>-array_name--(----subscriptorder_number-+--)-----------------><
|
Rules
The SUBSCRIPTORDER directive must appear in a scoping unit preceding all declarations and references to the arrays in the subscriptorder_array list. The directive only applies to that scoping unit and must contain at least one array. If multiple scoping units share an array, then you must apply the SUBSCRIPTORDER directive to each of the applicable scoping units with identical subscript arrangements. Examples of methods of array sharing between scoping units include COMMON statements, USE statements, and subroutine arguments.
The lowest subscript number in a subscriptorder_number list must be 1. The highest number must be equal to the number of dimensions in the corresponding array. Every integer number between these two limits, including the limits, signifies a subscript number prior to rearrangement and must be included exactly once in the list.
You must not apply a SUBSCRIPTORDER directive multiple times to a particular array in a scoping unit.
You must maintain array shape conformance in passing arrays as actual arguments to elemental procedures, if one of the arrays appears in a SUBSCRIPTORDER directive. You must also adjust the actual arguments of the SHAPE, SIZE, LBOUND, and UBOUND inquiry intrinsic procedures and of most transformational intrinsic procedures.
You must manually modify data in input data files and in explicit initializations for arrays that appear in the SUBSCRIPTORDER directive.
On arrays to which the COLLAPSE directive is also applied, the COLLAPSE directive always refers to the pre-subscriptorder dimension numbers.
You must not rearrange the last dimension of an assumed-size array.
Examples
Example 1: In the following example, the SUBSCRIPTORDER directive is applied to an explicit-shape array and swaps the subscripts in every reference to the array, without affecting the program output.
!IBM* SUBSCRIPTORDER(A(2,1))
INTEGER COUNT/1/, A(3,2)
DO J = 1, 3
DO K = 1, 2
! Inefficient coding: innermost index is accessing rightmost
! dimension. The subscriptorder directive compensates by
! swapping the subscripts in the array's declaration and
! access statements.
!
A(J,K) = COUNT
PRINT*, J, K, A(J,K)
COUNT = COUNT + 1
END DO
END DO
Without the directive above, the array shape is (3,2) and the array elements would be stored in the following order:
A(1,1) A(2,1) A(3,1) A(1,2) A(2,2) A(3,2)
With the directive, the array shape is (2,3) and the array elements are stored in the following order:
A(1,1) A(2,1) A(1,2) A(2,2) A(1,3) A(2,3)
Related Information
For more information on the COLLAPSE directive, see COLLAPSE
The UNROLL directive instructs the compiler to attempt loop unrolling where applicable. Loop unrolling replicates the body of the DO loop to reduce the number of iterations required to complete the loop.
You can control loop unrolling for an entire file using the -qunroll compiler option. Specifying the directive for a particular DO loop always overrides the compiler option.
Syntax
>>-+-UNROLL--+---------------------+-+------------------------->< | '-(--unroll_factor--)-' | '-NOUNROLL------------------------' |
Rules
For loop unrolling to occur, an UNROLL directive must precede a DO loop.
You must not specify the UNROLL directive more than once, or combine the directive with NOUNROLL, STREAM_UNROLL, UNROLL_AND_FUSE, or NOUNROLL_AND_FUSE directives for the same DO construct.
You must not specify the UNROLL directive for a DO WHILE loop or an infinite DO loop.
Examples
Example 1: In this example, the UNROLL(2) directive is used to tell the compiler that the body of the loop can be replicated so that the work of two iterations is performed in a single iteration. Instead of performing 1000 iterations, if the compiler unrolls the loop, it will only perform 500 iterations.
!IBM* UNROLL(2)
DO I = 1, 1000
A(I) = I
END DO
If the compiler chooses to unroll the previous loop, the compiler translates the loop so that it is essentially equivalent to the following:
DO I = 1, 1000, 2
A(I) = I
A(I+1) = I + 1
END DO
Example 2: In the first DO loop, UNROLL(3) is used. If unrolling is performed, the compiler will unroll the loop so that the work of three iterations is done in a single iteration. In the second DO loop, the compiler determines how to unroll the loop for maximum performance.
PROGRAM GOODUNROLL
INTEGER I, X(1000)
REAL A, B, C, TEMP, Y(1000)
!IBM* UNROLL(3)
DO I = 1, 1000
X(I) = X(I) + 1
END DO
!IBM* UNROLL
DO I = 1, 1000
A = -I
B = I + 1
C = I + 2
TEMP = SQRT(B*B - 4*A*C)
Y(I) = (-B + TEMP) / (2*A)
END DO
END PROGRAM GOODUNROLL
Related Information
The UNROLL_AND_FUSE directive instructs the compiler to attempt a loop unroll and fuse where applicable. Loop unrolling replicates the body of multiple DO loops and combines the necessary iterations into a single unrolled loop. Using a fused loop can minimize the required number of loop iterations, while reducing the frequency of cache misses. Applying UNROLL_AND_FUSE to a loop with dependencies will produce unexpected results.
Syntax
>>-+-UNROLL_AND_FUSE--+---------------------+-+---------------->< | '-(--unroll_factor--)-' | '-NOUNROLL_AND_FUSE------------------------' |
Rules
For loop unrolling to occur, an UNROLL_AND_FUSE directive must precede a DO loop.
You must not specify the UNROLL_AND_FUSE directive for the innermost DO loop.
You must not specify the UNROLL_AND_FUSE directive more than once, or combine the directive with NOUNROLL_AND_FUSE, NOUNROLL, UNROLL, or STREAM_UNROLL directives for the same DO construct.
You must not specify the UNROLL_AND_FUSE directive for a DO WHILE loop or an infinite DO loop.
Examples
Example 1: In the the following example, the UNROLL_AND_FUSE directive replicates and fuses the body of the loop. This reduces the number of cache misses for Array B.
INTEGER, DIMENSION(1000, 1000) :: A, B, C
!IBM* UNROLL_AND_FUSE(2)
DO I = 1, 1000
DO J = 1, 1000
A(J,I) = B(I,J) * C(J,I)
END DO
END DO
END
The DO loop below shows a possible result of applying the UNROLL_AND_FUSE directive.
DO I = 1, 1000, 2
DO J = 1, 1000
A(J,I) = B(I,J) * C(J,I)
A(J,I+1) = B(I+1, J) * C(J, I+1)
END DO
END DO
Example 2: The following example uses multiple UNROLL_AND_FUSE directives:
INTEGER, DIMENSION(1000, 1000) :: A, B, C, D, H
!IBM* UNROLL_AND_FUSE(4)
DO I = 1, 1000
!IBM* UNROLL_AND_FUSE(2)
DO J = 1, 1000
DO k = 1, 1000
A(J,I) = B(I,J) * C(J,I) + D(J,K)*H(I,K)
END DO
END DO
END DO
END
Related Information
+----------------------------End of IBM Extension----------------------------+
