Absoft's Auto-Vectorizing Compilers & Tools
                                                  Vectorize Your Code Quickly & Automatically

Auto-Vectorization on x86 & x64 Processors:

Absoft compilers include auto-vectorization capabilities which utilize the SIMD instructions of the host processor to restructure code in a manner which allows executing multiple operations simultaneously. This action is performed automatically and requires no action by the operator other than invoking the auto-vector option.

Auto-vectorization is especially effective on loops and in some cases can result in significant speed increases.

Absoft compilers can also generate an auto-vectorization report, showing which code segments were vectorized, and which were not, and why. This allows the programmer to review, and at their option, modify the existing code for additional performance gains.

Absoft auto-vectorization support is included in Absoft compilers v10.0 and higher for AMD and Intel x86 and x64 processors.

Vector Examples on x86_64:

Performance Graph





Vectorization Report Example


The Vectorization Report shows which loops were vectorized, which were not - and why.

View Sample Report



Auto-Vectorization on POWER &
G4/G5 Processors:

IBM POWER and Apple G4/G5 processors include hardware vector units which can accelerate application performance. Absoft has partnered with Crescent Bay Software to offer optional VAST vectorization tools tuned for the POWER architecture which make the vectorization process automatic and retains the original source code.

VAST-F/Vector is a preprocessor which examines source code looking for loops or other code segments which can benefit from vectorization. It then automatically generates new source code which includes vector calls. The original source code is also maintained. The new source code is then compiled in the normal manner with Absoft Fortran.

Vector Examples on POWER and G4/G5:

Performance Graph



The graph below illustrates the performance benefits of using VAST Vectorization Tools on a Mac G4 system.

For this test, longer is better:



VAST-F/Vector Information

VAST Vector:

  • Optimization of entire loop nests, not just inner loops. Critical optimizations include loop fusion (squeezing multiple loops into one loop), outer loop unrolling (unrolling an outer loop inside an inner loop), loop collapse (making one long loop from a multiple dimension loop), and loop interchange (changing the order of the loops in a loop nest to get more efficient memory access).

  • Unrolled vector loops. Unrolling vectorized loops is very important in making sure that the vector instructions are overlapped the the maximum extent possible.

  • Vectorization of reduction loops. Includes array summations, dot products, minimum and maximum element of an array, product of array elements, etc. These operations take a large fraction of the CPU time for many programs.

  • Vectorization of conditional loops. "if" statements and conditional operators are vectorized.

  • Non-aligned vectors can be vectorized efficiently. VAST introduces "permute" operations to align vectors "on the fly" prior to computation.

  • 32-bit float and 8, 16 and 32-bit integer vectorization. Integers can be signed and unsigned. Also, VAST can vectorize loops that contain mixed data sizes.

  • ALIGNED pragma so that the user can inform VAST-C about arrays that are aligned on 16-byte boundaries. Also the -Valigned command line switch.

  • -Vmessages switch to get vectorization messages for all loops in the program. Find out what constructs are inhibiting vectorization of your important loops.

  • DISJOINT, NODEPCHK pragmas for disambiguating data dependencies. Especially useful if the target program uses lots of pointers rather than array notation.

  • -L parameter for assertion levels to allow vectorization in the presence of pointer arguments. Can be very useful if the program is written to pass most of the data as pointer arguments.

  • Vector load lifting. Move all loads to the top of the loop, as far as they will go (safely). Allows the compiler to do a better job of instruction scheduling.

  • Vectorization of complex data type. Uses the permute instructions to reorder interleaved complex data so that it can be operated on with the vector unit.

  • Testing for stride one on loops with variable stride. Inserts a run-time test to see if variable array strides are all one; executes a vector version of the loop if the strides are one, otherwise executes the original scalar loop.

  • Partial vectorization of loops with strided or gather/scatter vectors.

  • Vectorization of "table lookup" loops. Loops that have a branch out of the loop can be vectorized in certain cases.




All Absoft Compilers Include FREE Technical Support!


Experienced Support Engineers are available via phone at
248-853-0095 or email 9am to 4pm EST (M-F)
to answer your Absoft Fortran questions!

 

Contact | Newsletter | Career | Legal | Terms of Use | Privacy | Buy | Support | Downloads | Site Map | Home

© 1996-2011 Absoft  Corporation 2781 Bond Street Rochester Hills Michigan 48309  
 Voice: 248-853-0050   Fax: 248-853-0108