AMD/Vista Compiler Comparisons
Fortran Execution Time Benchmarks


Updated 04/2008

Polyhedron 2007
Benchmark
Absoft
10.0.8
FTN95
5.20.0
FTN95.NET
5.20.0
G95
0.91
Intel
10.1.011
Lahey
7.10.02
NAG
5.0
PGI
7.1-6
AC 17.48 19.27 17.27 20.48 9.87 22.78 33.83 25.12
AERMOD 29.52 40.07 47.45 51.38 26.59 34.27 56.68 27.56
AIR 15.41 23.55 20.95 17.98 12.40 19.24 15.33 15.59
CAPACITA 59.93 84.84 116.52 92.25 80.39 102.14 100.65 56.02
CHANNEL 13.43 20.41 21.14 14.97 12.37 14.54 12.90 11.88
DODUC 40.04 59.23 44.86 46.51 31.39 52.33 66.32 38.03
FATIGUE 7.12 23.98 40.81 55.54 12.00 17.07 23.06 10.09
GAS_DYN 6.17 24.28 20.01 26.90 6.10 12.35 24.66 8.27
INDUCT 30.49 94.31 51.41 52.73 65.96 79.93 79.11 29.37
LINPK 21.33 21.41 22.95 23.13 23.29 21.06 21.41 22.07
MDBX 17.73 30.90 23.52 25.10 17.96 27.16 21.30 18.61
NF 24.64 45.47 62.88 46.22 25.45 37.76 27.64 25.03
PROTEIN 45.75 90.67 82.26 68.82 46.46 80.81 62.81 55.73
RNFLOW 30.86 44.33 50.44 40.08 34.55 36.28 48.05 44.79
TEST_FPU 18.55 26.03 48.25 32.24 18.16 20.51 20.48 18.61
TFFT 6.97 8.23 7.58 7.36 7.28 7.21 7.48 7.51
Geometric Mean 19.86 33.60 34.54 32.67 20.67 28.45 30.99 21.69

 

Compiler Switches
Absoft

f95 -V -m32 -Ofast -speed_math=9 -WOPT:if_conv=off -LNO:fu=9:full_unroll_size=7000 -march=host -xINTEGER -stack:0x8000000

FTN95 ftn95 /p6 /optimise (slink was used to increase the stack size)
FTN95.NET ftn95 /clr /optimise (dbk_link was used to increase the stack size)
g95 g95 -march=opteron -ffast-math -funroll-loops -O3
Intel ifort /O3 /Qipo /QxO /Qprec-div- /link /stack:64000000
Lahey lf95 -inline (35) -o1 -sse2 -nstchk -tp4 -ntrace -unroll (6) -zfm
NAG f95 -O4 -V
PGI pgf90 -Bstatic -V -fastsse -Munroll=n:4 -Mipa=fast,inline -tp k8-32
 
Notes  
  All figures are Execution Times in Seconds - measured on a Dell Dimension E521 with an AMD X2 processor 5600 (2.8 GHz), with 4 x 1024MB 533MHz DDR2 Memory, running Windows Vista Business. Each figure is the average over at least 10 runs (many more for some). Measurement error is typically <1%. Green cells highlight figures within 10% of the fastest. Red cells indicate figures which are more than 150% of the fastest.
So far as possible, we have used the compiler switches which give the best overall results. We have not attempted to tune individual benchmarks, and, in particular cases, different switch settings may give better results.

Thanks are due to Jos Bergervoet for permission to use his CAPACITA benchmark, to Quetzal Associates for permission to use their CHANNEL, FATIGUE, GAS_DYN, INDUCT, PROTEIN and RNFLOW benchmarks, to David Frank for his TEST_FPU benchmark, and to Ted Addison of McVehil-Monnett Associates for permission to use AERMOD, an air quality model used by the US Environmental Protection Agency.

All the benchmarks have been modified slightly to fit into our benchmarking harness.

The NF benchmark uses  "nested factorization", a little known but very effective iterative linear solver for huge finite difference matrices.  A paper describing nested factorization, and comparing it to other methods is available here.

This Benchmark comparison was produced by Polyhedron Ltd. and this page is reproduced with permission from Polyhedron Ltd.