Feeds:
Posts
Comments

Posts Tagged ‘Fortran’

The Fortran compiler supports several kinds of floating-point exceptions; a summary of their masked (or default) responses is given below:

The -fpen option allows some control over the results of floating-point exceptions.

-fpe0 restricts floating-point exceptions as follows:

  • Enables the overflow, the divide-by-zero, and the invalid floating-point exceptions. The program will print an error message and abort if any of these exceptions occurs. If a floating-point underflow occurs, the result is set to zero and execution continues. This is called flush-to-zero. This option sets -IPF_fp_speculationstrict  if no specific -IPF_fp_speculation option is specified.  The -fpe0 option sets -ftz. To get more detailed location information about where the exception occurred, use -traceback.

-fpe1 restricts only floating-point underflow:

  • Floating-point overflow, floating-point divide-by-zero, and floating-point invalid produce exceptional values (NaN and signed Infinities) and execution continues. If a floating-point underflow occurs, the result is set to zero and execution continues. The -fpe1 option sets -ftz.

-fpe3, the default, allows full floating-point exception behavior:

  • Floating overflow, floating divide-by-zero, and floating invalid produce exceptional values (NaN and signed Infinities) and execution continues. Floating underflow is gradual:  denormalized values are produced until the result becomes 0.


The -fpe option affects the Fortran main program only.  The floating-point exception behavior set by the Fortran main program remains in effect throughout the execution of the entire program unless changed by the programmer. If the main program is not Fortran, the user can use the Fortran intrinsic FOR_SET_FPE to set the floating-point exception behavior.

When compiling different routines in a program separately, you should use the same value of n in -fpen.

An example follows:

IMPLICIT NONE

real*4 res_uflow, res_oflow

real*4 res_dbyz, res_inv

real*4 small, big, zero, scale

small = 1.0e-30

big   = 1.0e30

zero  = 0.0

scale = 1.0e-10

!      IEEE underflow condition (Underflow Raised)

res_uflow = small * scale

write(6,100)”Underflow: “,small, ” *”, scale, ” = “, res_uflow

!      IEEE overflow condition (Overflow Raised)

res_oflow = big * big

write(6,100)”Overflow: “, big, ” *”, big, ” = “, res_oflow

!      IEEE divide-by-zero condition (Divide by Zero Raised)

res_dbyz = -big / zero

write(6,100)”Div-by-zero: “, -big, ” /”, zero, ” = “, res_dbyz

!      IEEE invalid condition (Invalid Raised)

res_inv = zero / zero

write(6,100)”Invalid: “, zero, ” /”, zero, ” = “, res_inv

100   format(A14,E8.1,A2,E8.1,A2,E10.1)

end

Consider the following command line:

$ ifort fpe.f -fpe0 -g

The following output is produced:

$./a.out

Underflow:  0.1E-29 * 0.1E-09 =   0.0E+00

forrtl: error (72): floating overflow

Image       PC        Routine     Line        Source

a.out       0804A063  Unknown     Unknown  Unknown

a.out       08049E78  Unknown     Unknown  Unknown

Unknown     B746B748  Unknown     Unknown  Unknown

a.out       08049D31  Unknown     Unknown  Unknown

Aborted

The following command line uses -fpe1:

$ifort fpe.f -fpe1 -g

The following output is produced:

$./a.out

Underflow:  0.1E-29 * 0.1E-09 =   0.0E+00

Overflow:  0.1E+31 * 0.1E+31 = Infinity

Div-by-zero: -0.1E+31 / 0.0E+00 = -Infinity

Invalid:  0.0E+00 / 0.0E+00 = NaN

The following command line uses -fpe3:

$ifort fpe.f -fpe3 -g

The following output is produced:

$./a.out

Underflow:  0.1E-29 * 0.1E-09 =   0.1E-39

Overflow:  0.1E+31 * 0.1E+31 = Infinity

Div-by-zero: -0.1E+31 / 0.0E+00 = -Infinity

Invalid:  0.0E+00 / 0.0E+00 = NaN

see also

Advertisements

Read Full Post »

Debug
-g Compile with debugging symbols (does not affect optimization level in the Intel compilers).
-traceback (ifort only) produce traceback information at runtime if the code terminates abnormally.
Optimization
-O0 Turn off optimizer.
-O[1-3] Turn on optimizer, level 1-3 aggressiveness (default is -O2; O1 and O2 are equivalent).
-no-prefetch Disable software prefetching (prefetching is turned on by default in ifort with -O3).
-scalar_rep (ifort only) Enable scalar replacement (scalar replacement is turned off by default in ifort with -O3).
-march=pentium4 (icc only) Generate code exclusively for a Pentium 4 or Xeon processor (employ SSE2 scalar instructions).
-unroll[n] Set maximum number of times to unroll loops. Omit n to use default heuristics. Use n=0 to disable loop unroller.
-xN Utilize SSE2 instructions (works only with Pentium 4 and higher); turns on auto-vectorizer.
-axN Same as above, but build “fat” executable that provides alternate instructions for non-P4 architecture.
-fno-alias Assume no pointer aliasing in program.
-fno-fnalias Assume no aliasing within functions, but assume aliasing across function calls.
-ansi_alias Assume ANSI compliance for optimization purposes (no out of bounds references, no casting of pointers to non-pointer types, no aliasing of objects with different types).
-alias_args Assume function arguments are not aliased.
-safe_cray_ptr (ifort only) Cray pointers do not alias with other variables.
-ip Enable single-file interprocedural optimizations.
Profile-Guided Optimization
-prof_gen Instrument program for profile-guided optimization.
-prof_use Enable use of PGO instrumentation during optimization (build with -prof_use after you have collected stats with a binary built with -prof_gen).
Layout
-pad (ifort only) Enable changing variable and array memory layout.
Floating Point
-fp-model precise Enables value-safe optimizations on floating-point data. Disables optimizations that can change the result of floating-point calculations.
-nolib-inline Disables inline expansion of standard library or intrinsic functions. This flag is sometimes required to maintain floating point consistency between -O0 and -O2 builds
-pc32
(default)-pc64
-pc80
Set precision of x87 FPU to 32 bits, 64 bits, or 80 bits by modifying the FPU control word (53-bit significand or approximately 16 significant digits) by default. Note that this can only be set for a program that includes a main() function or subroutine.
-fpe[0,1] (ifort only) Set floating point exception handler to flush underflows to zero and (-fpe0) error out on other exceptions or (-fpe1) continue processing after other exceptions.
Open MP
-openmp Turns on OpenMP.
-openmp_stubs Ignore OpenMP directives and functions by using stubs library.
Variable Treatment
-auto (ifort only) Make all local variables AUTOMATIC (default when using -openmp).
-auto_scalar (ifort only) Make only scalar local variables AUTOMATIC (default when not using -openmp).
-fpscomp logicals (ifort only) Very important if calling MPI subroutines that use logical arguments such as MPI_CART_SUB. Set internal representation of logical variables to use 1 for .TRUE. and 0 for .FALSE. instead of default.
Compiler Messages
-w Turn off warning messages.
-w95 (ifort only) Turn off warnings about Fortran 90/95 compliance.
-cm (ifort only) Turn off all comments.
-vec_report2 Indicate vectorized/non-vectorized loops.
-opt_report Generate an optimization report to stderr.
Preprocessing
-fpp[0-3] (ifort only) Use Fortran preprocessor with level 0, 1, 2, or 3.
Linking
-static Create static executable.
-shared Create executable that uses shared libraries (default).

Typical Intel C debugging command line example:

$icc -g -O0 -oa.out programname.c

Typical Intel Fortran optimized command line example:

$ifort -O3 -xN -ip -fno-alias -safe_cray_ptr -oa.out programname.F

I personally use the following options for compilation:

$ifort -c -w -cm -tpp7 -save programname.F

and to create an executable:

$ifort -arW -align -w -cm -tpp7 -save programname.F  programname.x

see also https://computing.llnl.gov/linux/linux_basics.html, and

http://scv.bu.edu/computation/linuxcluster/manpages/ifort.html

Read Full Post »