Function inlining replaces a call to a function or a subroutine with the body of the function or subroutine. This can speed up execution by eliminating parameter passing and function/subroutine call and return overhead. It also allows the compiler to optimize the function with the rest of the code. Note that using function inlining indiscriminately can result in much larger code size and no increase in execution speed.
The PGI compilers provide two categories of inlining:
Automatic inlining - During the compilation process a hidden pass precedes the compilation pass. This hidden pass extracts functions that are candidates for inlining. The inlining of functions occurs as the source files are compiled.
Inline libraries - You create inline libraries, for example using
the pgf90 command and the
-Mextract and -o options. There is no hidden extract pass but you must ensure that any files that depend on the inline library use the latest version of the inline library.
There are important restrictions on inlining. Inlining only applies to certain types of functions. Refer to Section 4.5, Restrictions on Inlining, at the end of this chapter for more details on function inlining limitations.
To invoke the function inliner, use the -Minline option. If you do not specify an inline library the compiler performs a special prepass on all source files named on the compiler command line before it compiles any of them. This pass extracts functions that meet the requirements for inlining and puts them in a temporary inline library for use by the compilation pass.
Several -Minline options allow you to determine the selection criteria for functions to be inlined. These selection criteria include:
If you specify both a function name and a size n, the compiler inlines functions that match the function name or have n or fewer statements.
Note: if a keyword name:, lib: or size: is omitted, then a name with a period is assumed to be an inline library, a number is assumed to be a size, and a name without a period is assumed to be a function name.
In the following example, the compiler inlines functions with fewer than approximately 100 statements in the source file myprog.f and writes the executable code in the default output file a.out.
$ pgf90 -Minline=size:100 myprog.f
Refer to Chapter 7, Command-line Options, for more information on the -Minline options.
If you specify one or more inline libraries on the command line with the -Minline option, the compiler does not perform an initial extract pass. The compiler selects functions to inline from the specified inline library. If you also specify a size or function name all functions in the inline library meeting the selection criteria are selected for inline expansion at points in the source text where they are called.
If you do not specify a function name or a size limitation for the -Minline option, the compiler inlines every function in the inline library that matches a function in the source text.
In the following example, the compiler inlines the function proc from the inline library lib.il and writes the executable code in the default output file a.out.
$ pgf90 -Minline=name:proc,lib:lib.il myprog.f
The following command line is equivalent to the line above, the only difference in this example is that the name: and lib: inline keywords are not used. The keywords are provided so you can avoid name conflicts if you use an inline library name that does not contain a period. Otherwise, without the keywords, a period lets the compiler know that the file on the command line is an inline library.
$ pgf90 -Minline=proc,lib.il myprog.f
You can create or update an inline library using the -Mextract command-line option. If you do not specify a selection criteria along with the -Mextract option, the compiler attempts to extract all subprograms.
When you use the -Mextract option, only the extract phase is performed; the compile and link phases are not performed. The output of an extract pass is a library of functions available for inlining. It is placed in the inline library file specified on the command line with the -o filename specification. If the library file exists, new information is appended to it. If the file does not exist, it is created.
You can use the -Minline option with the -Mextract option. In
this case, the extracted library of functions can have other functions inlined
into the library. Using both options enables you to obtain more than one level
of inlining. In this situation, if you do not specify a library with the
-Minline option, the inline process consists of two extract passes. The
first pass is a hidden pass implied by the
-Minline option, during which the compiler extracts functions and places them into a temporary library. The second pass uses the results of the first pass but puts its results into the library that you specify with the -o option.
An inline library is implemented as a directory with each inline function in the library stored as a file using an encoded form of the inlinable function.
A special file named TOC in the inline library directory serves as a table of contents for the inline library. This is a printable, ASCII file which can be examined to find out information about the library contents, such as names and sizes of functions, the source file from which they were extracted, the version number of the extractor which created the entry, etc.
Libraries and their elements can be manipulated using ordinary system commands.
When a library is created or updated using one of the PGI compilers, the last-change date of the library directory is updated. This allows a library to be listed as a dependence in a makefile (and ensures that the necessary compilations will be performed when a library is changed).
If you use inline libraries you need to be certain that they remain up to date with the source files they are inlined into. One way to assure that inline libraries are updated is to include them in your makefiles.
The makefile fragment shown in Example 4-1 assumes that the file utils.f contains a number of small functions that are used in the files parser.f and alloc.f. The makefile also maintains the inline library utils.il. Note that the makefile updates the library whenever you change utils.f or one of the include files it uses. In turn, the makefile compiles parser.f and alloc.f whenever you update the library.
SRC = mydir
FC = pgf90
FFLAGS = -O2
main.o: $(SRC)/main.f $(SRC)/global.h
$(FC) $(FFLAGS) -c $(SRC)/main.f
utils.o: $(SRC)/utils.f $(SRC)/global.h $(SRC)/utils.h
$(FC) $(FFLAGS) -c $(SRC)/utils.f
utils.il: $(SRC)/utils.f $(SRC)global.h $(SRC)/utils.h
$(FC) $(FFLAGS) -Mextract=15 -o utils.il
parser.o: $(SRC)/parser.f $(SRC)/global.h utils.il
$(FC) $(FFLAGS) -Minline=utils.il -c
alloc.o: $(SRC)/alloc.f $(SRC)/global.h utils.il
$(FC) $(FFLAGS) -Minline=utils.il -c
myprog: main.o utils.o parser.o alloc.o
$(FC) -o myprog main.o utils.o parser.o alloc.o
request inlining information from the compiler when you invoke the inliner,
-Minfo=inline option. For example:
$ pgf90 -Minline=mylib.il -Minfo=inline myext.f
Assume the program dhry consists of a single source file dhry.f. The following command line builds an executable file for dhry in which proc7 is inlined wherever it is called:
$ pgf90 dhry.f -Minline=proc7
The following command lines build an executable file for dhry in which proc7 plus any functions of approximately 10 or fewer statements are inlined (one level only). Note that the specified functions are inlined only if they are previously placed in the inline library, temp.il, during the extract phase.
$ pgf90 dhry.f -Mextract -o temp.il
$ pgf90 dhry.f -Minline=10,Proc7,temp.il
Assume the program fibo.f contains a single function fibo that calls itself recursively. The following command line creates the file fibo.o in which fibo is inlined into itself:
$ pgf90 fibo.f -c -Mrecursive -Minline=fibo
Because this version of fibo recurses only half as deeply, it executes noticeably faster.
Using the same source file dhry.f, the following example builds an executable for dhry in which all functions of roughly ten or fewer statements are inlined. Two levels of inlining are performed. This means that if function A calls function B, and B calls C, and both B and C are inlinable, then the version of B which is inlined into A will have had C inlined into it.
$ pgf90 dhry.f -Minline=size:10,levels:2
The following Fortran subprograms cannot be extracted:
A Fortran subprogram is not inlined if any of the following applies:
The following types of C and C++ functions cannot be inlined:
Certain C/C++ functions can only be inlined into the file which contains their definition: