Next
Previous
Contents
There are a variety of performance analysis tools available
for examining the execution of a program:
- elapsed time: Measuring the CPU time and wall time of a program is
the most important part of performance analysis. Unfortunately,
this does not provide much insight about how performance can be
improved. This can be measured on Unix systems with the time
command.
- event counting: This is a generalization of elapsed time
measurement. Instead of just counting clock ticks, however, on
some architectures it is possible to count the number of floating
point operations, the number of cache misses, etc. The
PerfAPI project
has designed a general API for all architectures that support
performance monitoring.
- function call analysis: This usually involves recompiling the
program with flags that cause code to be inserted around
functions that keep track of the time spent in each function
and the function call graph. This method is quite useful
and available on many Unix systems. See the manual page
for the gprof command for more information.
- statistical event sampling: This method causes periodic events to
result in interrupts that record the current program counter.
These program counters are then converted to the source code line
numbers and the data is tabulated in a human readable form. Since
this method would be too invasive if every event caused an
interrupt, only every n'th event causes an interrupt and thus the
method is not exact but statistical. Most programs spend a
significant fraction of time in a few lines of code, so these
statistical methods are usually accurate enough. Many Unix
systems implement this functionality with the prof command. Most
commonly, the events that are sampled are clock ticks. The
PerfAPI project
has developed a uniform interface to low level drivers that can
statistically sample other events such as cache misses and flops
on certain architectures. For Intel P6 and AMD K7 processors
running Linux the
perfctr package can be used to
sample a wide range of events.
The visual profiler falls into the statistical event sampling category of
profiling tools. It can sample clock ticks using the profil system call or
can sample a wide range of system events using the
PerfAPI package. Note that the PAPI
software doesn't truly do statistical event sampling on all platforms. For
these platforms, results for sampling events other than time will be
inaccurate. If a Intel P6 or AMD K7 processor running Linux is used, then
the
perfctr package to accurately sample
events other than clock ticks.
Next
Previous
Contents