SPECint

SPECint is a computer benchmark specification for CPU integer processing power. It is maintained by the Standard Performance Evaluation Corporation (SPEC). SPECint is the integer performance testing component of the SPEC test suite. The first SPEC test suite, CPU92, was announced in 1992. It was followed by CPU95, CPU2000, and CPU2006. The latest standard of SPECint is CINT2006 (aka SPECint2006).

CPU2006 is a set of benchmarks designed to test the CPU performance of a modern server computer system. It is split into two components, the first being CINT2006, the other being CFP2006 (SPECfp), for floating point testing.

SPEC defines a base runtime for each of the 12 benchmark programs. For SPECint2006, that number ranges from 1000 to 3000 seconds. The timed test is run on the system, and the time of the test system is compared to the reference time, and a ratio is computed. That ratio becomes the SPECint score for that test. (This differs from the rating in SPECINT2000, which multiplies the ratio by 100.)

As an example for SPECint2006, consider a processor which can run 400.perlbench in 2000 seconds. The time it takes the reference machine to run the benchmark is 9770 seconds. Thus the ratio is 4.885. Each ratio is computed, and then the geometric mean of those ratios is computed to produce an overall value.

For a fee, SPEC distributes source code files to users wanting to test their systems. These files are written in a standard programming language, which is then compiled for each particular CPU architecture and operating system. Thus, the performance measured is that of the CPU, RAM, and compiler, and does not test I/O, networking, or graphics.

Two metrics are reported for a particular benchmark, "base" and "peak". Compiler options account for the difference between the two numbers. As the SPEC benchmarks are distributed as source code, it is up to the party performing the test to compile this code. There is agreement that the benchmarks should be compiled in the same way as a user would compile a program, but there is no consistent method for user compilation, it varies system by system. SPEC, in this case, defines two reference points, "base" and "peak". Base has a more strict set of compilation rules than peak. Less optimization can be done, the compiler flags must be the same for each benchmark, in the same order, and there must be a limited number of flags. Base, then, is closest to how a user would compile a program with standard flags. The 'peak' metric can be performed with maximum compiler optimization, even to the extent of different optimizations for each benchmark. This number represents maximum system performance, achieved by full compiler optimization.

...
Wikipedia