summaryrefslogtreecommitdiffstats
path: root/zpu/roadshow/roadshow/dhrystone/VARIATIONS
diff options
context:
space:
mode:
Diffstat (limited to 'zpu/roadshow/roadshow/dhrystone/VARIATIONS')
-rw-r--r--zpu/roadshow/roadshow/dhrystone/VARIATIONS157
1 files changed, 157 insertions, 0 deletions
diff --git a/zpu/roadshow/roadshow/dhrystone/VARIATIONS b/zpu/roadshow/roadshow/dhrystone/VARIATIONS
new file mode 100644
index 0000000..3046cbd
--- /dev/null
+++ b/zpu/roadshow/roadshow/dhrystone/VARIATIONS
@@ -0,0 +1,157 @@
+
+ Understanding Variations in Dhrystone Performance
+
+
+
+ By Reinhold P. Weicker, Siemens AG, AUT E 51, Erlangen
+
+
+
+ April 1989
+
+
+ This article has appeared in:
+
+
+ Microprocessor Report, May 1989 (Editor: M. Slater), pp. 16-17
+
+
+
+
+Microprocessor manufacturers tend to credit all the performance measured by
+benchmarks to the speed of their processors, they often don't even mention the
+programming language and compiler used. In their detailed documents, usually
+called "performance brief" or "performance report," they usually do give more
+details. However, these details are often lost in the press releases and other
+marketing statements. For serious performance evaluation, it is necessary to
+study the code generated by the various compilers.
+
+Dhrystone was originally published in Ada (Communications of the ACM, Oct.
+1984). However, since good Ada compilers were rare at this time and, together
+with UNIX, C became more and more popular, the C version of Dhrystone is the
+one now mainly used in industry. There are "official" versions 2.1 for Ada,
+Pascal, and C, which are as close together as the languages' semantic
+differences permit.
+
+Dhrystone contains two statements where the programming language and its
+translation play a major part in the execution time measured by the benchmark:
+
+ o String assignment (in procedure Proc_0 / main)
+ o String comparison (in function Func_2)
+
+In Ada and Pascal, strings are arrays of characters where the length of the
+string is part of the type information known at compile time. In C, strings
+are also arrays of characters, but there are no operators defined in the
+language for assignment and comparison of strings. Instead, functions
+"strcpy" and "strcmp" are used. These functions are defined for strings of
+arbitrary length, and make use of the fact that strings in C have to end with
+a terminating null byte. For general-purpose calls to these functions, the
+implementor can assume nothing about the length and the alignment of the
+strings involved.
+
+The C version of Dhrystone spends a relatively large amount of time in these
+two functions. Some time ago, I made measurements on a VAX 11/785 with the
+Berkeley UNIX (4.2) compilers (often-used compilers, but certainly not the
+most advanced). In the C version, 23% of the time was spent in the string
+functions; in the Pascal version, only 10%. On good RISC machines (where less
+time is spent in the procedure calling sequence than on a VAX) and with better
+optimizing compilers, the percentage is higher; MIPS has reported 34% for an
+R3000. Because of this effect, Pascal and Ada Dhrystone results are usually
+better than C results (except when the optimization quality of the C compiler
+is considerably better than that of the other compilers).
+
+Several people have noted that the string operations are over-represented in
+Dhrystone, mainly because the strings occurring in Dhrystone are longer than
+average strings. I admit that this is true, and have said so in my SIGPLAN
+Notices paper (Aug. 1988); however, I didn't want to generate confusion by
+changing the string lengths from version 1 to version 2.
+
+Even if they are somewhat over-represented in Dhrystone, string operations are
+frequent enough that it makes sense to implement them in the most efficient
+way possible, not only for benchmarking purposes. This means that they can
+and should be written in assembly language code. ANSI C also explicitly allows
+the strings functions to be implemented as macros, i.e. by inline code.
+
+There is also a third way to speed up the "strcpy" statement in Dhrystone: For
+this particular "strcpy" statement, the source of the assignment is a string
+constant. Therefore, in contrast to calls to "strcpy" in the general case, the
+compiler knows the length and alignment of the strings involved at compile
+time and can generate code in the same efficient way as a Pascal compiler
+(word instructions instead of byte instructions).
+
+This is not allowed in the case of the "strcmp" call: Here, the addresses are
+formal procedure parameters, and no assumptions can be made about the length
+or alignment of the strings. Any such assumptions would indicate an incorrect
+implementation. They might work for Dhrystone, where the strings are in fact
+word-aligned with typical compilers, but other programs would deliver
+incorrect results.
+
+So, for an apple-to-apple comparison between processors, and not between
+several possible (legal or illegal) degrees of compiler optimization, one
+should check that the systems are comparable with respect to the following
+three points:
+
+ (1) String functions in assembly language vs. in C
+
+ Frequently used functions such as the string functions can and should be
+ written in assembly language, and all serious C language systems known
+ to me do this. (I list this point for completeness only.) Note that
+ processors with an instruction that checks a word for a null byte (such
+ as AMD's 29000 and Intel's 80960) have an advantage here. (This
+ advantage decreases relatively if optimization (3) is applied.) Due to
+ the length of the strings involved in Dhrystone, this advantage may be
+ considered too high in perspective, but it is certainly legal to use
+ such instructions - after all, these situations are what they were
+ invented for.
+
+ (2) String function code inline vs. as library functions.
+
+ ANSI C has created a new situation, compared with the older
+ Kernighan/Ritchie C. In the original C, the definition of the string
+ function was not part of the language. Now it is, and inlining is
+ explicitly allowed. I probably should have stated more clearly in my
+ SIGPLAN Notices paper that the rule "No procedure inlining for
+ Dhrystone" referred to the user level procedures only and not to the
+ library routines.
+
+ (3) Fixed-length and alignment assumptions for the strings
+
+ Compilers should be allowed to optimize in these cases if (and only if)
+ it is safe to do so. For Dhrystone, this is the "strcpy" statement, but
+ not the "strcmp" statement (unless, of course, the "strcmp" code
+ explicitly checks the alignment at execution time and branches
+ accordingly). A "Dhrystone switch" for the compiler that causes the
+ generation of code that may not work under certain circumstances is
+ certainly inappropriate for comparisons. It has been reported in Usenet
+ that some C compilers provide such a compiler option; since I don't have
+ access to all C compilers involved, I cannot verify this.
+
+ If the fixed-length and word-alignment assumption can be used, a wide
+ bus that permits fast multi-word load instructions certainly does help;
+ however, this fact by itself should not make a really big difference.
+
+A check of these points - something that is necessary for a thorough
+evaluation and comparison of the Dhrystone performance claims - requires
+object code listings as well as listings for the string functions (strcpy,
+strcmp) that are possibly called by the program.
+
+I don't pretend that Dhrystone is a perfect tool to measure the integer
+performance of microprocessors. The more it is used and discussed, the more I
+myself learn about aspects that I hadn't noticed yet when I wrote the program.
+And of course, the very success of a benchmark program is a danger in that
+people may tune their compilers and/or hardware to it, and with this action
+make it less useful.
+
+Whetstone and Linpack have their critical points also: The Whetstone rating
+depends heavily on the speed of the mathematical functions (sine, sqrt, ...),
+and Linpack is sensitive to data alignment for some cache configurations.
+
+Introduction of a standard set of public domain benchmark software (something
+the SPEC effort attempts) is certainly a worthwhile thing. In the meantime,
+people will continue to use whatever is available and widely distributed, and
+Dhrystone ratings are probably still better than MIPS ratings if these are -
+as often in industry - based on no reproducible derivation. However, any
+serious performance evaluation requires more than just a comparison of raw
+numbers; one has to make sure that the numbers have been obtained in a
+comparable way.
+
OpenPOWER on IntegriCloud