diff options
Diffstat (limited to 'zpu/roadshow/roadshow/dhrystone/RATIONALE')
-rw-r--r-- | zpu/roadshow/roadshow/dhrystone/RATIONALE | 361 |
1 files changed, 361 insertions, 0 deletions
diff --git a/zpu/roadshow/roadshow/dhrystone/RATIONALE b/zpu/roadshow/roadshow/dhrystone/RATIONALE new file mode 100644 index 0000000..926e046 --- /dev/null +++ b/zpu/roadshow/roadshow/dhrystone/RATIONALE @@ -0,0 +1,361 @@ + + + Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules + + [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62] + + + Reinhold P. Weicker + Siemens AG, E STE 35 + [now: Siemens AG, AUT E 51] + Postfach 3220 + D-8520 Erlangen + Germany (West) + + + + +1. Why a Version 2 of Dhrystone? + +The Dhrystone benchmark program [1] has become a popular benchmark for +CPU/compiler performance measurement, in particular in the area of +minicomputers, workstations, PC's and microprocesors. It apparently satisfies +a need for an easy-to-use integer benchmark; it gives a first performance +indication which is more meaningful than MIPS numbers which, in their literal +meaning (million instructions per second), cannot be used across different +instruction sets (e.g. RISC vs. CISC). With the increasing use of the +benchmark, it seems necessary to reconsider the benchmark and to check whether +it can still fulfill this function. Version 2 of Dhrystone is the result of +such a re-evaluation, it has been made for two reasons: + +o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal and C + have been distributed by Reinhold Weicker via floppy disk. However, the + version that was used most often for benchmarking has been the version made + by Rick Richardson by another translation from the Ada version into the C + programming language, this has been the version distributed via the UNIX + network Usenet [2]. + + There is an obvious need for a common C version of Dhrystone, since C is at + present the most popular system programming language for the class of + systems (microcomputers, minicomputers, workstations) where Dhrystone is + used most. There should be, as far as possible, only one C version of + Dhrystone such that results can be compared without restrictions. In the + past, the C versions distributed by Rick Richardson (Version 1.1) and by + Reinhold Weicker had small (though not significant) differences. + + Together with the new C version, the Ada and Pascal versions have been + updated as well. + +o As far as it is possible without changes to the Dhrystone statistics, + optimizing compilers should be prevented from removing significant + statements. It has turned out in the past that optimizing compilers + suppressed code generation for too many statements (by "dead code removal" + or "dead variable elimination"). This has lead to the danger that + benchmarking results obtained by a naive application of Dhrystone - without + inspection of the code that was generated - could become meaningless. + +The overall policiy for version 2 has been that the distribution of +statements, operand types and operand locality described in [1] should remain +unchanged as much as possible. (Very few changes were necessary; their impact +should be negligible.) Also, the order of statements should remain unchanged. +Although I am aware of some critical remarks on the benchmark - I agree with +several of them - and know some suggestions for improvement, I didn't want to +change the benchmark into something different from what has become known as +"Dhrystone"; the confusion generated by such a change would probably outweight +the benefits. If I were to write a new benchmark program, I wouldn't give it +the name "Dhrystone" since this denotes the program published in [1]. +However, I do recognize the need for a larger number of representative +programs that can be used as benchmarks; users should always be encouraged to +use more than just one benchmark. + +The new versions (version 2.1 for C, Pascal and Ada) will be distributed as +widely as possible. (Version 2.1 differs from version 2.0 distributed via the +UNIX Network Usenet in March 1988 only in a few corrections for minor +deficiencies found by users of version 2.0.) Readers who want to use the +benchmark for their own measurements can obtain a copy in machine-readable +form on floppy disk (MS-DOS or XENIX format) from the author. + + +2. Overall Characteristics of Version 2 + +In general, version 2 follows - in the parts that are significant for +performance measurement, i.e. within the measurement loop - the published +(Ada) version and the C versions previously distributed. Where the versions +distributed by Rick Richardson [2] and Reinhold Weicker have been different, +it follows the version distributed by Reinhold Weicker. (However, the +differences have been so small that their impact on execution time in all +likelihood has been negligible.) The initialization and UNIX instrumentation +part - which had been omitted in [1] - follows mostly the ideas of Rick +Richardson [2]. However, any changes in the initialization part and in the +printing of the result have no impact on performance measurement since they +are outside the measaurement loop. As a concession to older compilers, names +have been made unique within the first 8 characters for the C version. + +The original publication of Dhrystone did not contain any statements for time +measurement since they are necessarily system-dependent. However, it turned +out that it is not enough just to inclose the main procedure of Dhrystone in a +loop and to measure the execution time. If the variables that are computed +are not used somehow, there is the danger that the compiler considers them as +"dead variables" and suppresses code generation for a part of the statements. +Therefore in version 2 all variables of "main" are printed at the end of the +program. This also permits some plausibility control for correct execution of +the benchmark. + +At several places in the benchmark, code has been added, but only in branches +that are not executed. The intention is that optimizing compilers should be +prevented from moving code out of the measurement loop, or from removing code +altogether. Statements that are executed have been changed in very few places +only. In these cases, only the role of some operands has been changed, and it +was made sure that the numbers defining the "Dhrystone distribution" +(distribution of statements, operand types and locality) still hold as much as +possible. Except for sophisticated optimizing compilers, execution times for +version 2.1 should be the same as for previous versions. + +Because of the self-imposed limitation that the order and distribution of the +executed statements should not be changed, there are still cases where +optimizing compilers may not generate code for some statements. To a certain +degree, this is unavoidable for small synthetic benchmarks. Users of the +benchmark are advised to check code listings whether code is generated for all +statements of Dhrystone. + +Contrary to the suggestion in the published paper and its realization in the +versions previously distributed, no attempt has been made to subtract the time +for the measurement loop overhead. (This calculation has proven difficult to +implement in a correct way, and its omission makes the program simpler.) +However, since the loop check is now part of the benchmark, this does have an +impact - though a very minor one - on the distribution statistics which have +been updated for this version. + + +3. Discussion of Individual Changes + +In this section, all changes are described that affect the measurement loop +and that are not just renamings of variables. All remarks refer to the C +version; the other language versions have been updated similarly. + +In addition to adding the measurement loop and the printout statements, +changes have been made at the following places: + +o In procedure "main", three statements have been added in the non-executed + "then" part of the statement + + if (Enum_Loc == Func_1 (Ch_Index, 'C')) + + they are + + strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING"); + Int_2_Loc = Run_Index; + Int_Glob = Run_Index; + + The string assignment prevents movement of the preceding assignment to + Str_2_Loc (5'th statement of "main") out of the measurement loop (This + probably will not happen for the C version, but it did happen with another + language and compiler.) The assignment to Int_2_Loc prevents value + propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of + Int_Glob possibly dependent from the value of Run_Index. + +o In the three arithmetic computations at the end of the measurement loop in + "main ", the role of some variables has been exchanged, to prevent the + division from just cancelling out the multiplication as it was in [1]. A + very smart compiler might have recognized this and suppressed code + generation for the division. + +o For Proc_2, no code has been changed, but the values of the actual parameter + have changed due to changes in "main". + +o In Proc_4, the second assignment has been changed from + + Bool_Loc = Bool_Loc | Bool_Glob; + + to + + Bool_Glob = Bool_Loc | Bool_Glob; + + It now assigns a value to a global variable instead of a local variable + (Bool_Loc); Bool_Loc would be a "dead variable" which is not used + afterwards. + +o In Func_1, the statement + + Ch_1_Glob = Ch_1_Loc; + + was added in the non-executed "else" part of the "if" statement, to prevent + the suppression of code generation for the assignment to Ch_1_Loc. + +o In Func_2, the second character comparison statement has been changed to + + if (Ch_Loc == 'R') + + ('R' instead of 'X') because a comparison with 'X' is implied in the + preceding "if" statement. + + Also in Func_2, the statement + + Int_Glob = Int_Loc; + + has been added in the non-executed part of the last "if" statement, in order + to prevent Int_Loc from becoming a dead variable. + +o In Func_3, a non-executed "else" part has been added to the "if" statement. + While the program would not be incorrect without this "else" part, it is + considered bad programming practice if a function can be left without a + return value. + + To compensate for this change, the (non-executed) "else" part in the "if" + statement of Proc_3 was removed. + +The distribution statistics have been changed only by the addition of the +measurement loop iteration (1 additional statement, 4 additional local integer +operands) and by the change in Proc_4 (one operand changed from local to +global). The distribution statistics in the comment headers have been updated +accordingly. + + +4. String Operations + +The string operations (string assignment and string comparison) have not been +changed, to keep the program consistent with the original version. + +There has been some concern that the string operations are over-represented in +the program, and that execution time is dominated by these operations. This +was true in particular when optimizing compilers removed too much code in the +main part of the program, this should have been mitigated in version 2. + +It should be noted that this is a language-dependent issue: Dhrystone was +first published in Ada, and with Ada or Pascal semantics, the time spent in +the string operations is, at least in all implementations known to me, +considerably smaller. In Ada and Pascal, assignment and comparison of strings +are operators defined in the language, and the upper bounds of the strings +occuring in Dhrystone are part of the type information known at compilation +time. The compilers can therefore generate efficient inline code. In C, +string assignemt and comparisons are not part of the language, so the string +operations must be expressed in terms of the C library functions "strcpy" and +"strcmp". (ANSI C allows an implementation to use inline code for these +functions.) In addition to the overhead caused by additional function calls, +these functions are defined for null-terminated strings where the length of +the strings is not known at compilation time; the function has to check every +byte for the termination condition (the null byte). + +Obviously, a C library which includes efficiently coded "strcpy" and "strcmp" +functions helps to obtain good Dhrystone results. However, I don't think that +this is unfair since string functions do occur quite frequently in real +programs (editors, command interpreters, etc.). If the strings functions are +implemented efficiently, this helps real programs as well as benchmark +programs. + +I admit that the string comparison in Dhrystone terminates later (after +scanning 20 characters) than most string comparisons in real programs. For +consistency with the original benchmark, I didn't change the program despite +this weakness. + + +5. Intended Use of Dhrystone + +When Dhrystone is used, the following "ground rules" apply: + +o Separate compilation (Ada and C versions) + + As mentioned in [1], Dhrystone was written to reflect actual programming + practice in systems programming. The division into several compilation + units (5 in the Ada version, 2 in the C version) is intended, as is the + distribution of inter-module and intra-module subprogram calls. Although on + many systems there will be no difference in execution time to a Dhrystone + version where all compilation units are merged into one file, the rule is + that separate compilation should be used. The intention is that real + programming practice, where programs consist of several independently + compiled units, should be reflected. This also has implies that the + compiler, while compiling one unit, has no information about the use of + variables, register allocation etc. occuring in other compilation units. + Although in real life compilation units will probably be larger, the + intention is that these effects of separate compilation are modeled in + Dhrystone. + + A few language systems have post-linkage optimization available (e.g., final + register allocation is performed after linkage). This is a borderline case: + Post-linkage optimization involves additional program preparation time + (although not as much as compilation in one unit) which may prevent its + general use in practical programming. I think that since it defeats the + intentions given above, it should not be used for Dhrystone. + + Unfortunately, ISO/ANSI Pascal does not contain language features for + separate compilation. Although most commercial Pascal compilers provide + separate compilation in some way, we cannot use it for Dhrystone since such + a version would not be portable. Therefore, no attempt has been made to + provide a Pascal version with several compilation units. + +o No procedure merging + + Although Dhrystone contains some very short procedures where execution would + benefit from procedure merging (inlining, macro expansion of procedures), + procedure merging is not to be used. The reason is that the percentage of + procedure and function calls is part of the "Dhrystone distribution" of + statements contained in [1]. This restriction does not hold for the string + functions of the C version since ANSI C allows an implementation to use + inline code for these functions. + +o Other optimizations are allowed, but they should be indicated + + It is often hard to draw an exact line between "normal code generation" and + "optimization" in compilers: Some compilers perform operations by default + that are invoked in other compilers only when optimization is explicitly + requested. Also, we cannot avoid that in benchmarking people try to achieve + results that look as good as possible. Therefore, optimizations performed + by compilers - other than those listed above - are not forbidden when + Dhrystone execution times are measured. Dhrystone is not intended to be + non-optimizable but is intended to be similarly optimizable as normal + programs. For example, there are several places in Dhrystone where + performance benefits from optimizations like common subexpression + elimination, value propagation etc., but normal programs usually also + benefit from these optimizations. Therefore, no effort was made to + artificially prevent such optimizations. However, measurement reports + should indicate which compiler optimization levels have been used, and + reporting results with different levels of compiler optimization for the + same hardware is encouraged. + +o Default results are those without "register" declarations (C version) + + When Dhrystone results are quoted without additional qualification, they + should be understood as results obtained without use of the "register" + attribute. Good compilers should be able to make good use of registers even + without explicit register declarations ([3], p. 193). + +Of course, for experimental purposes, post-linkage optimization, procedure +merging and/or compilation in one unit can be done to determine their effects. +However, Dhrystone numbers obtained under these conditions should be +explicitly marked as such; "normal" Dhrystone results should be understood as +results obtained following the ground rules listed above. + +In any case, for serious performance evaluation, users are advised to ask for +code listings and to check them carefully. In this way, when results for +different systems are compared, the reader can get a feeling how much +performance difference is due to compiler optimization and how much is due to +hardware speed. + + +6. Acknowledgements + +The C version 2.1 of Dhrystone has been developed in cooperation with Rick +Richardson (Tinton Falls, NJ), it incorporates many ideas from the "Version +1.1" distributed previously by him over the UNIX network Usenet. Through his +activity with Usenet, Rick Richardson has made a very valuable contribution to +the dissemination of the benchmark. I also thank Chaim Benedelac (National +Semiconductor), David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan +Smith and Rafael Saavedra-Barrera (UC at Berkeley) for their help with +comments on earlier versions of the benchmark. + + +7. Bibliography + +[1] + Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark. + Communications of the ACM 27, 10 (Oct. 1984), 1013-1030 + +[2] + Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text) + Informal Distribution via "Usenet", Last Version Known to me: Sept. 21, + 1987 + +[3] + Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language. + Prentice-Hall, Englewood Cliffs (NJ) 1978 + |