summaryrefslogtreecommitdiffstats
path: root/zpu/roadshow/roadshow/dhrystone/RATIONALE
diff options
context:
space:
mode:
authorBert Lange <b.lange@hzdr.de>2015-04-15 13:36:55 +0200
committerBert Lange <b.lange@hzdr.de>2015-04-15 13:36:55 +0200
commita1c964908b51599bf624bd2d253419c7e629f195 (patch)
tree06125d59e83b7dde82d1bb57bc0e09ca83451b98 /zpu/roadshow/roadshow/dhrystone/RATIONALE
parentbbfe29a15f11548eb7c9fa71dcb4d2d18c164a53 (diff)
parent8679e4f91dcae05aef40f96629f33f0f4161f14a (diff)
downloadzpu-a1c964908b51599bf624bd2d253419c7e629f195.zip
zpu-a1c964908b51599bf624bd2d253419c7e629f195.tar.gz
Merge branch 'master' of https://github.com/zylin/zpu
Diffstat (limited to 'zpu/roadshow/roadshow/dhrystone/RATIONALE')
-rw-r--r--zpu/roadshow/roadshow/dhrystone/RATIONALE361
1 files changed, 361 insertions, 0 deletions
diff --git a/zpu/roadshow/roadshow/dhrystone/RATIONALE b/zpu/roadshow/roadshow/dhrystone/RATIONALE
new file mode 100644
index 0000000..926e046
--- /dev/null
+++ b/zpu/roadshow/roadshow/dhrystone/RATIONALE
@@ -0,0 +1,361 @@
+
+
+ Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
+
+ [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62]
+
+
+ Reinhold P. Weicker
+ Siemens AG, E STE 35
+ [now: Siemens AG, AUT E 51]
+ Postfach 3220
+ D-8520 Erlangen
+ Germany (West)
+
+
+
+
+1. Why a Version 2 of Dhrystone?
+
+The Dhrystone benchmark program [1] has become a popular benchmark for
+CPU/compiler performance measurement, in particular in the area of
+minicomputers, workstations, PC's and microprocesors. It apparently satisfies
+a need for an easy-to-use integer benchmark; it gives a first performance
+indication which is more meaningful than MIPS numbers which, in their literal
+meaning (million instructions per second), cannot be used across different
+instruction sets (e.g. RISC vs. CISC). With the increasing use of the
+benchmark, it seems necessary to reconsider the benchmark and to check whether
+it can still fulfill this function. Version 2 of Dhrystone is the result of
+such a re-evaluation, it has been made for two reasons:
+
+o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal and C
+ have been distributed by Reinhold Weicker via floppy disk. However, the
+ version that was used most often for benchmarking has been the version made
+ by Rick Richardson by another translation from the Ada version into the C
+ programming language, this has been the version distributed via the UNIX
+ network Usenet [2].
+
+ There is an obvious need for a common C version of Dhrystone, since C is at
+ present the most popular system programming language for the class of
+ systems (microcomputers, minicomputers, workstations) where Dhrystone is
+ used most. There should be, as far as possible, only one C version of
+ Dhrystone such that results can be compared without restrictions. In the
+ past, the C versions distributed by Rick Richardson (Version 1.1) and by
+ Reinhold Weicker had small (though not significant) differences.
+
+ Together with the new C version, the Ada and Pascal versions have been
+ updated as well.
+
+o As far as it is possible without changes to the Dhrystone statistics,
+ optimizing compilers should be prevented from removing significant
+ statements. It has turned out in the past that optimizing compilers
+ suppressed code generation for too many statements (by "dead code removal"
+ or "dead variable elimination"). This has lead to the danger that
+ benchmarking results obtained by a naive application of Dhrystone - without
+ inspection of the code that was generated - could become meaningless.
+
+The overall policiy for version 2 has been that the distribution of
+statements, operand types and operand locality described in [1] should remain
+unchanged as much as possible. (Very few changes were necessary; their impact
+should be negligible.) Also, the order of statements should remain unchanged.
+Although I am aware of some critical remarks on the benchmark - I agree with
+several of them - and know some suggestions for improvement, I didn't want to
+change the benchmark into something different from what has become known as
+"Dhrystone"; the confusion generated by such a change would probably outweight
+the benefits. If I were to write a new benchmark program, I wouldn't give it
+the name "Dhrystone" since this denotes the program published in [1].
+However, I do recognize the need for a larger number of representative
+programs that can be used as benchmarks; users should always be encouraged to
+use more than just one benchmark.
+
+The new versions (version 2.1 for C, Pascal and Ada) will be distributed as
+widely as possible. (Version 2.1 differs from version 2.0 distributed via the
+UNIX Network Usenet in March 1988 only in a few corrections for minor
+deficiencies found by users of version 2.0.) Readers who want to use the
+benchmark for their own measurements can obtain a copy in machine-readable
+form on floppy disk (MS-DOS or XENIX format) from the author.
+
+
+2. Overall Characteristics of Version 2
+
+In general, version 2 follows - in the parts that are significant for
+performance measurement, i.e. within the measurement loop - the published
+(Ada) version and the C versions previously distributed. Where the versions
+distributed by Rick Richardson [2] and Reinhold Weicker have been different,
+it follows the version distributed by Reinhold Weicker. (However, the
+differences have been so small that their impact on execution time in all
+likelihood has been negligible.) The initialization and UNIX instrumentation
+part - which had been omitted in [1] - follows mostly the ideas of Rick
+Richardson [2]. However, any changes in the initialization part and in the
+printing of the result have no impact on performance measurement since they
+are outside the measaurement loop. As a concession to older compilers, names
+have been made unique within the first 8 characters for the C version.
+
+The original publication of Dhrystone did not contain any statements for time
+measurement since they are necessarily system-dependent. However, it turned
+out that it is not enough just to inclose the main procedure of Dhrystone in a
+loop and to measure the execution time. If the variables that are computed
+are not used somehow, there is the danger that the compiler considers them as
+"dead variables" and suppresses code generation for a part of the statements.
+Therefore in version 2 all variables of "main" are printed at the end of the
+program. This also permits some plausibility control for correct execution of
+the benchmark.
+
+At several places in the benchmark, code has been added, but only in branches
+that are not executed. The intention is that optimizing compilers should be
+prevented from moving code out of the measurement loop, or from removing code
+altogether. Statements that are executed have been changed in very few places
+only. In these cases, only the role of some operands has been changed, and it
+was made sure that the numbers defining the "Dhrystone distribution"
+(distribution of statements, operand types and locality) still hold as much as
+possible. Except for sophisticated optimizing compilers, execution times for
+version 2.1 should be the same as for previous versions.
+
+Because of the self-imposed limitation that the order and distribution of the
+executed statements should not be changed, there are still cases where
+optimizing compilers may not generate code for some statements. To a certain
+degree, this is unavoidable for small synthetic benchmarks. Users of the
+benchmark are advised to check code listings whether code is generated for all
+statements of Dhrystone.
+
+Contrary to the suggestion in the published paper and its realization in the
+versions previously distributed, no attempt has been made to subtract the time
+for the measurement loop overhead. (This calculation has proven difficult to
+implement in a correct way, and its omission makes the program simpler.)
+However, since the loop check is now part of the benchmark, this does have an
+impact - though a very minor one - on the distribution statistics which have
+been updated for this version.
+
+
+3. Discussion of Individual Changes
+
+In this section, all changes are described that affect the measurement loop
+and that are not just renamings of variables. All remarks refer to the C
+version; the other language versions have been updated similarly.
+
+In addition to adding the measurement loop and the printout statements,
+changes have been made at the following places:
+
+o In procedure "main", three statements have been added in the non-executed
+ "then" part of the statement
+
+ if (Enum_Loc == Func_1 (Ch_Index, 'C'))
+
+ they are
+
+ strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
+ Int_2_Loc = Run_Index;
+ Int_Glob = Run_Index;
+
+ The string assignment prevents movement of the preceding assignment to
+ Str_2_Loc (5'th statement of "main") out of the measurement loop (This
+ probably will not happen for the C version, but it did happen with another
+ language and compiler.) The assignment to Int_2_Loc prevents value
+ propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of
+ Int_Glob possibly dependent from the value of Run_Index.
+
+o In the three arithmetic computations at the end of the measurement loop in
+ "main ", the role of some variables has been exchanged, to prevent the
+ division from just cancelling out the multiplication as it was in [1]. A
+ very smart compiler might have recognized this and suppressed code
+ generation for the division.
+
+o For Proc_2, no code has been changed, but the values of the actual parameter
+ have changed due to changes in "main".
+
+o In Proc_4, the second assignment has been changed from
+
+ Bool_Loc = Bool_Loc | Bool_Glob;
+
+ to
+
+ Bool_Glob = Bool_Loc | Bool_Glob;
+
+ It now assigns a value to a global variable instead of a local variable
+ (Bool_Loc); Bool_Loc would be a "dead variable" which is not used
+ afterwards.
+
+o In Func_1, the statement
+
+ Ch_1_Glob = Ch_1_Loc;
+
+ was added in the non-executed "else" part of the "if" statement, to prevent
+ the suppression of code generation for the assignment to Ch_1_Loc.
+
+o In Func_2, the second character comparison statement has been changed to
+
+ if (Ch_Loc == 'R')
+
+ ('R' instead of 'X') because a comparison with 'X' is implied in the
+ preceding "if" statement.
+
+ Also in Func_2, the statement
+
+ Int_Glob = Int_Loc;
+
+ has been added in the non-executed part of the last "if" statement, in order
+ to prevent Int_Loc from becoming a dead variable.
+
+o In Func_3, a non-executed "else" part has been added to the "if" statement.
+ While the program would not be incorrect without this "else" part, it is
+ considered bad programming practice if a function can be left without a
+ return value.
+
+ To compensate for this change, the (non-executed) "else" part in the "if"
+ statement of Proc_3 was removed.
+
+The distribution statistics have been changed only by the addition of the
+measurement loop iteration (1 additional statement, 4 additional local integer
+operands) and by the change in Proc_4 (one operand changed from local to
+global). The distribution statistics in the comment headers have been updated
+accordingly.
+
+
+4. String Operations
+
+The string operations (string assignment and string comparison) have not been
+changed, to keep the program consistent with the original version.
+
+There has been some concern that the string operations are over-represented in
+the program, and that execution time is dominated by these operations. This
+was true in particular when optimizing compilers removed too much code in the
+main part of the program, this should have been mitigated in version 2.
+
+It should be noted that this is a language-dependent issue: Dhrystone was
+first published in Ada, and with Ada or Pascal semantics, the time spent in
+the string operations is, at least in all implementations known to me,
+considerably smaller. In Ada and Pascal, assignment and comparison of strings
+are operators defined in the language, and the upper bounds of the strings
+occuring in Dhrystone are part of the type information known at compilation
+time. The compilers can therefore generate efficient inline code. In C,
+string assignemt and comparisons are not part of the language, so the string
+operations must be expressed in terms of the C library functions "strcpy" and
+"strcmp". (ANSI C allows an implementation to use inline code for these
+functions.) In addition to the overhead caused by additional function calls,
+these functions are defined for null-terminated strings where the length of
+the strings is not known at compilation time; the function has to check every
+byte for the termination condition (the null byte).
+
+Obviously, a C library which includes efficiently coded "strcpy" and "strcmp"
+functions helps to obtain good Dhrystone results. However, I don't think that
+this is unfair since string functions do occur quite frequently in real
+programs (editors, command interpreters, etc.). If the strings functions are
+implemented efficiently, this helps real programs as well as benchmark
+programs.
+
+I admit that the string comparison in Dhrystone terminates later (after
+scanning 20 characters) than most string comparisons in real programs. For
+consistency with the original benchmark, I didn't change the program despite
+this weakness.
+
+
+5. Intended Use of Dhrystone
+
+When Dhrystone is used, the following "ground rules" apply:
+
+o Separate compilation (Ada and C versions)
+
+ As mentioned in [1], Dhrystone was written to reflect actual programming
+ practice in systems programming. The division into several compilation
+ units (5 in the Ada version, 2 in the C version) is intended, as is the
+ distribution of inter-module and intra-module subprogram calls. Although on
+ many systems there will be no difference in execution time to a Dhrystone
+ version where all compilation units are merged into one file, the rule is
+ that separate compilation should be used. The intention is that real
+ programming practice, where programs consist of several independently
+ compiled units, should be reflected. This also has implies that the
+ compiler, while compiling one unit, has no information about the use of
+ variables, register allocation etc. occuring in other compilation units.
+ Although in real life compilation units will probably be larger, the
+ intention is that these effects of separate compilation are modeled in
+ Dhrystone.
+
+ A few language systems have post-linkage optimization available (e.g., final
+ register allocation is performed after linkage). This is a borderline case:
+ Post-linkage optimization involves additional program preparation time
+ (although not as much as compilation in one unit) which may prevent its
+ general use in practical programming. I think that since it defeats the
+ intentions given above, it should not be used for Dhrystone.
+
+ Unfortunately, ISO/ANSI Pascal does not contain language features for
+ separate compilation. Although most commercial Pascal compilers provide
+ separate compilation in some way, we cannot use it for Dhrystone since such
+ a version would not be portable. Therefore, no attempt has been made to
+ provide a Pascal version with several compilation units.
+
+o No procedure merging
+
+ Although Dhrystone contains some very short procedures where execution would
+ benefit from procedure merging (inlining, macro expansion of procedures),
+ procedure merging is not to be used. The reason is that the percentage of
+ procedure and function calls is part of the "Dhrystone distribution" of
+ statements contained in [1]. This restriction does not hold for the string
+ functions of the C version since ANSI C allows an implementation to use
+ inline code for these functions.
+
+o Other optimizations are allowed, but they should be indicated
+
+ It is often hard to draw an exact line between "normal code generation" and
+ "optimization" in compilers: Some compilers perform operations by default
+ that are invoked in other compilers only when optimization is explicitly
+ requested. Also, we cannot avoid that in benchmarking people try to achieve
+ results that look as good as possible. Therefore, optimizations performed
+ by compilers - other than those listed above - are not forbidden when
+ Dhrystone execution times are measured. Dhrystone is not intended to be
+ non-optimizable but is intended to be similarly optimizable as normal
+ programs. For example, there are several places in Dhrystone where
+ performance benefits from optimizations like common subexpression
+ elimination, value propagation etc., but normal programs usually also
+ benefit from these optimizations. Therefore, no effort was made to
+ artificially prevent such optimizations. However, measurement reports
+ should indicate which compiler optimization levels have been used, and
+ reporting results with different levels of compiler optimization for the
+ same hardware is encouraged.
+
+o Default results are those without "register" declarations (C version)
+
+ When Dhrystone results are quoted without additional qualification, they
+ should be understood as results obtained without use of the "register"
+ attribute. Good compilers should be able to make good use of registers even
+ without explicit register declarations ([3], p. 193).
+
+Of course, for experimental purposes, post-linkage optimization, procedure
+merging and/or compilation in one unit can be done to determine their effects.
+However, Dhrystone numbers obtained under these conditions should be
+explicitly marked as such; "normal" Dhrystone results should be understood as
+results obtained following the ground rules listed above.
+
+In any case, for serious performance evaluation, users are advised to ask for
+code listings and to check them carefully. In this way, when results for
+different systems are compared, the reader can get a feeling how much
+performance difference is due to compiler optimization and how much is due to
+hardware speed.
+
+
+6. Acknowledgements
+
+The C version 2.1 of Dhrystone has been developed in cooperation with Rick
+Richardson (Tinton Falls, NJ), it incorporates many ideas from the "Version
+1.1" distributed previously by him over the UNIX network Usenet. Through his
+activity with Usenet, Rick Richardson has made a very valuable contribution to
+the dissemination of the benchmark. I also thank Chaim Benedelac (National
+Semiconductor), David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan
+Smith and Rafael Saavedra-Barrera (UC at Berkeley) for their help with
+comments on earlier versions of the benchmark.
+
+
+7. Bibliography
+
+[1]
+ Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark.
+ Communications of the ACM 27, 10 (Oct. 1984), 1013-1030
+
+[2]
+ Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text)
+ Informal Distribution via "Usenet", Last Version Known to me: Sept. 21,
+ 1987
+
+[3]
+ Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language.
+ Prentice-Hall, Englewood Cliffs (NJ) 1978
+
OpenPOWER on IntegriCloud