diff options
author | bde <bde@FreeBSD.org> | 2007-11-29 02:01:21 +0000 |
---|---|---|
committer | bde <bde@FreeBSD.org> | 2007-11-29 02:01:21 +0000 |
commit | 723157380283ac617fbbbb23fb4ae6120293007b (patch) | |
tree | dafc46e7794ee0a35610452328fb817732ffbd66 | |
parent | 35b85a2fdb998527546900f82b6aca6f012c6561 (diff) | |
download | FreeBSD-src-723157380283ac617fbbbb23fb4ae6120293007b.zip FreeBSD-src-723157380283ac617fbbbb23fb4ae6120293007b.tar.gz |
Don't use plain "ret" instructions at targets of jump instructions,
since the branch caches on at least Athlon XP through Athlon 64 CPU's
don't understand such instructions and guarantee a cache miss taking
at least 10 cycles. Use the documented workaround "ret $0" instead
("nop; ret" also works, but "ret $0" is probably faster on old CPUs).
Normal code (even asm code) doesn't branch to "ret", since there is
usually some cleanup to do, but the __mcount, .mcount and .mexitcount
entry points were optimized too well to have the minimum number of
instructions (3 instructions each if profiling is not enabled) and
they did this. I didn't see a significant number of cache misses for
.mexitcount, but for the shared "ret" for __mcount and .mcount I
observed cache misses costing 26 cycles each. For a send(2) syscall
that makes about 70 function calls, the cost of these cache misses
alone increased the syscall time from about 4000 cycles to about 7000
cycles. 4000 is for a profiling (GUPROF) kernel with profiling disabled;
after this fix, configuring profiling only costs about 600 cycles in the
4000, which is consistent with almost perfect branch prediction in the
mcounting calls.
-rw-r--r-- | sys/amd64/amd64/prof_machdep.c | 4 | ||||
-rw-r--r-- | sys/i386/isa/prof_machdep.c | 4 |
2 files changed, 4 insertions, 4 deletions
diff --git a/sys/amd64/amd64/prof_machdep.c b/sys/amd64/amd64/prof_machdep.c index cdccc92..f49eb11 100644 --- a/sys/amd64/amd64/prof_machdep.c +++ b/sys/amd64/amd64/prof_machdep.c @@ -135,7 +135,7 @@ __mcount: \n\ popq %rdx \n\ popq %rax \n\ .mcount_exit: \n\ - ret \n\ + ret $0 \n\ "); #else /* !__GNUCLIKE_ASM */ #error "this file needs to be ported to your compiler" @@ -187,7 +187,7 @@ GMON_PROF_HIRES = 4 \n\ popq %rdx \n\ popq %rax \n\ .mexitcount_exit: \n\ - ret \n\ + ret $0 \n\ "); #endif /* __GNUCLIKE_ASM */ diff --git a/sys/i386/isa/prof_machdep.c b/sys/i386/isa/prof_machdep.c index 8fd0fc0..7c7edf4 100644 --- a/sys/i386/isa/prof_machdep.c +++ b/sys/i386/isa/prof_machdep.c @@ -113,7 +113,7 @@ __mcount: \n\ addl $8,%esp \n\ popfl \n\ .mcount_exit: \n\ - ret \n\ + ret $0 \n\ "); #else /* !__GNUCLIKE_ASM */ #error "this file needs to be ported to your compiler" @@ -157,7 +157,7 @@ GMON_PROF_HIRES = 4 \n\ popl %eax \n\ popl %edx \n\ .mexitcount_exit: \n\ - ret \n\ + ret $0 \n\ "); #endif /* __GNUCLIKE_ASM */ |