| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang emits SSE instructions on amd64 in the common path of
pthread_mutex_unlock. If the thread does not otherwise use SSE,
this usage incurs a context-switch of the FPU/SSE state, which
reduces the performance of multiple real-world applications by a
non-trivial amount (3-5% in one application).
Instead of this change, I experimented with eagerly switching the
FPU state at context-switch time. This did not help. Most of the
cost seems to be in the read/write of memory--as kib@ stated--and
not in the #NM handling. I tested on machines with and without
XSAVEOPT.
One counter-argument to this change is that most applications already
use SIMD, and the number of applications and amount of SIMD usage
are only increasing. This is absolutely true. I agree that--in
general and in principle--this change is in the wrong direction.
However, there are applications that do not use enough SSE to offset
the extra context-switch cost. SSE does not provide a clear benefit
in the current libthr code with the current compiler, but it does
provide a clear loss in some cases. Therefore, disabling SSE in
libthr is a non-loss for most, and a gain for some.
I refrained from disabling SSE in libc--as was suggested--because
I can't make the above argument for libc. It provides a wide variety
of code; each case should be analyzed separately.
https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html
Suggestions from: dim, jmg, rpaulo
Sponsored by: Dell Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
versions of pthread_md.h have a special case of dereferencing a null
pointer. Clang warns about this with:
In file included from lib/libthr/arch/i386/i386/pthread_md.c:36:
lib/libthr/arch/i386/include/pthread_md.h:96:10: error: indirection of non-volatile null pointer will be deleted, not trap [-Werror,-Wnull-dereference]
return (TCB_GET32(tcb_self));
^~~~~~~~~~~~~~~~~~~
lib/libthr/arch/i386/include/pthread_md.h:73:13: note: expanded from:
: "m" (*(u_int *)(__tcb_offset(name)))); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lib/libthr/arch/i386/include/pthread_md.h:96:10: note: consider using __builtin_trap() or qualifying pointer with 'volatile'
Since this indirection is done relative to the fs or gs segment, to
retrieve thread-specific data, it is an exception to the rule.
Therefore, add a volatile qualifier to tell the compiler we really want
to dereference a zero address.
MFC after: 1 week
|
| |
|
|
|
|
|
|
|
|
|
|
| |
For all libthr contexts, use ${MACHINE_CPUARCH}
for all libc contexts, use ${MACHINE_ARCH} if it exists, otherwise use
${MACHINE_CPUARCH}
Move some common code up a layer (the .PATH statement was the same in
all the arch submakefiles).
# Hope she hasn't busted powerpc64 with this...
|
|
|
|
|
|
| |
returns errno, because errno can be mucked by user's signal handler and
most of pthread api heavily depends on errno to be correct, this change
should improve stability of the thread library.
|
| |
|
|
|
|
|
| |
- Rename _thr_smp_cpus to boolean variable _thr_is_smp.
- Define CPU_SPINWAIT macro for each arch, only X86 supports it.
|
|
|
|
| |
needed.
|
|
|
|
|
| |
already allocate thread pointer space in tls block for initial thread.
Only i386 and amd64 have been done, others still have to be tested.
|
|
|
|
| |
in libc.
|
| |
|
| |
|
| |
|
|
|
|
| |
clock cycles and has 8191 threads limitation.
|
|
|
|
| |
functions, otherwise user ports have to be rebuilt.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.
Okayed by: jeff, mtm, rwatson, scottl
|
|
|
|
| |
Submitted by: kan
|
|
|
|
|
|
|
| |
run as a 32 bit support library for an amd64 kernel. 32 bit consumers of
libthr have zero chance of running on an amd64 kernel since we don't
implement the i386_set_ldt() family of functions. Note that this commit
doesn't make it actually work, it just removes one more obstacle.
|
|
|
|
|
| |
itself before it can execute any other code, so new thread should be
created with all signals are masked until after fsbase is set.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that this provokes. "Wherever possible" means "In the kernel OR NOT
C++" (implying C).
There are places where (void *) pointers are not valid, such as for
function pointers, but in the special case of (void *)0, agreement
settles on it being OK.
Most of the fixes were NULL where an integer zero was needed; many
of the fixes were NULL where ascii <nul> ('\0') was needed, and a
few were just "other".
Tested on: i386 sparc64
|
| |
|
|
|
|
| |
Approved by: re (scottl)
|
|
|
|
|
|
|
| |
exit function has invalidated the need for _spin[un]lock_pthread().
The _spin[un]lock() functions can now dereference curthread without
the danger that the ldtentry containing the pointer to the thread
has been cleared out from under them.
|
| |
|
|
|
|
| |
Approved by: re/jhb
|
|
|
|
|
|
| |
threads per process has been reached. Return EAGAIN, as per spec.
Approved by: re/blanket libthr
|
|
|
|
|
|
|
|
|
|
| |
that take the address of a struct pthread as their first argument.
_spin[un]lock() just become wrappers arround these two functions.
These new functions are for use in situations where curthread can't be
used. One example is _thread_retire(), where we invalidate the array index
curthread uses to get its pointer..
Approved by: re/blanket libthr
|
|
|
|
|
|
|
|
| |
o removed unused variables
o explicit inclusion of header files
o prototypes for externally defined functions
Approved by: re/blanket libthr
|
|
|
|
|
|
|
|
|
|
|
|
| |
in thr_private.h
o Lock down the ldt_entries array and ldt_free, which points to
the next free slot. As noted in the comments, it's necessary
to special case the initial_thread because %gs is not setup
for it yet. This is ok because that early in the program there
won't be any reentrancy issues anyways.
Approved by: re/blanket libthr
|
|
|
|
|
|
|
|
| |
the number of threads exceeds the number of open slots
in ldt_entries[].
Approved by: markm (mentor)(implicit)
Reviewed by: jeff
|
|
|
|
|
|
|
|
| |
as curthread in the new context, so that it will be set automatically when
the thread is switched to. This fixes a race where we'd run for a little
while with curthread unset in _thread_start.
Reviewed by: jeff
|
|
|
|
| |
Submitted by: gordan@freebsd.org
|
| |
|
|
adaptation of libc_r for the thr system call interface. This is beta
quality code.
|