summaryrefslogtreecommitdiffstats
path: root/lib/libthr
Commit message (Collapse)AuthorAgeFilesLines
* The SUSv4tc1 requires that pthread_setcancelstate() shall be not akib2013-06-191-1/+2
| | | | | | | | | cancellation point. When enabling the cancellation, only process the pending cancellation for asynchronous mode. Reported and reviewed by: Kohji Okuno <okuno.kohji@jp.panasonic.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Since the cause of the problems with the __fillcontextx() waskib2013-06-031-15/+7
| | | | | | | | | identified, unify the code of check_deferred_signal() for all architectures, making the variant under #ifdef x86 common. Tested by: marius (sparc64) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* The getcontext() from the __fillcontextx() call in thekib2013-05-281-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | check_deferred_signal() returns twice, since handle_signal() emulates the return from the normal signal handler by sigreturn(2)ing the passed context. Second return is performed on the destroyed stack frame, because __fillcontextx() has already returned. This causes undefined and bad behaviour, usually the victim thread gets SIGSEGV. Avoid nested frame and the need to return from it by doing direct call to getcontext() in the check_deferred_signal() and using a new private libc helper __fillcontextx2() to complement the context with the extended CPU state if the deferred signal is still present. The __fillcontextx() is now unused, but is kept to allow older libthr.so to be used with the new libc. Mark __fillcontextx() as returning twice [1]. Reported by: pgj Pointy hat to: kib Discussed with: dim Tested by: pgj, dim Suggested by: jilles [1] MFC after: 1 week
* Partially apply the capitalization of the heading word of the sequencekib2013-05-271-3/+3
| | | | | | | and fix typo. Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Return one-based key so that user can check if the key is ever allocateddavidxu2013-05-161-4/+7
| | | | | | in the first place. Initial patch submitted by: phk
* Fix return value for setcontext and swapcontext.davidxu2013-05-091-4/+8
|
* Add accept4() system call.jilles2013-05-012-0/+27
| | | | | | | | | | | | | | | The accept4() function, compared to accept(), allows setting the new file descriptor atomically close-on-exec and explicitly controlling the non-blocking status on the new socket. (Note that the latter point means that accept() is not equivalent to any form of accept4().) The linuxulator's accept4 implementation leaves a race window where the new file descriptor is not close-on-exec because it calls sys_accept(). This implementation leaves no such race window (by using falloc() flags). The linuxulator could be fixed and simplified by using the new code. Like accept(), accept4() is async-signal-safe, a cancellation point and permitted in capability mode.
* Remove extra code for SA_RESETHAND, it is not needed because kernel hasdavidxu2013-04-281-7/+0
| | | | already done this.
* libthr: Fix a parameter name in an internal header file.jilles2013-04-271-1/+1
|
* Remove debug code.davidxu2013-04-181-1/+0
|
* Avoid copying memory if SIGCANCEL is not masked.davidxu2013-04-181-4/+14
|
* Revert revision 249323, the PR/177624 is confusing, that bug is causeddavidxu2013-04-181-1/+10
| | | | | by using buggy getcontext/setcontext on same stack, while swapcontext normally works on different stack, there is no such a problem.
* libthr: Remove _thr_rtld_fini(), unused since r245630.jilles2013-04-122-12/+0
|
* swapcontext wrapper can not be implemented in C, the stack pointer saved indavidxu2013-04-101-10/+1
| | | | | | the context becomes invalid when the function returns, same as setjmp, it must be implemented in assemble language, see discussions in PR misc/177624.
* libthr: Always use the threaded rtld lock implementation.jilles2013-01-182-5/+6
| | | | | | | | | | | The threaded rtld lock implementation is faster even in the single-threaded case because it postpones signal handlers via THR_CRITICAL_ENTER and THR_CRITICAL_LEAVE instead of calling sigprocmask(2). As a result, exception handling becomes faster in single-threaded applications linked with libthr. Reviewed by: kib
* In suspend_common(), don't wait for a thread which is in creation, becausedavidxu2012-08-275-4/+64
| | | | | | | pthread_suspend_all_np() may have already suspended its parent thread. Add locking code in pthread_suspend_all_np() to only allow one thread to suspend other threads, this eliminates a deadlock where two or more threads try to suspend each others.
* Eliminate redundant code, _thr_spinlock_init() has already been calleddavidxu2012-08-231-3/+0
| | | | in init_private(), don't call it again in fork() wrapper.
* Implement syscall clock_getcpuclockid2, so we can get a clock iddavidxu2012-08-171-1/+3
| | | | | | | | for process, thread or others we want to support. Use the syscall to implement POSIX API clock_getcpuclock and pthread_getcpuclockid. PR: 168417
* Merging of projects/armv6, part 2gonzo2012-08-151-1/+13
| | | | Handle TLS for ARMv6 and ARMv7
* Do defered mutex wakeup once.davidxu2012-08-121-0/+1
|
* MFp4:davidxu2012-08-115-24/+38
| | | | | Further decreases unexpected context switches by defering mutex wakeup until internal sleep queue lock is released.
* Don't forget to initialize return value.davidxu2012-07-201-1/+1
|
* Simplify code by replacing _thr_ref_add() with _thr_find_thread().davidxu2012-07-201-5/+1
|
* Eliminate duplicated code.davidxu2012-07-201-19/+10
|
* Don't assign same value.davidxu2012-07-202-6/+4
|
* Eliminate duplicated code.davidxu2012-07-201-29/+14
|
* Eliminate duplicated code.davidxu2012-07-201-30/+16
|
* Don't forget to release a thread reference count,davidxu2012-07-201-4/+2
| | | | | | | replace _thr_ref_add() with _thr_find_thread(), so reference count is no longer needed. MFC after: 3 days
* Return EBUSY for PTHREAD_MUTEX_ADAPTIVE_NP too when the mutex could notdavidxu2012-05-271-0/+1
| | | | | | | be acquired. PR: 168317 MFC after: 3 days
* Create a common function lookup() to search a chan, this eliminatesdavidxu2012-05-101-5/+9
| | | | redundant SC_LOOKUP() calling.
* Fix mis-merged line, move SC_LOOKUP() call todavidxu2012-05-051-1/+1
| | | | upper level.
* MFp4:davidxu2012-05-033-1/+10
| | | | | | | Enqueue thread in LIFO, this can cause starvation, but it gives better performance. Use _thr_queuefifo to control the frequency of FIFO vs LIFO, you can use environment string LIBPTHREAD_QUEUE_FIFO to configure the variable.
* Set SIGCANCEL to SIGTHR as part of some cleanup of DTrace code.gnn2012-04-181-1/+1
| | | | | Reviewed by: davidxu@ MFC after: 1 week
* umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accessesdavidxu2012-04-052-5/+18
| | | | | | | | | | | | | | | | | | | | | a mutex after a thread has unlocked it, it event writes data to the mutex memory to clear contention bit, there is a race that other threads can lock it and unlock it, then destroy it, so it should not write data to the mutex memory if there isn't any waiter. The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It requires thread library to clear the lock word entirely, then call the WAKE2 operation to check if there is any waiter in kernel, and try to wake up a thread, if necessary, the contention bit is set again by the operation. This also mitgates the chance that other threads find the contention bit and try to enter kernel to compete with each other to wake up sleeping thread, this is unnecessary. With this change, the mutex owner is no longer holding the mutex until it reaches a point where kernel umtx queue is locked, it releases the mutex as soon as possible. Performance is improved when the mutex is contensted heavily. On Intel i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
* libthr: In the atfork handlers for signals, do not skip the last signal.jilles2012-03-261-3/+3
| | | | | | | | _SIG_MAXSIG works a bit unexpectedly: signals 1 till _SIG_MAXSIG are valid, both bounds inclusive. Reviewed by: davidxu MFC after: 1 week
* Use clockid parameter instead of hard-coded CLOCK_REALTIME.davidxu2012-03-191-1/+1
| | | | Reported by: pjd
* Some software think a mutex can be destroyed after it owned it, fordavidxu2012-03-181-7/+0
| | | | | | | | | | | example, it uses a serialization point like following: pthread_mutex_lock(&mutex); pthread_mutex_unlock(&mutex); pthread_mutex_destroy(&muetx); They think a previous lock holder should have already left the mutex and is no longer referencing it, so they destroy it. To be maximum compatible with such code, we use IA64 version to unlock the mutex in kernel, remove the two steps unlocking code.
* When destroying a barrier, waiting all threads exit the barrier,davidxu2012-03-162-4/+31
| | | | | this makes it possible a thread received PTHREAD_BARRIER_SERIAL_THREAD immediately free memory area of the barrier.
* - Switch to saving non-offseted pointer to TLS block in order too keep ↵gonzo2012-03-061-8/+4
| | | | things simple
* Follow changes made in revision 232144, pass absolute timeout to kernel,davidxu2012-02-273-30/+37
| | | | this eliminates a clock_gettime() syscall.
* In revision 231989, we pass a 16-bit clock ID into kernel, howeverdavidxu2012-02-251-17/+32
| | | | | | | | | | | | according to POSIX document, the clock ID may be dynamically allocated, it unlikely will be in 64K forever. To make it future compatible, we pack all timeout information into a new structure called _umtx_time, and use fourth argument as a size indication, a zero means it is old code using timespec as timeout value, but the new structure also includes flags and a clock ID, so the size argument is different than before, and it is non-zero. With this change, it is possible that a thread can sleep on any supported clock, though current kernel code does not have such a POSIX clock driver system.
* Use unused fourth argument of umtx_op to pass flags to kernel for operationdavidxu2012-02-221-13/+3
| | | | | | UMTX_OP_WAIT. Upper 16bits is enough to hold a clock id, and lower 16bits is used to pass flags. The change saves a clock_gettime() syscall from libthr.
* Check both seconds and nanoseconds are zero, only checking nanosecondsdavidxu2012-02-191-1/+1
| | | | is zero may trigger timeout too early. It seems a copy&paste bug.
* Add thread-local storage support for arm:gonzo2012-02-142-4/+5
| | | | | - Switch to Variant I TCB layout - Use function from rtld for TCB allocation/deallocation
* Make code more stable by checking NULL pointers.davidxu2012-02-111-2/+6
|
* Switch MIPS TLS implementation to Variant I:gonzo2012-02-102-13/+20
| | | | | Save pointer to the TLS structure taking into account TP_OFFSET and TCB structure size.
* Plug a memory leak. When a cached thread is reused, don't clear sleepdavidxu2012-02-072-12/+19
| | | | | | | queue pointers, just reuse it. PR: 164828 MFC after: 1 week
* Use getcontextx(3) internal API instead of getcontext(2) to providekib2012-01-211-4/+13
| | | | | | | | | | | | | the signal handlers with the context information in the deferrred case. Only enable the use of getcontextx(3) in the deferred signal delivery code on amd64 and i386. Sparc64 seems to have some undetermined issues with interaction of alloca(3) and signal delivery. Tested by: flo (who also provided sparc64 harware access for me), pho Discussed with: marius MFC after: 1 month
* The TCB_GET32() and TCB_GET64() macros in the i386 and amd64-specificdim2011-12-152-2/+2
| | | | | | | | | | | | | | | | | | | | | | versions of pthread_md.h have a special case of dereferencing a null pointer. Clang warns about this with: In file included from lib/libthr/arch/i386/i386/pthread_md.c:36: lib/libthr/arch/i386/include/pthread_md.h:96:10: error: indirection of non-volatile null pointer will be deleted, not trap [-Werror,-Wnull-dereference] return (TCB_GET32(tcb_self)); ^~~~~~~~~~~~~~~~~~~ lib/libthr/arch/i386/include/pthread_md.h:73:13: note: expanded from: : "m" (*(u_int *)(__tcb_offset(name)))); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/libthr/arch/i386/include/pthread_md.h:96:10: note: consider using __builtin_trap() or qualifying pointer with 'volatile' Since this indirection is done relative to the fs or gs segment, to retrieve thread-specific data, it is an exception to the rule. Therefore, add a volatile qualifier to tell the compiler we really want to dereference a zero address. MFC after: 1 week
* Pass CVWAIT flags to kernel, this should handledavidxu2011-11-171-3/+2
| | | | | | | | | timeout correctly for pthread_cond_timedwait when it uses kernel-based condition variable. PR: 162403 Submitted by: jilles MFC after: 3 days
OpenPOWER on IntegriCloud