summaryrefslogtreecommitdiffstats
path: root/libexec/rtld-elf/amd64
Commit message (Collapse)AuthorAgeFilesLines
* MFC r315331:kib2017-03-292-14/+18
| | | | | | | | | | Implement LD_BIND_NOT knob for rtld. MFC r315337: Disable LD_BIND_NOT for setugid processes. MFC r315429 (by jilles): Document that LD_BIND_NOT is unset for setugid processes.
* MFC r312288: rtld: do not rely on a populated GOT on amd64emaste2017-01-262-2/+14
| | | | | | | | | | | | | | | | | | On rela architectures GNU BFD ld and gold store the relocation addend in GOT entries (in addition to the relocation's r_addend field). rtld previously relied on this to access its own _DYNAMIC symbol in order to apply its own relocations. However, recording addends in the GOT is not specified by the ABI, and some versions of LLVM's LLD linker leave the GOT uninitialized on rela architectures. BFD ld does not populate the GOT on sparc64, and sparc64 rtld has a machine-dependent rtld_dynamic_addr() function that returns the _DYNAMIC address. Use the same approach on amd64, obtaining the %rip- relative _DYNAMIC address following a suggestion from Rafael EspĂ­ndola. Architectures other than amd64 should be addressed in future work.
* MFC r308689:kib2016-11-232-1/+29
| | | | | | | | | | Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) and CPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to the ifunc resolvers on x86. MFC r308925: Adjust r308689 to make rtld compilable with either in-tree or (hopefully) stock gcc 4.2.1 on i386 and other arches.
* Do not call callbacks for dl_iterate_phdr(3) with the rtld bind andkib2016-01-201-1/+2
| | | | | | | | | | | | | | | | | | phdr locks locked. This allows to call rtld services from the callback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to test existence of the library in the image, and for dlsym(). The later might still be not quite safe, due to the lazy resolution of filters. To allow dropping the locks around iteration in dl_iterate_phdr(3), we insert markers to track current position between relocks. The global objects list is converted to tailq and all iterators skip markers, globallist_next() and globallist_curr() helpers are added. Reported and tested by: davide Reviewed by: kan Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
* rtld: wrap a comment to 80 columnsemaste2016-01-051-2/+2
|
* Create a generalized exec hook that different architectures can hookimp2016-01-031-0/+2
| | | | | | into if they need to, but default to no action. Differential Review: https://reviews.freebsd.org/D2718
* Disable SSE in libthrvangyzen2015-08-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. If the thread does not otherwise use SSE, this usage incurs a context-switch of the FPU/SSE state, which reduces the performance of multiple real-world applications by a non-trivial amount (3-5% in one application). Instead of this change, I experimented with eagerly switching the FPU state at context-switch time. This did not help. Most of the cost seems to be in the read/write of memory--as kib@ stated--and not in the #NM handling. I tested on machines with and without XSAVEOPT. One counter-argument to this change is that most applications already use SIMD, and the number of applications and amount of SIMD usage are only increasing. This is absolutely true. I agree that--in general and in principle--this change is in the wrong direction. However, there are applications that do not use enough SSE to offset the extra context-switch cost. SSE does not provide a clear benefit in the current libthr code with the current compiler, but it does provide a clear loss in some cases. Therefore, disabling SSE in libthr is a non-loss for most, and a gain for some. I refrained from disabling SSE in libc--as was suggested--because I can't make the above argument for libc. It provides a wide variety of code; each case should be analyzed separately. https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html Suggestions from: dim, jmg, rpaulo Approved by: kib (mentor) MFC after: 2 weeks Sponsored by: Dell Inc.
* Change compiler setting to make default visibility of the symbols forkib2015-03-293-3/+4
| | | | | | | | | | | rtld on x86 to be hidden. This is a micro-optimization, which allows intrinsic references inside rtld to be handled without indirection through PLT. The visibility of rtld symbols for other objects in the symbol namespace is controlled by a version script. Reviewed by: kan, jilles Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Optimize r270798, only do the second pass over non-plt relocationskib2014-08-291-1/+3
| | | | | | | when the first pass found IFUNCs. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* IFUNC symbol type shall be processed for non-PLT relocations,kib2014-08-291-188/+161
| | | | | | | | | | | | | | | | | | | e.g. when a global variable is initialized with a pointer to ifunc. Add symbol type check and call resolver for STT_GNU_IFUNC symbol types when processing non-PLT relocations, but only after non-IFUNC relocations are done. The two-phase proceessing is required since resolvers may reference other symbols, which must be ready to use when resolver calls are done. Restructure reloc_non_plt() on x86 to call find_symdef() and handle IFUNC in single place. For non-x86 reloc_non_plt(), check for call for IFUNC relocation and do nothing, to avoid processing relocs twice. PR: 193048 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Add dwarf annotations to the amd64 _rtld_bind_start to allow debuggerskib2014-04-141-0/+43
| | | | | | | to unwind around the calls from PLT to binder. Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Add GNU hash support for rtld.kib2012-04-301-1/+1
| | | | | | | Based on dragonflybsd support for GNU hash by John Marino <draco marino st> Reviewed by: kan Tested by: bapt MFC after: 2 weeks
* Fix several problems with our ELF filters implementation.kib2012-03-201-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not relocate twice an object which happens to be needed by loaded binary (or dso) and some filtee opened due to symbol resolution when relocating need objects. Record the state of the relocation processing in Obj_Entry and short-circuit relocate_objects() if current object already processed. Do not call constructors for filtees loaded during the early relocation processing before image is initialized enough to run user-provided code. Filtees are loaded using dlopen_object(), which normally performs relocation and initialization. If filtee is lazy-loaded during the relocation of dso needed by the main object, dlopen_object() runs too earlier, when most runtime services are not yet ready. Postpone the constructors call to the time when main binary and depended libraries constructors are run, passing the new flag RTLD_LO_EARLY to dlopen_object(). Symbol lookups callers inform symlook_* functions about early stage of initialization with SYMLOOK_EARLY. Pass flags through all functions participating in object relocation. Use the opportunity and fix flags argument to find_symdef() in arch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead of true, which happen to have the same numeric value. Reported and tested by: theraven Reviewed by: kan MFC after: 2 weeks
* Add support for preinit, init and fini arrays. Some ABIs, inkib2012-03-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | particular on ARM, do require working init arrays. Traditional FreeBSD crt1 calls _init and _fini of the binary, instead of allowing runtime linker to arrange the calls. This was probably done to have the same crt code serve both statically and dynamically linked binaries. Since ABI mandates that first is called preinit array functions, then init, and then init array functions, the init have to be called from rtld now. To provide binary compatibility to old FreeBSD crt1, which calls _init itself, rtld only calls intializers and finalizers for main binary if binary has a note indicating that new crt was used for linking. Add parsing of ELF notes to rtld, and cache p_osrel value since we parsed it anyway. The patch is inspired by init_array support for DragonflyBSD, written by John Marino. Reviewed by: kan Tested by: andrew (arm, previous version), flo (sparc64, previous version) MFC after: 3 weeks
* Remove unneeded dtv variable.ed2012-01-171-2/+0
| | | | | | | It is only assigned and not used at all. The object files stay identical when the variables are removed. Approved by: kib
* _rtld_bind() read-locks the bind lock, and possible plt resolutionkib2011-12-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | from the dispatcher would also acquire bind lock in read mode, which is the supported operation. plt is explicitely designed to allow safe multithreaded updates, so the shared lock do not cause problems. The error in r228435 is that it allows read lock acquisition after the write lock for the bind block. If we dlopened the shared object that contains IRELATIVE or jump slot which target is STT_GNU_IFUNC, then possible recursive plt resolve from the dispatcher would cause it. Postpone the resolution for irelative/ifunc right before initializers are called, and drop bind lock around calls to dispatcher. Use initlist to iterate over the objects instead of the ->next, due to drop of the bind lock in iteration. For i386/reloc.c:reloc_iresolve(), fix calculation of the dispatch function address for dso, by taking into account possible non-zero relocbase. MFC after: 3 weeks
* Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions tokib2011-12-121-13/+96
| | | | | | | | | | | | | | | | | rtld on 386 and amd64. This adds runtime bits neccessary for the use of the dispatch functions from the dynamically-linked executables and shared libraries. To allow use of external references from the dispatch function, resolution of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries for PLT are prepared, and normal resolution of the GOT entries is finished. Similar to how it is done by GNU, IRELATIVE relocations are resolved in advance, instead of normal lazy handling for PLT. Move the init_pltgot() call before the relocations for the object are processed. MFC after: 3 weeks
* - change "is is" to "is" or "it is"eadler2011-10-161-1/+1
| | | | | | | | - change "the the" to "the" Approved by: lstewart Approved by: sahil (mentor) MFC after: 3 days
* When loading dso without PT_GNU_STACK phdr, only callkib2011-01-251-0/+3
| | | | | | | __pthread_map_stacks_exec() on architectures that allow executable stacks. Reported and tested by: marcel (ia64)
* Add section .note.GNU-stack for assembly files used by 386 and amd64.kib2011-01-071-0/+2
|
* Sort -mno-(mmx|3dnow|sse|sse2|sse3) options consistently throughout thedim2011-01-051-1/+1
| | | | | | tree. Submitted by: arundel
* On amd64 and i386, tell the compiler to refrain from generating SSE,dim2011-01-041-0/+1
| | | | | | | | | | | | | | | | 3DNow, MMX and floating point instructions in rtld-elf. Otherwise, _rtld_bind() (and whatever it calls) could possibly clobber function arguments that are passed in SSE/3DNow/MMX/FP registers, usually floating point values. This can happen, for example, when clang generates SSE code for memset() or memcpy() calls. One symptom of this is sshd dying early on amd64 with "PRNG not seeded", which is ultimately caused by libcrypto.so.6 calling RAND_add() with a double parameter. That parameter is passed via %xmm0, which gets wiped out by an SSE memset() in _rtld_bind(). Reviewed by: kib, kan
* Remove '-elf' from build flags for libexec/rtld-elf for amd64 and i386.dim2011-01-041-2/+0
| | | | ELF has been the default format for almost 12 years now.
* Implement support for ELF filters in rtld. Both normal and auxillarykib2010-12-251-20/+26
| | | | | | | | | | | | | | | | | filters are implemented. Filtees are loaded on demand, unless LD_LOADFLTR environment variable is set or -z loadfltr was specified during the linking. This forces rtld to upgrade read-locked rtld_bind_lock to write lock when it encounters an object with filter during symbol lookup. Consolidate common arguments of the symbol lookup functions in the SymLook structure. Track the state of the rtld locks in the RtldLockState structure. Pass local RtldLockState through the rtld symbol lookup calls to allow lock upgrades. Reviewed by: kan Tested by: Mykola Dzham <i levsha me>, nwhitehorn (powerpc)
* MFtbemd:imp2010-08-231-1/+3
| | | | | Prefer MACHNE_CPUARCH to MACHINE_ARCH in most contexts where you want to test of all the CPUs of a given family conform.
* Only use the cache after the early stage of loading. This isrdivacky2010-05-181-5/+6
| | | | | | | | | | because calling mmap() etc. may use GOT which is not set up yet. Use calloc() instead of mmap() in cases where this was the case before (sparc64, powerpc, arm). Submitted by: Dimitry Andric (dimitry andric com) Reviewed by: kan Approved by: ed (mentor)
* Now that the kernel defines CACHE_LINE_SIZE in machine/param.h, userwatson2009-04-191-2/+0
| | | | | | | | that definition in the custom locking code for the run-time linker rather than local definitions. Pointed out by: tinderbox MFC after: 2 weeks
* *thwack*! all the world's not i386.des2006-03-291-0/+2
| | | | Pointy hat to: des
* Allocate space for thread pointer, this allows thread library to accessdavidxu2006-03-281-1/+1
| | | | its pointer from begin, and simplifies _get_curthread() in libthr.
* Implement ELF symbol versioning using GNU semantics. This code aimskan2005-12-181-1/+3
| | | | | | | | | to be compatible with symbol versioning support as implemented by GNU libc and documented by http://people.redhat.com/~drepper/symbol-versioning and LSB 3.0. Implement dlvsym() function to allow lookups for a specific version of a given symbol.
* Explicitly cast ELF_R_TYPE() to the right type.marcel2005-12-181-2/+2
|
* Remove these unused files before any other archs include the same bogusjhb2004-11-121-202/+0
| | | | file.
* Add support for Thread Local Storage.dfr2004-08-032-0/+145
|
* More stack alignment fixes. Arrange so we call _rtld() in ld-elf.so.1peter2004-03-211-8/+9
| | | | | | | | | with the correct alignment. This is important because this calls to library static constructors are made from here. The bug in the old crt*.s files hid this because in this case, two wrongs do indeed make a right. Also, call _rtld_bind() with the correct alignment, because it calls back into the pthread library locking functions. If things happen just the wrong way, we get a SIG10 due to the broken stack alignment.
* Fix dynamic linking a bit more.. enough that mozilla-firebird works if youpeter2003-12-121-3/+3
| | | | | | dig up the patches for amd64 support for it. Note to self: do not put a 64 bit value in a 32 bit space.
* Revert last change. ../rtld.c uses CACHE_LINE_SIZE too.peter2003-12-111-0/+2
| | | | | | Change it to 64 while here. Reported by: ps
* Only define CACHE_LINE_SIZE in one place..peter2003-12-111-2/+0
|
* CACHE_LINE_SIZE is 64 on athlon and amd64 chips, not 32. This shouldpeter2003-12-111-1/+1
| | | | | probably be 128 since that is what the hardware prefetch fill size is on both the p3, p4 and athlon* cpus.
* Allow threading libraries to register their own lockingkan2003-05-291-21/+5
| | | | | | | | | | implementation in case default one provided by rtld is not suitable. Consolidate various identical MD lock implementation into a single file using appropriate machine/atomic.h. Approved by: re (scottl)
* Initial pass at supporting shared libraries on amd64. There are stillpeter2003-05-242-73/+101
| | | | | | | a few missing relocation types in amd64/reloc.c, but I have not found any of them in use yet. :-) Approved by: re (amd64/* blanket)
* Remove 80386 bandaids from code repocopied from i386. rtld_start.S stillpeter2003-04-301-78/+4
| | | | todo.
* No need to zero fill memory, mmapped anonymously. Kernel willkan2003-03-141-2/+0
| | | | | | return pre-zeroed pages itself. Noticed by: jake
* Fix the handling of high PLT entries (> 32764) on sparc64. This requirestmm2002-11-182-2/+3
| | | | | | | | additional arguments to reloc_jmpslot(), which is why MI code and MD code of other platforms had to be changed. Reviewed by: jake Approved by: re
* Remove the nanosleep calls from the spin loops in the locking code.jdp2002-07-061-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They provided little benefit (if any) and they caused some problems in OpenOffice, at least in post-KSE -current and perhaps in other environments too. The nanosleep calls prevented the profiling timer from advancing during the spinloops, thereby preventing the thread scheduler from ever pre-empting the spinning thread. Alexander Kabaev diagnosed this problem, Martin Blapp helped with testing, and Matt Dillon provided some helpful suggestions. This is a short-term fix for a larger problem. The use of spinlocking isn't guaranteed to work in all cases. For example, if the spinning thread has higher priority than all other threads, it may never be pre-empted, and the thread holding the lock may never progress far enough to release the lock. On the other hand, spinlocking is the only locking that can work with an arbitrary unknown threads package. I have some ideas for a much better fix in the longer term. It would eliminate all locking inside the dynamic linker by making it safe for symbol lookups and lazy binding to proceed in parallel with a call to dlopen or dlclose. This means that the only mutual exclusion needed would be to prevent multiple simultaneous calls to dlopen and/or dlclose. That mutual exclusion could be put into the native pthreads library. Applications using foreign threads packages would have to make their own arrangements to ensure that they did not have multiple threads in dlopen and/or dlclose -- a reasonable requirement in my opinion. MFC after: 3 days
* Update the asm statements to use the "+" modifier instead ofjdp2002-06-242-8/+8
| | | | | | | | | | matching constraints where appropriate. This makes the dynamic linker buildable at -O0 again. Thanks to Bruce Evans for identifying the cause of the build problem. MFC after: 1 week
* Correct a bug in the last commit. The whole point of creating a 'done:'dillon2002-06-101-3/+3
| | | | | | | goto target was so the cache could be freed. So free the cache after done: rather then before done: (!) Submitted by: Gavin Atkinson <gavin@ury.york.ac.uk>
* In tracking down an installation seg fault with then openoffice portdillon2002-06-101-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | Martin Blapp determined that the elf dynamic loader was at fault. In particular, the loader uses alloca() to allocate a symbol cache on the stack. Normally this would work just fine, but if the loader is called from a threaded program and the object being loaded is fairly large the alloca() can blow away the thread stack and effect other nearby thread stacks as well. My testing showed that the symbol cache can be as large as 250KBytes during the openoffice port build and install sequence. Martin was able to work around the problem by disabling the symbol cache (cache = NULL;). However, this solution is not adequate for commit because it can cause an enormous cpu burden for applications which do a lot of dynamic loading (e.g. like konqueror). The solution is to use anonymous mmap() to temporarily allocate space to hold the symbol cache. In testing I found that replacing the alloca() with mmap() has no observable degredation in performance. It should be noted that this bug does not necessarily cause an immediate crash but can instead result in long term corruption and instability in applications that load modules from threads. The bug is almost certainly responsible for some of the instabilities found in konqueror, for example, and possibly netscape too. Sleuthing work by: Martin Blapp <mb@imp.ch> X-MFC after: Before or after the 4.6 release depending on the release engineers
* Update rtld for the "new" ia64 ABI. In the old toolchain, thepeter2001-10-291-0/+3
| | | | | | | | | | | | | | | DT_INIT and DT_FINI tags pointed to fptr records. In 2.11.2, it points to the actuall address of the function. On IA64 you cannot just take an address of a function, store it in a function pointer variable and call it.. the function pointers point to a fptr data block that has the target gp and address in it. This is absolutely necessary for using the in-tree binutils toolchain, but (unfortunately) will not work with old shared libraries. Save your old ld-elf.so.1 if you want to use old ones still. Do not mix-and-match. This is a no-op change for i386 and alpha. Reviewed by: dfr
* Add ia64 support. Various adjustments were made to existing targets todfr2001-10-152-8/+17
| | | | | cope with a few interface changes required by the ia64. In particular, function pointers on ia64 need special treatment in rtld.
* Performance improvements for the ELF dynamic linker. Thesejdp2001-05-051-4/+9
| | | | | | | | | | | | | | | | | | | | particularly help programs which load many shared libraries with a lot of relocations. Large C++ programs such as are found in KDE are a prime example. While relocating a shared object, maintain a vector of symbols which have already been looked up, directly indexed by symbol number. Typically, symbols which are referenced by a relocation entry are referenced by many of them. This is the same optimization I made to the a.out dynamic linker in 1995 (rtld.c revision 1.30). Also, compare the first character of a sought-after symbol with its symbol table entry before calling strcmp(). On a PII/400 these changes reduce the start-up time of a typical KDE program from 833 msec (elapsed) to 370 msec. MFC after: 5 days
OpenPOWER on IntegriCloud