summaryrefslogtreecommitdiffstats
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* MFC: 323068jpaetzel2017-09-191-0/+1
| | | | | | Allow kldload tcpmd5 PR: 220170
* MFC r322940:rlibby2017-09-122-4/+4
| | | | amd64: drop q suffix from rd[fg]sbase for gas compatibility
* MFC r321284:rlibby2017-09-121-0/+6
| | | | efi: restrict visibility of EFIABI_ATTR-declared functions
* MFC r323032, r323053, r323058, r323059, r323084, r323114, r323127:mav2017-09-111-3/+8
| | | | | | | | | | | | | | | | Add NTB driver for PLX/Avago/Broadcom PCIe switches. This driver supports both NTB-to-NTB and NTB-to-Root Port modes (though the second with predictable complications on hot-plug and reboot events). I tested it with PEX 8717 and PEX 8733 chips, but expect it should work with many other compatible ones too. It supports up to two NT bridges per chip, each of which can have up to 2 64-bit or 4 32-bit memory windows, 6 or 12 scratchpad registers and 16 doorbells. There are also 4 DMA engines in those chips, but they are not yet supported. While there, rename Intel NTB driver from generic ntb_hw(4) to more specific ntb_hw_intel(4), so now it is on par with this new ntb_hw_plx(4) driver and alike to Linux naming.
* MFC r320901-r320902, r320996-r320997, r321002, r321048, r321400, r321743,ian2017-09-111-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r321745 r320901: Protect access to the AT realtime clock with its own mutex. The mutex protecting access to the registered realtime clock should not be overloaded to protect access to the atrtc hardware, which might not even be the registered rtc. More importantly, the resettodr mutex needs to be eliminated to remove locking/sleeping restrictions on clock drivers, and that can't happen if MD code for amd64 depends on it. This change moves the protection into what's really being protected: access to the atrtc date and time registers. This change also adds protection when the clock is accessed from xentimer_settime(), which bypasses the resettodr locking. Differential Revision: https://reviews.freebsd.org/D11483 r320902: Support multiple realtime clocks, and remove locking/sleeping restrictions on clock drivers. This tracks multiple concurrent realtime clock drivers in a list sorted by clock resolution. When system time changes (and periodically) the clock_settime() methods of all registered clocks are invoked. To initialize system time, each driver is tried in turn from best to worst resolution, until one succesfully returns a valid time. The code no longer holds a mutex while calling the clock_settime() and clock_gettime() methods of the registered clocks. This allows clock drivers to do whatever kind of locking or sleeping is necessary (this is especially important for i2c clock chips since i2c drivers often need to sleep). A new clock_register_flags() function allows the clock driver to pass flags. The flags currently defined help support drivers that use their own techniques to avoid roundoff errors (prevents the 4/5 rounding done by the subr_rtc code). A driver which may need to wait for resources (such as bus ownership) may pass a flag to indicate that it will obtain system time for itself after waiting for resources; this is merely an optimization to avoid the common code retrieving a timespec that will never get used. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11484 r320996: Allow setting debug.clocktime as a tunable. Print 64-bit time_t correctly on 32-bit systems. r320997: Minor optimization: instead of converting between days and years using loops that start in 1970, assume most conversions are going to be for recent dates and use a precomputed number of days through the end of 2016. r321002: Revert r320997. There are reports of it getting the wrong results, so clearly my testing was insuffficent, and it's best to just revert it until I get it straightened out. r321048: Minor optimization: instead of converting between days and years using loops that start in 1970, assume most conversions are going to be for recent dates and use a precomputed number of days through the end of 2016. This is a do-over of r320997, hopefully this time with 100% more workiness. The first attempt had an off-by-one error, but instead of just adding another mysterious +1 adjustment, this rearranges the relationship between recent_base_year and recent_base_days so that the latter is the number of days that occurred before the start of the associated year (instead of the count thru the end of that year). This makes the recent_base stuff work more like the original loop logic that didn't need any +1 adjustments. r321400: Add common code to support realtime clocks that store year without century. Most realtime clocks store the year as 2 BCD digits. Some add a century bit to extend the range another hundred years. Every clock driver has its own code to determine the century and pass a full year value to clock_ct_to_ts(). Now clock drivers can just convert BCD to bin and store the result in the clocktime struct and let the common code figure out the century. Clocks with a century bit can just add 100 to year if the century bit is on. r321743: Add taskqueue_enqueue_timeout_sbt(), because sometimes you want more control over the scheduling precision than 'ticks' can offer, and because sometimes you're already working with sbintime_t units and it's dumb to convert them to ticks just so they can get converted back to sbintime_t under the hood. r321745: Add clock_schedule(), a feature that allows realtime clock drivers to request that their clock_settime() methods be called at a given offset from top-of-second. This adds a timeout_task to the rtc_instance so that each clock can be separately added to taskqueue_thread with the scheduling it prefers, instead of looping through all the clocks at once with a single task on taskqueue_thread. If a driver doesn't call clock_schedule() the default is the old behavior: clock_settime() is queued immediately.
* MFC r322762, r322799, r322832, r322833:kib2017-09-119-63/+194
| | | | | | Make WRFSBASE and WRGSBASE instructions functional. Bump stable/11 __FreeBSD_version.
* MFC r322720,r322723:kib2017-08-271-96/+86
| | | | Simplify amd64 trap().
* MFC r322719:kib2017-08-271-3/+2
| | | | Trim excessive 'extern' and remove unused declaration.
* MFC r322718:kib2017-08-271-8/+11
| | | | Use ANSI C declaration for trap_pfault(). Style.
* MFC r322496:kib2017-08-211-4/+18
| | | | Print whole machine state on double fault.
* MFC r322495:kib2017-08-211-0/+32
| | | | Add {rd,wr}{fs,gs}base C wrappers for instructions.
* MFC r322494:kib2017-08-171-5/+11
| | | | Style.
* MFC r321899truckman2017-08-164-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lower the amd64 shared page, which contains the signal trampoline, from the top of user memory to one page lower on machines with the Ryzen (AMD Family 17h) CPU. This pushes ps_strings and the stack down by one page as well. On Ryzen there is some sort of interaction between code running at the top of user memory address space and interrupts that can cause FreeBSD to either hang or silently reset. This sounds similar to the problem found with DragonFly BSD that was fixed with this commit: https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b48dd28447fc8ef62fbc963accd301557fd9ac20 but our signal trampoline location was already lower than the address that DragonFly moved their signal trampoline to. It also does not appear to be related to SMT as described here: https://www.phoronix.com/forums/forum/hardware/processors-memory/955368-some-ryzen-linux-users-are-facing-issues-with-heavy-compilation-loads?p=955498#post955498 "Hi, Matt Dillon here. Yes, I did find what I believe to be a hardware issue with Ryzen related to concurrent operations. In a nutshell, for any given hyperthread pair, if one hyperthread is in a cpu-bound loop of any kind (can be in user mode), and the other hyperthread is returning from an interrupt via IRETQ, the hyperthread issuing the IRETQ can stall indefinitely until the other hyperthread with the cpu-bound loop pauses (aka HLT until next interrupt). After this situation occurs, the system appears to destabilize. The situation does not occur if the cpu-bound loop is on a different core than the core doing the IRETQ. The %rip the IRETQ returns to (e.g. userland %rip address) matters a *LOT*. The problem occurs more often with high %rip addresses such as near the top of the user stack, which is where DragonFly's signal trampoline traditionally resides. So a user program taking a signal on one thread while another thread is cpu-bound can cause this behavior. Changing the location of the signal trampoline makes it more difficult to reproduce the problem. I have not been because the able to completely mitigate it. When a cpu-thread stalls in this manner it appears to stall INSIDE the microcode for IRETQ. It doesn't make it to the return pc, and the cpu thread cannot take any IPIs or other hardware interrupts while in this state." since the system instability has been observed on FreeBSD with SMT disabled. Interrupts to appear to play a factor since running a signal-intensive process on the first CPU core, which handles most of the interrupts on my machine, is far more likely to trigger the problem than running such a process on any other core. Also lower sv_maxuser to prevent a malicious user from using mmap() to load and execute code in the top page of user memory that was made available when the shared page was moved down. Make the same changes to the 64-bit Linux emulator. PR: 219399 Reported by: nbe@renzel.net Reviewed by: kib Reviewed by: dchagin (previous version) Tested by: nbe@renzel.net (earlier version) Differential Revision: https://reviews.freebsd.org/D11780
* MFC r322175:kib2017-08-141-15/+35
| | | | | Avoid DI recursion when reclaim_pv_chunk() is called from pmap_advise() or pmap_remove().
* MFC r322171:kib2017-08-141-0/+29
| | | | | Explain why delayed invalidation is not required in pmap_protect() and pmap_remove_pages().
* MFC 322323 by jkimsephe2017-08-141-1/+2
| | | | | | | | | | Split identify_cpu() into two functions for amd64 as we do for i386. This reduces diff between amd64 and i386. Also, it fixes a regression introduced in r322076, i.e., identify_hypervisor() failed to identify some hypervisors. This function assumes cpu_feature2 is already initialized. Reported by: dexuan Tested by: dexuan
* MFC r321924:ed2017-08-111-1/+2
| | | | | | | | | | | | Keep top page on CloudABI to work around AMD Ryzen stability issues. Similar to r321899, reduce sv_maxuser by one page inside of CloudABI. This ensures that the stack, the vDSO and any allocations cannot touch the top page of user virtual memory. Considering that CloudABI userspace is completely oblivious to virtual memory layout, don't bother making this conditional based on the CPU of the running system.
* MFC r321919:kib2017-08-091-2/+2
| | | | | Do not call trapsignal() after handling usermode fault or interrupt, when a signal is not intended to be sent.
* MFC r321847:markj2017-08-081-11/+3
| | | | Batch updates to v_wire_count when freeing page table pages on x86.
* MFC: r322076jkim2017-08-071-0/+2
| | | | Detect hypervisor early so that we set lower hz on it.
* MFC r321730:kib2017-08-061-2/+0
| | | | Remove unused symbols.
* MFC r302843:mav2017-07-242-3/+5
| | | | | | Increase number of I/O APIC pins from 24 to 32 to give PCI up to 16 IRQs. Move HPET to the top of the supported 0-31 range.
* MFC r320546alc2017-07-221-1/+1
| | | | | | | | | When "force" is specified to pmap_invalidate_cache_range(), the given start address is not required to be page aligned. However, the loop within pmap_invalidate_cache_range() that performs the actual cache line invalidations requires that the starting address be truncated to a multiple of the cache line size. This change corrects an error in that truncation.
* MFC r319873:kib2017-07-216-18/+33
| | | | | Move struct syscall_args syscall arguments parameters container into struct thread.
* MFC r319871:kib2017-07-211-7/+7
| | | | | Make struct syscall_args visible to userspace compilation environment from machine/proc.h, consistently on all architectures.
* MFC r320936,r320937,r320938:kib2017-07-201-2/+2
| | | | Fix size argument to vm_pager_allocate().
* MFC r320595:dchagin2017-07-152-0/+26
| | | | | | | | | Add support for musl consumers to the Linuxulator. PR: 213809 Submitted by: Yonas Yanfa (for amd64) Reported by: Yonas Yanfa Relnotes: yes
* MFC r319057:dchagin2017-07-151-22/+0
| | | | | In r246085 some bits that are MI movied out into headers in compat/linux, but I missed that when I commited x86_64 Linuxulator. So remove the duplicates.
* MFC r320308:kib2017-07-011-6/+14
| | | | Translate between abridged and full x87 tags for compat32 ptrace(PT_GETFPREGS).
* MFC r314310alc2017-06-282-31/+40
| | | | | | | | | | Refine the fix from r312954. Specifically, add a new PDE-only flag, PG_PROMOTED, that indicates whether lingering 4KB page mappings might need to be flushed on a PDE change that restricts or destroys a 2MB page mapping. This flag allows the pmap to avoid range invalidations that are both unnecessary and costly. Approved by: re (kib)
* MFC r318398:trasz2017-06-061-1/+1
| | | | | | | | | | | Bump default MAXTSIZ (kern.maxtsiz) from 128MB to 32GB. The old limit prevents one from running eg clang built with debug; the new one is arbitrary (equal to MAXDSIZ) and... well, should be quite future-proof. Same fix might be applicable to other 64 bit architectures; I'll ask their respective maintainers to make sure it won't break anything. Approved by: re (kib)
* MFC r318318:kib2017-05-292-4/+7
| | | | | | Ensure that resume path on amd64 only accesses page tables for normal operation after processor is configured to allow all required features.
* MFC 310177: Enable EARLY_AP_STARTUP on amd64 and i386 kernels by default.jhb2017-05-242-0/+2
| | | | | | PR: 199321, 203682 Discussed with: re (kib) Relnotes: yes
* MFC r308474, r308691, r309203, r309365, r309703, r309898, r310720,markj2017-05-231-30/+12
| | | | | r308489, r308706: Add PQ_LAUNDRY and remove PG_CACHED pages.
* MFC efivar(8) (by imp):kib2017-05-201-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | List of revisions merged: r307070 r307071 r307072 r307074 r307189 r307224 r307339 r307390 r307391 r309776 r314231 r314232 r314615 r314616 r314617 r314618 r314619 r314620 r314621 r314623 r314890 r314925 r314926 r314927 r314928 r315770 r315771 Discussed with: gjb (re), imp Sponsored by: The FreeBSD Foundation
* MFC r318354 (by cem)vangyzen2017-05-191-1/+1
| | | | | | | | | | | | | Correct page frame mask constant used in pmap_change_attr_locked This was introduced in r290156. It's present in 11.0, but not any 10.x release unless someone decided to MFC it. It affects ordinary pages right above the DMAP limit, which is effectively system memory rounded up to a 1 GB (3rd level superpage) boundary (or up to a minimum of 4 GB, on small systems). Sponsored by: Dell EMC
* MFC r313497: ixl(4): Update to 1.7.12-k.erj2017-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | Refresh upstream driver before impending conversion to iflib. Major new features: - Support for Fortville-based 25G adapters - Support for I2C reads/writes (To prevent getting or sending corrupt data, you should set dev.ixl.0.debug.disable_fw_link_management=1 when using I2C [this will disable link!], then set it to 0 when done. The driver implements the SIOCGI2C ioctl, so ifconfig -v works for reading I2C data, but there are read_i2c and write_i2c sysctls under the .debug sysctl tree [the latter being useful for upper page support in QSFP+]). - Addition of an iWARP client interface (so the future iWARP driver for X722 devices can communicate with the base driver). - Add "options IXL_IW" to kernel config to enable this option. Sponsored by: Intel Corporation
* MFC 317786sephe2017-05-121-6/+7
| | | | | | | | | pcicfg: Fix direct calls of pci_cfg{read,write} on systems w/o PCI host bridge. Reported by: dexuan@ Reviewed by: jhb@ Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D10564
* MFC 313564:jhb2017-05-108-8/+0
| | | | | | | | | | | | | Drop the "created from" line from files generated by makesyscalls.sh. This information is less useful when the generated files are included in source control along with the source. If needed it can be reconstructed from the $FreeBSD$ tag in the generated file. Removing this information from the generated output permits committing the generated files along with the change to the system call master list without having inconsistent metadata in the generated files. Regenerate the affected files along with the MFC.
* MFC r315957:dchagin2017-04-292-2/+0
| | | | | Implement Linux mincore() system call. This is necessary for the upcoming drm-next.
* MFC r315505:dchagin2017-04-232-2/+0
| | | | | Implement getrandom() syscall. Note. GRND_RANDOM option is not supported for now.
* MFC r303261,r315059:mmel2017-04-162-2/+12
| | | | | | | | r303261: Add more UEFI/e820 memory types from latest specifications. r315059: Split overbloated machep.c to multiple files and do basic cleanup of these fragments.
* MFC r315501:dchagin2017-04-152-75/+0
| | | | To reduce code duplication move socket defines to the MI path.
* MFC r314866:dchagin2017-04-154-307/+2
| | | | | | | | | | | | | Reduce code duplication between MD Linux code by moving SYSV IPC 64-bit related struct definitions out into the MI path. Invert the native ipc structs to the Linux ipc structs convesion logic. Since 64-bit variant of ipc structs has more precision convert native ipc structs to the 64-bit Linux ipc structs and then truncate 64-bit values into the non 64-bit if needed. Unlike Linux, return EOVERFLOW if the values do not fit. Fix SYSV IPC for 64-bit Linuxulator which never sets IPC_64 bit.
* MFC 316644:avatar2017-04-152-2/+4
| | | | | | | | | | Trying to be more compatible with Linux if.h definitions: - renaming l_ifreq::ifru_metric to l_ifreq::ifru_ivalue; - adding a definition for ifr_ifindex which points to l_ifreq::ifru_ivalue. A quick search indicates that Linux already got the above changes since 2.1.14. Reviewed by: kib, marcel, dchagin
* MFC 314783:mmokhi2017-04-1110-290/+34
| | | | | | Regenerated Linuxulator syscall tables for r314782 Approved by: trasz
* MFC r314782:mmokhi2017-04-114-33/+51
| | | | | | | | Add UNIMPLEMENTED() placeholder macro for the syscalls that are not implemented in Linux kernel itself. Cleanup DUMMY() macros. Approved by: trasz
* Improvements for the brand detection and prioritization.kib2017-04-062-1/+2
| | | | | | | | | | | | | | | | | | | MFC r315701 (by ed): Set the interpreter path to /nonexistent. MFC r315749: Adjust r314851 to not require every brand to specify interpreter path. MFC r315753: Add a flag BI_BRAND_ONLY_STATIC to specify that the brand only matches static binaries. MFC r315754: Update r315753 with the proper flag name. MFC r316211: A followup to r315749, two more places where brand->interp_path was accessed unconditionally.
* Bring kernel space CloudABI code in sync with HEAD.ed2017-04-062-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MFC r312353, r312354 and r312355: Sync in the latest CloudABI generated source files. Languages like C++17 and Go provide direct support for slice types: pointer/length pairs. The CloudABI generator now has more complete for this, meaning that for the C binding, pointer/length pairs now use an automatic naming scheme of ${name} and ${name}_len. Apart from this change and some reformatting, the ABI definitions are identical. Binary compatibility is preserved entirely. MFC r315700: Make file descriptor passing work for CloudABI's sendmsg(). Reduce the potential amount of code duplication between cloudabi32 and cloudabi64 by creating a cloudabi_sock_recv() utility function. The cloudabi32 and cloudabi64 modules will then only contain code to convert the iovecs to the native pointer size. In cloudabi_sock_recv(), we can now construct an SCM_RIGHTS cmsghdr in an mbuf and pass that on to kern_sendit(). MFC r315736: Make file descriptor passing for CloudABI's recvmsg() work. Similar to the change for sendmsg(), create a pointer size independent implementation of recvmsg() and let cloudabi32 and cloudabi64 call into it. In case userspace requests one or more file descriptors, call kern_recvit() in such a way that we get the control message headers in an mbuf. Iterate over all of the headers and copy the file descriptors to userspace.
* MFC r315861:ed2017-04-062-2/+0
| | | | | | Stop providing the compat_3_brand. As of r315860, the ELF image activator works fine for CloudABI without it.
OpenPOWER on IntegriCloud