summaryrefslogtreecommitdiffstats
path: root/sys/i386
Commit message (Collapse)AuthorAgeFilesLines
* Enable the new PCI-PCI bridge driver on amd64 and i386 by default. It canjhb2011-05-031-0/+2
| | | | be disabled via 'nooptions NEW_PCIB'.
* Reimplement how PCI-PCI bridges manage their I/O windows. Previously thejhb2011-05-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | driver would verify that requests for child devices were confined to any existing I/O windows, but the driver relied on the firmware to initialize the windows and would never grow the windows for new requests. Now the driver actively manages the I/O windows. This is implemented by allocating a bus resource for each I/O window from the parent PCI bus and suballocating that resource to child devices. The suballocations are managed by creating an rman for each I/O window. The suballocated resources are mapped by passing the bus_activate_resource() call up to the parent PCI bus. Windows are grown when needed by using bus_adjust_resource() to adjust the resource allocated from the parent PCI bus. If the adjust request succeeds, the window is adjusted and the suballocation request for the child device is retried. When growing a window, the rman_first_free_region() and rman_last_free_region() routines are used to determine if the front or end of the existing I/O window is free. From using that, the smallest ranges that need to be added to either the front or back of the window are computed. The driver will first try to grow the window in whichever direction requires the smallest growth first followed by the other direction if that fails. Subtractive bridges will first attempt to satisfy requests for child resources from I/O windows (including attempts to grow the windows). If that fails, the request is passed up to the parent PCI bus directly however. The PCI-PCI bridge driver will try to use firmware-assigned ranges for child BARs first and only allocate a "fresh" range if that specific range cannot be accommodated in the I/O window. This allows systems where the firmware assigns resources during boot but later wipes the I/O windows (some ACPI BIOSen are known to do this) to "rediscover" the original I/O window ranges. The ACPI Host-PCI bridge driver has been adjusted to correctly honor hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge makes a wildcard request for an I/O window range. The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option is enabled. This is a transition aide to allow platforms that do not yet support bus_activate_resource() and bus_adjust_resource() in their Host-PCI bridge drivers (and possibly other drivers as needed) to use the old driver for now. Once all platforms support the new driver, the kernel option and old driver will be removed. PR: kern/143874 kern/149306 Tested by: mav
* All PCI based wireless drivers seem to be explicitly removed from thebschmidt2011-05-021-0/+6
| | | | PAE kernel config, do that also for those added to GENERIC lately.
* Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver,jhb2011-05-021-0/+1
| | | | | generic PCI-PCI bridge driver, x86 nexus driver, and x86 Host to PCI bridge drivers.
* Add the remaining wireless drivers.bschmidt2011-05-011-0/+10
| | | | Discussed with: joel
* Add urtw(4)kevlo2011-04-291-0/+1
|
* Define "Hypervisor Present" bit. This bit is used by several hypervisors tojkim2011-04-282-1/+2
| | | | | | | identify CPUs running under emulation. Currently QEMU-KVM, Xen-HVM, VMware, and MS Hyper-V are known to set this bit. MFC after: 3 days
* Add the watchdogs patting during the (shutdown time) disk syncing andattilio2011-04-281-0/+8
| | | | | | | | | | | | | | | | disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks
* This patch changes head so that the default NFS client is now the newrmacklem2011-04-271-2/+2
| | | | | | | | | | | | | | NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.
* - Add shim to simplify migration to the CAM-based ATA. For each new adaXmav2011-04-262-0/+2
| | | | | | | | | device in /dev/ create symbolic link with adY name, trying to mimic old ATA numbering. Imitation is not complete, but should be enough in most cases to mount file systems without touching /etc/fstab. - To know what behavior to mimic, restore ATA_STATIC_ID option in cases where it was present before. - Add some more details to UPDATING.
* Fix the experimental NFS client so that it does not boguslyrmacklem2011-04-251-1/+1
| | | | | | | | | | | | | | set the f_flags field of "struct statfs". This had the interesting effect of making the NFSv4 mounts "disappear" after r221014, since NFSMNT_NFSV4 and MNT_IGNORE became the same bit. Move the files used for a diskless NFS root from sys/nfsclient to sys/nfs in preparation for them to be used by both NFS clients. Also, move the declaration of the three global data structures from sys/nfsclient/nfs_vfsops.c to sys/nfs/nfs_diskless.c so that they are defined when either client uses them. Reviewed by: jhb MFC after: 2 weeks
* Switch the GENERIC kernels for all architectures to the new CAM-based ATAmav2011-04-242-17/+15
| | | | | | | | | | | | | stack. It means that all legacy ATA drivers are disabled and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential numbers for each type in order of detection, unless configured otherwise with tunables, see cam(4)). ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX.
* Do not invoke resume event handlers if suspend was successful.jkim2011-04-191-4/+6
| | | | Pointy hat to: jkim
* Add suspend/resume event handlers for apm(4) as well.jkim2011-04-191-6/+7
|
* Make pmap_invalidate_cache_range() available for consumption on amd64.kib2011-04-182-11/+44
| | | | | | | | | | | | | | | | Add pmap_invalidate_cache_pages() method on x86. It flushes the CPU cache for the set of pages, which are not neccessary mapped. Since its supposed use is to prepare the move of the pages ownership to a device that does not snoop all CPU accesses to the main memory (read GPU in GMCH), do not rely on CPU self-snoop feature. amd64 implementation takes advantage of the direct map. On i386, extract the helper pmap_flush_page() from pmap_page_set_memattr(), and use it to make a temporary mapping of the flushed page. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
* Add a function rdtsc32() to read lower 32 bits from TSC and discard upperjkim2011-04-141-0/+9
| | | | | | 32 bits. Some times compiler inserts unnecessary instructions to preserve unused upper 32 bits even when it is casted to a 32-bit value. It reduces such compiler mistakes where every cycle counts.
* Consistently use __volatile as the rest of this file.jkim2011-04-141-5/+5
|
* Consistently use C99 standard integers as the rest of this file.jkim2011-04-141-6/+6
|
* Reduce errors in effective frequency calculation.jkim2011-04-121-2/+3
|
* Reinstate cpu_est_clockrate() support for P-state invariant TSC if APERF andjkim2011-04-121-25/+24
| | | | | | MPERF MSRs are available. It was disabled in r216443. Remove the earlier hack to subtract 0.5% from the calibrated frequency as DELAY(9) is little bit more reliable now.
* Add forgotten declarations for tsc_perf_stat from the previous commit.jkim2011-04-121-0/+1
|
* Probe capability to find effective frequency. When the TSC is P-statejkim2011-04-121-1/+4
| | | | invariant, APERF/MPERF ratio can be used to find effective frequency.
* Add definitions for CPUID instruction 6, ECX information.jkim2011-04-121-0/+6
|
* Add tunables that mirror the functionality of sysctls machdep.panic_on_nmirstone2011-04-081-0/+2
| | | | | | | and machdep.kdb_on_nmi. Approved by: emaste (mentor) MFC after: 1 week
* Use atomic load & store for TSC frequency. It may be overkill for amd64 butjkim2011-04-074-19/+29
| | | | | | | | | safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).
* Implement atomic_load_acq_64(9) and atomic_store_rel_64(9) for i386. Thesejkim2011-04-062-0/+104
| | | | | | | | | | | functions are implemented with CMPXCHG8B instruction where it is available, i. e., all Pentium-class and later processors. Note this instruction is also used for atomic_store_rel_64() because a simple XCHG-like instruction for 64-bit memory access does not exist, unfortunately. If the processor lacks the instruction, i. e., 80486-class CPUs, two 32-bit load/store are performed with interrupt temporarily disabled, assuming it does not support SMP. Although this assumption may be little naive, it is true in reality. This implementation is inspired by Linux.
* Add accounting for most of the memory-related resources.trasz2011-04-051-1/+3
| | | | | Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
* Use cpu_ticks() for get_cyclecount(9) rather than checking existence of TSCjkim2011-04-041-7/+1
| | | | | | | at run-time on i386. cpu_ticks() is set to use RDTSC early enough on i386 where it is available. Otherwise, cpu_ticks() is driven by the current timecounter hardware as binuptime(9) does. This also avoids unnecessary namespace pollution from <machine/cputypes.h>.
* Revert r220032:linux compat: add SO_PASSCRED option with basic handlingavg2011-03-311-1/+0
| | | | | | | | | | I have not properly thought through the commit. After r220031 (linux compat: improve and fix sendmsg/recvmsg compatibility) the basic handling for SO_PASSCRED is not sufficient as it breaks recvmsg functionality for SCM_CREDS messages because now we would need to handle sockcred data in addition to cmsgcred. And that is not implemented yet. Pointyhat to: avg
* Break out the ath PCI logic into a separate device/module.adrian2011-03-312-1/+3
| | | | | | | | | Introduce the AHB glue for Atheros embedded systems. Right now it's hard-coded for the AR9130 chip whose support isn't yet in this HAL; it'll be added in a subsequent commit. Kernel configuration files now need both 'ath' and 'ath_pci' devices; both modules need to be loaded for the ath device to work.
* linux compat: add SO_PASSCRED option with basic handlingavg2011-03-261-0/+1
| | | | | | | | This seems to have been a part of a bigger patch by dchagin that either haven't been committed or committed partially. Submitted by: dchagin, nox MFC after: 2 weeks
* linux compat: add non-dummy capget and capset system calls, regenerateavg2011-03-266-12/+38
| | | | | | | | | | And drop dummy definitions for those system calls. This may transiently break the build. PR: kern/149168 Submitted by: John Wehle <john@feith.com> Reviewed by: netchild MFC after: 2 weeks
* linux compat: add non-dummy capget and capset system callsavg2011-03-261-2/+4
| | | | | | | PR: kern/149168 Submitted by: John Wehle <john@feith.com> Reviewed by: netchild MFC after: 2 weeks
* Export the correct AT_PLATFORM value.dchagin2011-03-261-2/+1
| | | | | | | Since signal trampolines are copied to the shared page do not need to leave place on the stack for it. Forgotten in the previous commit. MFC after: 1 Week
* Improve CPU identifications of various IDT/Centaur/VIA, Rise and Transmetajkim2011-03-262-83/+121
| | | | | | | | | CPUs. These CPUs need explicit MSR configuration to expose ceratin CPU capabilities (e.g., CMPXCHG8B) to work around compatibility issues with ancient software. Unfortunately, Rise mP6 does not set the CX8 bit in CPUID and there is no MSR to expose the feature although all mP6 processors are capable of CMPXCHG8B according to datasheets I found from the Net. Clean up and simplify VIA PadLock detection while I am in the neighborhood.
* Modestly increase the maximum allowed size of the kmem map on i386.alc2011-03-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Also, express this new maximum as a fraction of the kernel's address space size rather than a constant so that increasing KVA_PAGES will automatically increase this maximum. As a side-effect of this change, kern.maxvnodes will automatically increase by a proportional amount. While I'm here ensure that this change doesn't result in an unintended increase in maxpipekva on i386. Calculate maxpipekva based upon the size of the kernel address space and the amount of physical memory instead of the size of the kmem map. The memory backing pipes is not allocated from the kmem map. It is allocated from its own submap of the kernel map. In short, it has no real connection to the kmem map. (In fact, the commit messages for the maxpipekva auto-sizing talk about using the kernel map size, cf. r117325 and r117391, even though the implementation actually used the kmem map size.) Although the calculation is now done differently, the resulting value for maxpipekva should remain almost the same on i386. However, on amd64, the value will be reduced by 2/3. This is intentional. The recent change to VM_KMEM_SIZE_SCALE on amd64 for the benefit of ZFS also had the unnecessary side-effect of increasing maxpipekva. This change is effectively restoring maxpipekva on amd64 to its prior value. Eliminate init_param3() since it is no longer used.
* - Merge changes to the base system to support OFED. These includejeff2011-03-211-29/+38
| | | | | a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
* For now remove options FLOWTABLE from the remaining GENERIC kernelbz2011-03-191-1/+0
| | | | | | | | | | | | | | | configurations and make it opt-in for those who want it. LINT will still build it. While it may be a perfect win in some scenarios, it still troubles users (see PRs) in general cases. In addition we are still allocating resources even if disabled by sysctl and still leak arp/nd6 entries in case of interface destruction. Discussed with: qingli (2010-11-24, just never executed) Discussed with: juli (OCTEON1) PR: kern/148018, kern/155604, kern/144917, kern/146792 MFC after: 2 weeks
* Rework r219679. Always check CPU class at run-time to make it predictable.jkim2011-03-161-5/+5
| | | | | | | | Unfortunately, it pulls in <machine/cputypes.h> but it is small enough and namespace pollution is minimal, I hope. Pointed out by: bde Pointy hat: jkim
* Partially revert r219672. After r198295, kernel need to seed randomness asjkim2011-03-151-0/+4
| | | | | | | soon as possible for stack protector. However, dummy timecounter does not have enough entropy and we don't need to sacrifice Pentium class and later. Pointed out by: Maxim Dounin (mdounin at mdounin dot ru)
* Remove tsc_present from this file, really.jkim2011-03-151-1/+0
|
* Deprecate tsc_present as the last of its real consumers finally disappeared.jkim2011-03-151-1/+1
|
* Unconditionally use binuptime(9) for get_cyclecount(9) on i386. Since thisjkim2011-03-151-7/+2
| | | | | function is almost exclusively used for random harvesting, there is no need for micro-optimization. Adjust the manual page accordingly.
* Make get_cyclecount(9) little bit more useful where binuptime(9) is used.jkim2011-03-141-2/+2
|
* - Initial release of bxe(4) to support Broadcom NetXtreme II 10GbE.davidch2011-03-141-0/+1
| | | | | | (BCM57710, BCM57711, BCM57711E) MFC after: One month
* Enable shared page use for amd64/linux32 and i386/linux binaries.dchagin2011-03-132-18/+21
| | | | | | Move signal trampoline code from the top of the stack to the shared page. MFC after: 2 Weeks
* add DTrace systrace support for linux32 and freebsd32 on amd64 syscallsavg2011-03-125-4/+5728
| | | | | | | | | Regenerate system call and systrace support files. PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (earlier version) MFC after: 3 weeks
* add DTrace systrace support for linux32 and freebsd32 on amd64 syscallsavg2011-03-123-8/+10
| | | | | | | | | | This commits makes necessary changes in syscall/sysent generation infrastructure. PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (ealier version) MFC after: 3 weeks
* Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turnsjkim2011-03-111-9/+16
| | | | | off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as a CPU ticker. Note tsc_present does not change by this tunable.
* Detect NSC/AMD Geode SC1100 properly, not just Stepping 0. Although it isjkim2011-03-101-2/+3
| | | | | unclear that "TSC stops ticking with HLT instruction" problem is present with other steppings, it is limited to Stepping 0 for now.
OpenPOWER on IntegriCloud