summaryrefslogtreecommitdiffstats
path: root/sys
Commit message (Collapse)AuthorAgeFilesLines
* Add missing function trace for debug prints.njl2007-01-231-0/+2
|
* When exiting vfs_export(), delete the "export" option fromrodrigc2007-01-231-11/+31
| | | | | | | | | | | | | | | | | | the mount options list with vfs_deleteopt(). At this point, the export information is saved in mp->mnt_export, so we can delete the "export" mount option from mp->mnt_optnew and mp->mnt_opt. This fixes read-write/read-only update mounts (mount -u -o rw, mount -u -o ro) of NFS exported directories. For some reason, I could only reproduce the problem with a configuration supplied by Andre: - "options QUOTA" enabled in kernel config - "/ -maproot=root 10.0.1.105" in /etc/exports Reported by: kris, Andre Guibert de Bruet <andy siliconlandmark com>, Andrzej Tobola <ato iem pw edu pl> Tested by: Andre Guibert de Bruet
* Remove a PCI ID entry that conflicts with the AMR driver.scottl2007-01-231-1/+0
|
* It seems that enabling Tx and Rx before setting descriptor DMAyongari2007-01-231-15/+17
| | | | | | | | | | | | | addresses shall access invalid descriptor DMA addresses on PCIe hardwares and then panicked the system. To fix it set descriptor DMA addresses before enabling Tx and Rx such that hardware can see valid descriptor DMA addresses. Also set RL_EARLY_TX_THRESH before starting Tx and Rx. Reported by: steve.tell AT crashmail DOT de Tested by: steve.tell AT crashmail DOT de Obtained from: NetBSD MFC after: 1 week
* Clean up some of the various platform and release specific dma tagmjacob2007-01-233-47/+36
| | | | | | | stuff so it is centralized in isp_freebsd.h. Take out PCI posting flushed in qla2100/2200 register reads except for 2100s.
* Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.jhb2007-01-2219-27/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl
* Unbreak writes of 0 bytes. Zero byte writes happen when only ancillaryandre2007-01-222-2/+15
| | | | | | | | | | | control data but no payload data is passed. Change m_uiotombuf() to return at least one empty mbuf if the requested length was zero. Add comment to sosend_dgram and sosend_generic(). Diagnoses by: jhb Regression test by: rwatson Pointy hat to. andre
* Below is slightly edited description of the LOR by Tor Egge:kib2007-01-222-5/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B | +->F->D->E L2: F->D->E | +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
* Add quirk for EasyMP3 EM732X usb 2.0 flash mp3 player.imp2007-01-221-0/+8
| | | | | | | (It appears that the quirk proceedures link has disappeared and that this PR complied with it, if there's a problem, please contact me). PR: usb/96546
* Change the remainder of the drivers for DMA'ing devices enabled in themarius2007-01-2115-40/+57
| | | | | | | | sparc64 GENERIC and the sound device drivers known working on sparc64 to use bus_get_dma_tag() to obtain the parent DMA tag so we can get rid of the sparc64_root_dma_tag kludge eventually. Except for ath(4), sk(4), stge(4) and ti(4) these changes are runtime tested (unless I booted up the wrong kernels again...).
* Correct a logic bug in the previous change.marius2007-01-211-1/+1
|
* Use a printf-modifier which doesn't need a cast.netchild2007-01-211-2/+2
| | | | Submitted by: scottl
* - Disable the long-term load balancer. I believe that steal_busy worksjeff2007-01-201-1/+1
| | | | better and gives more predictable results.
* Fix tinderbox build on amd64.netchild2007-01-201-2/+2
|
* Quiet GCC4 warnings regarding the width of printf()-arguments notmarius2007-01-201-2/+3
| | | | | matching the format. While at it limit the format to unsigned int as we're only interested in the 11 least significant bits anyway.
* The multicast hash table has 8 slots in the BCE hardware, not 4 slots likescottl2007-01-201-4/+4
| | | | | | | the BGE hardware. Adapt the driver for this. Submitted by: Mike Karels MFC After: 3 days
* - We do need to IPI the idlethread on some systems. It may be stuck injeff2007-01-201-7/+1
| | | | | | | a power saving mode otherwise. - If the thread is already bound in sched_bind() unbind it before re-binding it to a new cpu. I don't like these semantics but they are expected by some code in the tree. Patch by jkoshy.
* MFp4 (113077, 113083, 113103, 113124, 113097):netchild2007-01-203-17/+83
| | | | | | | | | | | | | | | | | | | | | | | | | Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb
* Add macros for the individual divisor bits as some MC146818A-compatiblemarius2007-01-201-4/+7
| | | | chips also use them for different purposes.
* Remove BUS_DMA_WAITOK from bus_dma_tag_create() invocations as it'smarius2007-01-204-7/+7
| | | | no valid flag there.
* - Use bus_get_dma_tag() to obtain the parent DMA tag so dma(4) willmarius2007-01-201-6/+6
| | | | | | | | | work when we start requiring this. - Don't specify an alignment when creating our own parent DMA tag; the supported DMA engines require no alignment constraint (f.e. the LANCE child does though) and it's no inherited by the child DMA tags anyway (which probably is a bug though). - Fix whitespace nits.
* Fix build. chkdquot() should not return anything.delphij2007-01-201-1/+1
|
* Add front-ends for the 'lebuffer' variants found on some SBus cards.marius2007-01-204-2/+714
| | | | | | | | | These are shared-memory variants based on Am79C90-compatible chips that apart from the missing DMA engine are similar to the 'ledma' variant including using a (pseudo-)bus/device for the buffer that the actual LANCE device hangs off from. The performance of these is close to that of the 'ledma' one, like expected at a few times the CPU load though.
* Quota system cleanup.mpp2007-01-204-30/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Do not do quota accounting for the actual quota data files or for file system snapshot files ("system" files). This prevents a deadlock descibed in PR kern/30958 if the kernel ever has to grow the quota file. Snapshot files were already exempt from the quota checks, but this change generalized the check. 2) Fix a cast that caused extremely large uids/gids to incorrectly write the quota information to the data file at a truncated value for a uint_t32 id value. The incorrect cast caused quota files in this case to be around 4GB in size, with the correct cast they can now be 131GB in size. Also related to PR kern/30958. 3) Check for what appear to be negative UIDs/GIDs and not account for them. This prevents the quota files from becoming 131GB in size and causing quotacheck to run forever at bootup. This could also cause the kernel to try and expand the quota file, which might deadlock due to the issue in #1. kern/30958 and kern/38156 (and some much older closed PR's). 4) With the deadlock problems gone, the kernel can now expand the size of the quota database files if it needs to. 5) Pass in the i-node count change value to chkiq and chkiqchg as an int, like it used to be before the common routine was split up into 2 different routines to increase / decrease the i-node in-use count. Prevents an underflow on the i-node count. Related to PR kern/89247. 6) Prevent the block usage from growing slowly if a file system is full and the write was denied due to that fact. PR kern/89247. Some of these changes require an updated quotacheck to prevent the creation of huge (131GB) quota data files (item #3). #1/#4 probably fixes a lot of the random hangs when quotas are enabled, possibly some of the jail hangs.
* Ooops, fix the ratelimit.netchild2007-01-201-1/+3
|
* Convert a KASSERT into a runtime warning (rate limited) + failsafe fallback.netchild2007-01-201-4/+12
| | | | | | | Because of a stupid bug (also fixed with this commit) the KASSERT was triggered when runnung the linux top. Pointy hat to: netchild
* For setting the port PCnet chips must be powered down or stopped andmarius2007-01-201-7/+15
| | | | | | | | | | unlike documented may not take effect without an initialization. So don't invoke (*sc_mediachange) directly in lance_mediachange() but go through lance_init_locked(). It's suboptimal to impose this for all chips but given that besides the affected PCI bus front-end the only other front-end which supports media selection is and likely ever will be the 'ledma' front-end I see not enough reason to break the in-driver API for this (though one could argue both ways here).
* Use bus_get_dma_tag() to obtain the parent DMA tag so le(4) works onmarius2007-01-203-3/+3
| | | | platforms requiring this.
* - In tdq_transfer() always set NEEDRESCHED when necessary regardless ofjeff2007-01-201-15/+25
| | | | | | | | | | the ipi settings. If NEEDRESCHED is set and an ipi is later delivered it will clear it rather than cause extra context switches. However, if we miss setting it we can have terrible latency. - In sched_bind() correctly implement bind. Also be slightly more tolerant of code which calls bind multiple times. However, we don't change binding if another call is made with a different cpu. This does not presently work with hwpmc which I believe should be changed.
* Grumble- let a linux-ism slip in and had an llx whichmjacob2007-01-201-9/+12
| | | | then choked on a 64 bit platforms. Oops.
* MFP4: Move default setting to the end of isp_reset instead of themjacob2007-01-207-93/+138
| | | | | | | | | | front of isp_init so we can read NVRAM even if we're role ISP_NONE. Prepare for reintroduction of channels (for FC) for N-Port Virtualization. Fix a botch in handle assignment that caused us to nuke one device when a new one arrives and end up with two devices with the same identity in the virtual target mapping table.
* - In miibus_attach() remove IFM_IMASK from the dontcare_mask of themarius2007-01-201-2/+17
| | | | | | | | | | | | | | | | ifmedia_init() invocation. IFM_IMASK makes only sense here when all of the maxium of 32 PHYs on each one MII bus support disjoint sets of media, which generally isn't the case (though it would be nice if we had a way to let NIC drivers indicate that for the few card models where the PHY configuration is known/fixed and IFM_IMASK actually makes sense). - Add and use a miibus_print_child() for the bus_print_child method which additionally prints the PHY number (which actually is the PHY address) so one can figure out the media instance <-> PHY number mapping from the PHY driver attach output. This is intented to be usefull in situations where the addresses of the PHYs on the bus are known (f.e. of internal/ integrated PHYs) so one can feed the appropriate media instance number to ifconfig(8) (with the upcoming change for ifconfig(8)). This is more or less inspired by the NetBSD mii_print().
* - Don't set MIIF_NOISOLATE so ukphy(4) can be used in configurations withmarius2007-01-201-3/+1
| | | | | | | | | multiple PHYs. In case some PHYs currently driven by ukphy(4) exhibit problems when isolating due to incomplete implementations or silicon bugs we'll need to add specific drivers for these. Looking at NetBSD and OpenBSD I don't expect problems here though (quite the contrary; we still seem to set MIIF_NOISOLATE without good reason in a bunch of PHY drivers). - Fix a style(9) whitespace nit.
* - Change the PCI-X registers constants to be relative to the PCI-X PCIjhb2007-01-193-22/+88
| | | | | | | | | capability rather than hardcoded offsets for a particular card. While I'm here, expand the constants some. - Change the ahd(4) driver to use pci_find_extcap() to locate the PCI-X capability to keep up with the first change. Reviewed by: scottl, gibbs (earlier version)
* Major revamp of ULE's cpu load balancing:jeff2007-01-191-237/+290
| | | | | | | | | | | | | | | | | | | | | | | - Switch back to direct modification of remote CPU run queues. This added a lot of complexity with questionable gain. It's easy enough to reimplement if it's shown to help on huge machines. - Re-implement the old tdq_transfer() call as tdq_pickidle(). Change sched_add() so we have selectable cpu choosers and simplify the logic a bit here. - Implement tdq_pickpri() as the new default cpu chooser. This algorithm is similar to Solaris in that it tries to always run the threads with the best priorities. It is actually slightly more complex than solaris's algorithm because we also tend to favor the local cpu over other cpus which has a boost in latency but also potentially enables cache sharing between the waking thread and the woken thread. - Add a bunch of tunables that can be used to measure effects of different load balancing strategies. Most of these will go away once the algorithm is more definite. - Add a new mechanism to steal threads from busy cpus when we idle. This is enabled with kern.sched.steal_busy and kern.sched.busy_thresh. The threshold is the required length of a tdq's run queue before another cpu will be able to steal runnable threads. This prevents most queue imbalances that contribute the long latencies.
* Remove remnants from the sparc64 origin of this file and which aremarius2007-01-191-5/+0
| | | | unlikely to be ever used and misplaced on sun4v respectively.
* Convert the remainder of the low hanging fruits regarding includingmarius2007-01-1924-139/+141
| | | | | | | | headers in .S directly rather than getting to their macros through genassym.c/assym.s so there are less headers genassym.c has to be kept in sync with. While at it fix some stytle(9) bugs (indentation, prototype format, sort headers, etc) and remove trailing whitespace.
* Cope gracefully with device_get_children returning an error.imp2007-01-191-2/+5
| | | | | Obtained from: Hans Petter Selasky P4: http://perforce.freebsd.org/chv.cgi?CH=112957
* - Add a uart_rxready() and corresponding device-specific implementationsmarius2007-01-187-48/+45
| | | | | | | | | | | | | | | that can be used to check whether receive data is ready, i.e. whether the subsequent call of uart_poll() should return a char, and unlike uart_poll() doesn't actually receive data. - Remove the device-specific implementations of uart_poll() and implement uart_poll() in terms of uart_getc() and the newly added uart_rxready() in order to minimize code duplication. - In sunkbd(4) take advantage of uart_rxready() and use it to implement the polled mode part of sunkbd_check() so we don't need to buffer a potentially read char in the softc. - Fix some mis-indentation in sunkbd_read_char(). Discussed with: marcel
* A less draconian fix to the build.mjacob2007-01-181-3/+1
|
* - Probe the CS4231 in USIII machines.marius2007-01-181-6/+4
| | | | | | - Remove unused variables. [1] Reported by: Coverity Prevent (CID 700, 701) [1]
* Temporarily comment out the KASSERT that broke the kernel build.obrien2007-01-181-0/+2
|
* - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)marius2007-01-185-42/+42
| | | | | | | | | | may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which in turn reflects a JBus bus or something like that...). - In the both the sparc64 and sun4v bus_machdep.c use __FBSDID. - Spell SBus the official way in comments. - Replace hardcoded function names (all of which were actually outdated) in panic and status strings with __func__. - Fix whitespace nits.
* Revise the ng_ppp(4) node, so that code flow is more clear. All non-linkglebius2007-01-181-514/+805
| | | | | | | | hooks get their per hook rcvdata methods, and all functions are organized corresponding to protocol stack model. Submitted by: Alexander Motin <mav alkar.net> Reviewed by: archie, julian
* Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()marius2007-01-186-189/+0
| | | | | | | | | | | | | | | | and friends along with all hacks required to implement them. None of the drivers currently built (as part of GENERIC, LINT or modules) on sparc64 or sun4v and none of those we might want to use there in future uses them, AFAICT there actually never was a driver hooked up to the sparc64 or sun4v build that correctly used these functions (and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and the other functions working on a memory handle never actually worked on sun4v). All they ever were good for on sparc64 and sun4v was erroneously dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4) and syscons(4) in source files that supposedly were bus-neutral and hiding issues with drivers like f.e. ng_bt3c(4) that used these functions with busses other than isa(4) and therefore couldn't work on these platforms.
* Wrap the EISA-specific parts of the dpt(4) and si(4) back-ends inmarius2007-01-185-0/+23
| | | | | | | the newly added DEV_EISA. This is done so that these back-ends can be compiled on platforms not providing in{b,w,l}()/out{b,w,l}() and friends (but may wish to use them together with bus front-ends other than the EISA one).
* On sparc64 also use the fillw() this header provides for ia64 somarius2007-01-181-2/+4
| | | | | the sparc64 MD code doesn't need to provide a memsetw() along with the ISA compat cruft.
* - most all includes (#include <>) migrate to the sctp_os_bsd.h filerrs2007-01-1828-883/+193
| | | | | | | | | | | | | | | - Finally all splxx() are removed - Count error fixed in mapping array which might cause a wrong cumack generation. - Invariants around panic for case D + printf when no invariants. - one-to-one model race condition fixed by using a pre-formed connection and then completing the work so accept won't happen on a non-formed association. - Some additional paranoia checks in sctp_output. - Locks that were missing in the accept code. Approved by: gnn
* Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flagskib2007-01-181-17/+46
| | | | | | | | | | to open() [1]. Improve locking for accessing session control structures [2]. Try to document (most likely harmless) races in the code [3]. Based on submission by: Intron (intron at intron ac) [1] Reviewed by: jhb [2] Discussed with: netchild, rwatson, jhb [3]
* Set topology change propagation on all ports _except_ the caller.thompsa2007-01-181-1/+1
|
OpenPOWER on IntegriCloud