FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Update birth entry for Warren Zevon with his birthplace, and add an	dougb	2007-01-24	1	-1/+2
\| \| \| \|	entry for his death. Both per Wikipedia.
*	- Add a horrible bit of code to detect tsc differences between processors.	jeff	2007-01-24	1	-28/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This only works if there is no significant drift and all processors are running at the same frequency. Fortunately, schedgraph traces on MP machines tend to cover less than a second so drift shouldn't be an issue. - KTRFile::synchstamp() iterates once over the whole list to determine the lowest tsc value and syncs adjusts all other values to match. We assume that the first tick recorded on all cpus happened at the same instant to start with. - KTRFile::monostamp() iterates again over the whole file and checks for a cpu agnostic monotonically increasing clock. If the time ever goes backwards the cpu responsible is adjusted further to fit. This will make the possible incorrect delta between cpus as small as the shortest time between two events. This time can be fairly large due to sched_lock essentially protecting all events. - KTRFile::checkstamp() now returns an adjusted timestamp. - StateEvent::draw() detects states that occur out of order in time and draws them as 0 pixels after printing a warning.
*	- With a sleep time over 2097 seconds hzticks and slptime could end up	jeff	2007-01-24	1	-5/+6
\| \| \| \| \| \|	negative. Use unsigned integers for sleep and run time so this doesn't disturb sched_interact_score(). This should fix the invalid interactive priority panics reported by several users.
*	Fixes the MSG_PEEK for sctp_generic_recvmsg() the msg_flags	rrs	2007-01-24	1	-2/+10
\| \| \| \| \| \|	were not being copied in properly so PEEK and any other msg_flags input operation were not being performed right. Approved by: gnn
*	Bump .Dd for r1.313.	ceri	2007-01-24	1	-1/+1
\|
*	Document LD_UTRACE.	jhb	2007-01-23	1	-2/+7
\| \| \| \|	MFC after: 3 days
*	- Print clock information so we know if something is not reported correctly	jeff	2007-01-23	1	-7/+7
\| \| \| \| \| \| \| \| \| \|	from the tsc. - Set skipnext = 1 for yielding and preempted events so we don't show the event that adds us back to the run queue. It used to be 2 so we would skip the ksegrp run queue addition and the system run queue addition but the ksegrp run queue has gone away. - Don't display down to nanosecond resolution for scheduling events right now. This can sometimes cause a division by zero.
*	Document new quota knobs.	mpp	2007-01-23	1	-3/+51
\|
*	o introduce a flags 'errata' for HW bugs onto the softc.	bruno	2007-01-23	1	-42/+97
\| \| \| \| \| \| \| \| \| \| \| \| \|	o remove errata_a0 and introduce the corresponding flags into 'errata'. o introduce a new errata for K8, namely some platform might set the PENDING_BIT but aren't able to unset it, also don't loop forever waiting PENDING_BIT being cleared. o try to introduce a workaround for the PENDING_BIT stuck problem, o support now half multipliers for K8. Tested by: Abdullah Al-Marrie Approved by: njl
*	Use the more specific 'EM732X' designation rather than * to disable sync	imp	2007-01-23	1	-1/+1
\| \| \| \|	cache commands, per request from njl@.
*	Cylinder group bitmaps and blocks containing inode for a snapshot	kib	2007-01-23	10	-42/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	file are after snaplock, while other ffs device buffers are before snaplock in global lock order. By itself, this could cause deadlock when bdwrite() tries to flush dirty buffers on snapshotted ffs. If, during the flush, COW activity for snapshot needs to allocate block and ffs_alloccg() selects the cylinder group that is being written by bdwrite(), then kernel would panic due to recursive buffer lock acquision. Avoid dealing with buffers in bdwrite() that are from other side of snaplock divisor in the lock order then the buffer being written. Add new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in the bdwrite(). Default implementation, bufbdflush(), refactors the code from bdwrite(). For ffs device buffers, specialized implementation is used. Reviewed by: tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes) Tested by: Peter Holm X-MFC after: 3 weeks (if ever: it changes ABI)
*	Remove mount_nfs4 from SUBDIR list. The mount_nfs Makefile	rodrigc	2007-01-23	1	-1/+0
\| \| \| \|	links mount_nfs to mount_nfs4 now.
*	Link mount_nfs -> mount_nfs4, and mount_nfs.8 -> mount_nfs4.8.	rodrigc	2007-01-23	1	-0/+3
\| \| \| \|	Suggested by: rees
*	- Catch up to setrunqueue/choosethread/etc. api changes.	jeff	2007-01-23	1	-39/+90
\| \| \| \| \| \| \| \| \| \|	- Define our own maybe_preempt() as sched_preempt(). We want to be able to preempt idlethread in all cases. - Define our idlethread to require preemption to exit. - Get the cpu estimation tick from sched_tick() so we don't have to worry about errors from a sampling interval that differs from the time domain. This was the source of sched_priority prints/panics and inaccurate pctcpu display in top.
*	Oops, pc98 is independent of i386 for clock.c and machdep.c but not	bde	2007-01-23	3	-25/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	for clock.h, so changing th i386 clock.h broke it. MFi386 (not tested): Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.
*	- Remove setrunqueue and replace it with direct calls to sched_add().	jeff	2007-01-23	18	-208/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
*	- Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision	jeff	2007-01-23	2	-18/+12
\| \| \| \|	for this behavior on the initiator side.
*	Cleaned up declaration and initialization of clock_lock. It is only	bde	2007-01-23	7	-43/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.
*	Add missing function trace for debug prints.	njl	2007-01-23	1	-0/+2
\|
*	Merge mount_nfs4.c and mount_nfs.c into one program.	rodrigc	2007-01-23	2	-7/+248
\| \| \| \| \| \| \| \| \| \| \|	If argv[0] == "mount_nfs4", then default to mounting NFSv4, otherwise if argv[0] == "mount_nfs", default to the old mount_nfs behavior. - Add a -4 option. - Add the University of Michigan copyright from mount_nfs4.c, for the code merged from mount_nfs4.c. Reviewed by: rees
*	When exiting vfs_export(), delete the "export" option from	rodrigc	2007-01-23	1	-11/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the mount options list with vfs_deleteopt(). At this point, the export information is saved in mp->mnt_export, so we can delete the "export" mount option from mp->mnt_optnew and mp->mnt_opt. This fixes read-write/read-only update mounts (mount -u -o rw, mount -u -o ro) of NFS exported directories. For some reason, I could only reproduce the problem with a configuration supplied by Andre: - "options QUOTA" enabled in kernel config - "/ -maproot=root 10.0.1.105" in /etc/exports Reported by: kris, Andre Guibert de Bruet <andy siliconlandmark com>, Andrzej Tobola <ato iem pw edu pl> Tested by: Andre Guibert de Bruet
*	Remove a PCI ID entry that conflicts with the AMR driver.	scottl	2007-01-23	1	-1/+0
\|
*	Use fseeko to seek in the file, instead of fseek to prevent seek	mpp	2007-01-23	1	-2/+2
\| \| \| \|	errors for extremely large uids (e.g. in the billions range).
*	Make sure that unknown uids/gids that now have non-zero usage and	mpp	2007-01-23	1	-7/+24
\| \| \| \| \|	had a previously recorded usage of zero are correctly displayed in verbose mode. Generalize the print routine a little too.
*	It seems that enabling Tx and Rx before setting descriptor DMA	yongari	2007-01-23	1	-15/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	addresses shall access invalid descriptor DMA addresses on PCIe hardwares and then panicked the system. To fix it set descriptor DMA addresses before enabling Tx and Rx such that hardware can see valid descriptor DMA addresses. Also set RL_EARLY_TX_THRESH before starting Tx and Rx. Reported by: steve.tell AT crashmail DOT de Tested by: steve.tell AT crashmail DOT de Obtained from: NetBSD MFC after: 1 week
*	Clean up some of the various platform and release specific dma tag	mjacob	2007-01-23	3	-47/+36
\| \| \| \| \| \| \|	stuff so it is centralized in isp_freebsd.h. Take out PCI posting flushed in qla2100/2200 register reads except for 2100s.
*	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.	jhb	2007-01-22	19	-27/+281
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl
*	Unbreak writes of 0 bytes. Zero byte writes happen when only ancillary	andre	2007-01-22	2	-2/+15
\| \| \| \| \| \| \| \| \| \| \|	control data but no payload data is passed. Change m_uiotombuf() to return at least one empty mbuf if the requested length was zero. Add comment to sosend_dgram and sosend_generic(). Diagnoses by: jhb Regression test by: rwatson Pointy hat to. andre
*	Document the existence of the TCP_INFO socket option.	bms	2007-01-22	1	-1/+24
\| \| \| \|	Approved by: rwatson
*	Actually fully emulate NetBSD and print the media instance number	marius	2007-01-22	1	-2/+3
\| \| \| \| \| \| \|	only for non-zero instances so the typical output for IFM_IEEE80211 type media doesn't overflow 80 columns. Requested by: sam
*	Docuemnt exactly which functions access which NSS databases.	bms	2007-01-22	2	-6/+58
\| \| \| \| \| \| \| \|	Point out that FreeBSD libc has compat stubs for GNU glibc NSS modules which access NSDB_PASSWD/NSDB_GROUP, but not NSDB_HOSTS; based on painful experience porting nss_mdns. Reviewed by: ru
*	Below is slightly edited description of the LOR by Tor Egge:	kib	2007-01-22	2	-5/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B \| +->F->D->E L2: F->D->E \| +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
*	Add quirk for EasyMP3 EM732X usb 2.0 flash mp3 player.	imp	2007-01-22	1	-0/+8
\| \| \| \| \| \| \|	(It appears that the quirk proceedures link has disappeared and that this PR complied with it, if there's a problem, please contact me). PR: usb/96546
*	Change the remainder of the drivers for DMA'ing devices enabled in the	marius	2007-01-21	15	-40/+57
\| \| \| \| \| \| \| \|	sparc64 GENERIC and the sound device drivers known working on sparc64 to use bus_get_dma_tag() to obtain the parent DMA tag so we can get rid of the sparc64_root_dma_tag kludge eventually. Except for ath(4), sk(4), stge(4) and ti(4) these changes are runtime tested (unless I booted up the wrong kernels again...).
*	Correct a logic bug in the previous change.	marius	2007-01-21	1	-1/+1
\|
*	Use a printf-modifier which doesn't need a cast.	netchild	2007-01-21	1	-2/+2
\| \| \| \|	Submitted by: scottl
*	Decrease to WARNS=3.	rodrigc	2007-01-20	1	-1/+1
\|
*	Clean up compilation warnings. Set WARNS=6 in Makefile.	rodrigc	2007-01-20	10	-89/+35
\| \| \| \| \|	PR: 71659 Submitted by: Dan Lukes <dan obluda cz>
*	- Disable the long-term load balancer. I believe that steal_busy works	jeff	2007-01-20	1	-1/+1
\| \| \| \|	better and gives more predictable results.
*	Fix tinderbox build on amd64.	netchild	2007-01-20	1	-2/+2
\|
*	Quiet GCC4 warnings regarding the width of printf()-arguments not	marius	2007-01-20	1	-2/+3
\| \| \| \| \|	matching the format. While at it limit the format to unsigned int as we're only interested in the 11 least significant bits anyway.
*	The multicast hash table has 8 slots in the BCE hardware, not 4 slots like	scottl	2007-01-20	1	-4/+4
\| \| \| \| \| \| \|	the BGE hardware. Adapt the driver for this. Submitted by: Mike Karels MFC After: 3 days
*	- We do need to IPI the idlethread on some systems. It may be stuck in	jeff	2007-01-20	1	-7/+1
\| \| \| \| \| \| \|	a power saving mode otherwise. - If the thread is already bound in sched_bind() unbind it before re-binding it to a new cpu. I don't like these semantics but they are expected by some code in the tree. Patch by jkoshy.
*	MFp4 (113077, 113083, 113103, 113124, 113097):	netchild	2007-01-20	3	-17/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb
*	Add macros for the individual divisor bits as some MC146818A-compatible	marius	2007-01-20	1	-4/+7
\| \| \| \|	chips also use them for different purposes.
*	Remove BUS_DMA_WAITOK from bus_dma_tag_create() invocations as it's	marius	2007-01-20	4	-7/+7
\| \| \| \|	no valid flag there.
*	- Use bus_get_dma_tag() to obtain the parent DMA tag so dma(4) will	marius	2007-01-20	1	-6/+6
\| \| \| \| \| \| \| \| \|	work when we start requiring this. - Don't specify an alignment when creating our own parent DMA tag; the supported DMA engines require no alignment constraint (f.e. the LANCE child does though) and it's no inherited by the child DMA tags anyway (which probably is a bug though). - Fix whitespace nits.
*	Fix build. chkdquot() should not return anything.	delphij	2007-01-20	1	-1/+1
\|
*	- For the sake of completeness mention back-end support for the ILACC	marius	2007-01-20	1	-29/+57
\| \| \| \| \| \| \| \| \| \|	and add a list of known-working PCI devices. - For consistency throughout this man page also talk about C-Bus and ISA adapters rather than cards. - Add missing .Tn. - Mention ifconfig(8) along with listing selectable media types. - Add/un-comment hardware notes for the newly supported 'lebuffer' variants (the transition from P/N 501-1860 to 501-1869 isn't a typo).
*	Add front-ends for the 'lebuffer' variants found on some SBus cards.	marius	2007-01-20	4	-2/+714
\| \| \| \| \| \| \| \| \|	These are shared-memory variants based on Am79C90-compatible chips that apart from the missing DMA engine are similar to the 'ledma' variant including using a (pseudo-)bus/device for the buffer that the actual LANCE device hangs off from. The performance of these is close to that of the 'ledma' one, like expected at a few times the CPU load though.