summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Use the expression (x+0.0)-(y+0.0) instead of x+y when mixing NaN arg(s).bde2008-02-142-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses 2 tricks to improve consistency so that more serious problems aren't hidden in simple regression tests by noise for the NaNs: - for a signaling NaN, adding 0.0 generates the invalid exception and converts to a quiet NaN, and doesn't have too many effects for other types of args (it converts -0 to +0 in some rounding modes, but that hopefully doesn't change the result after adding the NaN arg). This avoids some inconsistencies on i386 and ia64. On these arches, the result of an operation on 2 NaNs is apparently the largest or the smallest of the NaNs as bits (consistently largest or smallest for each arch, but the opposite). I forget which way the comparison goes and if the sign bit affects it. The quiet bit is is handled poorly by not always setting it before the comparision or ignoring it. Thus if one of the args was originally a signaling NaN and the other was originally a quiet NaN, then the result depends too much on whether the signaling NaN has been quieted at this point, which in turn depends on optimizations and promotions. E.g., passing float signaling NaNs to double functions must quiet them on conversion; on i387, loading a signaling NaN of type float or double (but not long double) into a register involves a conversion, so it quiets signaling NaNs, so if the addition has 2 register operands than it only sees quiet NaNs, but if the addition has a memory operand then it sees a signaling NaN iff it is in the memory operand. - subtraction instead of addition is used to avoid a dubious optimization in old versions of gcc. For SSE operations, mixing of NaNs apparently always gives the target operand. This is not as good as the i387 and ia64 behaviour. It doesn't mix NaNs at all, and makes addition not quite commutative. Old versions of gcc sometimes rewrite x+y to y+x and thus give different results (in bits) for NaNs. gcc-3.3.3 rewrites x+y to y+x for one of pow() and powf() but not the other, so starting from float NaN args x and y, powf(x, y) was almost always different from pow(x, y). These tricks won't give consistency of 2-arg float and double functions with long double ones on amd64, since long double ones use the i387 which has different semantics from SSE. Convert to __FBSDID().
* Prefer NULL over integer 0 for pointer type.yongari2008-02-141-11/+11
|
* Nuke local jumbo allocator and switch to use of UMA backed pageyongari2008-02-142-295/+142
| | | | | | | | | | | | | | | | | | | | | | | | | allocator for jumbo frame. o Removed unneeded jlist lock which was used to manage jumbo buffers. o Don't reinitialize hardware if MTU was not changed. o Added additional check for minimal MTU size. o Added a new tunable hw.skc.jumbo_disable to disable jumbo frame support for the driver. The tunable could be set for systems that do not need to use jumbo frames and it would save (9K * number of Rx descriptors) bytes kernel memory. o Jumbo buffer allocation failure is no longer critical error for the operation of sk(4). If sk(4) encounter the allocation failure it just disables jumbo frame support and continues to work without user intervention. With these changes jumbo frame performance of sk(4) was slightly increased and users should not encounter jumbo buffer allocation failure. Previously sk(4) tried to allocate physically contiguous memory, 3388KB for 256 Rx descriptors. Sometimes that amount of contiguous memory region could not be available for running systems which in turn resulted in failure of loading the driver. Tested by: Cy Schubert < Cy.Schubert () komquats dot com >
* Remove debugging code under OLD_DIAGNOSTIC; this is all >10 years old andrwatson2008-02-142-32/+3
| | | | | | hasn't been used in that time. MFC after: 1 month
* In Coda, flush the attribute cache for a cnode when its fid isrwatson2008-02-141-1/+4
| | | | | | | changed, as its synthesized inode number may have changed and we want stat(2) to pick up the new inode number. MFC after: 1 month
* Add minimally invasive shims to ease MFCs of mxge back as fargallatin2008-02-143-10/+68
| | | | | | as RELENG_6 Sponsored by: Myricom, Inc.
* Add KASSERT()'s to catch attempts to recurse on spin mutexes that aren'tjhb2008-02-131-1/+9
| | | | | | marked recursable either via mtx_lock_spin() or thread_lock(). MFC after: 1 week
* Mark the syscons video spin mutex as recursable since it is currentlyjhb2008-02-131-1/+2
| | | | | | recursed in a few places. MFC after: 1 week
* Mark sleepqueue chain spin mutexes are recursable since the sleepq codejhb2008-02-131-1/+1
| | | | | | | now recurses on them in sleepq_broadcast() and sleepq_signal() when resuming threads that are fully asleep. MFC after: 1 week
* Add a couple of assertions and KTR logging to thread_lock_flags() tojhb2008-02-131-1/+7
| | | | | | match mtx_lock_spin_flags(). MFC after: 1 week
* Make the type of the firmware arrays match thosegallatin2008-02-132-6/+6
| | | | in the other eth*_z8e.h files.
* Update manpage with lockmgr_assert() description.attilio2008-02-132-4/+73
|
* Add an automatic kernel module version dependency to prevent loadingjhb2008-02-132-0/+15
| | | | | | | | | | | | | modules using invalid ABI versions (e.g. a 7.x module with an 8.x kernel) for a given kernel: - Add a 'kernel' module version whose value is __FreeBSD_version. - Add a version dependency on 'kernel' in every module that has an acceptable version range of __FreeBSD_version up to the end of the branch __FreeBSD_version is part of. E.g. a module compiled on 701000 would work on kernels with versions between 701000 and 799999 inclusive. Discussed on: arch@ MFC after: 1 week
* Bump __FreeBSD_version after the introduction of:attilio2008-02-131-1/+1
| | | | | | | - lockmgr_assert() - BUF_ASSERT_*() family functions which enriched the KPI.
* Improve conformance to the HTTP specification by using case-insensitivecperciva2008-02-131-6/+6
| | | | | | | | comparisons for header keywords. Apparently some proxies use creative capitalization. Weird proxy found by: brooks MFC after: 3 days
* - Add real assertions to lockmgr locking primitives.attilio2008-02-138-57/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A couple of notes for this: * WITNESS support, when enabled, is only used for shared locks in order to avoid problems with the "disowned" locks * KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order to assert for a generic thread (not curthread) owning or not the lock. Really, this kind of check is bogus but it seems very widespread in the consumers code. So, for the moment, we cater this untrusted behaviour, until the consumers are not fixed and the options could be removed (hopefully during 8.0-CURRENT lifecycle) * Implementing KA_HELD and KA_UNHELD (not surported natively by WITNESS) made necessary the introduction of LA_MASKASSERT which specifies the range for default lock assertion flags * About other aspects, lockmgr_assert() follows exactly what other locking primitives offer about this operation. - Build real assertions for buffer cache locks on the top of lockmgr_assert(). They can be used with the BUF_ASSERT_*(bp) paradigm. - Add checks at lock destruction time and use a cookie for verifying lock integrity at any operation. - Redefine BUF_LOCKFREE() in order to not use a direct assert but let it rely on the aforementioned destruction time check. KPI results evidently broken, so __FreeBSD_version bumping and manpage update result necessary and will be committed soon. Side note: lockmgr_assert() will be used soon in order to implement real assertions in the vnode namespace replacing the legacy and still bogus "VOP_ISLOCKED()" way. Tested by: kris (earlier version) Reviewed by: jhb
* Update cache flushing behavior in light of recent namecache andrwatson2008-02-131-7/+0
| | | | | | | | | | | | | access cache improvements: - Flush just access control state on CODA_PURGEUSER, not the full namecache for /coda. - When replacing a fid on a cnode as a result of, e.g., reintegration after offline operation, we no longer need to purge the namecache entries associated with its vnode. MFC after: 1 month
* The hptrr driver first appeared in 6.3, not 5.3.brueffer2008-02-131-1/+1
| | | | | | PR: 120616 Submitted by: Josh Paetzel <josh@tcbug.org> MFC after: 3 days
* Forced commit to note that the lost log message for the previous commitbde2008-02-130-0/+0
| | | | | | | | | said that the previous commit was almost a null forced commit too. It just converted to __FBSDID(). I was going to change `huge' from its double precision value of 1e300, but that seems to be unnecessary since `huge' is only used to set FE_INEXACT, and any value with an exponent larger than LDBL_MANT_DIG will do for that, while initializing a really huge value in a portable way would require more code.
* s_ceill.cbde2008-02-133-9/+6
| | | | | s_floorl.c s_truncl.c
* Use RTFREE_LOCKED() instead of rtfree() when releasing a reference on thejhb2008-02-131-1/+1
| | | | | | | | 'rt' route in rtredirect() as 'rt' is always locked. MFC after: 1 week PR: kern/117913 Submitted by: Stefan Lambrev stefan.lambrev of moneybookers.com
* On arches where long double is the same as double, alias ceil(), floor()bde2008-02-134-6/+19
| | | | | | | and trunc() to the corresponding long double functions. This is not just an optimization for these arches. The full long double functions have a wrong value for `huge', and the arches without full long doubles depended on it being wrong.
* Remove coda_namecache from coda5 as well. We should probably GC coda5rwatson2008-02-131-1/+1
| | | | | | entirely at this point as coda6 is considered the supported branch. MFC after: 1 month
* Remove coda_namecache from "options vcoda", it is no longer required.rwatson2008-02-131-1/+0
| | | | | MFC after: 1 month Spotted by: Tinderbox
* Implement a rudimentary access cache for the Coda kernel module,rwatson2008-02-133-28/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | modeled on the access cache found in NFS, smbfs, and the Linux coda module. This is a positive access cache of a single entry per file, tracking recently granted rights, but unlike NFS and smbfs, supporting explicit invalidation by the distributed file system. For each cnode, maintain a C_ACCCACHE flag indicating the validity of the cache, and a cached uid and mode tracking recently granted positive access control decisions. Prefer the cache to venus_access() in VOP_ACCESS() if it is valid, and when we must fall back to venus_access(), update the cache. Allow Venus to clear the access cache, either the whole cache on CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER. Unlike the Coda module on Linux, we don't flush all entries on a user purge using a generation number, we instead walk present cnodes and clear only entries for the specific user, meaning it is somewhat more expensive but won't hit all users. Since the Coda module is agressive about not keeping around unopened cnodes, the utility of the cache is somewhat limited for files, but works will for directories. We should make Coda less agressive about GCing cnodes in VOP_INACTIVE() in order to improve the effectiveness of in-kernel caching of attributes and access rights. MFC after: 1 month
* Fix the C version of ceill(x) for -1 < x <= -0 in all rounding modes.bde2008-02-131-1/+1
| | | | The result should be -0, but was +0.
* - Remove duplicate tputs.3 from MLINK. As we use termcap in the bsae, removerafan2008-02-131-1/+0
| | | | | | | the one links to curs_terminfo. Submitted by: David Naylor <blackdragon at highveldmail.co.za> MFC after: 3 days
* Remove now-unused Coda namecache.rwatson2008-02-132-905/+0
| | | | MFC after: 1 month
* Rather than having the Coda module use its own namecache, use the globalrwatson2008-02-137-168/+114
| | | | | | | | | | | | | | | | | VFS namecache, as is done by the Coda module on Linux. Unlike the Coda namecache, the global VFS namecache isn't tagged by credential, so use ore conservative flushing behavior (for now) when CODA_PURGEUSER is issued by Venus. This improves overall integration with the FreeBSD VFS, including allowing __getcwd() to work better, procfs/procstat monitoring, and so on. This improves shell behavior in many cases, and improves ".." handling. It may lead to some slowdown until we've implemented a specific access cache, which should net improve performance, but in the mean time, lookup access control now always goes to Venus, whereas previously it didn't. MFC after: 1 month
* Fix a lock leak in the ntfs locking scheme:attilio2008-02-131-1/+1
| | | | | | | | | When ntfs_ntput() reaches 0 in the refcount the inode lockmgr is not released and directly destroyed. Fix this by unlocking the lockmgr() even in the case of zero-refcount. Reported by: dougb, yar, Scot Hetzel <swhetzel at gmail dot com> Submitted by: yar
* Fix exp2*(x) on signaling NaNs by returning x+x as usual.bde2008-02-134-4/+4
| | | | | | | | | | | | | | | This has the side effect of confusing gcc-4.2.1's optimizer into more often doing the right thing. When it does the wrong thing here, it seems to be mainly making too many copies of x with dependency chains. This effect is tiny on amd64, but in some cases on i386 it is enormous. E.g., on i386 (A64) with -O1, the current version of exp2() should take about 50 cycles, but took 83 cycles before this change and 66 cycles after this change. exp2f() with -O1 only speeded up from 51 to 47 cycles. (exp2f() should take about 40 cycles, on an Athlon in either i386 or amd64 mode, and now takes 42 on amd64). exp2l() with -O1 slowed down from 155 cycles to 123 for some args; this is unimportant since the i386 exp2l() is a fake; the wrong thing for it seems to involve branch misprediction.
* Remove dublicate MLINK.brueffer2008-02-131-1/+0
| | | | Submitted by: David Naylor <blackdragon@highveldmail.co.za>
* Rearrange the polynomial evaluation for better parallelism. This isbde2008-02-131-3/+4
| | | | | | | | | | | | | | faster on all machines tested (old Celeron (P2), A64 (amd64 and i386) and ia64) except on ia64 when compiled with -O1. It takes 2 more multiplications, so it will be slower on old machines. The speedup is about 8 cycles = 17% on A64 (amd64 and i386) with best CFLAGS and some parallelism in the caller. Move the evaluation of 2**k up a bit so that it doesn't compete too much with the new polynomial evaluation. Unlike the previous optimization, this rearrangement cannot change the result, so compilers and CPU schedulers can do it, but they don't do it quite right yet. This saves a whole 1 or 2 cycles on A64.
* - mention new firmware images used in multi-slice modebrueffer2008-02-131-4/+40
| | | | | | | | | - mention LRO support - describe multi-slice related tunables. - correct DIAGNOSTICS section to reflect that missing firmware is non-fatal. Submitted by: gallatin
* Use hardware remainder on amd64 since it is 5 to 10 times faster thanbde2008-02-133-1/+78
| | | | software remainder and is already used for remquo().
* style.Makefile(5)obrien2008-02-136-6/+6
|
* style(9)obrien2008-02-132-6/+6
|
* Consolidate the code to generate a new XID for a NFS request into ajhb2008-02-133-22/+25
| | | | | | | | nfs_xid_gen() function instead of duplicating the logic in both nfsm_rpchead() and the NFS3ERR_JUKEBOX handling in nfs_request(). MFC after: 1 week Submitted by: mohans (a long while ago)
* Remove SMP left-overs from NetBSD.marcel2008-02-122-12/+3
|
* Make sure we restrict Linux only IPC calls from being executedcsjp2008-02-123-4/+22
| | | | | | | | | | | | | | | | | through the FreeBSD ABI. IPC_INFO, SHM_INFO, SHM_STAT were added specifically for Linux binary support. They are not documented as being a part of the FreeBSD ABI, also, the structures necessary for them have been hidden away from the users for a long time. Also, the Linux ABI layer uses it's own structures to populate the responses back to the user to ensure that the ABI is consistent. I think there is a bit more separation work that needs to happen. Reviewed by: jhb Discussed with: jhb Discussed on: freebsd-arch@ (very briefly) MFC after: 1 month
* Regenerate for readlink(2).ru2008-02-1210-11/+11
|
* Change readlink(2)'s return type and type of the last argumentru2008-02-126-13/+13
| | | | | | to match POSIX. Prodded by: Alexey Lyashkov
* There's no need to suppress option GDB.marcel2008-02-121-1/+0
|
* Add PIC support for IPIs. When registering an interrupt handler,marcel2008-02-129-59/+143
| | | | | | the PIC also informs the platform at which IRQ level it can start assigning IPIs, since this can depend on the number of IRQs supported for external interrupts.
* Fix remainder() and remainderf() in round-towards-minus-infinity modebde2008-02-122-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | when the result is +-0. IEEE754 requires (in all rounding modes) that if the result is +-0 then its sign is the same as that of the first arg, but in round-towards-minus-infinity mode an uncorrected implementation detail always reversed the sign. (The detail is that x-x with x's sign positive gives -0 in this mode only, but the algorithm assumed that x-x always has positive sign for such x.) remquo() and remquof() seem to need the same fix, but I cannot test them yet. Use long doubles when mixing NaN args. This trick improves consistency of results on at least amd64, so that more serious problems like the above aren't hidden in simple regression tests by noise for the NaNs. On amd64, hardware remainder should be used since it is about 10 times faster than software remainder and is already used for remquo(), but it involves using the i387 even for floats and doubles, and the i387 does NaN mixing which is better than but inconsistent with SSE NaN mixing. Software remainder() would probably have been inconsistent with software remainderl() for the same reason if the latter existed. Signaling NaNs cause further inconsistencies on at least ia64 and i386. Use __FBSDID().
* If busdma is being used to realign dynamic buffers and the alignment is set toscottl2008-02-122-4/+4
| | | | | | | | PAGE_SIZE or less, the bounce page counting logic was flawed and wouldn't reserve any pages. Adjust to be correct. Review of other architectures is forthcoming. Submitted by: Joseph Golio
* Fix a typo when testing for the NO_C3 quirk.jhb2008-02-121-1/+1
| | | | MFC after: 3 days
* Fix typo.raj2008-02-121-1/+1
| | | | Approved by: cognet (mentor)
* Eliminate BUS_DMA <-> cache incoherencies in USB transfers.raj2008-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With write-allocate cache we get into the following scenario: 1. data has been updated in the memory by the USB HC, but 2. D-cache holds an un-flushed value of it 3. when affected cache line is being replaced, the old (un-flushed) value is flushed and overwrites the newly arrived This is possible due to how write-allocate works with virtual caches (ARM for example). In case of USB transfers it leads to fatal tags discrepancies in umass(4) operation, which look like the following: umass0: Invalid CSW: tag 1 should be 2 (probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR (probe0:umass-sim0:0:0:0): Retrying Command umass0: Invalid CSW: tag 1 should be 3 (probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR (probe0:umass-sim0:0:0:0): Retrying Command umass0: Invalid CSW: tag 1 should be 4 (probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR (probe0:umass-sim0:0:0:0): Retrying Command umass0: Invalid CSW: tag 1 should be 5 (probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR (probe0:umass-sim0:0:0:0): Retrying Command umass0: Invalid CSW: tag 1 should be 6 (probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR (probe0:umass-sim0:0:0:0): error 5 (probe0:umass-sim0:0:0:0): Retries Exausted To eliminate this, a BUS_DMASYNC_PREREAD sync operation is required in usbd_start_transfer(). Credits for nailing this down go to Grzegorz Bernacki gjb AT semihalf DOT com. Reviewed by: imp Approved by: cognet (mentor)
* Add the -4 option to the synopsis.ceri2008-02-121-1/+1
|
OpenPOWER on IntegriCloud