summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* Fix interaction with Windows 2000/XP based servers:bp2005-11-221-1/+3
| | | | | | | | | | If the complete reply on the TRANS2_FIND_FIRST2 request fits exactly into one responce packet, then next call to TRANS2_FIND_NEXT2 will return zero entries and server will close current transaction. To avoid subsequent errors we should not perform FIND_CLOSE2 request. PR: kern/78953 Submitted by: Jim Carroll
* Properly parse the nowin95 mount option.rodrigc2005-11-191-5/+4
| | | | Tested by: Rainer Hurling <rhurlin at gwdg dot de>
* Add "shortnames" and "longnames" mount options which arerodrigc2005-11-181-1/+5
| | | | | | | | | | synonyms for "shortname" and "longname" mount options. The old (before nmount()) mount_msdosfs program accepted "shortnames" and "longnames", but the kernel nmount() checked for "shortname" and "longname". So, make the kernel accept "shortnames", "longnames", "shortname", "longname" for forwards and backwarsd compatibility. Discovered by: Rainer Hurling <rhurlin at gwdg dot de>
* - Add errmsg to the list of smbfs mount options.rodrigc2005-11-161-7/+23
| | | | | | - Use vfs_mount_error() to propagate smbfs mount errors back to userspace. Reviewed by: bp (smbfs maintainer)
* This is a workaround for a complicated issue involving VFS cookies and devfs.dwhite2005-11-091-0/+24
| | | | | | | | | | | | | The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days
* Normalize a significant number of kernel malloc type names:rwatson2005-10-3122-36/+36
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* Use correct cirteria for determining which directory entries we canphk2005-10-181-1/+1
| | | | | | purge right away and which we merely can hide. Beaten into my skull by: kris
* Implement the full range of ISO9660 number conversion routines in iso.h.des2005-10-181-49/+35
| | | | MFC after: 2 weeks
* Unconditionally mount a CD9660 filesystem as read-only, instead ofrodrigc2005-10-171-2/+4
| | | | returning EROFS if we forget to mount it as read-only.
* Use the actual sector size of the media instead of hard-coding it to 2048.rodrigc2005-10-171-3/+12
| | | | | This eliminates KASSERTs in GEOM if we accidentally mount an audio CD as a cd9660 filesystem.
* Unconditionally mount a UDF filesystem as read-only, instead ofrodrigc2005-10-171-2/+4
| | | | returning an EROFS if we forget to mount it as read-only.
* - Fix typo.flz2005-10-171-1/+1
| | | | | Approved by: ssouhlal MFC after: 1 week
* Update nwfs_lookup() to match the current cache_lookup() API.truckman2005-10-161-26/+11
| | | | | | | | | cache_lookup() has returned a ref'ed and locked vnode since vfs_cache.c:1.96, dated Tue Mar 29 12:59:06 2005 UTC. This change is similar to the change made to smbfs_lookup() in smbfs_vnops.c:1.58. Tested by: "Antony Mawer" ant AT mawer.org MFC after: 2 weeks
* Reflect mpsafety of the underlying filesystem in the nullfs image.kris2005-10-161-0/+1
| | | | | | | | | | I benchmarked this by simultaneously extracting 4 large tarballs (basically world images) on a 4-processor AMD64 system, in a malloc-backed md. With this patch, system time was reduced by 43%, and wall clock time by 33%. Submitted by: jeff MFC after: 1 week
* Apply the same fix to a potential race in the ISDOTDOT code intruckman2005-10-161-3/+4
| | | | | | | | cd9660_lookup() that was used to fix an actual race in ufs_lookup.c:1.78. This is not currently a hazard, but the bug would be activated by marking cd9660 as MPSAFE. Requested by: bde
* In preparation for making the modules actually use opt_*.h filesyar2005-10-142-8/+0
| | | | | | | | | | | | | | | | | provided in the kernel build directory, fix modules that were failing to build this way due to not quite correct kernel option usage. In particular: ng_mppc.c uses two complementary options, both of which are listed in sys/conf/files. Ideally, there should be a separate option for including ng_mppc.c in kernel build, but now only NETGRAPH_MPPC_ENCRYPTION is usable anyway, the other one requires proprietary files. nwfs and smbfs were trying to ensure they were built with proper network components, but the check was rather questionable. Discussed with: ru
* 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, mostdavidxu2005-10-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
* - Do not hardcode the bsize to a sectorsize of 2048, even thoughrodrigc2005-10-091-5/+21
| | | | | | | | | | | | the UDF specification specifies a logical sectorsize of 2048. Instead, get it from GEOM. - When reading the UDF Anchor Volume Descriptor, use the logical sectorsize of 2048 when calculating the offset to read from, but use the actual sectorsize to determine how much to read. - works with reading a DVD disk and a DVD disk image file via mdconfig - correctly returns EINVAL if we try to mount_udf an audio CD, instead of panicking inside GEOM when INVARIANTS is set
* We don't need 'imp' here.pjd2005-10-071-1/+0
|
* Second attempt at a work-around for fifo-related socket panics duringrwatson2005-10-011-0/+4
| | | | | | | | make -j with high levels of parallelism: acquire Giant in fifo I/O routines. Discussed with: ups MFC after: 3 days
* The NWFS code in RELENG_6 is broken due to a typo inphk2005-09-301-1/+1
| | | | | | | | | | | | | | | | | sys/fs/nwfs/nwfs_vfsop= s.c, introduced with the conversion to nmount with revision 1.38. This causes mount_nwfs to fail with the error message: mount_nwfs: mount error: /mnt/netware: syserr = No such file or directo= ry This is caused by a typo on line 178, which specifies "nwfw_args" rather than "nwfs_args". Submitted by: Antony Mawer <gnats@mawer.org> Fat fingers: phk PR: 86757 MFC: 3 days
* Remove checks for BOOTSIG[23] from FAT32 bootblocks.peadar2005-09-292-8/+2
| | | | | | | | | There seems to be very little documentary evidence outside this implementation to suggest a these checks are neccessary, and more than one camera-formatted flash disk fails the check, but mounts successfully on most other systems. Reviewed By: bde@
* Back out fifo_vnops.c:1.127, which introduced an sx lock around I/O onrwatson2005-09-271-16/+3
| | | | | | | | | a fifo. While this did indeed close the race, confirming suspicions about the nature of the problem, it causes difficulties with blocking I/O on fifos. Discussed with: ups Also spotted by: Peter Holm <peter at holm dot cc>
* Assert v_fifoinfo is non-NULL in fifo_close() in order to catchrwatson2005-09-261-0/+1
| | | | | | | non-conforming cases sooner. MFC after: 3 days Reported by: Peter Holm <peter at holm dot cc>
* Lock the read socket receive buffer when frobbing the sb_state flag onrwatson2005-09-251-2/+2
| | | | | | | | | that socket during open, not the write socket receive buffer. This might explain clearing of the sb_state SB_LOCK flag seen occasionally in soreceive() on fifos. MFC after: 3 days Spotted by: ups
* Make rule zero really magical, that way we don't have to do anythingphk2005-09-243-153/+99
| | | | | | | | | | | | | | | | | | when we mount and get zero cost if no rules are used in a mountpoint. Add code to deref rules on unmount. Switch from SLIST to TAILQ. Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead. Drop goto, a break will do. Reduce double pointers to single pointers. Combine reaping and destroying rulesets. Avoid memory leaks in a some error cases.
* For reasons of consistency (and necessity), assert an exclusive vnoderwatson2005-09-231-0/+1
| | | | | | | lock on the fifo vnode in fifo_open(): we rely on the vnode lock to serialize access to v_fifoinfo. MFC after: 3 days
* Add fi_sx, an sx lock to serialize I/O operations on the socket pairrwatson2005-09-221-3/+16
| | | | | | | | | | | | | | | | | | | | underlying the POSIX fifo implementation. In 6.x/7.x, fifo access is moved from the VFS layer, where it was serialized using the vnode lock, to the file descriptor layer, where access is protected by a reference count but not serialized. This exposed socket buffer locking to high levels of parallelism in specific fifo workloads, such as make -j 32, which expose as yet unresolved socket buffer bugs. fi_sx re-adds serialization about the read and write routines, although not paths that simply test socket buffer mbuf queue state, such as the poll and kqueue methods. This restores the extra locking cost previously present in some cases, but is an effective workaround for the instability that has been experienced. This workaround should be removed once the bug in socket buffer handling has been fixed. Reported by: kris, jhb, Julien Gabel <jpeg at thilelli dot net>, Peter Holm <peter at holm dot cc>, others MFC after: 3 days
* Rewamp DEVFS internals pretty severely [1].phk2005-09-196-446/+437
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Give DEVFS a proper inode called struct cdev_priv. It is important to keep in mind that this "inode" is shared between all DEVFS mountpoints, therefore it is protected by the global device mutex. Link the cdev_priv's into a list, protected by the global device mutex. Keep track of each cdev_priv's state with a flag bit and of references from mountpoints with a dedicated usecount. Reap the benefits of much improved kernel memory allocator and the generally better defined device driver APIs to get rid of the tables of pointers + serial numbers, their overflow tables, the atomics to muck about in them and all the trouble that resulted in. This makes RAM the only limit on how many devices we can have. The cdev_priv is actually a super struct containing the normal cdev as the "public" part, and therefore allocation and freeing has moved to devfs_devs.c from kern_conf.c. The overall responsibility is (to be) split such that kern/kern_conf.c is the stuff that deals with drivers and struct cdev and fs/devfs handles filesystems and struct cdev_priv and their private liason exposed only in devfs_int.h. Move the inode number from cdev to cdev_priv and allocate inode numbers properly with unr. Local dirents in the mountpoints (directories, symlinks) allocate inodes from the same pool to guarantee against overlaps. Various other fields are going to migrate from cdev to cdev_priv in the future in order to hide them. A few fields may migrate from devfs_dirent to cdev_priv as well. Protect the DEVFS mountpoint with an sx lock instead of lockmgr, this lock also protects the directory tree of the mountpoint. Give each mountpoint a unique integer index, allocated with unr. Use it into an array of devfs_dirent pointers in each cdev_priv. Initially the array points to a single element also inside cdev_priv, but as more devfs instances are mounted, the array is extended with malloc(9) as necessary when the filesystem populates its directory tree. Retire the cdev alias lists, the cdev_priv now know about all the relevant devfs_dirents (and their vnodes) and devfs_revoke() will pick them up from there. We still spelunk into other mountpoints and fondle their data without 100% good locking. It may make better sense to vector the revoke event into the tty code and there do a destroy_dev/make_dev on the tty's devices, but that's for further study. Lots of shuffling of stuff and churn of bits for no good reason[2]. XXX: There is still nothing preventing the dev_clone EVENTHANDLER from being invoked at the same time in two devfs mountpoints. It is not obvious what the best course of action is here. XXX: comment out an if statement that lost its body, until I can find out what should go there so it doesn't do damage in the meantime. XXX: Leave in a few extra malloc types and KASSERTS to help track down any remaining issues. Much testing provided by: Kris Much confusion caused by (races in): md(4) [1] You are not supposed to understand anything past this point. [2] This line should simplify life for the peanut gallery.
* Assert that (vp) is locked in fifo_close(), since we rely on therwatson2005-09-181-0/+1
| | | | | | | exclusive vnode lock to synchronize the reference counts on struct fifoinfo. MFC after: 3 days
* Don't attempt to recurse lockmgr, it doesn't like it.phk2005-09-152-3/+6
|
* Handle a race condition where NULLFS vnode can be cleaned while threadskan2005-09-151-4/+28
| | | | | | | can still be asleep waiting for lowervp lock. Tested by: kkenn Discussed with: ssouhlal, jeffr
* The socket pointers in fifoinfo are not permitted to be NULL, sorwatson2005-09-151-5/+2
| | | | | | don't check if they are, it just confuses the fifo code more. MFC after: 3 days
* Various minor polishing.phk2005-09-153-22/+10
|
* Protect the devfs rule internal global lists with a sx lock, the perphk2005-09-151-1/+9
| | | | | mount locks are not enough. Finer granularity (x)locking could be implemented, but I prefer to keep it simple for now.
* Absolve devfs_rule.c from locking responsibility and call it withphk2005-09-153-19/+5
| | | | all necessary locking held.
* Close a race which could result in unwarranted "ruleset %d alreadyphk2005-09-153-44/+34
| | | | | | | | | | | | | | | | | running" panics. Previously, recursion through the "include" feature was prevented by marking each ruleset as "running" when applied. This doesn't work for the case where two DEVFS instances try to apply the same ruleset at the same time. Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which limits how many levels of "include" we will traverse. Be aware that traversal of "include" is recursive and kernel stack size is limited. MFC: after 3 days
* Trim down now (believed to be) unused fifo_ioctl() andrwatson2005-09-131-65/+75
| | | | | | | | | | | | | | | | | | | | | | fifo_kqfilter() VOP implementations, since they in theory are used only on open file descriptors, in which case the ioctls are via fifo_ioctl_f() and kqueue requests are via fifo_kqfilter_f(). Generate warnings if they are entered for now. These printf() calls should become panic() calls. Annotate and re-implement fifo_ioctl_f(): don't arbitrarily forward ioctls to the socket layer, only forward the ones we explicitly support for fifos. In the case of FIONREAD, don't forward the request to the write socket on a read-write fifo, or the read result is overwritten. Annotate a nasty case for the undefined POSIX O_RDWR on fifos, in which failure of the second ioctl will result in the socket pair being in an inconsistent state. Assert copyright as I find myself rewriting non-trivial parts of fifofs. MFC after: 3 days
* As a result of kqueue locking work, socket buffer locks will alwaysrwatson2005-09-131-18/+6
| | | | | | | | be held when entering a kqueue filter for fifos via a socket buffer event: as such, assert the lock unconditionally rather than acquiring it conditionall. MFC after: 3 days
* Annotate two issues:rwatson2005-09-131-0/+12
| | | | | | | | | | | | 1) fifo_kqfilter() is not actually ever used, it likely should be GC'd. 2) fifo_kqfilter_f() doesn't implement EVFILT_VNODE, so detecting events on the underlying vnode for a fifo no longer works (it did in 4.x). Likely, fifo_kqfilter_f() should forward the request to the VFS using fp->f_vnode, which would work once fifo_kqfilter() was detached from the vnode operation vector (removing the fifo override). Discussed with: phk
* Introduce no-op nosup fifo kqueue filter and detach routine, which arerwatson2005-09-121-1/+33
| | | | | | | | | | | | | | | | used when a read filter is requested on a write-only fifo descriptor, or a write filter is requested on a read-only fifo descriptor. This permits the filters to be registered, but never raises the event, which causes kqueue behavior for fifos to more closely match similar semantics for poll and select, which permit testing for the condition even though the condition will never be raised, and is consistent with POSIX's notion that a fifo has identical semantics to a one-way IPC channel created using pipe() on most operating systems. The fifo regression test suite can now run to completion on HEAD without errors. MFC after: 3 days
* When a request is made to register a filter on a fifo that doesn'trwatson2005-09-121-2/+2
| | | | | | | apply to the fifo (i.e., not EVFILT_READ or EVFILT_WRITE), reject it as EINVAL, not by returning 1 (EPERM). MFC after: 3 days
* Remove DFLAG_SEEKABLE from fifo file descriptors: fifos are not seekablerwatson2005-09-121-1/+1
| | | | | | according to POSIX, not to mention the fact that it doesn't make sense (and hence isn't really implemented). This causes the fifo_misc regression test to succeed.
* Only poll the fifo for read events if the fifo is attached to a readablerwatson2005-09-121-2/+2
| | | | | | | | | | | | | | file descriptor. Otherwise, the read end of a fifo might return that it is writable (which it isn't). Only poll the fifo for write events if the fifo attached to a writable file descriptor. Otherwise, the write end of a fifo might return that it is readable (which it isn't). In the event that a file is FREAD|FWRITE (which is allowed by POSIX, but has undefined behavior), we poll for both. MFC after: 3 days
* After going to some trouble to identify only the write-related eventsrwatson2005-09-121-2/+2
| | | | | | | | | to poll the write socket for, the fifo polling code proceeded to poll for the complete set of events. Use 'levents' instead of 'events' as the argument to poll, and only poll the write socket if there is interest in write events. MFC after: 3 days
* When a writer opens a fifo, wake up the read socket for read, not therwatson2005-09-121-1/+1
| | | | | | write socket. MFC after: 3 days
* Add an assertion that fifo_open() doesn't race against other threadsrwatson2005-09-121-0/+2
| | | | | | | while sleeping to allocate fifo state: due to using the vnode lock to serialize access to a fifo during open, it shouldn't happen (tm). MFC after: 3 days
* Rather than reaching into the internals of the UNIX domain socket coderwatson2005-09-121-1/+1
| | | | | | | by calling uipc_connect2() to connect two socket endpoints to create a fifo, call soconnect2(). MFC after: 3 days
* Clean up prototypes.phk2005-09-121-258/+96
|
* Cast bf_sysid to const char * when passing it to strncmp(), becauserodrigc2005-09-111-1/+1
| | | | strncmp does not take an unsigned char *. Eliminates warning with GCC 4.0.
OpenPOWER on IntegriCloud