summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* Mark most often used sysctl's as MPSAFE.ed2009-01-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | After running a `make buildkernel', I noticed most of the Giant locks in sysctl are only caused by a very small amount of sysctl's: - sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like sysctl.oidfmt. - kern.ident, kern.osrelease, kern.version, etc. These are just constant strings. - kern.arandom, used by the stack protector. It is already protected by arc4_mtx. I also saw the following sysctl's show up. Not as often as the ones above, but still quite often: - security.jail.jailed. Also mark security.jail.list as MPSAFE. They don't need locking or already use allprison_lock. - kern.devname, used by devname(3), ttyname(3), etc. This seems to reduce Giant locking inside sysctl by ~75% in my primitive test setup.
* Use the correct field name for the size of the sierra_id. While thisimp2009-01-281-1/+1
| | | | | | is the same size as id, and is unlikely to change, it seems better to use the correct field here. There's no difference in the generated code.
* Mark cd9660 MPSAFE and add support for using shared vnode locks duringjhb2009-01-284-50/+114
| | | | | | | | | | | | | | | | | | | | | | pathname lookups. - Remove 'i_offset' and 'i_ino' from the ISO node structure and replace them with local variables in the lookup routine instead. - Cache a copy of 'i_diroff' for use during a lookup in a local variable. - Save a copy of the found directory entry in a malloc'd buffer after a successfull lookup before getting the vnode. This allows us to release the buffer holding the directory block before calling vget() which otherwise resulted in a LOR between "bufwait" and the vnode lock. - Use an inlined version of vn_vget_ino() to handle races with .. lookups. I had to inline the code here since cd9660 uses an internal vget routine to save a disk I/O that would otherwise re-read the directory block. - Honor the requested locking flags during lookups to allow for shared locking. - Honor the requested locking flags passed to VFS_ROOT() and VFS_VGET() similar to UFS. - Don't make every ISO 9660 vnode hold a reference on the vnode of the underlying device vnode of the mountpoint. The mountpoint already holds a suitable reference.
* Sync with ufs_vnops.c:1.245 and remove support for accessing device nodesjhb2009-01-281-3/+7
| | | | in ISO 9660 filesystems.
* Assert an exclusive vnode lock for fifo_cleanup() and fifo_close() sincejhb2009-01-281-2/+2
| | | | | | they change v_fifoinfo. Discussed with: ups (a while ago)
* Last step of splitting up minor and unit numbers: remove minor().ed2009-01-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.
* The kernel may do unbalanced calls to fifo_close() for fifo vnode,kib2009-01-261-1/+4
| | | | | | | | | | | | | without corresponding number of fifo_open(). This causes assertion failure in fifo_close() due to vp->v_fifoinfo being NULL for kernel with INVARIANTS, or NULL pointer dereference otherwise. In fact, we may ignore excess calls to fifo_close() without bad consequences. Turn KASSERT() into the return, and print warning for now. Tested by: pho Reviewed by: rwatson MFC after: 2 weeks
* Turn a "panic: non-decreasing id" into an error printf. This seemstrasz2009-01-131-2/+5
| | | | | | | | | to be caused by a metadata corruption that occurs quite often after unplugging a pendrive during write activity. Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Fix msdosfs_print(), which in turn fixes "show lockedvnods" for msdosfstrasz2009-01-111-0/+1
| | | | | | | | vnodes. Reviewed by: kib Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Fix a deadlock which can occur due to a pseudofs vnode not getting unlocked.marcus2009-01-091-0/+1
| | | | | | Reported by: Richard Todd <rmtodd@ichotolot.servalan.com> Reviewed by: kib Approved by: kib
* Don't panic with "vinvalbuf: dirty bufs" when the mounted device that wastrasz2009-01-081-2/+18
| | | | | | | | being written to goes away. Reviewed by: kib, scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Add a VOP_VPTOCNP implementation for pseudofs which covers file systemsmarcus2008-12-301-0/+79
| | | | | | | | | | | | such as procfs and linprocfs. This implementation's locking was enhanced by kib. Reviewed by: kib des Approved by: des kib Tested by: pho
* When the insmntque() in the pfs_vncache_alloc() fails, vop_reclaim callskib2008-12-291-1/+2
| | | | | | | | | | | | pfs_vncache_free() that removes pvd from the list, while it is not yet put on the list. Prevent the invalid removal from the list by clearing pvd_next and pvd_prev for the newly allocated pvd, and only move pfs_vncache list head when the pvd was at the head. Suggested and approved by: des MFC after: 2 weeks
* vm_map_lock_read() does not increment map->timestamp, so we shouldkib2008-12-291-1/+1
| | | | | | | | | | compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks
* Use curproc->p_sysent->sv_flags bit SV_ILP32 for detection of the 32 bitkib2008-12-291-11/+5
| | | | | | | | caller, instead of direct comparision with ia32_freebsd_sysvec. Tested by: pho Approved by: des MFC after: 2 weeks
* Drop the pseudofs vnode lock around call to pfs_read handler. The handlerkib2008-12-291-15/+18
| | | | | | | | | may need to lock arbitrary vnodes, causing either lock order reversal or recursive vnode lock acquisition. Tested by: pho Approved by: des MFC after: 2 weeks
* After the pfs_vncache_mutex is dropped, another thread may attempt tokib2008-12-291-13/+26
| | | | | | | | | | | | | | | | | | do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we could end up with two vnodes for the pair. Recheck the cache under the locked pfs_vncache_mutex after all sleeping operations are done [1]. This case mostly cannot happen now because pseudofs uses exclusive vnode locking for lookup. But it does drop the vnode lock for dotdot lookups, and Marcus' pseudofs_vptocnp implementation is vulnerable too. Do not call free() on the struct pfs_vdata after insmntque() failure, because vp->v_data points to the structure, and pseudofs_reclaim() frees it by the call to pfs_vncache_free(). Tested by: pho [1] Approved by: des MFC after: 2 weeks
* According to phk@, VOP_STRATEGY should never, _ever_, returntrasz2008-12-165-5/+5
| | | | | | | | | | | | | anything other than 0. Make it so. This fixes "panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648", encountered when writing to an orphaned filesystem. Reason for the panic was the following assert: KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp)); at vfs_bio:bufstrategy(). Reviewed by: scottl, phk Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Reference the vmspace of the process being inspected by procfs, linprocfskib2008-12-121-1/+8
| | | | | | | | and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week
* Do not leak defs_de_interlock on error.kib2008-12-121-1/+3
| | | | Another pointy hat for my collection.
* Implement VOP_VPTOCNP for devfs. Directory and character device vnodes aremarcus2008-12-121-0/+65
| | | | | | | properly translated to their component names. Reviewed by: arch Approved by: kib
* Add a simple VOP_VPTOCNP implementation for deadfs which returns EBADF.marcus2008-12-121-0/+1
| | | | | Reviewed by: arch Approved by: kib
* Relock user map earlier, to have the lock held when break leaves thekib2008-12-101-2/+1
| | | | | | | loop earlier due to sbuf error. Pointy hat to: me Submitted by: dchagin
* Make two style changes to create new commit and document proper commitkib2008-12-081-1/+1
| | | | | | | | | | | | | | | | | | message for r185765. Noted by: rdivacky Requested by: des Commit message for r185765 should be: In procfs map handler, and in linprocfs maps handler, do not call vn_fullpath() while having vm map locked. This is done in anticipation of the vop_vptocnp commit, that would make vn_fullpath sometime acquire vnode lock. Also, in linprocfs, maps handler already acquires vnode lock. No objections from: des MFC after: 2 week
* Change the linprocfs <pid>/maps and procfs <pid>/map handlers to usekib2008-12-081-9/+28
| | | | | | | | | | | | sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
* The timezone byte is a signed value, treat it as such.kientzle2008-11-271-1/+1
| | | | | | | | | Otherwise, time zone information for time zones west of GMT gets discarded. PR: kern/128934 Submitted by: J.R. Oldroyd MFC after: 4 days
* In null_lookup(), do the needed cleanup instead of panicing sayingkib2008-11-261-5/+4
| | | | | | | | the cleanup is needed. Reported by: kris, pho Tested by: pho MFC after: 2 weeks
* - Support IEEE_P1282 and IEEE_1282 tags in the rock ridge extensions record.lulf2008-11-261-2/+6
| | | | | PR: kern/128942 Submitted by: "J.R. Oldroyd" <fbsd - at - opal.com>
* Simplify mode_t check treatment (suggested by trasz).daichi2008-11-251-39/+2
| | | | | | | By semantical view, trasz's code is better than prior one. Submitted by: trasz Reviewed by: Masanori OZAWA <ozawa@ongs.co.jp>
* Fixes Unionfs socket issue reported as kern/118346.daichi2008-11-253-39/+191
| | | | | | | | PR: 118346 Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> Discussed at: devsummit Strassburg, EuroBSDCon2008 Discussed with: rwatson, gnn, hrs MFC after: 2 week
* - Fix a typo in a comment.jhb2008-11-183-9/+5
| | | | | | - Whitespace fix. - Remove #if 0'd BSD 4.x code for flushing busy buffers from a mountpoint during an unmount. FreeBSD uses vflush() for this.
* When looking up the vnode for the device to mount the filesystem on,jhb2008-11-181-12/+8
| | | | | | ask NDINIT to return a locked vnode instead of letting it drop the lock and return a referenced vnode and then relock the vnode a few lines down. This matches the behavior of other filesystem mount routines.
* Remove copy/paste code from UFS to handle sparse blocks. While Rockjhb2008-11-181-6/+0
| | | | | Ridge does support sparse files, the cd9660 code does not currently support them.
* Remove unused i_flags field and IN_ACCESS flag from cd9660 in-memoryjhb2008-11-183-6/+0
| | | | i-nodes. cd9660 doesn't support access times.
* Remove unnecessary locking around vn_fullpath(). The vnode lock for thejhb2008-11-042-11/+6
| | | | | | | | | | | | | | | | vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month
* Don't pass WANTPARENT to the pathname lookup of the mount point for ajhb2008-11-041-4/+1
| | | | | unionfs mount just so we can immediately drop the reference on the parent directory vnode without using it.
* Fix few missed accmode changes in coda.trasz2008-11-033-5/+7
| | | | Approved by: rwatson (mentor)
* Implement support for RPCSEC_GSS authentication to both the NFS clientdfr2008-11-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
* Catch up with netsmb locking: explicit thread arguments no longer required.rwatson2008-11-021-2/+2
|
* Remove the call to getinoquota() from ntfs_access. How did it get there?!trasz2008-11-021-7/+0
| | | | Approved by: rwatson (mentor)
* Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessarytrasz2008-10-2815-67/+69
| | | | | | | to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
* Fix a number of style issues in the MALLOC / FREE commit. I've tried todes2008-10-235-7/+10
| | | | | be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-2327-136/+131
| | | | MFC after: 3 months
* The locking in portalfs's socket connect code is no less correct thanrwatson2008-10-121-1/+0
| | | | | | identical code in connect(2), so remove XXX that it might be incorrect. MFC after: 3 days
* Remove the struct thread unuseful argument from bufobj interface.attilio2008-10-107-30/+27
| | | | | | | | | | | | | | | | | | | | | In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Use soconnect2() rather than directly invoking uipc_connect2() torwatson2008-10-061-1/+1
| | | | | | interconnect two UNIX domain sockets. MFC after: 3 days
* Change the linprocfs <pid>/maps and procfs <pid>/map handlers to usekib2008-10-041-29/+7
| | | | | | | | | | | | sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week
* Fix Vflags abuse in fdescfs. There should be no functional changes.trasz2008-10-031-3/+1
| | | | Approved by: rwatson (mentor)
* Fix Vflags abuse in cd9660. There should be no functional changes.trasz2008-10-031-7/+7
| | | | Approved by: rwatson (mentor)
* Step 1.5 of importing the network stack virtualization infrastructurezec2008-10-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
OpenPOWER on IntegriCloud