summaryrefslogtreecommitdiffstats
path: root/sys/dev/md
Commit message (Collapse)AuthorAgeFilesLines
* MFC r292128:kib2015-12-261-13/+27
| | | | | | | In md(4) over vnode, correct handling of the unaligned unmapped io requests which page alignment + size is greater than MAXPHYS. Split request up to the size of io which fits into pbuf KVA with alignment, and retry if a part of the bio is left unprocessed.
* MFC r291716, r291724, r291741, r291742ken2015-12-161-71/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In addition to those revisions, add this change to a file that is not in head: sys/ia64/include/bus.h: Guard kernel-only parts of the ia64 machine/bus.h header with #ifdef _KERNEL. This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t. ------------------------------------------------------------------------ r291716 | ken | 2015-12-03 15:54:55 -0500 (Thu, 03 Dec 2015) | 257 lines Add asynchronous command support to the pass(4) driver, and the new camdd(8) utility. CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and completed CCBs may be retrieved via the CAMIOGET ioctl. User processes can use poll(2) or kevent(2) to get notification when I/O has completed. While the existing CAMIOCOMMAND blocking ioctl interface only supports user virtual data pointers in a CCB (generally only one per CCB), the new CAMIOQUEUE ioctl supports user virtual and physical address pointers, as well as user virtual and physical scatter/gather lists. This allows user applications to have more flexibility in their data handling operations. Kernel memory for data transferred via the queued interface is allocated from the zone allocator in MAXPHYS sized chunks, and user data is copied in and out. This is likely faster than the vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in configurations with many processors (there are more TLB shootdowns caused by the mapping/unmapping operation) but may not be as fast as running with unmapped I/O. The new memory handling model for user requests also allows applications to send CCBs with request sizes that are larger than MAXPHYS. The pass(4) driver now limits queued requests to the I/O size listed by the SIM driver in the maxio field in the Path Inquiry (XPT_PATH_INQ) CCB. There are some things things would be good to add: 1. Come up with a way to do unmapped I/O on multiple buffers. Currently the unmapped I/O interface operates on a struct bio, which includes only one address and length. It would be nice to be able to send an unmapped scatter/gather list down to busdma. This would allow eliminating the copy we currently do for data. 2. Add an ioctl to list currently outstanding CCBs in the various queues. 3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do that. 4. Test physical address support. Virtual pointers and scatter gather lists have been tested, but I have not yet tested physical addresses or scatter/gather lists. 5. Investigate multiple queue support. At the moment there is one queue of commands per pass(4) device. If multiple processes open the device, they will submit I/O into the same queue and get events for the same completions. This is probably the right model for most applications, but it is something that could be changed later on. Also, add a new utility, camdd(8) that uses the asynchronous pass(4) driver interface. This utility is intended to be a basic data transfer/copy utility, a simple benchmark utility, and an example of how to use the asynchronous pass(4) interface. It can copy data to and from pass(4) devices using any target queue depth, starting offset and blocksize for the input and ouptut devices. It currently only supports SCSI devices, but could be easily extended to support ATA devices. It can also copy data to and from regular files, block devices, tape devices, pipes, stdin, and stdout. It does not support queueing multiple commands to any of those targets, since it uses the standard read(2)/write(2)/writev(2)/readv(2) system calls. The I/O is done by two threads, one for the reader and one for the writer. The reader thread sends completed read requests to the writer thread in strictly sequential order, even if they complete out of order. That could be modified later on for random I/O patterns or slightly out of order I/O. camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from the pass(4) driver and also to send request notifications internally. For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR) per CAM CCB on the reading side, and a scatter/gather list (CAM_DATA_SG) on the writing side. In addition to testing both interfaces, this makes any potential reblocking of I/O easier. No data is copied between the reader and the writer, but rather the reader's buffers are split into multiple I/O requests or combined into a single I/O request depending on the input and output blocksize. For the file I/O path, camdd(8) also uses a single buffer (read(2), write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list (readv(2), writev(2), preadv(2), pwritev(2)) on writes. Things that would be nice to do for camdd(8) eventually: 1. Add support for I/O pattern generation. Patterns like all zeros, all ones, LBA-based patterns, random patterns, etc. Right Now you can always use /dev/zero, /dev/random, etc. 2. Add support for a "sink" mode, so we do only reads with no writes. Right now, you can use /dev/null. 3. Add support for automatic queue depth probing, so that we can figure out the right queue depth on the input and output side for maximum throughput. At the moment it defaults to 6. 4. Add support for SATA device passthrough I/O. 5. Add support for random LBAs and/or lengths on the input and output sides. 6. Track average per-I/O latency and busy time. The busy time and latency could also feed in to the automatic queue depth determination. sys/cam/scsi/scsi_pass.h: Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue and fetch asynchronous CAM CCBs respectively. Although these ioctls do not have a declared argument, they both take a union ccb pointer. If we declare a size here, the ioctl code in sys/kern/sys_generic.c will malloc and free a buffer for either the CCB or the CCB pointer (depending on how it is declared). Since we have to keep a copy of the CCB (which is fairly large) anyway, having the ioctl malloc and free a CCB for each call is wasteful. sys/cam/scsi/scsi_pass.c: Add asynchronous CCB support. Add two new ioctls, CAMIOQUEUE and CAMIOGET. CAMIOQUEUE adds a CCB to the incoming queue. The CCB is executed immediately (and moved to the active queue) if it is an immediate CCB, but otherwise it will be executed in passstart() when a CCB is available from the transport layer. When CCBs are completed (because they are immediate or passdone() if they are queued), they are put on the done queue. If we get the final close on the device before all pending I/O is complete, all active I/O is moved to the abandoned queue and we increment the peripheral reference count so that the peripheral driver instance doesn't go away before all pending I/O is done. The new passcreatezone() function is called on the first call to the CAMIOQUEUE ioctl on a given device to allocate the UMA zones for I/O requests and S/G list buffers. This may be good to move off to a taskqueue at some point. The new passmemsetup() function allocates memory and scatter/gather lists to hold the user's data, and copies in any data that needs to be written. For virtual pointers (CAM_DATA_VADDR), the kernel buffer is malloced from the new pass(4) driver malloc bucket. For virtual scatter/gather lists (CAM_DATA_SG), buffers are allocated from a new per-pass(9) UMA zone in MAXPHYS-sized chunks. Physical pointers are passed in unchanged. We have support for up to 16 scatter/gather segments (for the user and kernel S/G lists) in the default struct pass_io_req, so requests with longer S/G lists require an extra kernel malloc. The new passcopysglist() function copies a user scatter/gather list to a kernel scatter/gather list. The number of elements in each list may be different, but (obviously) the amount of data stored has to be identical. The new passmemdone() function copies data out for the CAM_DATA_VADDR and CAM_DATA_SG cases. The new passiocleanup() function restores data pointers in user CCBs and frees memory. Add new functions to support kqueue(2)/kevent(2): passreadfilt() tells kevent whether or not the done queue is empty. passkqfilter() adds a knote to our list. passreadfiltdetach() removes a knote from our list. Add a new function, passpoll(), for poll(2)/select(2) to use. Add devstat(9) support for the queued CCB path. sys/cam/ata/ata_da.c: Add support for the BIO_VLIST bio type. sys/cam/cam_ccb.h: Add a new enumeration for the xflags field in the CCB header. (This doesn't change the CCB header, just adds an enumeration to use.) sys/cam/cam_xpt.c: Add a new function, xpt_setup_ccb_flags(), that allows specifying CCB flags. sys/cam/cam_xpt.h: Add a prototype for xpt_setup_ccb_flags(). sys/cam/scsi/scsi_da.c: Add support for BIO_VLIST. sys/dev/md/md.c: Add BIO_VLIST support to md(4). sys/geom/geom_disk.c: Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size limiting code in g_disk_start() a bit. sys/kern/subr_bus_dma.c: Change _bus_dmamap_load_vlist() to take a starting offset and length. Add a new function, _bus_dmamap_load_pages(), that will load a list of physical pages starting at an offset. Update _bus_dmamap_load_bio() to allow loading BIO_VLIST bios. Allow unmapped I/O to start at an offset. sys/kern/subr_uio.c: Add two new functions, physcopyin_vlist() and physcopyout_vlist(). sys/pc98/include/bus.h: Guard kernel-only parts of the pc98 machine/bus.h header with #ifdef _KERNEL. This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t. sys/sys/bio.h: Add a new bio flag, BIO_VLIST. sys/sys/uio.h: Add prototypes for physcopyin_vlist() and physcopyout_vlist(). share/man/man4/pass.4: Document the CAMIOQUEUE and CAMIOGET ioctls. usr.sbin/Makefile: Add camdd. usr.sbin/camdd/Makefile: Add a makefile for camdd(8). usr.sbin/camdd/camdd.8: Man page for camdd(8). usr.sbin/camdd/camdd.c: The new camdd(8) utility. Sponsored by: Spectra Logic ------------------------------------------------------------------------ r291724 | ken | 2015-12-03 17:07:01 -0500 (Thu, 03 Dec 2015) | 6 lines Fix typos in the camdd(8) usage() function output caused by an error in my diff filter script. Sponsored by: Spectra Logic ------------------------------------------------------------------------ r291741 | ken | 2015-12-03 22:38:35 -0500 (Thu, 03 Dec 2015) | 10 lines Fix g_disk_vlist_limit() to work properly with deletes. Add a new bp argument to g_disk_maxsegs(), and add a new function, g_disk_maxsize() tha will properly determine the maximum I/O size for a delete or non-delete bio. Submitted by: will Sponsored by: Spectra Logic ------------------------------------------------------------------------ ------------------------------------------------------------------------ r291742 | ken | 2015-12-03 22:44:12 -0500 (Thu, 03 Dec 2015) | 5 lines Fix a style issue in g_disk_limit(). Noticed by: bdrewery ------------------------------------------------------------------------ Sponsored by: Spectra Logic
* MFC r258909:trasz2015-10-181-0/+39
| | | | | | | | Tweak mdconfig(8) manual page, in particular revise the EXAMPLES section. This removes stuff that doesn't really belong there, and simplifies examples for the basic operations. Sponsored by: The FreeBSD Foundation
* MFC r286720:ae2015-08-281-3/+6
| | | | | | Use g_conf_printf_escaped() to escape illegal symbols in file name. PR: 202289
* MFC r269190:kib2014-08-041-1/+3
| | | | | For md(4), posix shm(3) and tmpfs(5), free swap space used by paged in dirty page, which is written by the process.
* MFC Alexander Motin's GEOM direct dispatch work:scottl2014-01-071-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r256603: Introduce new function devstat_end_transaction_bio_bt(), adding new argument to specify present time. Use this function to move binuptime() out of lock, substantially reducing lock congestion when slow timecounter is used. r256606: Move g_io_deliver() out of the lock, as required for direct dispatch. Move g_destroy_bio() out too to reduce lock scope even more. r256607: Fix passing uninitialized bio_resid argument to g_trace(). r256610: Add unmapped I/O support to GEOM RAID. r256830: Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping temporary mapped buffer. That fixes double unmap if biodone() called twice for the same BIO (but with different done methods). r256880: Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. r259247: Fix bug introduced at r256607. We have to recalculate bp_resid here since sizes of original and completed requests may differ due to end of media. Testing of the stable/10 merge was done by Netflix, but all of the credit goes to Alexander and iX Systems. Submitted by: mav Sponsored by: iX Systems
* Give the page allocations initiated by the swap-backed md(4) a higherkib2013-08-301-1/+1
| | | | | | | | priority. If the write is requested by a system daemon, sleeping there would starve resources and cause deadlock. Reported and tested by: pho Sponsored by: The FreeBSD Foundation
* Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9).kib2013-08-221-2/+1
| | | | | | | | The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation
* The soft and hard busy mechanism rely on the vm object lock to work.attilio2013-08-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
* Fix the data corruption on the swap-backed md.kib2013-05-241-1/+7
| | | | | | | | | | | Assign the rv variable a success code if the pager was not asked for the page. Using an error code from the previous processed page caused zeroing of the valid page, when e.g. the previous page was not available in the pager. Reported by: lstewart Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Do not declare that preloaded md(4) supports unmapped bio requests, itkib2013-04-021-1/+9
| | | | | | | does not. Reported by: <mh@kernel32.de> Sponsored by: The FreeBSD Foundation
* Support unmapped i/o for the md(4).kib2013-03-191-40/+204
| | | | | | | | | | The vnode-backed md(4) has to map the unmapped bio because VOP_READ() and VOP_WRITE() interfaces do not allow to pass unmapped requests to the filesystem. Vnode-backed md(4) uses pbufs instead of relying on the bio_transient_map, to avoid usual md deadlock. Sponsored by: The FreeBSD Foundation Tested by: pho, scottl
* Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-8/+8
| | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* Switch vm_object lock to be a rwlock.attilio2013-02-201-0/+1
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Print correct unit number when attaching preloaded memory disks.jh2012-11-211-6/+7
| | | | Retire now unused mdunits variable.
* Disallow attaching preloaded memory disks via ioctl.jh2012-11-211-23/+6
| | | | | | | | | | | | - The feature is dangerous because the kernel code didn't check validity of the memory address provided from user space. - It seems that mdconfig(8) never really supported attaching preloaded memory disks. - Preloaded memory disks are automatically attached during md(4) initialization. Thus there shouldn't be much use for the feature. PR: kern/169683 Discussed on: freebsd-hackers
* Zero the newly allocated md(4) swap-backed page to prevent randomkib2012-11-081-0/+9
| | | | | | | | | | | kernel memory leakage to userspace. For the typical use, when a filesystem put on the md disk, the change only results in CPU and memory bandwidth spent to zero the page, since filsystems make sure that user never see unwritten content. But if md disk is used as raw device by userspace, the garbage is exposed. Reported by: Paul Schenkeveld <freebsd@psconsult.nl> MFC after: 2 weeks
* Add a MD_ROOT_FSTYPE kernel option. The option specifies themarcel2012-11-031-1/+5
| | | | | file system part for the MD_ROOT mount string. Hardcoding the the file system type as "ufs" is too restrictive.
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-15/+3
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-051-2/+1
| | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
* Remove verbose unused commented out debugging printf.kib2012-08-041-6/+0
| | | | | MFC after: 1 week Reviewed by: alc
* Disallow sectorsize larger than MAXPHYS and mediasize smaller thanjh2012-08-021-6/+12
| | | | | | | | sectorsize. PR: 169947 Submitted by: Filip Palian (original version) Reviewed by: kib
* Make it possible to resize md(4) devices.trasz2012-07-071-1/+71
| | | | | Reviewed by: kib Sponsored by: FreeBSD Foundation
* Document a large number of currently undocumented sysctls. While hereeadler2011-12-131-2/+4
| | | | | | | | | | | | fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week
* Add information about MD_READONLY and MD_COMPRESS flags to theae2011-10-311-0/+5
| | | | | | configuration dump. MFC after: 1 week
* Include sys/sbuf.h directly.ae2011-07-111-0/+1
|
* Move the ZERO_REGION_SIZE to a machine-dependent file, as on manymdf2011-05-131-0/+2
| | | | | | | | | | | | | | | | | | architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853
* Usa a globally visible region of zeros for both /dev/zero and the mdmdf2011-05-131-5/+3
| | | | | | | | device. There are likely other kernel uses of "blob of zeros" than can be converted. Reviewed by: alc MFC after: 1 week
* Implement BIO_DELETE for vnode devices by simply overwriting the deleteddes2011-04-291-0/+42
| | | | | | | | | | | | sectors with all-zeroes. The zeroes come from a static buffer; null(4) uses a dynamic buffer for the same purpose (for /dev/zero). It might be a good idea to have a static, shared, read-only all-zeroes page somewhere in the kernel that md(4), null(4) and any other code that needs zeroes could use. Reviewed by: kib MFC after: 3 weeks
* Use the preload_fetch_addr() and preload_fetch_size() conveniencemarcel2011-02-091-10/+9
| | | | | | | functions and only create the MD device when we have a non-zero pointer and size. Sponsored by: Juniper Networks
* Add support for BIO_DELETE on swap-backed md(4). In the case of BIO_DELETEkib2011-01-271-6/+10
| | | | | | | | | | | covering the whole page, free the page. Otherwise, clear the region and mark it clean. Not marking the page dirty could reinstantiate cleared data, but it is allowed by BIO_DELETE specification and saves unneeded write to swap. Reviewed by: alc Tested by: pho MFC after: 2 weeks
* Bio shall not be accessed after g_io_deliver(9).kib2011-01-251-1/+1
| | | | | | Reported and tested by: pho Reviewed by: ae, phk MFC after: 1 week
* Add missed ().kib2011-01-191-2/+2
| | | | | Noted by: alc MFC after: 3 days
* There is no point in calling vm_object_set_writeable_dirty() on an objectalc2011-01-191-1/+0
| | | | | | | that is definitively known to be swap backed since its only effects are on vnode-backed objects. Reviewed by: kib
* Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4).kib2010-12-291-2/+3
| | | | | | | | Non-zero value of attribute means that device supports BIO_DELETE. Suggested and reviewed by: pjd Tested by: pho MFC after: 1 week
* Add sysctl vm.md_malloc_wait, non-zero value of which switches malloc-backedkib2010-12-291-3/+8
| | | | | | | | | | | md(4) to using M_WAITOK malloc calls. M_NOWAITOK allocations may fail when enough memory could be freed, but not immediately. E.g. SU UFS becomes quite unhappy when metadata write return error, that would happen for failed malloc() call. Reported and tested by: pho MFC after: 1 week
* Allow the MDIOCATTACH ioctl operation to originate from within the kernel.marcel2010-10-181-8/+16
| | | | | | To protect against malicious software, we demand that the file name is at a particular location (i.e. appended to the mdio structure) for it to be treated as in-kernel.
* - Remove some extra white space.jh2010-07-261-9/+7
| | | | - Wrap g_md_dumpconf() prototype to 80 columns.
* Convert md(4) to use alloc_unr(9) and alloc_unr_specific(9) for unitjh2010-07-221-12/+20
| | | | | | | number allocation. The old approach had some problems such as it allowed an overflow to occur in the unit number calculation. PR: kern/122288
* Calculate nshift only once.kib2010-07-061-4/+6
| | | | | Also noted by: avg MFC after: 1 week
* Eliminate unnecessary page queues locking.alc2010-06-151-3/+1
|
* Lock the page around vm_page_activate() and vm_page_deactivate() callskib2010-05-031-0/+2
| | | | | | | where it was missed. The wrapped fragments now protect wire_count with page lock. Reviewed by: alc
* Fix panic on invalid 'mdconfig -at preload' usage.trasz2010-02-271-0/+2
| | | | PR: kern/80136
* (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.antoine2009-12-281-1/+1
| | | | | | | | | Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Add cpu_flush_dcache() for use after non-DMA based I/O so that amarcel2009-05-181-3/+6
| | | | | | | | | | | | | | | | | | | | | possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.
* Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that ajhb2009-03-111-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month
* Remove unnecessary page queues locking around vm_page_wakeup(). (Thisalc2009-02-221-7/+1
| | | | | | change is applicable to RELENG_7 but not RELENG_6.) MFC after: 1 week
* Add the possibility to specify "-o force" with "mdconfig -du".trasz2009-01-101-2/+4
| | | | | | Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Fix forced mdconfig -du. E.g. the following would previouslytrasz2008-12-161-1/+4
| | | | | | | | | | | | result in panic: mdconfig -af blah.img -o force mount /dev/md0 /mnt mdconfig -du 0 Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
OpenPOWER on IntegriCloud