summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_disk.c
Commit message (Collapse)AuthorAgeFilesLines
* Clarify and reimplement the bioq API so that bioq_disksort() hasluigi2009-02-131-65/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the correct behaviour (sorting by distance from the current head position in the scan direction) and bioq_insert_head() and bioq_insert_tail() have a well defined (and useful) behaviour, especially when intermixed with calls to bioq_disksort(). In particular: - fix a bug in the existing bioq_disksort() that did not use the current head position correctly; - redefine semantics of bioq_insert_head() and bioq_insert_tail(). bioq_insert_tail() can now be used as a barrier between previous and subsequent calls to bioq_disksort(). The code is heavily documented in the source code so please refer to that for the details. Much of this code comes from Fabio Checconi. Also thanks to Kirk for feedback on the (re)definition of bioq_insert_tail(). NOTE: in the current tree there is only a handful of files which intermix calls to bioq_disksort() with bioq_insert_head() and bioq_insert_tail(). The ordering of the queue in these situation was not specified (nor easy to figure out) before, so I doubt any of that code could be affected by the specification of the API. Also note that the current implementation is significantly simpler than the previous one (also used in ata_sort_queue()). It would be useful to reimplement ata_sort_queue() using the same code used in bioq_disksort(). MFC after: 1 week
* Make bioq_disksort have a ANSI-C definition rather than a K&R definition.imp2009-02-031-3/+1
|
* Add a new I/O request - BIO_FLUSH, which basically tells providers below topjd2006-10-311-0/+1
| | | | | | | flush their caches. For now will mostly be used by disks to flush their write cache. Sponsored by: home.pl
* Unexpand TAILQ_FIRST(foo) == NULL to TAILQ_EMPTY(foo).delphij2006-05-291-2/+2
|
* When calling bioq_first() to see if a queue is empty in bioq_disksort(),rwatson2006-01-131-1/+1
| | | | | | | don't save the return value as we won't use it. Noticed by: Coverity Prevent analysis tool MFC after: 3 days
* - Fix insertions of bios which represent data earlier than anything elsejeff2005-06-151-7/+4
| | | | | | | | in the queue. The insertion sort assumed this had already been taken care of. Spotted by: Antoine Brodin Approved by: re (scottl)
* - Dramatically simplify bioqdisksort(). We no longer do ordered bios sojeff2005-06-121-85/+40
| | | | | | | most of the code to deal with them has been dead for sometime. Simplify the code by doing an insert sort hinted by the current head position. Met with apathy by: arch@
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Add bioq_insert_head() function.pjd2004-12-131-0/+7
| | | | OK'd by: phk
* Add bioq_takefirst().phk2004-08-191-6/+11
| | | | | | | | | | | | | | | | | | | | | | If the bioq is empty, NULL is returned. Otherwise the front element is removed and returned. This can simplify locking in many drivers from: lock() bp = bioq_first(bq); if (bp == NULL) { unlock() return } bioq_remove(bp, bq) unlock to: lock() bp = bioq_takefirst(bq); unlock() if (bp == NULL) return;
* Report bio_pblkbo instead of bio_blkno.phk2003-10-181-3/+3
|
* Make bioq_disksort() sort on the bio_offset field instead of bio_pblkno.phk2003-10-181-9/+9
|
* Made use of 'error' argument, which was unused (by mistake) before.phk2003-10-141-1/+1
| | | | Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
* Use __FBSDID().obrien2003-06-111-3/+3
|
* Don't include <sys/disklabel.h>phk2003-04-161-1/+0
|
* Remove BIO_SETATTR from non-GEOM part of kernel as well.phk2003-04-031-1/+0
|
* #include <geom/geom_disk.h>phk2003-04-011-0/+1
|
* Introduce bioq_flush() function.phk2003-04-011-0/+15
|
* retire the "busy" field in bioqueues, it's served it's purpose.phk2003-03-301-8/+0
|
* Preparation commit before I start on the bioqueue lockdown:phk2003-03-301-0/+43
| | | | | Collect all the bits of bioqueue handing in subr_disk.c, vfs_bio.c is big enough as it is and disksort already lives in subr_disk.c.
* Including <sys/stdint.h> is (almost?) universally only to be able to usephk2003-03-181-1/+0
| | | | | %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
* Don't pick up a name from the dev_t if it is not there.phk2003-03-031-1/+7
|
* NO_GEOM cleanup: remove #ifdefphk2003-01-301-423/+0
|
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-1/+1
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Only include <sys/diskslice.h> ifdef NO_GEOMphk2003-01-201-1/+1
|
* This checkin reimplements the io-request priority hack in a waymckusick2002-10-221-29/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | that works in the new threaded kernel. It was commented out of the disksort routine earlier this year for the reasons given in kern/subr_disklabel.c (which is where this code used to reside before it moved to kern/subr_disk.c): ---------------------------- revision 1.65 date: 2002/04/22 06:53:20; author: phk; state: Exp; lines: +5 -0 Comment out Kirks io-request priority hack until we can do this in a civilized way which doesn't cause grief. The problem is that it is not generally safe to cast a "struct bio *" to a "struct buf *". Things like ccd, vinum, ata-raid and GEOM constructs bio's which are not entrails of a struct buf. Also, curthread may or may not have anything to do with the I/O request at hand. The correct solution can either be to tag struct bio's with a priority derived from the requesting threads nice and have disksort act on this field, this wouldn't address the "silly-seek syndrome" where two equal processes bang the diskheads from one edge to the other of the disk repeatedly. Alternatively, and probably better: a sleep should be introduced either at the time the I/O is requested or at the time it is completed where we can be sure to sleep in the right thread. The sleep also needs to be in constant timeunits, 1/hz can be practicaly any sub-second size, at high HZ the current code practically doesn't do anything. ---------------------------- As suggested in this comment, it is no longer located in the disk sort routine, but rather now resides in spec_strategy where the disk operations are being queued by the thread that is associated with the process that is really requesting the I/O. At that point, the disk queues are not visible, so the I/O for positively niced processes is always slowed down whether or not there is other activity on the disk. On the issue of scaling HZ, I believe that the current scheme is better than using a fixed quantum of time. As machines and I/O subsystems get faster, the resolution on the clock also rises. So, ten years from now we will be slowing things down for shorter periods of time, but the proportional effect on the system will be about the same as it is today. So, I view this as a feature rather than a drawback. Hence this patch sticks with using HZ. Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>
* One #include <sys/sysctl.h> should be enough.cognet2002-10-211-1/+0
| | | | Approved by: mux (mentor)
* Separate fiels reported by disk_err() with spaces, so that output doesn'tsobomax2002-10-171-7/+7
| | | | | | look cryptic. MFC after: 1 week
* Populate more fields of the disklabel for PC98.phk2002-10-141-0/+2
| | | | Submitted by: Kawanobe Koh <kawanobe@st.rim.or.jp>
* NB: This commit does *NOT* make GEOM the default in FreeBSDphk2002-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NB: But it will enable it in all kernels not having options "NO_GEOM" Put the GEOM related options into the intended order. Add "options NO_GEOM" to all kernel configs apart from NOTES. In some order of controlled fashion, the NO_GEOM options will be removed, architecture by architecture in the coming days. There are currently three known issues which may force people to need the NO_GEOM option: boot0cfg/fdisk: Tries to update the MBR while it is being used to control slices. GEOM does not allow this as a direct operation. SCSI floppy drives: Appearantly the scsi-da driver return "EBUSY" if no media is inserted. This is wrong, it should return ENXIO. PC98: It is unclear if GEOM correctly recognizes all variants of PC98 disklabels. (Help Wanted! I have neither docs nor HW) These issues are all being worked. Sponsored by: DARPA & NAI Labs.
* If dsgetlabel() returns a label with a size of zero in diskdumpconf(),brian2002-10-051-0/+2
| | | | | | | | | | treat it as an invalid partition. This fixes a bug where ``dumpon <device>'' will configure the dump device at a random offset on the disk if <device> isn't a valid partition. Reviewed by: phk
* (This commit touches about 15 disk device drivers in a very consistentphk2002-09-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and predictable way, and I apologize if I have gotten it wrong anywhere, getting prior review on a patch like this is not feasible, considering the number of people involved and hardware availability etc.) If struct disklabel is the messenger: kill the messenger. Inside struct disk we had a struct disklabel which disk drivers used to communicate certain metrics to the disklayer above (GEOM or the disk mini-layer). This commit changes this communication to use four explicit fields instead. Amongst the benefits is that the fields do not get overwritten by wrong or bogus on-disk disklabels. Once that is clear, <sys/disk.h> which is included in the drivers no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in, the few places that needs them, have gotten explicit #includes for them. The disklabel inside struct disk is now only for internal use in the disk mini-layer, so instead of embedding it, we malloc it as we need it. This concludes (modulus any mistakes) the series of disklabel related commits. I belive it all amounts to a NOP for all the rest of you :-) Sponsored by: DARPA & NAI Labs.
* Make FreeBSD "struct disklabel" agnostic, step 312 of 723:phk2002-09-201-0/+150
| | | | | | | | | | Rename bioqdisksort() to bioq_disksort(). Keep a #define around to avoid changing all diskdrivers right now. Move it from subr_disklabel.c to subr_disk.c. Move prototype from <sys/disklabel.h> to <sys/bio.h> Sponsored by: DARPA and NAI Labs.
* Make FreeBSD "struct disklabel" agnostic, step 311 of 723:phk2002-09-201-3/+40
| | | | | | | | | | | | | | | | | | | | | | | | | Rename diskerr() to disk_err() for naming consistency. Drop the by now entirely useless struct disklabel argument. Add a flag argument for new-line termination. Fix a couple of printf-format-casts to %j instead of %l. Correctly print the name of all bio commands. Move the function from subr_disklabel.c to subr_disk.c, and from <sys/disklabel.h> to <sys/disk.h>. Use the new disk_err() throughout, #include <sys/disk.h> as needed. Bump __FreeBSD_version for the sake of the aac disk drivers #ifdefs. Remove unused disklabel members of softc for aac, amr and mlx, which seem to originally have been intended for diskerr() use, but which only rotted and got Copy&Pasted at least two times to many. Sponsored by: DARPA & NAI Labs.
* Don't use "NULL" when "0" is really meant.archie2002-08-211-1/+1
|
* Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the startphk2002-04-091-0/+4
| | | | | | of the device magic stuff might occupy. Sponsored by: DARPA & NAI Labs.
* Rename DIOCGKERNELDUMP to DIOCSKERNELDUMP as it strictly speakingphk2002-04-091-1/+1
| | | | | | is a "set" not a "get" operation. Sponsored by: DARPA & NAI Labs.
* Here follows the new kernel dumping infrastructure.phk2002-03-311-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caveats: The new savecore program is not complete in the sense that it emulates enough of the old savecores features to do the job, but implements none of the options yet. I would appreciate if a userland hacker could help me out getting savecore to do what we want it to do from a users point of view, compression, email-notification, space reservation etc etc. (send me email if you are interested). Currently, savecore will scan all devices marked as "swap" or "dump" in /etc/fstab _or_ any devices specified on the command-line. All architectures but i386 lack an implementation of dumpsys(), but looking at the i386 version it should be trivial for anybody familiar with the platform(s) to provide this function. Documentation is quite sparse at this time, more to come. Details: ATA and SCSI drivers should work as the dump formatting code has been removed. The IDA, TWE and AAC have not yet been converted. Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set the device as dumpdev. To implement the "off" argument, /dev/null is used as the device. Savecore will fail if handed any options since they are not (yet) implemented. All devices marked "dump" or "swap" in /etc/fstab will be scanned and dumps found will be saved to diskfiles named from the MD5 hash of the header record. The header record is dumped in readable format in the .info file. The kernel is not saved. Only complete dumps will be saved. All maintainer rights for this code are disclaimed: feel free to improve and extend. Sponsored by: DARPA, NAI Labs
* Make the disk_clone() routine more robust for abuse.phk2002-03-111-33/+26
| | | | Sneak in a trivial bit of the GEOM stuff while we're here anyway.
* Fix a warning.robert2002-03-051-2/+0
|
* Don't call cdevsw_add().phk2001-11-041-1/+0
|
* Rename the top 7 bits if disk minors to spare bits, rather than type bits.phk2001-11-041-1/+1
|
* Don't choke on old sd%d.ctl devices.phk2001-11-031-0/+2
| | | | Tripped over by: Jos Backus <josb@cncdsl.com>
* Turn the symlinks around, instead of ad0s1 -> ad0s1c, make it ad0s1c -> ad0s1.phk2001-11-021-13/+23
| | | | Requested by: peter
* Fix a problem in the disk related hack where device nodes for a physicallyphk2001-10-281-0/+2
| | | | | | | | non-existent disk in a legacy /dev on a DEVFS system would panic the system if stat(2)'ed. Do not whine about anonymous device nodes not having a si_devsw, they're not supposed to.
* Nudge the axe a bit closer to cdevsw[]:phk2001-10-271-2/+55
| | | | | | | | | | | | Make it a panic to repeat make_dev() or destroy_dev(), this check should maybe be neutered when -current goes -stable. Whine if devsw() is called on anon dev_t's in a devfs system. Make a hack to avoid our lazy-eval disk code triggering the above whine. Fix the multiple make_dev() in disk code by making ${disk}${unit}s${slice} an alias/symlink to ${disk}${unit}s${slice}c
* disk_clone() was a bit too eager to please: "md0s1ec" is not a validphk2001-10-221-0/+2
| | | | | | device. Noticed by: Chad David <davidc@acns.ab.ca>
* KSE Milestone 2julian2001-09-121-7/+7
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Don't dump on the label sector or below. This avoids clobbering thebde2001-08-151-2/+2
| | | | | | | | | | | | | | | | | | | label if the dump device overflaps the label (which is a slight misconfiguration). Dump routines don't use dscheck(), so the normal write protection of the label doesn't help. Reduced some nearby overflow bugs. In disk_dumpcheck(), there was (fatal but fail-safe) overflow on i386's with 4GB of memory, at least if Maxmem was the top page (can this happen?). The fix assumes that the sector size divides PAGE_SIZE (dump routines already assume this). In setdumpdev(), the corresponding overflow occurred with only about 2GB of memory on all machines with 32-bit ints. This allowed setdumpdev() to succeed when it shouldn't have, but then disk_dumpcheck() failed safe later. Except in old versions of FreeBSD like RELENG_3 where there is no disk_dumpcheck(). PR: 28164 (label clobbering part) MFC after: 1 week
* Remove the hack-around for the slice/label code, it didn'tphk2001-05-291-11/+1
| | | | cover the hole.
OpenPOWER on IntegriCloud