summaryrefslogtreecommitdiffstats
path: root/sys/geom
Commit message (Collapse)AuthorAgeFilesLines
* Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.gibbs2010-09-022-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month
* Correct offset conversion to little endian. It was implemented in version 2,pjd2010-08-282-7/+9
| | | | | | | | but because of a bug it was a no-op, so we were still using offsets in native byte order for the host. Do it properly this time, bump version to 4 and set the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4. MFC after: 2 weeks
* Remove bintime_cmp() function, unused since r200086.mav2010-08-181-15/+0
| | | | MFC after: 1 week
* Check that gsp is not NULL before access. It can be NULLae2010-08-031-1/+1
| | | | | | | for some cases. Approved by: kib (mentor) MFC after: 1 week
* Check that table is not NULL before access, it can be NULLae2010-08-031-1/+1
| | | | | | | for some cases. Approved by: mav (mentor) MFC after: 2 weeks
* Forward ioctl requests to original geom.ae2010-08-021-0/+19
| | | | | | | | PR: 148540 Silence from: luigi Reviewed by: pjd Approved by: mav (mentor) MFC after: 2 weeks
* Release access for consumers that are opened, but will be destroyedae2010-08-021-0/+4
| | | | | | | | | indirectly by orphan method. PR: 148688 Silence from: marcel Approved by: mav (mentor) MFC after: 2 weeks
* Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers tomav2010-07-252-0/+16
| | | | | GEOM. This information needed for proper soft-RAID's on-disk metadata reading and writing.
* Prevent access after free to table entry in case whenae2010-07-231-8/+8
| | | | | | | | | user deletes partition that not yet created (changes doesn't committed to disk). PR: 148687 Approved by: mav (mentor) MFC after: 7 days
* Fixed cache size decoding read from a label.ru2010-07-141-1/+1
| | | | | | PR: kern/144732 Submitted by: Eugene Grosbein MFC after: 3 days
* Add NTFS partition type to GEOM_MBR.rpaulo2010-06-263-2/+14
|
* 'unit' can be negative, so use signed type for it.pjd2010-06-141-1/+1
| | | | | | Found by: Coverity Prevent CID: 3731 MFC after: 3 days
* BIO_DELETE contains range we want to delete and doesn't provide any usefulpjd2010-06-141-1/+1
| | | | | | data, so there is no need to copy it to userland. MFC after: 3 days
* fix a few cases where a string is passed via format argument instead ofavg2010-06-111-1/+1
| | | | | | | | | | via %s Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer. Found by: clang MFC after: 2 week
* Untangle g_print_bio(), silencing Coverity.trasz2010-06-101-8/+7
| | | | | Found with: Coverity Prevent CID: 3566, 3567
* Try and narrow the gap in which you act on an event that has been canceled.mjacob2010-06-081-0/+10
| | | | | Obtained from: Jaako Heinonen MFC after: 1 month
* Make sure not to pass NULL to g_orphan_provider().trasz2010-06-051-1/+2
| | | | | Found with: Coverity Prevent CID: 3411
* Don't leak memory on destruction.marius2010-06-022-0/+12
| | | | | Reviewed by: marcel MFC after: 3 days
* g_label: fix possible NULL pointer dereferenceavg2010-05-311-4/+2
| | | | | | | | in case glabel debug level is >= 1 and gp->provider list is empty for some reason Found by: clang static analyzer MFC after: 4 days
* Fix some whitespace nits.marius2010-05-241-10/+11
|
* Teach gpart about bootcode on APM.nwhitehorn2010-05-161-0/+26
|
* Yet another potential dereference of a dead provider.mjacob2010-05-141-1/+1
| | | | | Sponsored by: Panasas MFC after: 1 week
* Make sure to check that the active provider pointer points to something beforemjacob2010-05-141-1/+1
| | | | | | | dereferencing the pointer. Sponsored by: Pansas MFC after: 1 week
* - Don't return EAGAIN from gv_unload(). It was used to work around thejh2010-05-104-5/+20
| | | | | | | | | | | deadlock fixed in r207671. - Wait for worker process to exit at class unload. The worker process was not guaranteed to exit before the linker unloaded the module. - Use 0 as the worker process exit status instead of ENXIO and style the NOTREACHED comment. Reviewed by: lulf X-MFC after: r207671
* In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case.jh2010-05-101-1/+1
| | | | | | | EBUSY was probably used as a workaround for the deadlock fixed in r207671. Approved by: pjd X-MFC after: r207671
* - Remove obsolete flags.lulf2010-05-081-6/+0
| | | | MFC after: 1 week
* Fix deadlock between GEOM class unloading and withering. Withering can'tjh2010-05-052-47/+54
| | | | | | | | | proceed while g_unload_class() blocks the event thread. Fix this by not running g_unload_class() as a GEOM event and dropping the topology lock when withering needs to proceed. PR: kern/139847 Silence on: freebsd-geom
* Re-calculate a geometry when reprobing as well.marcel2010-04-251-0/+9
| | | | | PR: kern/145452 Reported by: "Andrey V. Elsukov" <bu7cher@yandex.ru>
* Fix undo for schemes that have internal partitions. Internal partitionsmarcel2010-04-251-1/+14
| | | | | | | | | | | do not constitute user-visible or active partitions and as such should not prevent undoing pending operations. While here, initialize the last usable sector for the placeholder geom based on the null scheme, created to allow undoing the destruction of a scheme. This gives consistent output with "gpart show". Based on a patch from: "Andrey V. Elsukov" <bu7cher@yandex.ru>
* Implement the resize verb and add support for resizing partitionsmarcel2010-04-238-4/+222
| | | | | | for all schemes but EBR. Quality work by Andrey! Submitted by: "Andrey V. Elsukov" <bu7cher@yandex.ru>
* Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don'tjh2010-04-191-7/+13
| | | | | | | assert that the topology lock is held when g_valid_obj() is called from debugger. MFC after: 1 week
* Use lower priority for GELI worker threads. This improves systempjd2010-04-151-3/+2
| | | | | | responsiveness under heavy GELI load. MFC after: 3 days
* g_io_check: respond to zero pp->mediasize with ENXIOavg2010-04-151-2/+2
| | | | | | | | Previsouly this condition was reported with EIO by bio_offset > mediasize check. Perhaps that check should be extended to bio_offset+bio_length > mediasize. MFC after: 1 week
* fix copyright format, as requested by Joel Dahlluigi2010-04-134-4/+8
|
* make code compile with KTRluigi2010-04-131-11/+4
|
* Bring in geom_sched, support for scheduling disk I/O requestsluigi2010-04-126-0/+3330
| | | | | | | | | | | | in a device independent manner. Also include an example anticipatory scheduler, gsched_rr, which gives very nice performance improvements in presence of competing random access patterns. This is joint work with Fabio Checconi, developed last year and presented at BSDCan 2009. You can find details in the README file or at http://info.iet.unipi.it/~luigi/geom_sched/
* g_vfs_open: allow only one mount per device vnodeavg2010-04-031-1/+6
| | | | | | | | | | | | | | | | In other words, deny multiple read-only mounts of the same device. Shared read-only mounts should theoretically be possible, but, unfortunately, can not be implemented correctly using current buffer cache code/interface and results in an eventual system crash. Also, using nullfs seems to be a more efficient way to achieve the same goal. This gets us back to where we were before GEOM and where other BSDs are. Submitted by: pjd (idea for checking for shared mounting) Discussed with: phk, pjd Silence from: fs@, geom@ MFC after: 2 weeks
* bo_bsize: revert r205860 and take an alternative approch in getblkavg2010-04-021-1/+1
| | | | | | | | | | | | | | | | | In r205860 I missed the fact that there is code that strongly assumes that devvp bo_bsize is equal to underlying provider's sectorsize. In those places it is hard to obtain the sectorsize in an alternative way if devvp bo_bsize is set to something else. So, I am reverting bo_bsize assigment in g_vfs_open. Instead, in getblk I use DEV_BSIZE block size for b_offset calculation if vp is a disk vp as reported by vn_isdisk. This should coinside with vp being a devvp. Reported by: Mykola Dzham <i@levsha.me> Tested by: Mykola Dzham <i@levsha.me> Pointyhat to: avg MFC after: 2 weeks X-ToDo: convert bread(devvp) in all fs to use bo_bsize-d blocks
* g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZEavg2010-03-291-1/+1
| | | | | | | | | | Because of how breadn -> bufstrategy -> g_vfs_strategy are currently implemented, bread on devvp always expects DEV_BSIZE block size. Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media properties or filesystem implementation details. Reviewed by: mckusick MFC after: 2 weeks
* Change how multipath labels are created and managed. This makes it easiermjacob2010-03-291-76/+38
| | | | | | | | | | | | | | | | | to support various storage boxes which really aren't active-active. We only write the label on the *first* provider. For all other providers we just "add" the disk. This also allows for an "add" verb. A usage implication is that you should specificy the currently active storage path as the first provider. Note that this does not add RDAC-like functionality, but better allows for autovolumefailover configurations (additional checkins elsewhere will support this). Sponsored by: Panasas MFC after: 1 month
* Do not fetch precise time of request start when stats collection disabled.mav2010-03-241-1/+4
| | | | Reviewed by: pjd, phk
* Add 'rotate' and 'getactive' verbs to provide some control and informationmjacob2010-03-211-0/+87
| | | | | | | about what the currently active path is. Sponsored by: Panasas MFC after: 1 month
* Escape characters unsafe for XML output in GEOM class, instance andjh2010-03-201-3/+25
| | | | | | | | | | | | | | | | | | | provider names. - Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced with '?'. Those characters are disallowed in XML. - '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are replaced with XML numeric character reference. If the kern.geom.confxml sysctl provides invalid XML, libgeom geom_xml2tree() fails and utilities using it do not work. Unsafe characters are common in msdosfs and cd9660 labels. PR: kern/104389 Submitted by: Doug Steinwand (original version) Reviewed by: pjd Discussed on: freebsd-geom MFC after: 3 weeks
* Simplify loops.pjd2010-03-181-20/+10
|
* - Set missing flag when initiating a plex rebuild with the rebuildparitylulf2010-03-081-0/+15
| | | | | | command. - Check if plex is already syncing or rebuilding before initiating a parity rebuild or check.
* Please welcome HAST - Highly Avalable Storage.pjd2010-02-182-74/+134
| | | | | | | | | | | | | | | | | | | | | | HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV
* - Style fixes.pjd2010-02-181-54/+32
| | | | - Prefer strlcpy() over strncpy().
* Correct comment.pjd2010-02-181-1/+1
|
* Log attach just like we log detach.pjd2010-02-181-0/+1
|
* - Give geom_redboot taste of flash/spi. Now there is another providergonzo2010-02-031-1/+2
| | | | | of redboot partitions. This patch was missed during merge from projects/mips.
OpenPOWER on IntegriCloud