summaryrefslogtreecommitdiffstats
path: root/sys/geom
Commit message (Collapse)AuthorAgeFilesLines
* Close race between geom destruction on g_vfs_close() when softc destroyedmav2011-12-021-1/+3
| | | | | | and g_vfs_orphan() call that tries to access softc, intruced at r227015. PR: kern/162997
* Add an ability to increase number of allocated APM entries when weae2011-11-281-30/+47
| | | | | | | | have reserved free space in the APM area. Also instead of one write request per each APM entry, use MAXPHY sized writes when we are updating APM. MFC after: 1 month
* The size of APM could be bigger than number of already allocated entries.ae2011-11-281-1/+1
| | | | | | And the first usable sector should not start from the inside of APM area. MFC after: 1 month
* Temporary revert r227009 to fix freeze on UP systems without PREEMPTION.mav2011-11-141-27/+12
| | | | | | | | | | | Before r215687, if some withered geom or provider could not be destroyed, g_event thread went to sleep for 0.1s before retrying. After that change it is just restarting immediately. r227009 made orphaned (withered) provider to not detach immediately, but only after context switch. That made loop inside g_event thread infinite on UP systems without PREEMPTION. To address original problem with possible dead lock addressed by r227009 we have to fix r215687 change first, that needs some time to think and test.
* Major GEOM MULTIPATH class rewrite:mav2011-11-122-120/+616
| | | | | | | | | | | | | | | | | | | | | | | | | - Improved locking and destruction process to fix crashes. - Improved "automatic" configuration method to make it consistent and safe by reading metadata back from all specified paths after writing to one. - Added provider size check to reduce chance of ordering conflict with other GEOM classes. - Added "manual" configuration method without using on-disk metadata. - Added "add" and "remove" commands to allow manage paths manually. - Failed paths are no longer dropped from geom, but only marked as FAIL and excluded from I/O operations. - Automatically restore failed paths when all others paths are marked as failed, for example, because of device-caused (not transport) errors. - Added "fail" and "restore" commands to manually control FAIL flag. - geom is now destroyed on last path disconnection. - Added optional Active/Active mode support. Unlike Active/Passive mode, load evenly distributed between all working paths. If supported by the device, it allows to significantly improve performance, utilizing bandwidth of all paths. It is controlled by -A option during creation. Disabled by default now. - Improved `status` and `list` commands output. Sponsored by: iXsystems, inc. MFC after: 1 month
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-0719-22/+35
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.ed2011-11-071-1/+1
| | | | This means that their use is restricted to a single C file.
* Add mutex and two flags to make orphan() call properly asynchronous:mav2011-11-021-22/+65
| | | | | | | - delay consumer closing and detaching on orphan() until all I/Os complete; - prevent new I/Os submission after orphan() called. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
* Make orphan() method in geom_dev asynchronous using destroy_dev_sched_cb()mav2011-11-011-12/+27
| | | | | | | instead of destroy_dev(). It moves device destruction waiting out of the topology lock and so fixes dead lock between orphanization and closing. Real provider and geom destruction called from swi context after device destroyed as callback of the destroy_dev_sched_cb().
* Refactor disk disconnection and geom destruction handling sequences.mav2011-11-012-50/+44
| | | | | | | | Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
* Refactor disk disconnection and geom destruction handling sequences.mav2011-11-011-55/+46
| | | | | | | | Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
* Workaround the problem introduced by combination of r162200 and r215687.mav2011-11-011-1/+1
| | | | | | | | | | | | | | | | r162200 delays provider orphanization until all running requests complete, to workaround broken orphan() method implementation in some classes. r215687 removes persistent periodic (10Hz) event thread wake ups. Together these changes can indefinitely delay orphanization until some other event wake up the event thread. One consequence of this is inability of CAM to destroy device disconnected when busy and, as consequence, create new one after reconnection. While the best solution would be to revert r162200, it is not easy, as some classes still look broken in that way. Instead conditionally wake up event thread if there are some providers waiting for orphanization. MFC after: 1 week
* Our geom withering function could take some time before geom with itsae2011-10-281-0/+4
| | | | | | | | providers and consumers will be destroyed. Before take some actions with a geom, check that it is not destroyed at the moment. Tested by: nwhitehorn MFC after: 1 week
* Before this change when GELI detected hardware crypto acceleration it willpjd2011-10-272-12/+5
| | | | | | | | | | | | | | | | | | | start only one worker thread. For software crypto it will start by default N worker threads where N is the number of available CPUs. This is not optimal if hardware crypto is AES-NI, which uses CPU for AES calculations. Change that to always start one worker thread for every available CPU. Number of worker threads per GELI provider can be easly reduced with kern.geom.eli.threads sysctl/tunable and even for software crypto it should be reduced when using more providers. While here, when number of threads exceeds number of CPUs avilable don't reduce this number, assume the user knows what he is doing. Reported by: Yuri Karaban <dev@dev97.com> MFC after: 3 days
* Clarify disks/volumes above 2TiB support in geom_raid:mav2011-10-263-23/+66
| | | | | | | | | | - add support for volumes above 2TiB with Promise metadata format; - enforse and document other limitations: - Intel and Promise metadata formats do not support disks above 2TiB; - NVIDIA metadata format does not support volumes above 2TiB. Sponsored by: iXsystems, Inc. MFC after: 2 weeks
* Allow upper layers to discover than BIO_DELETE and/or BIO_FLUSH is notpjd2011-10-251-3/+3
| | | | | | supported by returning EOPNOTSUPP instead of 0 or ENODEV. MFC after: 3 days
* Improve style a bit.pjd2011-10-251-5/+7
| | | | MFC after: 3 days
* Simplify disk_alloc().pjd2011-10-251-4/+2
| | | | MFC after: 3 days
* Add support for creating GELI devices with older metadata version for usepjd2011-10-252-5/+56
| | | | | | | | | | | | | | with older FreeBSD versions: - Add -V option to 'geli init' to specify version number. If no -V is given the most recent version is used. - If -V is given don't allow to use features not supported by this version. - Print version in 'geli list' output. - Update manual page and add table describing which GELI version is supported by which FreeBSD version, so one can use it when preparing GELI device for older FreeBSD version. Inspired by: Garrett Cooper <yanegomi@gmail.com> MFC after: 3 days
* When decoding metadata, check magic string, so we know this is not GELI devicepjd2011-10-251-0/+2
| | | | | | | before we check its version. We don't want to report that some garbage is unsupported version if this is not even GELI provider. MFC after: 3 days
* Prefer G_ELI_VERSION_* defines for version numbers over plain digits.pjd2011-10-251-3/+5
| | | | MFC after: 3 days
* Fit lines into 80 chars.pjd2011-10-251-4/+6
| | | | MFC after: 3 days
* When metadata is at newer version than the highest supported, returnpjd2011-10-251-1/+1
| | | | | | EOPNOTSUPP when decoding. MFC after: 3 days
* Add support for Boot Camp. The support is defined as follows:marcel2011-10-231-97/+213
| | | | | | | | | | | | o Detect when Boot Camp is enabled (i.e. the MBR mirrors the GPT). o When Boot Camp is enabled, update the MBR whenever we write the GPT. o Creation of a Boot Camp enabled GPT is not supported. o Automatically disable Boot Camp when the GPT has been changed so that there's either no EFI partition or no HFS+ partition. o The first 4 partitions (by index) get mirrored in the MBR. Requested by, discussed with and tested by: kris@pcbsd.org MFC after: 1 week
* Allow to dump on Solaris swap partitions.marius2011-10-181-1/+2
| | | | | PR: 161764 Submitted by: Peter Jeremy
* Add some spare fields to the g_class and g_geom structures needed to implementpjd2011-07-171-0/+5
| | | | direct I/O handling and provider's property changes handling.
* Remove include of sys/sbuf.h from geom/geom.h.ae2011-07-111-1/+0
| | | | | | | sbuf support is not always required for geom/geom.h users, and no need to depend from it. PR: kern/158398
* Include sys/sbuf.h directly.ae2011-07-1125-0/+27
| | | | Reviewed by: pjd
* Allow disk partitions associated with UFS read-only mountedmckusick2011-07-101-1/+1
| | | | | | | | | filesystems to be opened for writing. This functionality used to be special-cased for just the root filesystem, but with this change is now available for all UFS filesystems. This change is needed for journaled soft updates recovery. Discussed with: Jeff Roberson
* Initialize elements of state array when creating the GPT table.ae2011-06-291-34/+33
| | | | | | | | | This fixes the problem, when the secondary GPT header is not erased when partition table destroyed. Move equal operations from g_part_gpt_create and g_part_gpt_recover to the separate function g_gpt_set_defaults. Reported by: dwhite MFC after: 1 week
* EBR could contain an early stage of boot code. But we do not support it.ae2011-06-271-16/+20
| | | | | | | | | | | | | Remove message about non empty bootcode, we can not break something while GEOM_PART_EBR_COMPAT is defined. But without GEOM_PART_EBR_COMPAT any changes in EBR are allowed and we can accidentally wipe the boot code. To do not break anything save the first EBR chunk and keep it untouched each time when we are changing EBR. Note that we are still not support boot code for EBR. PR: kern/141235 MFC after: 1 month
* MS Windows NT+ uses 4 bytes at offset 0x1b8 in the MBR to identifyae2011-06-271-6/+8
| | | | | | | | | | | disk drive. The boot0cfg(8) utility preserves these 4 bytes when is writing bootcode to keep a multiboot ability. Change gpart's bootcode method to keep DSN if it is not zero. Also do not allow writing bootcode with size not equal to MBRSIZE. PR: kern/157819 Tested by: Eir Nym MFC after: 1 month
* Change the way how we update bootcode for BSD scheme.ae2011-06-201-13/+12
| | | | | | | | | Since the only parameter that we check is size of bootcode, then allow only two sizes: size of boot1 and size of /boot/boot. This partially protects users from losing ability to boot if incorrect bootcode is specified. Requested by: ru
* Plumb device physical path reporting from CAM devices, through GEOM andgibbs2011-06-146-1/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DEVFS, and make it accessible via the diskinfo utility. Extend GEOM's generic attribute query mechanism into generic disk consumers. sys/geom/geom_disk.c: sys/geom/geom_disk.h: sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Allow disk providers to implement a new method which can override the default BIO_GETATTR response, d_getattr(struct bio *). This function returns -1 if not handled, otherwise it returns 0 or an errno to be passed to g_io_deliver(). sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Don't copy the serial number to dp->d_ident anymore, as the CAM XPT is now responsible for returning this information via d_getattr()->(a)dagetattr()->xpt_getatr(). sys/geom/geom_dev.c: - Implement a new ioctl, DIOCGPHYSPATH, which returns the GEOM attribute "GEOM::physpath", if possible. If the attribute request returns a zero-length string, ENOENT is returned. usr.sbin/diskinfo/diskinfo.c: - If the DIOCGPHYSPATH ioctl is successful, report physical path data when diskinfo is executed with the '-v' option. Submitted by: will Reviewed by: gibbs Sponsored by: Spectra Logic Corporation Add generic attribute change notification support to GEOM. sys/sys/geom/geom.h: Add a new attrchanged method field to both g_class and g_geom. sys/sys/geom/geom.h: sys/geom/geom_event.c: - Provide the g_attr_changed() function that providers can use to advertise attribute changes. - Perform delivery of attribute change notifications from a thread context via the standard GEOM event mechanism. sys/geom/geom_subr.c: Inherit the attrchanged method from class to geom (class instance). sys/geom/geom_disk.c: Provide disk_attr_changed() to provide g_attr_changed() access to consumers of the disk API. sys/cam/scsi/scsi_pass.c: sys/cam/scsi/scsi_da.c: sys/geom/geom_dev.c: sys/geom/geom_disk.c: Use attribute changed events to track updates to physical path information. sys/cam/scsi/scsi_da.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, and the updated buffer type references our physical path attribute, emit a GEOM attribute changed event via the disk_attr_changed() API. sys/cam/scsi/scsi_pass.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, update the physical patch devfs alias for this pass instance. Submitted by: gibbs Sponsored by: Spectra Logic Corporation
* MFCattilio2011-06-032-1/+22
|\
| * Update disk's stripesize and stripeoffset parameters on provider open.mav2011-06-031-0/+6
| | | | | | | | | | | | | | | | | | They are media-dependent and may change in run-time, same as sectorsize and/or mediasize. SCSI devices return physical sector size and offset via READ CAPACITY(16) command and so can not report it until media inserted or at least until probe sequence completed. UNMAP support is also reported there.
| * Add diagnostic message about not aligned partitions.ae2011-06-031-1/+16
| | | | | | | | Idea from: ivoras
* | MFCattilio2011-06-021-4/+2
|\ \ | |/
| * Do not hide stripeoffset from libgeom(3), it may be useful even whenae2011-06-021-4/+2
| | | | | | | | | | | | stripesize is zero. MFC after: 1 week
* | MFCattilio2011-05-271-2/+7
|\ \ | |/
| * Some partitioning tools may have a different opinion about diskae2011-05-271-2/+7
| | | | | | | | | | | | geometry and partitions may start from withing the first track. If we found such partitions, then do not reserve space of the first track, only first sector.
* | MFCattilio2011-05-265-19/+17
|\ \ | |/
| * Prevent non-aligned reading from provider while tasting. Rejectae2011-05-252-0/+10
| | | | | | | | | | | | | | providers with unsupported sectorsize. Reported by: Joerg Wunsch MFC after: 1 week
| * Do not truncate available disk space to the closest track boundary.ae2011-05-251-7/+3
| |
| * Do not truncate available disk space to the closest track boundary.ae2011-05-251-2/+1
| |
| * Do not truncate available disk space to the closest track boundary.ae2011-05-251-4/+3
| |
| * Remove unused variable.ae2011-05-241-3/+0
| | | | | | | | MFC after: 1 week
| * Remove unused variable.ae2011-05-241-3/+0
| | | | | | | | MFC after: 1 week
| * Recognize BIO_FLUSH requests and pass them to userland.pjd2011-05-231-0/+3
| | | | | | | | MFC after: 1 week
| * Make diagnostic messages more specific. With bootverbose print outae2011-05-161-42/+83
| | | | | | | | | | | | | | all inconsistencies of integrity in the partition table, not first found only. Requested by: kib
OpenPOWER on IntegriCloud