summaryrefslogtreecommitdiffstats
path: root/sys/geom
Commit message (Collapse)AuthorAgeFilesLines
...
* Make CAM return and GEOM DISK pass through new GEOM::lunid attribute.mav2013-06-121-1/+24
| | | | | | | | | | | | | | | | | | SPC-4 specification states that serial number may be property of device, but not a specific logical unit. People reported about FC storages using serial number in that way, making it unusable for purposes of LUN multipath detection. SPC-4 states that designators associated with logical unit from the VPD page 83h "Device Identification" should be used for that purpose. Report first of them in the new attribute in such preference order: NAA, EUI-64, T10 and SCSI name string. While there, make GEOM DISK properly report GEOM::ident in XML output also using d_getattr() method, if available. This fixes serial numbers reporting for SCSI disks in `geom disk list` output and confxml. Discussed with: gibbs, ken Sponsored by: iXsystems, Inc. MFC after: 2 weeks
* Don't update provider properties and don't set DISKFLAG_OPEN if d_open()mav2013-06-111-0/+2
| | | | | disk method call returned error. GEOM considers devices in such case as still closed, and won't call symmetric d_close() for them.
* Change the set and unset ctlreqs by making the index argument optional.marcel2013-06-095-35/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows setting attributes on tables. One simply does not provide an index in that case. Otherwise the entry corresponding the index has the attribute set or unset. Use this change to fix a relatively longstanding bug in our GPT scheme that's the result of rev 198097 (relatively harmless) followed by rev 237057 (damaging). The damaging part being that our GPT scheme always has the active flag set on the PMBR slice. This is in violation with EFI. Existing EFI implementions for both x86 and ia64 reject the GPT. As such, GPT disks created by us aren't usable under EFI because of that. After this change, GPT disks never have the active flag set on the PMBR slice. In order to make the GPT disk bootable under some x86 BIOSes, the reason of rev 198097, one must now set the active attribute on the gpt table. The kernel will apply this to the PMBR slice For (S)ATA: gpart set -a active ada0 To fix an existing GPT disk that has the active flag set in the PMBR, and that does not need the flag, use (again for (S)ATA): gpart unset -a active ada0 The EBR, MBR & PC98 schemes, which also impement at least 1 attribute, now check to make sure the entry passed is valid. They do not have attributes that apply to the table.
* Remove stub implementation.marcel2013-06-091-11/+0
|
* MFP4 @222836brooks2013-05-301-2/+5
| | | | | | Add support for partitioning CFI disks from FDT using geom_flashmap. Sponsored by: DARPA, AFRL
* Remove an extra semicolon from the DOT language output.jh2013-05-211-1/+1
| | | | | | PR: kern/178540 Submitted by: Trond Endrestol MFC after: 1 week
* Fix vdc->Secondary_Element_Count metadata field access from 16 to 8 bit.mav2013-05-201-1/+1
| | | | | | | In some cases it could cause kernel panic during failed drive replacement. Reported by: trasz MFC after: 1 week
* - Use int8_t type for the mftrecsz field in g_label_ntfs. char typestas2013-05-051-3/+4
| | | | | | | | used previously caused probe failure on platforms where char is unsigned (e.g. ARM), as mftrecsz can be negative. Submitted by: Ilya Bakulin <ilya@bakulin.de> MFC after: 2 weeks
* Return "descr" field alike to "Intel RAID1 volume" for GEOM RAID to makemav2013-04-271-0/+4
| | | | it look better in bsdinstall.
* Teach GEOM and CAM about the difference between the max "size" of r/w and deletesmh2013-04-262-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | requests. sys/geom/geom_disk.h: - Added d_delmaxsize which represents the maximum size of individual device delete requests in bytes. This can be used by devices to inform geom of their size limitations regarding delete operations which are generally different from the read / write limits as data is not usually transferred from the host to physical device. sys/geom/geom_disk.c: - Use new d_delmaxsize to calculate the size of chunks passed through to the underlying strategy during deletes instead of using read / write optimised values. This defaults to d_maxsize if unset (0). - Moved d_maxsize default up so it can be used to default d_delmaxsize sys/cam/ata/ata_da.c: - Added d_delmaxsize calculations for TRIM and CFA sys/cam/scsi/scsi_da.c: - Added re-calculation of d_delmaxsize whenever delete_method is set. - Added kern.cam.da.X.delete_max sysctl which allows the max size for delete requests to be limited. This is useful in preventing timeouts on devices who's delete methods are slow. It should be noted that this limit is reset then the device delete method is changed and that it can only be lowered not increased from the device max. Reviewed by: mav Approved by: pjd (mentor)
* Added a sysctl (kern.geom.dev.delete_max_sectors) to control the maximumsmh2013-04-261-5/+22
| | | | | | | | | | | | | | | | | | | | | | size of a delete request sent to the providing device performed by g_dev_ioctl. This allows the kernel and apps via ioctl e.g. newfs -E to request large LBA deletes which siginificantly improves performance. Previously this was hard coded to 65536 sectors, the new default is 262144 which doubles the throughput of deletes on commonly available SSD's. In tests on a Intel 520 120GB FW: 400i disk it improved the delete throughput from 1.6GB/s to over 2.6GB/s on a full disk delete such as that done via newfs -E For some SSD's where delete time is pretty much constant, no matter what the request, setting this to 0 will provide significantly better throughput e.g. Samsung 840 240GB FW DXT07B0Q @ 262144 = 79G/s, @ 0 = 2259G/s Reviewed by: mav Approved by: pjd (mentor) MFC after: 2 weeks
* Comment typo fix.ivoras2013-04-161-2/+2
| | | | Is aware of the importance of comments: dim
* Fix the buffer-overflow-fixing fixes.ivoras2013-04-161-15/+19
| | | | | | Pointy-hat to: me, for not realizing snprintf() is available in kernel. Thanks to: jh, for bringing me the good news of snprintf(), Pawel Worach, for noting that the panic can be provoked in i386 and not in amd64
* Partial MFP4 of 222836:brooks2013-04-161-1/+1
| | | | | | | | | | | Only look for FDT partitions if our potential parent is a DISK device. Excluding direct recursion on the flashmap geoms was insufficient because it did not prevent the underlying device from being retrieved if flashmap geoms were further partitioned. Reviewed by: imp Sponsored by: DARPA, AFRL
* Introduce glabel labels based on GEOM ident attributes. In this initialivoras2013-04-153-1/+87
| | | | | | | | implementation, error on the side of conservatism and only create labels for GEOMs of classes DISK and MULTIPATH. Discussed with: trasz Approved by: silence from freebsd-geom@
* Introduce a symbol for the GEOM class name instead of using the ad-hoc stringivoras2013-04-153-2/+5
| | | | constant.
* move the error report to a lower log level... Now you can see when itjmg2013-04-132-4/+5
| | | | | | returns an error without getting every single io that went through it.. MFC after: 1 week
* Make it possible to submit FLUSH bios through geom_dev strategy. Thistrasz2013-04-061-1/+2
| | | | | | is required for CTL to work with device-backed LUNs. Reviewed by: mav
* Following r241022, replace iteration over the provider list on media eventsmav2013-04-051-2/+10
| | | | | | by taking first one and asserting that there is no others. MFC after: 1 week
* geom_slice.c and its consumers like GEOM_LABEL are not touching the datamav2013-03-261-0/+8
| | | | | unless hotspots are used. Pass G_PF_ACCEPT_UNMAPPED flag through except such rare cases (obsolete GEOM_SUNLABEL and GEOM_BSD).
* GEOM NOP does not touch the data, so pass G_PF_ACCEPT_UNMAPPED flag through.mav2013-03-261-0/+1
|
* Remove extra bio_data and bio_length copying to child request after callingmav2013-03-263-6/+0
| | | | g_clone_bio(), that already copied them.
* Do not pass unmapped buffers to drivers that cannot handle themkan2013-03-261-1/+1
| | | | | | | | | | | | | | In physio, check if device can handle unmapped IO and pass an appropriately mapped buffer to the driver strategy routine. The only driver in the tree that can handle unmapped buffers is one exposed by GEOM, so mark it as such with the new flag in the driver cdevsw structure. This fixes insta-panics on hosts, running dconschat, as /dev/fwmem is an example of the driver that makes use of physio routine, but bypasses the g_down thread, where the buffer gets mapped normally. Discussed with: kib (earlier version)
* Make GEOM MULTIPATH to report unmapped bio support if underling path reportmav2013-03-251-0/+2
| | | | it. GEOM MULTIPATH itself never touches the data and so transparent.
* In GEOM DISK:mav2013-03-251-56/+28
| | | | | | | - Replace single done mutex with per-disk ones. On system with several disks on several HBAs that removes small, but measurable lock congestion. - Modify disk destruction process to not destroy the mutex prematurely. - Remove some extra pointer derefences.
* Fix long known deadlock between geom dev destruction and d_close() call.mav2013-03-241-71/+129
| | | | | | | | Use destroy_dev_sched_cb() to not wait for device destruction while holding GEOM topology lock (that actually caused deadlock). Use request counting protected by mutex to properly wait for outstanding requests completion in cases of device closing and geom destruction. Unlike r227009, this code does not block taskqueue thread for indefinite time, waiting for completion.
* Make g_wither_washer() to not loop by itself, but only when there was somemav2013-03-243-29/+13
| | | | | | | | | more topology change done that may require its attention. Add few missing g_do_wither() calls in respective places to signal it. This fixes potential infinite loop here when some provider is withered, but still opened or connected for some reason and so can not be destroyed. For example, see r227009 and r227510.
* Correct the page count when excess length is trimmed from the bio.kib2013-03-211-0/+9
| | | | Reported and tested by: Ivan Klymenko <fidaj@ukr.net
* Assert that transient mapping of the bio is only done when unmappedkib2013-03-211-0/+2
| | | | | | buffers are allowed. Sponsored by: The FreeBSD Foundation
* The geom_part provider supports unmapped bio iff the underlyingkib2013-03-191-0/+1
| | | | | | | provider does so, since geom_part never inspects the bio_data. Sponsored by: The FreeBSD Foundation Tested by: pho
* A flag for the geom disk driver to indicate that it accepts thekib2013-03-192-1/+20
| | | | | | | unmapped i/o requests. Sponsored by: The FreeBSD Foundation Tested by: pho
* Implement the concept of the unmapped VMIO buffers, i.e. buffers whichkib2013-03-193-5/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
* We don't need buffer to handle BIO_DELETE, so don't check buffer size for it.pjd2013-03-141-1/+1
| | | | This fixes handling BIO_DELETE larger than MAXPHYS.
* Add legacy support to geom raid to create a /dev/arX device for supportsbruno2013-03-081-0/+22
| | | | | | | | | | | | | | | of upgrading older machines using ataraid(4) to newer releases. This optional parameter is controlled via kern.geom.raid.legacy_aliases and will create a /dev/ar0 device that will point at /dev/raid/r0 for example. Tested on Dell SC 1425 DDF-1 format software raid controllers installing from stable/7 and upgrading to stable/9 without having to adjust /etc/fstab Reviewed by: mav Obtained from: Yahoo! MFC after: 2 Weeks
* g_label_ntfs_taste: Abort taste is recsize == 0dumbbell2013-03-081-1/+1
| | | | | | | This will avoid a 0-byte read (in g_read_data()) leading to a panic, if previously read data are erroneous. Suggested by: John-Mark Gurney <jmg@funkthat.com>
* Support the FAT16 partition type in gpart(8)gavin2013-03-073-0/+3
| | | | | | PR: kern/174714 Submitted by: 4721 at hushmail dot com MFC after: 1 week
* Fix panic when Secondary_Element_Count == 1 and Secondary_Element_Seqmav2013-03-071-1/+4
| | | | | | | is not set (255). Reported by: sbruno MFC after: 1 week
* g_label_ntfs.c: Mark structures as __packeddumbbell2013-03-051-3/+3
| | | | | | Without this, read data is mis-interpreted. This could trigger a panic, as was the case on one computer where computed "recsize" was zero, leading to a call to g_read_page() asking for 0 bytes.
* Remove ntfs headers dependency for g_label_ntfs.c by redefining theattilio2013-03-021-14/+65
| | | | | | used structs and values. This patch is not targeted for MFC.
* Add barrier write capability to the VFS buffer interface. A barriermckusick2013-02-161-0/+4
| | | | | | | | | | | | | | | | | | | write is a disk write request that tells the disk that the buffer being written must be committed to the media along with any writes that preceeded it before any future blocks may be written to the drive. Barrier writes are provided by adding the functions bbarrierwrite (bwrite with barrier) and babarrierwrite (bawrite with barrier). Following a bbarrierwrite the client knows that the requested buffer is on the media. It does not ensure that buffers written before that buffer are on the media. It only ensure that buffers written before that buffer will get to the media before any buffers written after that buffer. A flush command must be sent to the disk to ensure that all earlier written buffers are on the media. Reviewed by: kib Tested by: Peter Holm
* g_mirror: g_getattr() failure should not be fatalavg2013-01-261-3/+1
| | | | | | | | | | | This allows to use gmirror e.g. on top of ZVOLs. PR: kern/175323 Submitted by: Alexei.Volkov@softlynx.ru, mav Reported by: Alexei.Volkov@softlynx.ru Tested by: Alexei.Volkov@softlynx.ru Reviewed by: ae, mav, pjd MFC after: 1 week
* - Fix rebuild position broken at r245522.mav2013-01-171-2/+5
| | | | - Identify one more metadata field.
* For Promise/AMD metadata add support for disks with capacity above 2TiBmav2013-01-171-55/+91
| | | | and for volumes with sector size above 512 bytes.
* Recalculate volume size only for real CONCATs. For SINGLE trust volumemav2013-01-171-1/+2
| | | | | size given by metadata, as it should be correct and in some cases can be smaller then subdisk size.
* Allow to insert new component to geom_raid3 without specifying number.mav2013-01-151-16/+29
| | | | | PR: kern/160562 MFC after: 2 weeks
* Alike to r242314 for GRAID make GRAID3 more aggressive in marking volumesmav2013-01-151-9/+12
| | | | | | | | | | | as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks
* Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumesmav2013-01-151-9/+12
| | | | | | | | | | | | as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. PR: kern/113957 MFC after: 2 weeks
* Keep value of orig_config_id metadata field. Windows driver writes theremav2013-01-141-2/+5
| | | | | previous value of config_id when it is changed in some cases. I guess it may be used do avoid some split-brain conditions.
* Small cosmetic tuning of the IRRT status constants.mav2013-01-141-4/+7
|
* Print some more metadata fields.mav2013-01-141-6/+12
|
OpenPOWER on IntegriCloud