summaryrefslogtreecommitdiffstats
path: root/sys/geom/stripe/g_stripe.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r306279: Use g_wither_provider() where applicable.mav2016-10-061-2/+1
| | | | | It is just a helper function combining G_PF_WITHER setting with g_orphan_provider().
* Pull in r267961 and r267973 again. Fix for issues reported will follow.hselasky2014-06-281-6/+3
|
* Revert r267961, r267973:gjb2014-06-271-3/+6
| | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifhselasky2014-06-271-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
* Do not increment bio_data in case of BIO_DELETE.mav2014-04-101-3/+10
| | | | This fixes KASSERT() panic in g_io_request().
* Merge GEOM direct dispatch changes from the projects/camlock branch.mav2013-10-221-19/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-1/+2
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Refactor disk disconnection and geom destruction handling sequences.mav2011-11-011-55/+46
| | | | | | | | Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.
* Include sys/sbuf.h directly.ae2011-07-111-0/+1
| | | | Reviewed by: pjd
* Remove "for a moment" assignment. struct g_geom zeroed when allocated.ae2011-05-041-2/+0
| | | | MFC after: 1 week
* Implement relaxed comparision for hardcoded provider names to make itmav2011-04-271-1/+2
| | | | | ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back.
* Add some FEATURE macros for various GEOM classes.netchild2011-02-251-0/+1
| | | | | | | | | | | No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project
* Correct comment.pjd2010-02-181-1/+1
|
* Make geom_stripe report it's stripe size to upper layers.mav2009-12-241-0/+2
|
* If provider is open for writing when we taste it, skip it for classes thatpjd2009-10-091-0/+4
| | | | | | | | | | | | | | | | | | | | depend on on-disk metadata. This was we won't attach to providers that are used by other classes. For example we don't want to configure partitions on da0 if it is part of gmirror, what we really want is partitions on mirror/foo. During regular work it works like this: if provider is open for writing a class receives the spoiled event from GEOM and detaches, once provider is closed the taste event is send again and class can rediscover its metadata if it is still there. This doesn't work that way when new class arrives, because GEOM gives all existing providers for it to taste, also those open for writing. Classes have to decided on their own if they want to deal with such providers (eg. geom_dev) or not (classes modified by this commit). Reported by: des, Oliver Lehmann <lehmann@ans-netz.de> Tested by: des, Oliver Lehmann <lehmann@ans-netz.de> Discussed with: phk, marcel Reviewed by: marcel MFC after: 3 days
* Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYSmav2009-09-041-7/+6
| | | | | instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to work on systems with increased MAXPHYS.
* Add sbuf_new_auto as a shortcut for the very common case of creating ades2008-08-091-1/+1
| | | | | | | completely dynamic sbuf. Obtained from: Varnish MFC after: 2 weeks
* Despite several examples in the kernel, the third argument ofdwmalone2007-06-041-1/+1
| | | | | | | | | | | | | sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.
* Change spaces to tabs where needed.pjd2006-11-011-3/+3
|
* Implement BIO_FLUSH handling by simply passing it down to the components.pjd2006-10-311-3/+39
| | | | Sponsored by: home.pl
* Remove trailing spaces.pjd2006-02-011-2/+2
|
* Normalize a significant number of kernel malloc type names:rwatson2005-10-311-1/+1
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* Avoid code duplication and implement bitcount32() function in systm.h only.pjd2005-08-191-1/+1
| | | | | Reviewed by: cperciva MFC after: 3 days
* Before calling g_orphan_provider(), add G_PF_WITHER flag, so GEOM will knowpjd2005-07-171-0/+1
| | | | | | | | to destroy it. PR: kern/81758 Submitted by: trasz <trasz@buziaczek.pl> MFC after: 3 days
* Check return value.pjd2005-05-111-0/+4
| | | | Found by: Coverity Prevent analysis tool
* - Add md_provsize field to metadata, which will help withpjd2005-02-271-2/+9
| | | | | | | | | | | | | | | | | shared-last-sector problem. After this change, even if there is more than one provider with the same last sector, the proper one will be chosen based on its size. It still doesn't fix the 'c' partition problem (when da0s1 can be confused with da0s1c) and situation when 'a' partition starts at offset 0 (then da0s1a can be confused with da0s1 and da0s1c). One can use '-h' option there, when creating device or avoid sharing last sector. Actually, when providers share the same last sector and their size is equal, they provide exactly the same data, so the name (da0s1, da0s1a, da0s1c) isn't important at all. - Provide backward compatibility. - Update copyright's year. MFC after: 1 week
* Fix year in copyrights.pjd2005-02-161-1/+1
|
* - Turn off 'fast' mode by default and increase maximum memory to consumepjd2004-12-091-2/+2
| | | | | when this mode is used. - Manual page update.
* This is not needed anymore, it is forced in GEOM now.pjd2004-09-201-3/+0
| | | | | | | | | Actually, it can even cause some problems, because GEOM requires sectorsize to be more than 0 on first access, not on provider creation, so we can skip valid providers by doing this check here. Reported by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Sven Willenberger <sven@dmv.com>
* Allow to configure debug level from /boot/loader.conf.pjd2004-08-301-0/+1
|
* Skip providers with not defined sector size.pjd2004-08-261-0/+3
| | | | Reported by: kuriyama
* Dump disk number.pjd2004-08-251-1/+2
|
* Increase default kern.geom.stripe.maxmem to 50 elements.pjd2004-08-111-1/+1
|
* Fix one of the lastest commit. This bio_caller1 should also be changed topjd2004-08-101-1/+1
| | | | | | | bio_driver1 (as all the rest). This introduced a small memory leak, but it wasn't really critical, because maximum memory for g_stripe_zone is always set, so after few requests gstripe was working in "economic" mode.
* - Introduce option for hardcoding providers' names into metadata.pjd2004-08-091-0/+10
| | | | | | | | | | | | It allows to fix problems when last provider's sector is shared between few providers. - Bump version number for CONCAT and STRIPE and add code for backward compatibility. - Do not bump version number of MIRROR, as it wasn't officially introduced yet. Even if someone started to play with it, there is no big deal, because wrong MD5 sum of metadata will deny those providers. - Update manual pages. - Add version history to g_(stripe|concat).h files.
* Do not use g_wither_geom(9). I doesn't work in the way which is expectedpjd2004-08-091-2/+3
| | | | | here anymore (after g_wither_washer() was introduced), i.e. geom and consumer will not be immediately destroyed if possible.
* Tag all geom classes in the tree with a version number.phk2004-08-081-0/+1
|
* Add and document kern.geom.stripe.fast_failed sysctl, which shows howpjd2004-08-061-1/+7
| | | | many times "fast" mode failed.
* Fields bio_caller[12] should be used by the consumer and fieldspjd2004-08-061-23/+23
| | | | bio_driver[12] should be used by the provider!
* Fix I/O leakage. We're cloning bios in g_stripe_start_fast(), but whenpjd2004-08-061-0/+2
| | | | | | | | | | something goes wrong while running in "fast" mode, we free all bios and falling back to "economic" mode. Freeing bios, doesn't mean decrease bio_children, so bio_inbed couldn't be equal to bio_children and request was never finished. Decrease bio_children manually when destroying bios. Reported by: Sam Lawrance <boris@brooknet.com.au>, simon
* Improve geom(8)'s 'list' command to show geoms and their providers andpjd2004-07-261-29/+30
| | | | consumers. Teach STRIPE, CONCAT and NOP classes about this improvement.
* Change naming scheme from /dev/<name>.stripe to /dev/stripe/<name>.pjd2004-07-261-23/+18
|
* M_WAITOK is ok here, while I'm using M_WAITOK later in this function.pjd2004-07-261-8/+1
|
* Minor sysctl description fixes.pjd2004-07-131-2/+2
| | | | Submitted by: simon
* Implement "FAST" mode for GEOM_STRIPE class and turn it on by default.pjd2004-07-091-80/+345
| | | | | | | | | | | | | | | In this mode you can setup even very small stripe size and you can be sure that only one I/O request will be send to every disks in stripe. It consumes some more memory, but if allocation fails, it will fall back to "ECONOMIC" mode. It is about 10 times faster for small stripe size than "ECONOMIC" mode and other RAID0 implementations. It is even recommended to use this mode and small stripe size, so our requests are always splitted. One can still use "ECONOMIC" mode by setting kern.geom.stripe.fast to 0. It is also possible to setup maximum memory which "FAST" mode can consume, by setting kern.geom.stripe.maxmem from /boot/loader.conf.
* - Add 'stop' command, which works just like 'destroy' command, but soundspjd2004-07-051-1/+2
| | | | | | less dangerous. - Update manual pages and extend examples. - Bump versions.
* Dump some more informations:pjd2004-05-261-19/+29
| | | | | | | | | - device state - list of used providers - total number of disks - number of disks online Prodded by: Alex Deiter <tiamat@komi.mts.ru>
* Introduce STRIPE GEOM class. It implements RAID0 transformation and itpjd2004-05-201-0/+900
is intend to be fast. Just like CONCAT class it provides manual and auto configuration methods. Supported by: Wheel - Open Technologies - http://www.wheel.pl
OpenPOWER on IntegriCloud