summaryrefslogtreecommitdiffstats
path: root/sys/geom/gate
Commit message (Collapse)AuthorAgeFilesLines
* MFC Alexander Motin's GEOM direct dispatch work:scottl2014-01-071-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r256603: Introduce new function devstat_end_transaction_bio_bt(), adding new argument to specify present time. Use this function to move binuptime() out of lock, substantially reducing lock congestion when slow timecounter is used. r256606: Move g_io_deliver() out of the lock, as required for direct dispatch. Move g_destroy_bio() out too to reduce lock scope even more. r256607: Fix passing uninitialized bio_resid argument to g_trace(). r256610: Add unmapped I/O support to GEOM RAID. r256830: Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping temporary mapped buffer. That fixes double unmap if biodone() called twice for the same BIO (but with different done methods). r256880: Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. r259247: Fix bug introduced at r256607. We have to recalculate bp_resid here since sizes of original and completed requests may differ due to end of media. Testing of the stable/10 merge was done by Netflix, but all of the credit goes to Alexander and iX Systems. Submitted by: mav Sponsored by: iX Systems
* Remove extra bio_data and bio_length copying to child request after callingmav2013-03-261-2/+0
| | | | g_clone_bio(), that already copied them.
* We don't need buffer to handle BIO_DELETE, so don't check buffer size for it.pjd2013-03-141-1/+1
| | | | This fixes handling BIO_DELETE larger than MAXPHYS.
* In g_gate_dumpconf() always check the result of g_gate_hold().trociny2012-08-071-1/+3
| | | | | | | | This fixes "Negative sc_ref" panic possible when sysctl_kern_geom_confxml() is run simultaneously with destroying GATE device. Reviewed by: pjd MFC after: 3 days
* Reorder things in g_gate_create() so at the moment when g_new_geomf()trociny2012-07-281-77/+63
| | | | | | | is called name is properly initialized. Discussed with: pjd MFC after: 2 weeks
* Extend GEOM Gate class to handle read I/O requests directly within the kernel.pjd2012-07-042-35/+314
| | | | | | | | This will allow HAST to read directly from the local component without even communicating userland daemon. Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-1/+2
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Include sys/sbuf.h directly.ae2011-07-111-0/+1
| | | | Reviewed by: pjd
* Recognize BIO_FLUSH requests and pass them to userland.pjd2011-05-231-0/+3
| | | | MFC after: 1 week
* GEOM has an internal mechanism to deal with ENOMEM errors returned viapjd2011-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | g_io_deliver(). In such case it increases 'pace' counter on each ENOMEM and reschedules the request. The 'pace' counter is decreased for each request going down, but until 'pace' is greater than zero, GEOM will handle at most 10 requests per second. For GEOM GATE users that are proxy to local GEOM providers (like ggatel(8) and HAST) we can end up with almost permanent slow down of GEOM down queue. This is because once we reach GEOM GATE queue limit, we return ENOMEM to the GEOM. This means that we have, eg. 1024 I/O requests in the GEOM GATE queue. To make room in the queue and stop returning ENOMEM we need to proceed the requests of course, but those requests are handled by userland daemons that handle them by reading/writing also from/to local GEOM providers. For example with HAST, a new requests comes to /dev/hast/data, which is GEOM GATE provider. GEOM GATE passes the request to hastd(8) and hastd(8) reads/writes from/to /dev/da0. Once we reach GEOM GATE queue limit, to free up a slot in GEOM GATE queue, hastd(8) has to read/write from/to /dev/da0, but this request will also be very slow, because GEOM now slows down all the requests. We end up with full queue that we can unload at the speed of 10 requests per second. This simply looks like a deadlock. Fix it by allowing userland daemons that work with both GEOM GATE and local GEOM providers to specify unlimited queue size, so GEOM GATE will never return ENOMEM to the GEOM. MFC after: 1 week
* Increase debug level on g_gate device destruction and add message ontrociny2011-03-301-1/+2
| | | | | | | | device creation. Suggested by: danger Approved by: pjd (mentor) MFC after: 3 days
* In g_gate_create() there is a window between when g_gate_softc istrociny2011-03-272-2/+6
| | | | | | | | | | | | | | | registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible. Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array. Approved by: pjd (mentor) Reported by: Freddie Cash <fjwcash@gmail.com> MFC after: 1 week
* Add some FEATURE macros for various GEOM classes.netchild2011-02-251-0/+2
| | | | | | | | | | | No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project
* 'unit' can be negative, so use signed type for it.pjd2010-06-141-1/+1
| | | | | | Found by: Coverity Prevent CID: 3731 MFC after: 3 days
* BIO_DELETE contains range we want to delete and doesn't provide any usefulpjd2010-06-141-1/+1
| | | | | | data, so there is no need to copy it to userland. MFC after: 3 days
* Simplify loops.pjd2010-03-181-20/+10
|
* Please welcome HAST - Highly Avalable Storage.pjd2010-02-182-74/+134
| | | | | | | | | | | | | | | | | | | | | | HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV
* (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.antoine2009-12-281-1/+1
| | | | | | | | | Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
* Bump copyright year.pjd2006-09-082-2/+2
|
* Use __FBSDID in .c files.pjd2006-09-081-2/+3
|
* Fix problems with destroy and forcible destroy functionality:pjd2006-09-052-75/+47
| | | | | | | | | | - hold/release device in start/done routines, this will probably slow down things a bit, but previous code was racy; - only release device if g_gate_destroy() failed - if it succeeded device is dead and there is nothing to release; - various other changes which makes forcible destruction reliable. MFC after: 3 days
* Remove trailing spaces.pjd2006-02-012-2/+2
|
* Normalize a significant number of kernel malloc type names:rwatson2005-10-311-1/+1
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* Add CANCEL command which allows to remove one request from the queue orpjd2005-07-082-4/+53
| | | | | | | | all requests from the queue if request number is not given. Bump version number. Approved by: re (scottl)
* Update copyright in files changed this year.pjd2005-02-162-2/+2
|
* Remove mutex asserion from g_gate_find(). We don't want g_gate_list_mtxpjd2005-02-161-1/+0
| | | | mutex to be held here, because we want speed here.
* Remove TDP_GEOM flag from thread after ggate device creation.pjd2005-02-161-0/+7
| | | | | | | This flag means "wait for all pending requests before returning to userland". There are pending events for sure, because we just created new provider and other classes want to taste it, but we cannot answer on I/O requests until we're here.
* Fix typo. We want to unlock mutex here.pjd2005-02-121-1/+1
| | | | | Submitted by: Andreas Kohn <andreas.kohn@gmail.com> MFC after: 1 week
* - Remove g_gate_hold()/g_gate_release() from start/done paths. It savespjd2005-02-092-59/+40
| | | | | | | | | | 4 mutex operations per I/O requests. - Use only one mutex to protect both (incoming and outgoing) queue. As MUTEX_PROFILING(9) shows, there is no big contention for this lock. - Protect sc_queue_count with queue mutex, instead of doing atomic operations on it. - Remove DROP_GIANT()/PICKUP_GIANT() - ggate is marked as MPSAFE and no Giant there.
* - Use bioq_insert_tail()/bioq_insert_head() instead of bioq_disksort().pjd2005-02-051-3/+7
| | | | | | - Improve mediasize checking. MFC after: 1 week
* - Add missing Giant drop before acquiring the topology lock.pjd2004-11-231-3/+6
| | | | - Move DROP_GIANT()/PICKUP_GIANT() to g_gate_ioctl().
* Unlock g_gate_list_mtx mutex when we cannot allocate unit number.pjd2004-10-021-0/+1
| | | | | | | MT5 candidate. PR: kern/72253 Submitted by: Ivan Voras <ivoras@fer.hr>
* Tag all geom classes in the tree with a version number.phk2004-08-081-0/+1
|
* Do a pass over all modules in the kernel and make them return EOPNOTSUPPphk2004-07-151-0/+1
| | | | | | | | for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
* Remove unused argument for good.pjd2004-07-011-2/+2
|
* Introduce a hack that will make geom_gate to work with read-only mounts.pjd2004-06-271-0/+9
| | | | | | | | Now, when trying to mount file system in read-only mode it tries to opened a device for writting to be able to update to read-write mode latter. Ehh. Discussed with: phk
* Don't hold topology lock while calling g_gate_release().pjd2004-06-211-0/+2
| | | | Found by: KASSERT()
* Do the dreaded s/dev_t/struct cdev */phk2004-06-161-2/+2
| | | | Bump __FreeBSD_version accordingly.
* Close some small wakeup<->msleep races.pjd2004-05-051-2/+4
|
* Turn off debugging by default.pjd2004-05-031-1/+1
|
* Prefer signed type over unsigned to be able to assert negativepjd2004-05-031-1/+1
| | | | reference count.
* - Hold g_gate_list_mtx lock while generating/checking unit number.pjd2004-05-031-5/+9
| | | | | | Found by: mtx_assert() g_gate.c:273 - Set command before returning to userland with ENOMEM error value. Found by: assert() ggatel.c:108
* Make it compile on 64-bit architectures.pjd2004-05-022-26/+26
| | | | | The biggest issue was that 16-bit atomic operations aren't supported on all architectures.
* Kernel bits of GEOM Gate.pjd2004-04-302-0/+775
OpenPOWER on IntegriCloud