summaryrefslogtreecommitdiffstats
path: root/sys/geom/geom_kern.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove unneeded Giant locking around kthreads creation.kib2016-05-201-2/+0
| | | | Sponsored by: The FreeBSD Foundation
* Remove asserts that Giant is not held on entrance into geom KPI, whichkib2016-05-201-3/+0
| | | | | | | | outlived their usefulness. This allows to remove drop/pickup Giant wrappers around GEOM calls. Discussed with: alfred, imp, phk Sponsored by: The FreeBSD Foundation
* sys/geom: spelling fixes in comments.pfg2016-04-291-1/+1
| | | | No functional change.
* Fix multiple incorrect SYSCTL arguments in the kernel:hselasky2014-10-211-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies
* Pull in r267961 and r267973 again. Fix for issues reported will follow.hselasky2014-06-281-2/+1
|
* Revert r267961, r267973:gjb2014-06-271-1/+2
| | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifhselasky2014-06-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
* Merge GEOM direct dispatch changes from the projects/camlock branch.mav2013-10-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months
* Introduce a kern.geom.notaste sysctl that can be used to temporarilydes2013-09-241-0/+4
| | | | | | | | | | disable GEOM tasting to avoid the "bouncing GEOM" problem where, when you shut down the consumer of a provider which can be viewed in multiple ways (typically a mirror whose members are labeled partitions), GEOM will immediately taste that provider's alter ego and reattach the consumer. Approved by: re (glebius)
* Move the three geom kprocs as threads under a single pid.thompsa2011-05-111-46/+25
| | | | Reviewed by: julian
* Use g_eventlock to protect against losing wakeups in the g_event processjh2010-11-221-4/+2
| | | | | | | | | | | | | and replace tsleep(9) with msleep(9) which doesn't use a timeout. The previously used timeout caused the event process to wake up ten times per second on an idle system. one_event() is now called with the topology lock held and it returns with both the topology and event locks held when there are no more events in the queue. Reported by: mav, Marius NĂ¼nnerich Reviewed by: freebsd-geom
* Add sbuf_new_auto as a shortcut for the very common case of creating ades2008-08-091-3/+3
| | | | | | | completely dynamic sbuf. Obtained from: Varnish MFC after: 2 weeks
* Commit 14/14 of sched_lock decomposition.jeff2007-06-051-6/+6
| | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* Add sysctl descriptions.le2005-11-251-7/+8
|
* Call g_waitidle() instead of GEOM using the root_mount_hold() KPI.phk2005-04-191-6/+0
| | | | | GEOM could (and will) get events as a result of drivers coming in late so a one-shot method is not good enough for GEOM.
* Add a named reference-count KPI to hold off mounting of the root filesystem.phk2005-04-181-0/+6
| | | | | | | | | | | | While we wait for holds to be released, print a list of who holds us back once per second. Use the new KPI from GEOM instead of vfs_mount.c calling g_waitidle(). Use the new KPI also from ata. With ATAmkIII's newbusification, ata could narrowly miss the window and ad0 would not exist when we tried to mount root.
* Make various random things staticphk2005-02-101-2/+2
|
* Stop explicitly touching td_base_pri outside of the scheduler and simplyjhb2004-12-301-4/+11
| | | | | set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
* Make kern.geom.debugflags sysctl tunable from /boot/loader.conf.pjd2004-09-131-0/+1
| | | | | | It will help to debug problems when booting. Approved by: phk
* don't call sbuf_clear() right after sbuf_new(), it is not necessary.phk2004-02-101-3/+0
|
* Sleep on "-" in our normal state to simplify debugging.phk2003-06-181-1/+3
|
* Use __FBSDID().obrien2003-06-111-2/+3
| | | | Approved by: phk
* Fix some easy, global, lint warnings. In most cases, this meansmarkm2003-04-301-1/+1
| | | | | making some local variables static. In a couple of cases, this means removing an unused variable.
* Introduce a g_waitfor_event() function which posts an event and waits forphk2003-04-231-12/+3
| | | | it to be run (or cancelled) and use this instead of home-rolled versions.
* More of the event stuff can now be private to geom_event.cphk2003-04-231-2/+0
|
* Rename g_call_me() to g_post_event(), and give it a flagphk2003-04-231-3/+3
| | | | argument to determine if we can M_WAITOK in malloc.
* Move the shutdown eventhandler stuff to a more logical place.phk2003-04-231-0/+11
|
* Change events to have an array of "void *" references, and give thephk2003-04-021-3/+3
| | | | | | | | | | | event posting functions varargs to fill these. Attribute g_call_me() to appropriate g_geom's where necessary. Add a flag argument to g_call_me() methods which will be used to signal cancellation of events in the future. This commit should be a no-op.
* Turn /dev/geom.ctl from a GEOM class into a plain character device driverphk2003-03-241-0/+1
| | | | instead, it will never see a disk-I/O transaction, so this is a lot simpler.
* Retire the GEOM private statistics code and use devstat instead.phk2003-03-181-4/+0
|
* Implement a bio-taskqueue to reduce number of context switches inphk2003-02-111-10/+0
| | | | | | | | | | | | | | | | | | | | disk I/O processing. The intent is that the disk driver in its hardware interrupt routine will simply schedule the bio on the task queue with a routine to finish off whatever needs done. The g_up thread will then schedule this routine, the likely outcome of which is a biodone() which queues the bio on g_up's regular queue where it will be picked up and processed. Compared to the using the regular taskqueue, this saves one contextswitch. Change our scheduling of the g_up and g_down queues to be water-tight, at the cost of breaking the userland regression test-shims. Input and ideas from: scottl
* Remove another printf which does not say anything we didn't already know.phk2003-02-111-1/+0
|
* Update the statistics collection code to track busy time instead ofphk2003-02-091-1/+1
| | | | | | | | | | idle time. Statistics now default to "on" and can be turned off with sysctl kern.geom.collectstats=0 Performance impact of statistics collection is on the order of 800 nsec per consumer/provider set on a 700MHz Athlon.
* Move the g_stat struct to its own .h file, we will export it to other code.phk2003-02-081-0/+2
| | | | | | | | | | | | | | | | | | Insted of embedding a struct g_stat in consumers and providers, merely include a pointer. Remove a couple of <sys/time.h> includes now unneeded. Add a special allocator for struct g_stat. This allocator will allocate entire pages and hand out g_stat functions from there. The "id" field indicates free/used status. Add "/dev/geom.stats" device driver whic exports the pages from the allocator to userland with mmap(2) in read-only mode. This mmap(2) interface should be considered a non-public interface and the functions in libgeom (not yet committed) should be used to access the statistics data.
* Commit the correct copy of the g_stat structure.phk2003-02-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add debug.sizeof.g_stat sysctl. Set the id field of the g_stat when we create consumers and providers. Remove biocount from consumer, we will use the counters in the g_stat structure instead. Replace one field which will need to be atomically manipulated with two fields which will not (stat.nop and stat.nend). Change add companion field to bio_children: bio_inbed for the exact same reason. Don't output the biocount in the confdot output. Fix KASSERT in g_io_request(). Add sysctl kern.geom.collectstats defaulting to off. Collect the following raw statistics conditioned on this sysctl: for each consumer and provider { total number of operations started. total number of operations completed. time last operation completed. sum of idle-time. for each of BIO_READ, BIO_WRITE and BIO_DELETE { number of operations completed. number of bytes completed. number of ENOMEM errors. number of other errors. sum of transaction time. } } API for getting hold of these statistics data not included yet.
* Fix some sleep strings to make more sense.phk2003-02-071-3/+3
|
* Remove the "ascii" attribute from the sysctls so that "sysctl -a" willphk2002-12-271-3/+3
| | | | skip them.
* Use a mutex assert to document our locking circumstances.phk2002-12-261-0/+3
|
* Fix a cut&past-o.phk2002-12-011-1/+1
| | | | | Spotted by: yar Approved by: re (blanket)
* Add the remaning part of the new libdisk interaction.phk2002-10-281-2/+23
| | | | | | | WARNING: This is not a published interface, it is a stopgap measure for WARNING: libdisk so we can get 5.0-R out of the door. Sponsored by: DARPA & NAI Labs
* Reduce the GEOM verbosity under bootverbose to something more sufferable.phk2002-10-251-2/+0
| | | | | | | This is not quite the set of information I would want, but the tree where I have the "correct" version is messed up with conflicts. Sponsored by: DARPA & NAI Labs.
* No need to specify CTLTYPE_INT when we use SYSCTL_INT.phk2002-10-201-7/+7
|
* Be consistent and return the NUL at the end of kern.geom.conf{xml,dot}.phk2002-10-171-2/+2
| | | | Spotted by: sam
* Properly isolate the locking domains of sysctl from the topology lockphk2002-10-041-17/+25
| | | | | | for the sysctls which report the configuration. Sponsored by: DARPA & NAI Labs.
* Move GEOM's sysctls under kern.geom.phk2002-10-021-9/+11
| | | | Sponsored by: DARPA & NAI Labs.
* Zero the local-variable mutexes before we call mtx_init() on them,phk2002-09-281-0/+2
| | | | | | | failing to do this may lead mtx_init() to belive they have already been initialized. Detected by: Marc Recht <marc@informatik.uni-bremen.de>
* Style, whitespace and lint fixes.phk2002-09-281-4/+5
| | | | Sponsored by: DARPA & NAI Labs.
* Make the UP/DOWN threads hold on to their own private mutex while doingphk2002-09-271-2/+26
| | | | | | | | | | work. This prevents people from sleeping in the UP/DOWN I/O path by mistake or design (doing so almost invariably result in deadlocks since it stalls all I/O processing in the given direction. Sponsored by: DARPA & NAI Labs.
* Various no-ops:phk2002-09-271-6/+0
| | | | | | | | | | | Add a __unused. Make the 2byte decoder functions return 16 bits for the benefits of picky lints. No need to grab giant around a tsleep() when we have a timeout. Sponsored by: DARPA & NAI Labs.
* Don't use the static thread.. it is going away.julian2002-06-291-2/+2
|
OpenPOWER on IntegriCloud