summaryrefslogtreecommitdiffstats
path: root/include/linux/drbd.h
Commit message (Collapse)AuthorAgeFilesLines
* drbd: Minor cleanupsAndreas Gruenbacher2014-11-101-1/+1
| | | | | | | | | | | . Update comments . drbd_set_{in,out_of}_sync(): Remove unused parameters . Move common code into adm_del_resource() . Redefine ERR_MINOR_EXISTS -> ERR_MINOR_OR_VOLUME_EXISTS Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* drbd: silence underflow warning in read_in_block()Dan Carpenter2014-07-101-1/+1
| | | | | | | | | | My static checker warns that "data_size" could be negative and underflow the limit check. The code looks suspicious but I don't know if it is a real bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: device->ldev is not guaranteed on an D_ATTACHING diskPhilipp Reisner2014-07-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some parts of the code assumed that get_ldev_if_state(device, D_ATTACHING) is sufficient to access the ldev member of the device object. That was wrong. ldev may not be there or might be freed at any time if the device has a disk state of D_ATTACHING. bm_rw() Documented that drbd_bm_read() is only called from drbd_adm_attach. drbd_bm_write() is only called when a reference is held, and it is documented that a caller has to hold a reference before calling drbd_bm_write() drbd_bm_write_page() Use get_ldev() instead of get_ldev_if_state(device, D_ATTACHING) drbd_bmio_set_n_write() No longer use get_ldev_if_state(device, D_ATTACHING). All callers hold a reference to ldev now. drbd_bmio_clear_n_write() All callers where holding a reference of ldev anyways. Remove the misleading get_ldev_if_state(device, D_ATTACHING) drbd_reconsider_max_bio_size() Removed the get_ldev_if_state(device, D_ATTACHING). All callers now pass a struct drbd_backing_dev* when they have a proper reference, or a NULL pointer. Before this fix, the receiver could trigger a NULL pointer deref when in drbd_reconsider_max_bio_size() drbd_bump_write_ordering() Used get_ldev_if_state(device, D_ATTACHING) with the wrong assumption. Remove it, and allow the caller to pass in a struct drbd_backing_dev* when the caller knows that accessing this bdev is safe. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Move string function prototypes from linux/drbd.h to drbd_string.hAndreas Gruenbacher2014-02-171-6/+0
| | | | | Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
* drbd: Define the size of res_opts->cpu_mask in a single placeAndreas Gruenbacher2014-02-171-0/+2
| | | | | Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
* drbd: Allow online change of al-stripes and al-stripe-sizePhilipp Reisner2013-06-281-1/+5
| | | | | | | | | | | | | | | | | | | | | Allow to change the AL layout with an resize operation. For that the reisze command gets two new fields: al_stripes and al_stripe_size. In order to make the operation crash save: 1) Lock out all IO and MD-IO 2) Write the super block with MDF_PRIMARY_IND clear 3) write the bitmap to the new location (all zeros, since we allow only while connected) 4) Initialize the new AL-area 5) Write the super block with the restored MDF_PRIMARY_IND. 6) Unfreeze all IO Since the AL-layout has no influence on the protocol, this operation needs to be beforemed on both sides of a resource (if intended). Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* drbd: use sched_setscheduler()Philipp Reisner2013-03-281-1/+1
| | | | | | | | | | It was unnoticed for some time that assigning to current->policy is no longer sufficient to set a real time priority for a kernel thread. Reported-by: Charlie Suffin <Charlie.Suffin@stratus.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* drbd: Fix disconnect to keep the peer disk state if connection breaks during ↵Philipp Reisner2013-03-281-1/+2
| | | | | | | | | | | | | | | | | | operation The issue was that if the connection broke while we did the gracefull state change to C_DISCONNECTING (C_TEARDOWN), then we returned a success code from the state engine. (SS_CW_NO_NEED) The result of that is that we missed to call the fence-peer script in such a case. Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN). This one should never reach back into user space. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* drbd: Broadcast sync progress no more often than once per secondPhilipp Reisner2012-11-091-2/+2
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: disambiguation, s/ERR_DISCARD/ERR_DISCARD_IMPOSSIBLE/Lars Ellenberg2012-11-091-1/+1
| | | | | | | | | | | | | | | If for some reason (typically "split-brained" cluster manager) drbd replica data has diverged, we can chose a victim, and reconnect using "--discard-my-data", causing the victim to become sync-target, fetching all changed blocks from the peer. If we are Primary, we are potentially in use, and we refuse to "roll back" changes to the data below the page cache and other users. Rename the error symbol for this to ERR_DISCARD_IMPOSSIBLE. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: introduce stop-sector to online verifyLars Ellenberg2012-11-091-1/+1
| | | | | | | | | | | | | | | | We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Had to bump the protocol version differently, we are now 101. Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; } Also, the return value convention for worker callbacks has changed, we returned "true/false" for "keep the connection up" in 8.3, we return 0 for success and <= for failure in 8.4. Affected: receive_state() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: New disk option al-updatesPhilipp Reisner2012-11-081-0/+1
| | | | | | | | | | By disabling al-updates one might increase performace. The price for that is that in case a crashed primary (that had al-updates disabled) is reintegraded, it will receive a full-resync instead of a bitmap based resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Load balancing method: stripingPhilipp Reisner2012-11-081-0/+6
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Load balancing of read requestsPhilipp Reisner2012-11-081-0/+8
| | | | | | | | New config option for the disk secition "read-balancing", with the values: prefer-local, prefer-remote, round-robin, when-congested-remote. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: on attach, enforce clean meta dataLars Ellenberg2012-11-081-1/+9
| | | | | | | | | | | | | | | | | | | | | Detection of unclean shutdown has moved into user space. The kernel code will, whenever it updates the meta data, mark it as "unclean", and will refuse to attach to such unclean meta data. "drbdadm up" now schedules "drbdmeta apply-al", which will apply the activity log to the bitmap, and/or reinitialize it, if necessary, as well as set a "clean" indicator flag. This moves a bit code out of kernel space. As a side effect, it also prevents some 8.3 module from accidentally ignoring the 8.4 style activity log, if someone should downgrade, whether on purpose, or accidentally because he changed kernel versions without providing an 8.4 for the new kernel, and the new kernel comes with in-tree 8.3. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Use the terminology suggested by the command names in the source code ↵Andreas Gruenbacher2012-11-081-2/+2
| | | | | | | and messages Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: spelling fix: too smallLars Ellenberg2012-11-081-2/+2
| | | | | | | It is not "to small", but "too small". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Refuse to change network options online when...Philipp Reisner2012-11-081-0/+1
| | | | | | | | | | | | | | * the peer does not speak protocol_version 100 and the user wants to change one of: - wire_protocol - two_primaries - integrity_alg * the user wants to remove the allow_two_primaries flag when there are two primaries Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Fix the upper limit of resync-afterAndreas Gruenbacher2012-11-081-2/+2
| | | | | | | | | | The 32-bit resync_after netlink field takes a device minor number as parameter, which is no longer limited to 255. We cannot statically verify which device numbers are valid, so set the ummer limit to the highest possible signed 32-bit integer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Allow online change of replication protocol only with agreed_pv >= 100Philipp Reisner2012-11-081-0/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Introduce protocol version 100 headersAndreas Gruenbacher2012-11-081-0/+1
| | | | | | | | | | | | | | | | | | The 8 byte header finally becomes too small. With the protocol 100 header we have 16 bit for the volume number, proper 32 bit for the data length, and 32 bit for further extensions in the future. Previous versions of drbd are using version 80 headers for all packets short enough for protocol 80. They support both header versions in worker context, but only version 80 headers in asynchronous context. For backwards compatibility, continue to use version 80 headers for short packets before protocol version 100. From protocol version 100 on, use the same header version for all packets. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Converted drbd_try_outdate_peer() from mdev to tconnPhilipp Reisner2012-11-081-1/+2
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Allow volumes to become primary only on one sidePhilipp Reisner2012-11-041-1/+2
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: switch configuration interface from connector to genetlinkLars Ellenberg2012-11-041-34/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Implemented new commands to create/delete connections/minorsPhilipp Reisner2011-10-141-0/+3
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Preparing the connector interface to operator on connectionsPhilipp Reisner2011-10-141-4/+15
| | | | | | | | Up to now it only operated on minor numbers. Now it can work also on named connections. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: new on-disk activity log transaction formatLars Ellenberg2011-10-141-0/+4
| | | | | | | | | | | Use a new on-disk transaction format for the activity log, which allows for multiple changes to the active set per transaction. Using 4k transaction blocks, we can now get rid of the work-around code to deal with devices not supporting 512 byte logical block size. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Use new header layoutPhilipp Reisner2011-09-281-1/+1
| | | | | | | | | | | The new header layout will only be used if the peer supports it of course. For the first packet and the handshake packet the old (h80) layout is used for compatibility reasons. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Get rid of BE_DRBD_MAGIC and BE_DRBD_MAGIC_BIGAndreas Gruenbacher2011-08-251-2/+0
| | | | | | | Converting the constants happens at compile time. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Fix spellingBart Van Assche2011-05-241-4/+4
| | | | | | | | Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
* drbd: fix schedule in atomicLars Ellenberg2011-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* Fix common misspellingsLucas De Marchi2011-03-311-2/+2
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* drbd: Remove unused function atodb_endio()Andreas Gruenbacher2011-03-101-1/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Provide hints with the error message when clearing the sync pause flagPhilipp Reisner2011-03-101-0/+2
| | | | | | | | When the user clears the sync-pause flag, and sync stays in pause state, give hints to the user, why it still is in pause state. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Rename enum drbd_state_ret_codes to enum drbd_state_rvAndreas Gruenbacher2011-03-101-2/+2
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Rename enum drbd_ret_codes to enum drbd_ret_codeAndreas Gruenbacher2011-03-101-1/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNCPhilipp Reisner2011-03-101-1/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Implemented two new connection states Ahead/BehindPhilipp Reisner2011-03-101-0/+4
| | | | | | | | In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: New configuration parameters for dealing with network congestionPhilipp Reisner2011-03-101-0/+7
| | | | | | | | | | | net { on_congestion {block|pull-ahead|disconnect}; congestion-fill {sectors}; congestion-extents {al-extents}; } Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Silenced an assertPhilipp Reisner2010-10-221-1/+1
| | | | | | | That assertion's condition needed adjustment for today's semantics Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: add race-breaker to drbd_go_disklessLars Ellenberg2010-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds a necessary race breaker to these commits: drbd: fix for possible deadlock on IO error during resync drbd: drop wrong debug asserts, fix recently introduced race What we do is get a refcount, check the state, then depending on the state and the requested minimum disk state, either hold it (success), or give it back immediately (failed "try lock"). Some code paths (flushing of drbd metadata) may still grab and hold a refcount even if we are D_FAILED (application IO won't). So even if we hit local_cnt == 0 once after being D_FAILED, we still need to wait for that again after we changed to D_DISKLESS. Once local_cnt reaches 0 while we are D_DISKLESS, we can be sure that no one will look at the protected members anymore, so only then is it safe to free them. We cannot easily convert to standard locking primitives here, as we want to be able to use it in atomic context (we always do a "try lock"), as well as hold references for a "long time" (from IO submission to completion callback). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: fix unlikely access after free and list corruptionLars Ellenberg2010-10-141-2/+2
| | | | | | | | | | | Various cleanup paths have been incomplete, for the very unlikely case that we cannot allocate enough bios from process context when submitting on behalf of the peer or resync process. Never observed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Track the reasons to suspend IO in dedicated state bitsPhilipp Reisner2010-10-141-3/+7
| | | | | | | | | | | | | | | | | | | There are three ways to get IO suspended: * Loss of any access to data * Fence-peer-handler running * User requested to suspend IO Track those in different bits, so that one condition clearing its state bit does not interfere with the other two conditions. Only when the user resumes IO he overrules all three bits. The fact is hidden from the user, he sees only a single suspend bit. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Sending of big packets, for payloads from 64KByte to 4GBytePhilipp Reisner2010-10-141-0/+2
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Do not allow a fencing-policy of resource-and-stonith with protocol APhilipp Reisner2010-10-141-0/+1
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Finished the "on-no-data-accessible suspend-io;" functionalityPhilipp Reisner2010-10-141-0/+5
| | | | | | | | | When no data is accessible (no connection to the peer, nor a local disk) allow the user to select to freeze all IO operations instead of getting IO errors. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: revert "delay probes", feature is being re-implemented differentlyLars Ellenberg2010-08-071-1/+1
| | | | | | | | | | | | | | | | It was a now abandoned attempt to throttle resync bandwidth based on the delay it causes on the bulk data socket. It has no userbase yet, and has been disabled by 9173465ccb51c09cc3102a10af93e9f469a0af6f already. This removes the now unused code. The basic feature, namely using up "idle" bandwith of network and disk IO subsystem, with minimal impact to application IO, is being reimplemented differently. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* drbd: Fixed a race between disk-attach and unexpected state changesPhilipp Reisner2010-06-141-1/+1
| | | | | | | | | | | | | This was a very hard to trigger race condition. If we got a state packet from the peer, after drbd_nl_disk() has already changed the disk state to D_NEGOTIATING but after_state_ch() was not yet run by the worker, then receive_state() might called drbd_sync_handshake(), which in turn crashed when accessing p_uuid. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* Preparing 8.3.8rc2Philipp Reisner2010-06-011-1/+1
| | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* drbd: This is now equivalent to drbd release 8.3.8rc1Philipp Reisner2010-05-211-2/+2
| | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
OpenPOWER on IntegriCloud