diff options
author | ken <ken@FreeBSD.org> | 2012-12-07 23:48:54 +0000 |
---|---|---|
committer | ken <ken@FreeBSD.org> | 2012-12-07 23:48:54 +0000 |
commit | b8327e28a33a6bb69cb12171c683fc2756305c0a (patch) | |
tree | 28c4c730a9cb2bfa02e5897d3d92e578e4ee1755 /sys/cam/cam_xpt.c | |
parent | 24b0c94606e6b293ab6908adf7e5489f94a6190b (diff) | |
download | FreeBSD-src-b8327e28a33a6bb69cb12171c683fc2756305c0a.zip FreeBSD-src-b8327e28a33a6bb69cb12171c683fc2756305c0a.tar.gz |
Fix a panic during CAM EDT traversal.
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
Diffstat (limited to 'sys/cam/cam_xpt.c')
-rw-r--r-- | sys/cam/cam_xpt.c | 16 |
1 files changed, 2 insertions, 14 deletions
diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c index b184dd1..304d8c7 100644 --- a/sys/cam/cam_xpt.c +++ b/sys/cam/cam_xpt.c @@ -2178,8 +2178,8 @@ xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, * invalidated, but not peripherals that are scheduled to * be freed. So instead of calling cam_periph_acquire(), * which will fail if the periph has been invalidated, we - * just check for the free flag here. If it is free, we - * skip to the next periph. + * just check for the free flag here. If it is in the + * process of being freed, we skip to the next periph. */ if (periph->flags & CAM_PERIPH_FREE) { next_periph = SLIST_NEXT(periph, periph_links); @@ -2192,16 +2192,9 @@ xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, */ periph->refcount++; - xpt_unlock_buses(); - retval = tr_func(periph, arg); /* - * We need the lock for list traversal. - */ - xpt_lock_buses(); - - /* * Grab the next peripheral before we release this one, so * our next pointer is still valid. */ @@ -2283,11 +2276,6 @@ xptpdperiphtraverse(struct periph_driver **pdrv, */ periph->refcount++; - /* - * XXX KDM we have the toplogy lock here, but in - * xptperiphtraverse(), we drop it before calling the - * traversal function. Which is correct? - */ retval = tr_func(periph, arg); /* |