From 85046579bde15e532983438f86b36856e358f417 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Fri, 20 Jan 2012 14:34:19 -0800 Subject: SHM_UNLOCK: fix long unpreemptible section scan_mapping_unevictable_pages() is used to make SysV SHM_LOCKed pages evictable again once the shared memory is unlocked. It does this with pagevec_lookup()s across the whole object (which might occupy most of memory), and takes 300ms to unlock 7GB here. A cond_resched() every PAGEVEC_SIZE pages would be good. However, KOSAKI-san points out that this is called under shmem.c's info->lock, and it's also under shm.c's shm_lock(), both spinlocks. There is no strong reason for that: we need to take these pages off the unevictable list soonish, but those locks are not required for it. So move the call to scan_mapping_unevictable_pages() from shmem.c's unlock handling up to shm.c's unlock handling. Remove the recently added barrier, not needed now we have spin_unlock() before the scan. Use get_file(), with subsequent fput(), to make sure we have a reference to mapping throughout scan_mapping_unevictable_pages(): that's something that was previously guaranteed by the shm_lock(). Remove shmctl's lru_add_drain_all(): we don't fault in pages at SHM_LOCK time, and we lazily discover them to be Unevictable later, so it serves no purpose for SHM_LOCK; and serves no purpose for SHM_UNLOCK, since pages still on pagevec are not marked Unevictable. The original code avoided redundant rescans by checking VM_LOCKED flag at its level: now avoid them by checking shp's SHM_LOCKED. The original code called scan_mapping_unevictable_pages() on a locked area at shm_destroy() time: perhaps we once had accounting cross-checks which required that, but not now, so skip the overhead and just let inode eviction deal with them. Put check_move_unevictable_page() and scan_mapping_unevictable_pages() under CONFIG_SHMEM (with stub for the TINY case when ramfs is used), more as comment than to save space; comment them used for SHM_UNLOCK. Signed-off-by: Hugh Dickins Reviewed-by: KOSAKI Motohiro Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Eric Dumazet Cc: Johannes Weiner Cc: Michel Lespinasse Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- ipc/shm.c | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) (limited to 'ipc/shm.c') diff --git a/ipc/shm.c b/ipc/shm.c index 02ecf2c..854ab58 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -870,9 +870,7 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf) case SHM_LOCK: case SHM_UNLOCK: { - struct file *uninitialized_var(shm_file); - - lru_add_drain_all(); /* drain pagevecs to lru lists */ + struct file *shm_file; shp = shm_lock_check(ns, shmid); if (IS_ERR(shp)) { @@ -895,22 +893,31 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf) err = security_shm_shmctl(shp, cmd); if (err) goto out_unlock; - - if(cmd==SHM_LOCK) { + + shm_file = shp->shm_file; + if (is_file_hugepages(shm_file)) + goto out_unlock; + + if (cmd == SHM_LOCK) { struct user_struct *user = current_user(); - if (!is_file_hugepages(shp->shm_file)) { - err = shmem_lock(shp->shm_file, 1, user); - if (!err && !(shp->shm_perm.mode & SHM_LOCKED)){ - shp->shm_perm.mode |= SHM_LOCKED; - shp->mlock_user = user; - } + err = shmem_lock(shm_file, 1, user); + if (!err && !(shp->shm_perm.mode & SHM_LOCKED)) { + shp->shm_perm.mode |= SHM_LOCKED; + shp->mlock_user = user; } - } else if (!is_file_hugepages(shp->shm_file)) { - shmem_lock(shp->shm_file, 0, shp->mlock_user); - shp->shm_perm.mode &= ~SHM_LOCKED; - shp->mlock_user = NULL; + goto out_unlock; } + + /* SHM_UNLOCK */ + if (!(shp->shm_perm.mode & SHM_LOCKED)) + goto out_unlock; + shmem_lock(shm_file, 0, shp->mlock_user); + shp->shm_perm.mode &= ~SHM_LOCKED; + shp->mlock_user = NULL; + get_file(shm_file); shm_unlock(shp); + scan_mapping_unevictable_pages(shm_file->f_mapping); + fput(shm_file); goto out; } case IPC_RMID: -- cgit v1.1 From 245132643e1cfcd145bbc86a716c1818371fcb93 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Fri, 20 Jan 2012 14:34:21 -0800 Subject: SHM_UNLOCK: fix Unevictable pages stranded after swap Commit cc39c6a9bbde ("mm: account skipped entries to avoid looping in find_get_pages") correctly fixed an infinite loop; but left a problem that find_get_pages() on shmem would return 0 (appearing to callers to mean end of tree) when it meets a run of nr_pages swap entries. The only uses of find_get_pages() on shmem are via pagevec_lookup(), called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's scan_mapping_unevictable_pages(). The first is already commented, and not worth worrying about; but the second can leave pages on the Unevictable list after an unusual sequence of swapping and locking. Fix that by using shmem_find_get_pages_and_swap() (then ignoring the swap) instead of pagevec_lookup(). But I don't want to contaminate vmscan.c with shmem internals, nor shmem.c with LRU locking. So move scan_mapping_unevictable_pages() into shmem.c, renaming it shmem_unlock_mapping(); and rename check_move_unevictable_page() to check_move_unevictable_pages(), looping down an array of pages, oftentimes under the same lock. Leave out the "rotate unevictable list" block: that's a leftover from when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed handling involved looking at pages at tail of LRU. Was there significance to the sequence first ClearPageUnevictable, then test page_evictable, then SetPageUnevictable here? I think not, we're under LRU lock, and have no barriers between those. Signed-off-by: Hugh Dickins Reviewed-by: KOSAKI Motohiro Cc: Minchan Kim Cc: Rik van Riel Cc: Shaohua Li Cc: Eric Dumazet Cc: Johannes Weiner Cc: Michel Lespinasse Cc: [back to 3.1 but will need respins] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- ipc/shm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'ipc/shm.c') diff --git a/ipc/shm.c b/ipc/shm.c index 854ab58..b76be5b 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -916,7 +916,7 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf) shp->mlock_user = NULL; get_file(shm_file); shm_unlock(shp); - scan_mapping_unevictable_pages(shm_file->f_mapping); + shmem_unlock_mapping(shm_file->f_mapping); fput(shm_file); goto out; } -- cgit v1.1