diff options
author | Jan Kara <jack@suse.cz> | 2016-05-12 18:29:18 +0200 |
---|---|---|
committer | Ross Zwisler <ross.zwisler@linux.intel.com> | 2016-05-19 15:20:54 -0600 |
commit | ac401cc782429cc8560ce4840b1405d603740917 (patch) | |
tree | 44deea39b147b4f2e75286943e2ec1c838e7a2fa /mm/filemap.c | |
parent | 4f622938a5e2b7f1374ffb1e5fc212744898f513 (diff) | |
download | op-kernel-dev-ac401cc782429cc8560ce4840b1405d603740917.zip op-kernel-dev-ac401cc782429cc8560ce4840b1405d603740917.tar.gz |
dax: New fault locking
Currently DAX page fault locking is racy.
CPU0 (write fault) CPU1 (read fault)
__dax_fault() __dax_fault()
get_block(inode, block, &bh, 0) -> not mapped
get_block(inode, block, &bh, 0)
-> not mapped
if (!buffer_mapped(&bh))
if (vmf->flags & FAULT_FLAG_WRITE)
get_block(inode, block, &bh, 1) -> allocates blocks
if (page) -> no
if (!buffer_mapped(&bh))
if (vmf->flags & FAULT_FLAG_WRITE) {
} else {
dax_load_hole();
}
dax_insert_mapping()
And we are in a situation where we fail in dax_radix_entry() with -EIO.
Another problem with the current DAX page fault locking is that there is
no race-free way to clear dirty tag in the radix tree. We can always
end up with clean radix tree and dirty data in CPU cache.
We fix the first problem by introducing locking of exceptional radix
tree entries in DAX mappings acting very similarly to page lock and thus
synchronizing properly faults against the same mapping index. The same
lock can later be used to avoid races when clearing radix tree dirty
tag.
Reviewed-by: NeilBrown <neilb@suse.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Diffstat (limited to 'mm/filemap.c')
-rw-r--r-- | mm/filemap.c | 9 |
1 files changed, 7 insertions, 2 deletions
diff --git a/mm/filemap.c b/mm/filemap.c index dfe55c2..7b9a4b1 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -160,13 +160,15 @@ static void page_cache_tree_delete(struct address_space *mapping, return; /* - * Track node that only contains shadow entries. + * Track node that only contains shadow entries. DAX mappings contain + * no shadow entries and may contain other exceptional entries so skip + * those. * * Avoid acquiring the list_lru lock if already tracked. The * list_empty() test is safe as node->private_list is * protected by mapping->tree_lock. */ - if (!workingset_node_pages(node) && + if (!dax_mapping(mapping) && !workingset_node_pages(node) && list_empty(&node->private_list)) { node->private_data = mapping; list_lru_add(&workingset_shadow_nodes, &node->private_list); @@ -611,6 +613,9 @@ static int page_cache_tree_insert(struct address_space *mapping, /* DAX accounts exceptional entries as normal pages */ if (node) workingset_node_pages_dec(node); + /* Wakeup waiters for exceptional entry lock */ + dax_wake_mapping_entry_waiter(mapping, page->index, + false); } } radix_tree_replace_slot(slot, page); |