diff options
author | Christoph Lameter <clameter@sgi.com> | 2007-10-16 01:26:08 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@woody.linux-foundation.org> | 2007-10-16 09:43:01 -0700 |
commit | 4c93c355d5d563f300df7e61ef753d7a064411e9 (patch) | |
tree | 24bcdbed58a51c69640da9c8e220dd5ce0c054a7 /include/linux/slub_def.h | |
parent | ee3c72a14bfecdf783738032ff3c73ef6412f5b3 (diff) | |
download | op-kernel-dev-4c93c355d5d563f300df7e61ef753d7a064411e9.zip op-kernel-dev-4c93c355d5d563f300df7e61ef753d7a064411e9.tar.gz |
SLUB: Place kmem_cache_cpu structures in a NUMA aware way
The kmem_cache_cpu structures introduced are currently an array placed in the
kmem_cache struct. Meaning the kmem_cache_cpu structures are overwhelmingly
on the wrong node for systems with a higher amount of nodes. These are
performance critical structures since the per node information has
to be touched for every alloc and free in a slab.
In order to place the kmem_cache_cpu structure optimally we put an array
of pointers to kmem_cache_cpu structs in kmem_cache (similar to SLAB).
However, the kmem_cache_cpu structures can now be allocated in a more
intelligent way.
We would like to put per cpu structures for the same cpu but different
slab caches in cachelines together to save space and decrease the cache
footprint. However, the slab allocators itself control only allocations
per node. We set up a simple per cpu array for every processor with
100 per cpu structures which is usually enough to get them all set up right.
If we run out then we fall back to kmalloc_node. This also solves the
bootstrap problem since we do not have to use slab allocator functions
early in boot to get memory for the small per cpu structures.
Pro:
- NUMA aware placement improves memory performance
- All global structures in struct kmem_cache become readonly
- Dense packing of per cpu structures reduces cacheline
footprint in SMP and NUMA.
- Potential avoidance of exclusive cacheline fetches
on the free and alloc hotpath since multiple kmem_cache_cpu
structures are in one cacheline. This is particularly important
for the kmalloc array.
Cons:
- Additional reference to one read only cacheline (per cpu
array of pointers to kmem_cache_cpu) in both slab_alloc()
and slab_free().
[akinobu.mita@gmail.com: fix cpu hotplug offline/online path]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "Pekka Enberg" <penberg@cs.helsinki.fi>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux/slub_def.h')
-rw-r--r-- | include/linux/slub_def.h | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 92e10cf..f74716b 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -16,8 +16,7 @@ struct kmem_cache_cpu { struct page *page; int node; unsigned int offset; - /* Lots of wasted space */ -} ____cacheline_aligned_in_smp; +}; struct kmem_cache_node { spinlock_t list_lock; /* Protect partial list and nr_partial */ @@ -62,7 +61,11 @@ struct kmem_cache { int defrag_ratio; struct kmem_cache_node *node[MAX_NUMNODES]; #endif - struct kmem_cache_cpu cpu_slab[NR_CPUS]; +#ifdef CONFIG_SMP + struct kmem_cache_cpu *cpu_slab[NR_CPUS]; +#else + struct kmem_cache_cpu cpu_slab; +#endif }; /* |