diff options
author | green <green@FreeBSD.org> | 2004-07-19 06:21:27 +0000 |
---|---|---|
committer | green <green@FreeBSD.org> | 2004-07-19 06:21:27 +0000 |
commit | c4b4a8f0485fc72cdb96016e2461ed2f31eece8e (patch) | |
tree | 44be1f3417ec13bd5a460473ed5ac140766f30cd /sys/kern | |
parent | 15d7825a5f77eea2647e578771fa0533edc0863d (diff) | |
download | FreeBSD-src-c4b4a8f0485fc72cdb96016e2461ed2f31eece8e.zip FreeBSD-src-c4b4a8f0485fc72cdb96016e2461ed2f31eece8e.tar.gz |
Reimplement contigmalloc(9) with an algorithm which stands a greatly-
improved chance of working despite pressure from running programs.
Instead of trying to throw a bunch of pages out to swap and hope for
the best, only a range that can potentially fulfill contigmalloc(9)'s
request will have its contents paged out (potentially, not forcibly)
at a time.
The new contigmalloc operation still operates in three passes, but it
could potentially be tuned to more or less. The first pass only looks
at pages in the cache and free pages, so they would be thrown out
without having to block. If this is not enough, the subsequent passes
page out any unwired memory. To combat memory pressure refragmenting
the section of memory being laundered, each page is removed from the
systems' free memory queue once it has been freed so that blocking
later doesn't cause the memory laundered so far to get reallocated.
The page-out operations are now blocking, as it would make little sense
to try to push out a page, then get its status immediately afterward
to remove it from the available free pages queue, if it's unlikely to
have been freed. Another change is that if KVA allocation fails, the
allocated memory segment will be freed and not leaked.
There is a sysctl/tunable, defaulting to on, which causes the old
contigmalloc() algorithm to be used. Nonetheless, I have been using
vm.old_contigmalloc=0 for over a month. It is safe to switch at
run-time to see the difference it makes.
A new interface has been used which does not require mapping the
allocated pages into KVA: vm_page.h functions vm_page_alloc_contig()
and vm_page_release_contig(). These are what vm.old_contigmalloc=0
uses internally, so the sysctl/tunable does not affect their operation.
When using the contigmalloc(9) and contigfree(9) interfaces, memory
is now tracked with malloc(9) stats. Several functions have been
exported from kern_malloc.c to allow other subsystems to use these
statistics, as well. This invalidates the BUGS section of the
contigmalloc(9) manpage.
Diffstat (limited to 'sys/kern')
-rw-r--r-- | sys/kern/kern_malloc.c | 74 |
1 files changed, 47 insertions, 27 deletions
diff --git a/sys/kern/kern_malloc.c b/sys/kern/kern_malloc.c index e966a49..83a8283 100644 --- a/sys/kern/kern_malloc.c +++ b/sys/kern/kern_malloc.c @@ -176,6 +176,47 @@ malloc_last_fail(void) } /* + * Add this to the informational malloc_type bucket. + */ +static void +malloc_type_zone_allocated(struct malloc_type *ksp, unsigned long size, + int zindx) +{ + mtx_lock(&ksp->ks_mtx); + ksp->ks_calls++; + if (zindx != -1) + ksp->ks_size |= 1 << zindx; + if (size != 0) { + ksp->ks_memuse += size; + ksp->ks_inuse++; + if (ksp->ks_memuse > ksp->ks_maxused) + ksp->ks_maxused = ksp->ks_memuse; + } + mtx_unlock(&ksp->ks_mtx); +} + +void +malloc_type_allocated(struct malloc_type *ksp, unsigned long size) +{ + malloc_type_zone_allocated(ksp, size, -1); +} + +/* + * Remove this allocation from the informational malloc_type bucket. + */ +void +malloc_type_freed(struct malloc_type *ksp, unsigned long size) +{ + mtx_lock(&ksp->ks_mtx); + KASSERT(size <= ksp->ks_memuse, + ("malloc(9)/free(9) confusion.\n%s", + "Probably freeing with wrong type, but maybe not here.")); + ksp->ks_memuse -= size; + ksp->ks_inuse--; + mtx_unlock(&ksp->ks_mtx); +} + +/* * malloc: * * Allocate a block of memory. @@ -196,7 +237,6 @@ malloc(size, type, flags) #ifdef DIAGNOSTIC unsigned long osize = size; #endif - register struct malloc_type *ksp = type; #ifdef INVARIANTS /* @@ -242,29 +282,16 @@ malloc(size, type, flags) krequests[size >> KMEM_ZSHIFT]++; #endif va = uma_zalloc(zone, flags); - mtx_lock(&ksp->ks_mtx); - if (va == NULL) - goto out; - - ksp->ks_size |= 1 << indx; - size = keg->uk_size; + if (va != NULL) + size = keg->uk_size; + malloc_type_zone_allocated(type, va == NULL ? 0 : size, indx); } else { size = roundup(size, PAGE_SIZE); zone = NULL; keg = NULL; va = uma_large_malloc(size, flags); - mtx_lock(&ksp->ks_mtx); - if (va == NULL) - goto out; + malloc_type_allocated(type, va == NULL ? 0 : size); } - ksp->ks_memuse += size; - ksp->ks_inuse++; -out: - ksp->ks_calls++; - if (ksp->ks_memuse > ksp->ks_maxused) - ksp->ks_maxused = ksp->ks_memuse; - - mtx_unlock(&ksp->ks_mtx); if (flags & M_WAITOK) KASSERT(va != NULL, ("malloc(M_WAITOK) returned NULL")); else if (va == NULL) @@ -289,7 +316,6 @@ free(addr, type) void *addr; struct malloc_type *type; { - register struct malloc_type *ksp = type; uma_slab_t slab; u_long size; @@ -297,7 +323,7 @@ free(addr, type) if (addr == NULL) return; - KASSERT(ksp->ks_memuse > 0, + KASSERT(type->ks_memuse > 0, ("malloc(9)/free(9) confusion.\n%s", "Probably freeing with wrong type, but maybe not here.")); size = 0; @@ -334,13 +360,7 @@ free(addr, type) size = slab->us_size; uma_large_free(slab); } - mtx_lock(&ksp->ks_mtx); - KASSERT(size <= ksp->ks_memuse, - ("malloc(9)/free(9) confusion.\n%s", - "Probably freeing with wrong type, but maybe not here.")); - ksp->ks_memuse -= size; - ksp->ks_inuse--; - mtx_unlock(&ksp->ks_mtx); + malloc_type_freed(type, size); } /* |