| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
prior to 3.0.0 release) as contrib/jemalloc, and integrate it into libc.
The code being imported by this commit diverged from
lib/libc/stdlib/malloc.c in March 2010, which means that a portion of
the jemalloc 1.0.0 ChangeLog entries are relevant, as are the entries
for all subsequent releases.
|
|
|
|
|
|
|
| |
MALLOC_OPTIONS environment variable, not JEMALLOC_OPTIONS.
Reviewed by: jasone
Approved by: emaste (mentor)
|
|
|
|
| |
They have no effect when coming in pairs, or before .Bl/.Bd
|
|
|
|
| |
Approved by: rrs (mentor)
|
|
|
|
| |
Approved by: rrs (mentor)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix a race in chunk_dealloc_dss().
* Check for allocation failure before zeroing memory in base_calloc().
Merge enhancements from a divergent version of jemalloc:
* Convert thread-specific caching from magazines to an algorithm that is
more tunable, and implement incremental GC.
* Add support for medium size classes, [4KiB..32KiB], 2KiB apart by
default.
* Add dirty page tracking for pages within active small/medium object
runs. This allows malloc to track precisely which pages are in active
use, which makes dirty page purging more effective.
* Base maximum dirty page count on proportion of active memory.
* Use optional zeroing in arena_chunk_alloc() to avoid needless zeroing
of chunks. This is useful in the context of DSS allocation, since a
long-lived application may commonly recycle chunks.
* Increase the default chunk size from 1MiB to 4MiB.
Remove feature:
* Remove the dynamic rebalancing code, since thread caching reduces its
utility.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a large page size that is greater than malloc(3)'s default chunk size but
less than or equal to 4 MB, then increase the chunk size to match the large
page size.
Most often, using a chunk size that is less than the large page size is not
a problem. However, consider a long-running application that allocates and
frees significant amounts of memory. In particular, it frees enough memory
at times that some of that memory is munmap()ed. Up until the first
munmap(), a 1MB chunk size is just fine; it's not a problem for the virtual
memory system. Two adjacent 1MB chunks that are aligned on a 2MB boundary
will be promoted automatically to a superpage even though they were
allocated at different times. The trouble begins with the munmap(),
releasing a 1MB chunk will trigger the demotion of the containing superpage,
leaving behind a half-used 2MB reservation. Now comes the real problem.
Unfortunately, when the application needs to allocate more memory, and it
recycles the previously munmap()ed address range, the implementation of
mmap() won't be able to reuse the reservation. Basically, the coalescing
rules in the virtual memory system don't allow this new range to combine
with its neighbor. The effect being that superpage promotion will not
reoccur for this range of addresses until both 1MB chunks are freed at some
point in the future.
Reviewed by: jasone
MFC after: 3 weeks
|
|
|
|
|
|
|
| |
potential extreme contention in the kernel for multi-threaded applications
on SMP systems.
Reported by: kris
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This caching allows for completely lock-free allocation/deallocation in the
steady state, at the expense of likely increased memory use and
fragmentation.
Reduce the default number of arenas to 2*ncpus, since thread-specific
caching typically reduces arena contention.
Modify size class spacing to include ranges of 2^n-spaced, quantum-spaced,
cacheline-spaced, and subpage-spaced size classes. The advantages are:
fewer size classes, reduced false cacheline sharing, and reduced internal
fragmentation for allocations that are slightly over 512, 1024, etc.
Increase RUN_MAX_SMALL, in order to limit fragmentation for the
subpage-spaced size classes.
Add a size-->bin lookup table for small sizes to simplify translating sizes
to size classes. Include a hard-coded constant table that is used unless
custom size class spacing is specified at run time.
Add the ability to disable tiny size classes at compile time via
MALLOC_TINY.
|
|
|
|
|
|
|
|
| |
allocation patterns, number of CPUs, and MALLOC_OPTIONS settings indicate
that lazy deallocation has the potential to worsen throughput dramatically.
Performance degradation occurs when multiple threads try to clear the lazy
free cache simultaneously. Various experiments to avoid this bottleneck
failed to completely solve this problem, while adding yet more complexity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
threshold, according to the 'F' MALLOC_OPTIONS flag. This obsoletes the
'H' flag.
Try to realloc() large objects in place. This substantially speeds up
incremental large reallocations in the common case.
Fix a bug in arena_ralloc() that caused relocation of sub-page objects
even if the old and new sizes were in the same size class.
Maintain trees of runs and simplify the per-chunk page map. This allows
logarithmic-time searching for sufficiently large runs in
arena_run_alloc(), whereas the previous algorithm required linear time
in the worst case.
Break various large functions into smaller sub-functions, and inline
only the functions that are in the fast path for small object
allocation/deallocation.
Remove an unnecessary check in base_pages_alloc_mmap().
Avoid integer division in choose_arena() for the NO_TLS case on
single-CPU systems.
|
|
|
|
|
|
|
|
|
| |
default. This has the disadvantage of rendering the datasize resource
limit irrelevant, but without this change, legitimate uses of more
memory than will fit in the data segment are thwarted by default.
Fix chunk_alloc_mmap() to work correctly if initial mapping is not
chunk-aligned and mapping extension fails.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memory is acquired from the system via sbrk(2) and/or mmap(2). By default,
use sbrk(2) only, in order to support traditional use of resource limits.
Additionally, when both options are enabled, prefer the data segment to
anonymous mappings, in order to coexist better with large file mappings
in applications on 32-bit platforms. This change has the potential to
increase memory fragmentation due to the linear nature of the data
segment, but from a performance perspective this is mitigated by the use
of madvise(2). [1]
Add the ability to interpret integer prefixes in MALLOC_OPTIONS
processing. For example, MALLOC_OPTIONS=lllllllll can now be specified as
MALLOC_OPTIONS=9l.
Reported by: [1] rwatson
Design review: [1] alc, peter, rwatson
|
| |
|
|
|
|
| |
Submitted by: bmah, jhb
|
|
|
|
| |
enhancements.
|
| |
|
| |
|
| |
|
|
|
|
| |
Discussed with: arch@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
allocation patterns that involve a relatively even mixture of many
different size classes.
Reduce the chunk size from 16 MB to 2 MB. Since chunks are now carved up
using an address-ordered first best fit policy, VM map fragmentation is
much less likely, which makes smaller chunks not as much of a risk. This
reduces the virtual memory size of most applications.
Remove redzones, since program buffer overruns are no longer as likely to
corrupt malloc data structures.
Remove the C MALLOC_OPTIONS flag, and add H and S.
|
| |
|
|
|
|
|
|
|
| |
a scalable concurrent allocator implementation.
Reviewed by: current@
Approved by: phk, markm (mentor)
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Requested by: keramida
Bump .Dd
Requested by: ru
|
| |
|
| |
|
|
|
|
|
| |
PR: 43357
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
|
| |
|
| |
|
|
|
|
| |
Approved by: re
|
|
|
|
|
|
|
| |
are marked up in stdio(3), and because they are defined expressions
of type "FILE *".
Approved by: re
|
|
|
|
|
|
|
|
| |
Hopefully, now it is more clear that the memory referenced by the
ptr argument of realloc(ptr,size) is freed and only the return value
of realloc() points to a valid memory area upon successful completion.
Submitted by: Martin Faxer <gmh003532@brfmasthugget.se>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
different pointer than the one passed to it.
PR: docs/31925
Submitted by: Andrew <andrew@ugh.net.au>
|
|
|
|
|
| |
PR: 31365
Submitted by: SUZUKI Koichi <koich@cac.co.jp>
|
|
|
|
| |
Inspired by comment from: dd
|
|
|
|
|
|
|
|
| |
Backout previous revision. We should not expand plain text xrefs if
they appear in the literal text, e.g. in the error or warning message
of the library function. (Submitted by: bde)
Moved "out of memory" from warning to errors section.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|