diff options
author | jasone <jasone@FreeBSD.org> | 2016-02-29 19:10:32 +0000 |
---|---|---|
committer | jasone <jasone@FreeBSD.org> | 2016-02-29 19:10:32 +0000 |
commit | ac01d0e42d905f1758cecc124bcf65024cb3a2d4 (patch) | |
tree | 9f2709c1ddd21e02e5ee473059251e64d3bc457f | |
parent | 997362c1e3a4a3c1b28833f88702375860f6a8c4 (diff) | |
download | FreeBSD-src-ac01d0e42d905f1758cecc124bcf65024cb3a2d4.zip FreeBSD-src-ac01d0e42d905f1758cecc124bcf65024cb3a2d4.tar.gz |
Update jemalloc to 4.1.0.
Add missing Symbol.map entry for __aligned_alloc.
Add weak-->strong symbol binding for
{malloc_stats_print,mallctl,mallctlnametomib,mallctlbymib} -->
{__malloc_stats_print,__mallctl,__mallctlnametomib,__mallctlbymib}. These
bindings complete the set necessary to allow applications to replace all
malloc-related symbols.
53 files changed, 3391 insertions, 1577 deletions
diff --git a/contrib/jemalloc/COPYING b/contrib/jemalloc/COPYING index 611968c..104b1f8 100644 --- a/contrib/jemalloc/COPYING +++ b/contrib/jemalloc/COPYING @@ -1,10 +1,10 @@ Unless otherwise specified, files in the jemalloc source distribution are subject to the following license: -------------------------------------------------------------------------------- -Copyright (C) 2002-2015 Jason Evans <jasone@canonware.com>. +Copyright (C) 2002-2016 Jason Evans <jasone@canonware.com>. All rights reserved. Copyright (C) 2007-2012 Mozilla Foundation. All rights reserved. -Copyright (C) 2009-2015 Facebook, Inc. All rights reserved. +Copyright (C) 2009-2016 Facebook, Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: diff --git a/contrib/jemalloc/ChangeLog b/contrib/jemalloc/ChangeLog index 8ed42cb..9cbfbf9 100644 --- a/contrib/jemalloc/ChangeLog +++ b/contrib/jemalloc/ChangeLog @@ -4,6 +4,79 @@ brevity. Much more detail can be found in the git revision history: https://github.com/jemalloc/jemalloc +* 4.1.0 (February 28, 2016) + + This release is primarily about optimizations, but it also incorporates a lot + of portability-motivated refactoring and enhancements. Many people worked on + this release, to an extent that even with the omission here of minor changes + (see git revision history), and of the people who reported and diagnosed + issues, so much of the work was contributed that starting with this release, + changes are annotated with author credits to help reflect the collaborative + effort involved. + + New features: + - Implement decay-based unused dirty page purging, a major optimization with + mallctl API impact. This is an alternative to the existing ratio-based + unused dirty page purging, and is intended to eventually become the sole + purging mechanism. New mallctls: + + opt.purge + + opt.decay_time + + arena.<i>.decay + + arena.<i>.decay_time + + arenas.decay_time + + stats.arenas.<i>.decay_time + (@jasone, @cevans87) + - Add --with-malloc-conf, which makes it possible to embed a default + options string during configuration. This was motivated by the desire to + specify --with-malloc-conf=purge:decay , since the default must remain + purge:ratio until the 5.0.0 release. (@jasone) + - Add MS Visual Studio 2015 support. (@rustyx, @yuslepukhin) + - Make *allocx() size class overflow behavior defined. The maximum + size class is now less than PTRDIFF_MAX to protect applications against + numerical overflow, and all allocation functions are guaranteed to indicate + errors rather than potentially crashing if the request size exceeds the + maximum size class. (@jasone) + - jeprof: + + Add raw heap profile support. (@jasone) + + Add --retain and --exclude for backtrace symbol filtering. (@jasone) + + Optimizations: + - Optimize the fast path to combine various bootstrapping and configuration + checks and execute more streamlined code in the common case. (@interwq) + - Use linear scan for small bitmaps (used for small object tracking). In + addition to speeding up bitmap operations on 64-bit systems, this reduces + allocator metadata overhead by approximately 0.2%. (@djwatson) + - Separate arena_avail trees, which substantially speeds up run tree + operations. (@djwatson) + - Use memoization (boot-time-computed table) for run quantization. Separate + arena_avail trees reduced the importance of this optimization. (@jasone) + - Attempt mmap-based in-place huge reallocation. This can dramatically speed + up incremental huge reallocation. (@jasone) + + Incompatible changes: + - Make opt.narenas unsigned rather than size_t. (@jasone) + + Bug fixes: + - Fix stats.cactive accounting regression. (@rustyx, @jasone) + - Handle unaligned keys in hash(). This caused problems for some ARM systems. + (@jasone, Christopher Ferris) + - Refactor arenas array. In addition to fixing a fork-related deadlock, this + makes arena lookups faster and simpler. (@jasone) + - Move retained memory allocation out of the default chunk allocation + function, to a location that gets executed even if the application installs + a custom chunk allocation function. This resolves a virtual memory leak. + (@buchgr) + - Fix a potential tsd cleanup leak. (Christopher Ferris, @jasone) + - Fix run quantization. In practice this bug had no impact unless + applications requested memory with alignment exceeding one page. + (@jasone, @djwatson) + - Fix LinuxThreads-specific bootstrapping deadlock. (Cosmin Paraschiv) + - jeprof: + + Don't discard curl options if timeout is not defined. (@djwatson) + + Detect failed profile fetches. (@djwatson) + - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for + --disable-stats case. (@jasone) + * 4.0.4 (October 24, 2015) This bugfix release fixes another xallocx() regression. No other regressions diff --git a/contrib/jemalloc/FREEBSD-Xlist b/contrib/jemalloc/FREEBSD-Xlist index f090348..dff8d26 100644 --- a/contrib/jemalloc/FREEBSD-Xlist +++ b/contrib/jemalloc/FREEBSD-Xlist @@ -8,6 +8,7 @@ README autogen.sh autom4te.cache/ bin/ +build-aux/ config.* configure* coverage.sh @@ -26,6 +27,7 @@ include/jemalloc/internal/public_symbols.txt include/jemalloc/internal/public_unnamespace.h include/jemalloc/internal/public_unnamespace.sh include/jemalloc/internal/size_classes.sh +include/jemalloc/internal/smoothstep.sh include/jemalloc/jemalloc.h.in include/jemalloc/jemalloc.sh include/jemalloc/jemalloc_defs.h @@ -44,6 +46,7 @@ include/jemalloc/jemalloc_typedefs.h.in include/msvc_compat/ install-sh jemalloc.pc* +msvc/ src/valgrind.c src/zone.c test/ diff --git a/contrib/jemalloc/FREEBSD-diffs b/contrib/jemalloc/FREEBSD-diffs index 5700c51..e7484ea 100644 --- a/contrib/jemalloc/FREEBSD-diffs +++ b/contrib/jemalloc/FREEBSD-diffs @@ -1,5 +1,5 @@ diff --git a/doc/jemalloc.xml.in b/doc/jemalloc.xml.in -index 26a5e14..2a801b7 100644 +index bc5dbd1..ba182da 100644 --- a/doc/jemalloc.xml.in +++ b/doc/jemalloc.xml.in @@ -53,11 +53,23 @@ @@ -27,7 +27,7 @@ index 26a5e14..2a801b7 100644 <refsect2> <title>Standard API</title> <funcprototype> -@@ -2759,4 +2771,18 @@ malloc_conf = "lg_chunk:24";]]></programlisting></para> +@@ -2905,4 +2917,18 @@ malloc_conf = "lg_chunk:24";]]></programlisting></para> <para>The <function>posix_memalign<parameter/></function> function conforms to IEEE Std 1003.1-2001 (“POSIX.1”).</para> </refsect1> @@ -47,7 +47,7 @@ index 26a5e14..2a801b7 100644 + </refsect1> </refentry> diff --git a/include/jemalloc/internal/jemalloc_internal.h.in b/include/jemalloc/internal/jemalloc_internal.h.in -index 654cd08..ad5382d 100644 +index 3f54391..d240256 100644 --- a/include/jemalloc/internal/jemalloc_internal.h.in +++ b/include/jemalloc/internal/jemalloc_internal.h.in @@ -8,6 +8,9 @@ @@ -72,11 +72,11 @@ index 654cd08..ad5382d 100644 -#endif - ; +static const bool config_lazy_lock = true; + static const char * const config_malloc_conf = JEMALLOC_CONFIG_MALLOC_CONF; static const bool config_prof = #ifdef JEMALLOC_PROF - true diff --git a/include/jemalloc/internal/jemalloc_internal_decls.h b/include/jemalloc/internal/jemalloc_internal_decls.h -index a601d6e..e7094b2 100644 +index 2b8ca5d..42d97f2 100644 --- a/include/jemalloc/internal/jemalloc_internal_decls.h +++ b/include/jemalloc/internal/jemalloc_internal_decls.h @@ -1,6 +1,9 @@ @@ -111,10 +111,10 @@ index f051f29..561378f 100644 #endif /* JEMALLOC_H_EXTERNS */ diff --git a/include/jemalloc/internal/private_symbols.txt b/include/jemalloc/internal/private_symbols.txt -index a90021a..34904bf 100644 +index 5880996..6e94e03 100644 --- a/include/jemalloc/internal/private_symbols.txt +++ b/include/jemalloc/internal/private_symbols.txt -@@ -280,7 +280,6 @@ iralloct_realign +@@ -296,7 +296,6 @@ iralloct_realign isalloc isdalloct isqalloc @@ -124,10 +124,10 @@ index a90021a..34904bf 100644 jemalloc_postfork_child diff --git a/include/jemalloc/jemalloc_FreeBSD.h b/include/jemalloc/jemalloc_FreeBSD.h new file mode 100644 -index 0000000..737542e +index 0000000..433dab5 --- /dev/null +++ b/include/jemalloc/jemalloc_FreeBSD.h -@@ -0,0 +1,142 @@ +@@ -0,0 +1,160 @@ +/* + * Override settings that were generated in jemalloc_defs.h as necessary. + */ @@ -182,6 +182,9 @@ index 0000000..737542e +#elif defined(__powerpc__) +# define LG_SIZEOF_PTR 2 +#endif ++#ifdef __riscv__ ++# define LG_SIZEOF_PTR 3 ++#endif + +#ifndef JEMALLOC_TLS_MODEL +# define JEMALLOC_TLS_MODEL /* Default. */ @@ -205,17 +208,22 @@ index 0000000..737542e +/* Mangle. */ +#undef je_malloc +#undef je_calloc -+#undef je_realloc -+#undef je_free +#undef je_posix_memalign +#undef je_aligned_alloc ++#undef je_realloc ++#undef je_free +#undef je_malloc_usable_size +#undef je_mallocx +#undef je_rallocx +#undef je_xallocx +#undef je_sallocx +#undef je_dallocx ++#undef je_sdallocx +#undef je_nallocx ++#undef je_mallctl ++#undef je_mallctlnametomib ++#undef je_mallctlbymib ++#undef je_malloc_stats_print +#undef je_allocm +#undef je_rallocm +#undef je_sallocm @@ -223,17 +231,22 @@ index 0000000..737542e +#undef je_nallocm +#define je_malloc __malloc +#define je_calloc __calloc -+#define je_realloc __realloc -+#define je_free __free +#define je_posix_memalign __posix_memalign +#define je_aligned_alloc __aligned_alloc ++#define je_realloc __realloc ++#define je_free __free +#define je_malloc_usable_size __malloc_usable_size +#define je_mallocx __mallocx +#define je_rallocx __rallocx +#define je_xallocx __xallocx +#define je_sallocx __sallocx +#define je_dallocx __dallocx ++#define je_sdallocx __sdallocx +#define je_nallocx __nallocx ++#define je_mallctl __mallctl ++#define je_mallctlnametomib __mallctlnametomib ++#define je_mallctlbymib __mallctlbymib ++#define je_malloc_stats_print __malloc_stats_print +#define je_allocm __allocm +#define je_rallocm __rallocm +#define je_sallocm __sallocm @@ -253,17 +266,22 @@ index 0000000..737542e + */ +__weak_reference(__malloc, malloc); +__weak_reference(__calloc, calloc); -+__weak_reference(__realloc, realloc); -+__weak_reference(__free, free); +__weak_reference(__posix_memalign, posix_memalign); +__weak_reference(__aligned_alloc, aligned_alloc); ++__weak_reference(__realloc, realloc); ++__weak_reference(__free, free); +__weak_reference(__malloc_usable_size, malloc_usable_size); +__weak_reference(__mallocx, mallocx); +__weak_reference(__rallocx, rallocx); +__weak_reference(__xallocx, xallocx); +__weak_reference(__sallocx, sallocx); +__weak_reference(__dallocx, dallocx); ++__weak_reference(__sdallocx, sdallocx); +__weak_reference(__nallocx, nallocx); ++__weak_reference(__mallctl, mallctl); ++__weak_reference(__mallctlnametomib, mallctlnametomib); ++__weak_reference(__mallctlbymib, mallctlbymib); ++__weak_reference(__malloc_stats_print, malloc_stats_print); +__weak_reference(__allocm, allocm); +__weak_reference(__rallocm, rallocm); +__weak_reference(__sallocm, sallocm); @@ -282,7 +300,7 @@ index f943891..47d032c 100755 +#include "jemalloc_FreeBSD.h" EOF diff --git a/src/jemalloc.c b/src/jemalloc.c -index 5a2d324..b6cbb79 100644 +index 0735376..a34b85c 100644 --- a/src/jemalloc.c +++ b/src/jemalloc.c @@ -4,6 +4,10 @@ @@ -296,7 +314,7 @@ index 5a2d324..b6cbb79 100644 /* Runtime configuration options. */ const char *je_malloc_conf JEMALLOC_ATTR(weak); bool opt_abort = -@@ -2490,6 +2494,107 @@ je_malloc_usable_size(JEMALLOC_USABLE_SIZE_CONST void *ptr) +@@ -2611,6 +2615,107 @@ je_malloc_usable_size(JEMALLOC_USABLE_SIZE_CONST void *ptr) */ /******************************************************************************/ /* @@ -404,7 +422,7 @@ index 5a2d324..b6cbb79 100644 * The following functions are used by threading libraries for protection of * malloc during fork(). */ -@@ -2590,4 +2695,11 @@ jemalloc_postfork_child(void) +@@ -2717,4 +2822,11 @@ jemalloc_postfork_child(void) ctl_postfork_child(); } @@ -463,10 +481,10 @@ index 2d47af9..934d5aa 100644 +#endif +} diff --git a/src/util.c b/src/util.c -index 4cb0d6c..25b61c2 100644 +index 02673c7..116e981 100644 --- a/src/util.c +++ b/src/util.c -@@ -58,6 +58,22 @@ wrtmessage(void *cbopaque, const char *s) +@@ -66,6 +66,22 @@ wrtmessage(void *cbopaque, const char *s) JEMALLOC_EXPORT void (*je_malloc_message)(void *, const char *s); diff --git a/contrib/jemalloc/VERSION b/contrib/jemalloc/VERSION index f9b6da9..fd7c988 100644 --- a/contrib/jemalloc/VERSION +++ b/contrib/jemalloc/VERSION @@ -1 +1 @@ -4.0.4-0-g91010a9e2ebfc84b1ac1ed7fdde3bfed4f65f180 +4.1.0-1-g994da4232621dd1210fcf39bdf0d6454cefda473 diff --git a/contrib/jemalloc/doc/jemalloc.3 b/contrib/jemalloc/doc/jemalloc.3 index 57e163d..c47f417 100644 --- a/contrib/jemalloc/doc/jemalloc.3 +++ b/contrib/jemalloc/doc/jemalloc.3 @@ -2,12 +2,12 @@ .\" Title: JEMALLOC .\" Author: Jason Evans .\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/> -.\" Date: 10/24/2015 +.\" Date: 02/28/2016 .\" Manual: User Manual -.\" Source: jemalloc 4.0.4-0-g91010a9e2ebfc84b1ac1ed7fdde3bfed4f65f180 +.\" Source: jemalloc 4.1.0-1-g994da4232621dd1210fcf39bdf0d6454cefda473 .\" Language: English .\" -.TH "JEMALLOC" "3" "10/24/2015" "jemalloc 4.0.4-0-g91010a9e2ebf" "User Manual" +.TH "JEMALLOC" "3" "02/28/2016" "jemalloc 4.1.0-1-g994da4232621" "User Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- @@ -31,7 +31,7 @@ jemalloc \- general purpose memory allocation functions .SH "LIBRARY" .PP -This manual describes jemalloc 4\&.0\&.4\-0\-g91010a9e2ebfc84b1ac1ed7fdde3bfed4f65f180\&. More information can be found at the +This manual describes jemalloc 4\&.1\&.0\-1\-g994da4232621dd1210fcf39bdf0d6454cefda473\&. More information can be found at the \m[blue]\fBjemalloc website\fR\m[]\&\s-2\u[1]\d\s+2\&. .PP The following configuration options are enabled in libc\*(Aqs built\-in jemalloc: @@ -244,7 +244,7 @@ function allocates at least bytes of memory, and returns a pointer to the base address of the allocation\&. Behavior is undefined if \fIsize\fR is -\fB0\fR, or if request size overflows due to size class and/or alignment constraints\&. +\fB0\fR\&. .PP The \fBrallocx\fR\fB\fR @@ -255,7 +255,7 @@ to be at least bytes, and returns a pointer to the base address of the resulting allocation, which may or may not have moved from its original location\&. Behavior is undefined if \fIsize\fR is -\fB0\fR, or if request size overflows due to size class and/or alignment constraints\&. +\fB0\fR\&. .PP The \fBxallocx\fR\fB\fR @@ -301,10 +301,12 @@ function allocates no memory, but it performs the same size computation as the \fBmallocx\fR\fB\fR function, and returns the real size of the allocation that would result from the equivalent \fBmallocx\fR\fB\fR -function call\&. Behavior is undefined if +function call, or +\fB0\fR +if the inputs exceed the maximum supported size class and/or alignment\&. Behavior is undefined if \fIsize\fR is -\fB0\fR, or if request size overflows due to size class and/or alignment constraints\&. +\fB0\fR\&. .PP The \fBmallctl\fR\fB\fR @@ -404,7 +406,8 @@ should not be depended on, since such behavior is entirely implementation\-depen .PP Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile\- or run\-time\&. .PP -The string pointed to by the global variable +The string specified via +\fB\-\-with\-malloc\-conf\fR, the string pointed to by the global variable \fImalloc_conf\fR, the \(lqname\(rq of the file referenced by the symbolic link named /etc/malloc\&.conf, and the value of the environment variable \fBMALLOC_CONF\fR, will be interpreted, in that order, from left to right as options\&. Note that @@ -414,8 +417,10 @@ may be read before is entered, so the declaration of \fImalloc_conf\fR should specify an initializer that contains the final value to be read by jemalloc\&. +\fB\-\-with\-malloc\-conf\fR +and \fImalloc_conf\fR -is a compile\-time setting, whereas +are compile\-time mechanisms, whereas /etc/malloc\&.conf and \fBMALLOC_CONF\fR @@ -451,11 +456,7 @@ In addition to multiple arenas, unless \fB\-\-disable\-tcache\fR is specified during configuration, this allocator supports thread\-specific caching for small and large objects, in order to make it possible to completely avoid synchronization for most allocation requests\&. Such caching allows very fast allocation in the common case, but it increases memory usage and fragmentation, since a bounded number of objects can remain allocated in each thread cache\&. .PP -Memory is conceptually broken into equal\-sized chunks, where the chunk size is a power of two that is greater than the page size\&. Chunks are always aligned to multiples of the chunk size\&. This alignment makes it possible to find metadata for user objects very quickly\&. -.PP -User objects are broken into three categories according to size: small, large, and huge\&. Small and large objects are managed entirely by arenas; huge objects are additionally aggregated in a single data structure that is shared by all threads\&. Huge objects are typically used by applications infrequently enough that this single data structure is not a scalability issue\&. -.PP -Each chunk that is managed by an arena tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object)\&. The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time\&. +Memory is conceptually broken into equal\-sized chunks, where the chunk size is a power of two that is greater than the page size\&. Chunks are always aligned to multiples of the chunk size\&. This alignment makes it possible to find metadata for user objects very quickly\&. User objects are broken into three categories according to size: small, large, and huge\&. Multiple small and large objects can reside within a single chunk, whereas huge objects each have one or more chunks backing them\&. Each chunk that contains small and/or large objects tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object)\&. The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time\&. .PP Small objects are managed in groups by page runs\&. Each run maintains a bitmap to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least sizeof(\fBdouble\fR)\&. All other object size classes are multiples of the quantum, spaced such that there are four size classes for each doubling in size, which limits internal fragmentation to approximately 20% for all but the smallest size classes\&. Small size classes are smaller than four times the page size, large size classes are smaller than the chunk size (see the @@ -703,6 +704,13 @@ was specified during build configuration\&. was specified during build configuration\&. .RE .PP +"config\&.malloc_conf" (\fBconst char *\fR) r\- +.RS 4 +Embedded configure\-time\-specified run\-time options string, empty unless +\fB\-\-with\-malloc\-conf\fR +was specified during build configuration\&. +.RE +.PP "config\&.munmap" (\fBbool\fR) r\- .RS 4 \fB\-\-enable\-munmap\fR @@ -788,11 +796,20 @@ is supported by the operating system; \(lqdisabled\(rq otherwise\&. Virtual memory chunk size (log base 2)\&. If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size\&. The default chunk size is 2 MiB (2^21)\&. .RE .PP -"opt\&.narenas" (\fBsize_t\fR) r\- +"opt\&.narenas" (\fBunsigned\fR) r\- .RS 4 Maximum number of arenas to use for automatic multiplexing of threads and arenas\&. The default is four times the number of CPUs, or one if there is a single CPU\&. .RE .PP +"opt\&.purge" (\fBconst char *\fR) r\- +.RS 4 +Purge mode is \(lqratio\(rq (default) or \(lqdecay\(rq\&. See +"opt\&.lg_dirty_mult" +for details of the ratio mode\&. See +"opt\&.decay_time" +for details of the decay mode\&. +.RE +.PP "opt\&.lg_dirty_mult" (\fBssize_t\fR) r\- .RS 4 Per\-arena minimum ratio (log base 2) of active to dirty pages\&. Some dirty unused pages may be allowed to accumulate, within the limit set by the ratio (or one chunk worth of dirty pages, whichever is greater), before informing the kernel about some of those pages via @@ -804,6 +821,15 @@ and for related dynamic control options\&. .RE .PP +"opt\&.decay_time" (\fBssize_t\fR) r\- +.RS 4 +Approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. The pages are incrementally purged according to a sigmoidal decay curve that starts and ends with zero purge rate\&. A decay time of 0 causes all unused dirty pages to be purged immediately upon creation\&. A decay time of \-1 disables purging\&. The default decay time is 10 seconds\&. See +"arenas\&.decay_time" +and +"arena\&.<i>\&.decay_time" +for related dynamic control options\&. +.RE +.PP "opt\&.stats_print" (\fBbool\fR) r\- .RS 4 Enable/disable statistics printing at exit\&. If enabled, the @@ -914,7 +940,9 @@ option for final profile dumping\&. Profile output is compatible with the command, which is based on the \fBpprof\fR that is developed as part of the -\m[blue]\fBgperftools package\fR\m[]\&\s-2\u[3]\d\s+2\&. +\m[blue]\fBgperftools package\fR\m[]\&\s-2\u[3]\d\s+2\&. See +HEAP PROFILE FORMAT +for heap profile format documentation\&. .RE .PP "opt\&.prof_prefix" (\fBconst char *\fR) r\- [\fB\-\-enable\-prof\fR] @@ -1063,7 +1091,7 @@ macro to explicitly use the specified cache rather than the automatically manage "tcache\&.flush" (\fBunsigned\fR) \-w [\fB\-\-enable\-tcache\fR] .RS 4 Flush the specified thread\-specific cache (tcache)\&. The same considerations apply to this interface as to -"thread\&.tcache\&.flush", except that the tcache will never be automatically be discarded\&. +"thread\&.tcache\&.flush", except that the tcache will never be automatically discarded\&. .RE .PP "tcache\&.destroy" (\fBunsigned\fR) \-w [\fB\-\-enable\-tcache\fR] @@ -1073,10 +1101,18 @@ Flush the specified thread\-specific cache (tcache) and make the identifier avai .PP "arena\&.<i>\&.purge" (\fBvoid\fR) \-\- .RS 4 -Purge unused dirty pages for arena <i>, or for all arenas if <i> equals +Purge all unused dirty pages for arena <i>, or for all arenas if <i> equals "arenas\&.narenas"\&. .RE .PP +"arena\&.<i>\&.decay" (\fBvoid\fR) \-\- +.RS 4 +Trigger decay\-based purging of unused dirty pages for arena <i>, or for all arenas if <i> equals +"arenas\&.narenas"\&. The proportion of unused dirty pages to be purged depends on the current time; see +"opt\&.decay_time" +for details\&. +.RE +.PP "arena\&.<i>\&.dss" (\fBconst char *\fR) rw .RS 4 Set the precedence of dss allocation as related to mmap allocation for arena <i>, or for all arenas if <i> equals @@ -1092,6 +1128,13 @@ Current per\-arena minimum ratio (log base 2) of active to dirty pages for arena for additional information\&. .RE .PP +"arena\&.<i>\&.decay_time" (\fBssize_t\fR) rw +.RS 4 +Current per\-arena approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. Each time this interface is set, all currently unused dirty pages are considered to have fully decayed, which causes immediate purging of all unused dirty pages unless the decay time is set to \-1 (i\&.e\&. purging disabled)\&. See +"opt\&.decay_time" +for additional information\&. +.RE +.PP "arena\&.<i>\&.chunk_hooks" (\fBchunk_hooks_t\fR) rw .RS 4 Get or set the chunk management hook functions for arena <i>\&. The functions must be capable of operating on all extant chunks associated with arena <i>, usually by passing unknown chunks to the replaced functions\&. In practice, it is feasible to control allocation for arenas created via @@ -1332,6 +1375,15 @@ during arena creation\&. See for additional information\&. .RE .PP +"arenas\&.decay_time" (\fBssize_t\fR) rw +.RS 4 +Current default per\-arena approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused, used to initialize +"arena\&.<i>\&.decay_time" +during arena creation\&. See +"opt\&.decay_time" +for additional information\&. +.RE +.PP "arenas\&.quantum" (\fBsize_t\fR) r\- .RS 4 Quantum size\&. @@ -1511,6 +1563,13 @@ Minimum ratio (log base 2) of active to dirty pages\&. See for details\&. .RE .PP +"stats\&.arenas\&.<i>\&.decay_time" (\fBssize_t\fR) r\- +.RS 4 +Approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. See +"opt\&.decay_time" +for details\&. +.RE +.PP "stats\&.arenas\&.<i>\&.nthreads" (\fBunsigned\fR) r\- .RS 4 Number of threads currently assigned to arena\&. @@ -1712,6 +1771,71 @@ Cumulative number of allocation requests for this size class\&. .RS 4 Current number of huge allocations for this size class\&. .RE +.SH "HEAP PROFILE FORMAT" +.PP +Although the heap profiling functionality was originally designed to be compatible with the +\fBpprof\fR +command that is developed as part of the +\m[blue]\fBgperftools package\fR\m[]\&\s-2\u[3]\d\s+2, the addition of per thread heap profiling functionality required a different heap profile format\&. The +\fBjeprof\fR +command is derived from +\fBpprof\fR, with enhancements to support the heap profile format described here\&. +.PP +In the following hypothetical heap profile, +\fB[\&.\&.\&.]\fR +indicates elision for the sake of compactness\&. +.sp +.if n \{\ +.RS 4 +.\} +.nf +heap_v2/524288 + t*: 28106: 56637512 [0: 0] + [\&.\&.\&.] + t3: 352: 16777344 [0: 0] + [\&.\&.\&.] + t99: 17754: 29341640 [0: 0] + [\&.\&.\&.] +@ 0x5f86da8 0x5f5a1dc [\&.\&.\&.] 0x29e4d4e 0xa200316 0xabb2988 [\&.\&.\&.] + t*: 13: 6688 [0: 0] + t3: 12: 6496 [0: ] + t99: 1: 192 [0: 0] +[\&.\&.\&.] + +MAPPED_LIBRARIES: +[\&.\&.\&.] +.fi +.if n \{\ +.RE +.\} +.sp +The following matches the above heap profile, but most tokens are replaced with +\fB<description>\fR +to indicate descriptions of the corresponding fields\&. +.sp +.if n \{\ +.RS 4 +.\} +.nf +<heap_profile_format_version>/<mean_sample_interval> + <aggregate>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] + [\&.\&.\&.] + <thread_3_aggregate>: <curobjs>: <curbytes>[<cumobjs>: <cumbytes>] + [\&.\&.\&.] + <thread_99_aggregate>: <curobjs>: <curbytes>[<cumobjs>: <cumbytes>] + [\&.\&.\&.] +@ <top_frame> <frame> [\&.\&.\&.] <frame> <frame> <frame> [\&.\&.\&.] + <backtrace_aggregate>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] + <backtrace_thread_3>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] + <backtrace_thread_99>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] +[\&.\&.\&.] + +MAPPED_LIBRARIES: +</proc/<pid>/maps> +.fi +.if n \{\ +.RE +.\} .SH "DEBUGGING MALLOC PROBLEMS" .PP When debugging, it is a good idea to configure/build jemalloc with the diff --git a/contrib/jemalloc/include/jemalloc/internal/arena.h b/contrib/jemalloc/include/jemalloc/internal/arena.h index 12c6179..3519873 100644 --- a/contrib/jemalloc/include/jemalloc/internal/arena.h +++ b/contrib/jemalloc/include/jemalloc/internal/arena.h @@ -23,6 +23,18 @@ */ #define LG_DIRTY_MULT_DEFAULT 3 +typedef enum { + purge_mode_ratio = 0, + purge_mode_decay = 1, + + purge_mode_limit = 2 +} purge_mode_t; +#define PURGE_DEFAULT purge_mode_ratio +/* Default decay time in seconds. */ +#define DECAY_TIME_DEFAULT 10 +/* Number of event ticks between time checks. */ +#define DECAY_NTICKS_PER_UPDATE 1000 + typedef struct arena_runs_dirty_link_s arena_runs_dirty_link_t; typedef struct arena_run_s arena_run_t; typedef struct arena_chunk_map_bits_s arena_chunk_map_bits_t; @@ -31,6 +43,7 @@ typedef struct arena_chunk_s arena_chunk_t; typedef struct arena_bin_info_s arena_bin_info_t; typedef struct arena_bin_s arena_bin_t; typedef struct arena_s arena_t; +typedef struct arena_tdata_s arena_tdata_t; #endif /* JEMALLOC_H_TYPES */ /******************************************************************************/ @@ -154,15 +167,14 @@ struct arena_chunk_map_misc_s { /* Profile counters, used for large object runs. */ union { - void *prof_tctx_pun; - prof_tctx_t *prof_tctx; + void *prof_tctx_pun; + prof_tctx_t *prof_tctx; }; /* Small region run metadata. */ arena_run_t run; }; }; -typedef rb_tree(arena_chunk_map_misc_t) arena_avail_tree_t; typedef rb_tree(arena_chunk_map_misc_t) arena_run_tree_t; #endif /* JEMALLOC_ARENA_STRUCTS_A */ @@ -220,28 +232,28 @@ struct arena_chunk_s { */ struct arena_bin_info_s { /* Size of regions in a run for this bin's size class. */ - size_t reg_size; + size_t reg_size; /* Redzone size. */ - size_t redzone_size; + size_t redzone_size; /* Interval between regions (reg_size + (redzone_size << 1)). */ - size_t reg_interval; + size_t reg_interval; /* Total size of a run for this bin's size class. */ - size_t run_size; + size_t run_size; /* Total number of regions in a run for this bin's size class. */ - uint32_t nregs; + uint32_t nregs; /* * Metadata used to manipulate bitmaps for runs associated with this * bin. */ - bitmap_info_t bitmap_info; + bitmap_info_t bitmap_info; /* Offset of first region in a run for this bin's size class. */ - uint32_t reg0_offset; + uint32_t reg0_offset; }; struct arena_bin_s { @@ -251,13 +263,13 @@ struct arena_bin_s { * which may be acquired while holding one or more bin locks, but not * vise versa. */ - malloc_mutex_t lock; + malloc_mutex_t lock; /* * Current run being used to service allocations of this bin's size * class. */ - arena_run_t *runcur; + arena_run_t *runcur; /* * Tree of non-full runs. This tree is used when looking for an @@ -266,10 +278,10 @@ struct arena_bin_s { * objects packed well, and it can also help reduce the number of * almost-empty chunks. */ - arena_run_tree_t runs; + arena_run_tree_t runs; /* Bin statistics. */ - malloc_bin_stats_t stats; + malloc_bin_stats_t stats; }; struct arena_s { @@ -278,14 +290,14 @@ struct arena_s { /* * Number of threads currently assigned to this arena. This field is - * protected by arenas_lock. + * synchronized via atomic operations. */ unsigned nthreads; /* * There are three classes of arena operations from a locking * perspective: - * 1) Thread assignment (modifies nthreads) is protected by arenas_lock. + * 1) Thread assignment (modifies nthreads) is synchronized via atomics. * 2) Bin-related operations are protected by bin locks. * 3) Chunk- and run-related operations are protected by this mutex. */ @@ -324,7 +336,7 @@ struct arena_s { /* Minimum ratio (log base 2) of nactive:ndirty. */ ssize_t lg_dirty_mult; - /* True if a thread is currently executing arena_purge(). */ + /* True if a thread is currently executing arena_purge_to_limit(). */ bool purging; /* Number of pages in active runs and huge regions. */ @@ -339,12 +351,6 @@ struct arena_s { size_t ndirty; /* - * Size/address-ordered tree of this arena's available runs. The tree - * is used for first-best-fit run allocation. - */ - arena_avail_tree_t runs_avail; - - /* * Unused dirty memory this arena manages. Dirty memory is conceptually * tracked as an arbitrarily interleaved LRU of dirty runs and cached * chunks, but the list linkage is actually semi-duplicated in order to @@ -375,6 +381,53 @@ struct arena_s { arena_runs_dirty_link_t runs_dirty; extent_node_t chunks_cache; + /* + * Approximate time in seconds from the creation of a set of unused + * dirty pages until an equivalent set of unused dirty pages is purged + * and/or reused. + */ + ssize_t decay_time; + /* decay_time / SMOOTHSTEP_NSTEPS. */ + nstime_t decay_interval; + /* + * Time at which the current decay interval logically started. We do + * not actually advance to a new epoch until sometime after it starts + * because of scheduling and computation delays, and it is even possible + * to completely skip epochs. In all cases, during epoch advancement we + * merge all relevant activity into the most recently recorded epoch. + */ + nstime_t decay_epoch; + /* decay_deadline randomness generator. */ + uint64_t decay_jitter_state; + /* + * Deadline for current epoch. This is the sum of decay_interval and + * per epoch jitter which is a uniform random variable in + * [0..decay_interval). Epochs always advance by precise multiples of + * decay_interval, but we randomize the deadline to reduce the + * likelihood of arenas purging in lockstep. + */ + nstime_t decay_deadline; + /* + * Number of dirty pages at beginning of current epoch. During epoch + * advancement we use the delta between decay_ndirty and ndirty to + * determine how many dirty pages, if any, were generated, and record + * the result in decay_backlog. + */ + size_t decay_ndirty; + /* + * Memoized result of arena_decay_backlog_npages_limit() corresponding + * to the current contents of decay_backlog, i.e. the limit on how many + * pages are allowed to exist for the decay epochs. + */ + size_t decay_backlog_npages_limit; + /* + * Trailing log of how many unused dirty pages were generated during + * each of the past SMOOTHSTEP_NSTEPS decay epochs, where the last + * element is the most recent epoch. Corresponding epoch times are + * relative to decay_epoch. + */ + size_t decay_backlog[SMOOTHSTEP_NSTEPS]; + /* Extant huge allocations. */ ql_head(extent_node_t) huge; /* Synchronizes all huge allocation/update/deallocation. */ @@ -402,6 +455,17 @@ struct arena_s { /* bins is used to store trees of free regions. */ arena_bin_t bins[NBINS]; + + /* + * Quantized address-ordered trees of this arena's available runs. The + * trees are used for first-best-fit run allocation. + */ + arena_run_tree_t runs_avail[1]; /* Dynamically sized. */ +}; + +/* Used in conjunction with tsd for fast arena-related context lookup. */ +struct arena_tdata_s { + ticker_t decay_ticker; }; #endif /* JEMALLOC_ARENA_STRUCTS_B */ @@ -417,7 +481,10 @@ static const size_t large_pad = #endif ; +extern purge_mode_t opt_purge; +extern const char *purge_mode_names[]; extern ssize_t opt_lg_dirty_mult; +extern ssize_t opt_decay_time; extern arena_bin_info_t arena_bin_info[NBINS]; @@ -425,9 +492,15 @@ extern size_t map_bias; /* Number of arena chunk header pages. */ extern size_t map_misc_offset; extern size_t arena_maxrun; /* Max run size for arenas. */ extern size_t large_maxclass; /* Max large size class. */ +extern size_t run_quantize_max; /* Max run_quantize_*() input. */ extern unsigned nlclasses; /* Number of large size classes. */ extern unsigned nhclasses; /* Number of huge size classes. */ +#ifdef JEMALLOC_JET +typedef size_t (run_quantize_t)(size_t); +extern run_quantize_t *run_quantize_floor; +extern run_quantize_t *run_quantize_ceil; +#endif void arena_chunk_cache_maybe_insert(arena_t *arena, extent_node_t *node, bool cache); void arena_chunk_cache_maybe_remove(arena_t *arena, extent_node_t *node, @@ -445,9 +518,11 @@ bool arena_chunk_ralloc_huge_expand(arena_t *arena, void *chunk, size_t oldsize, size_t usize, bool *zero); ssize_t arena_lg_dirty_mult_get(arena_t *arena); bool arena_lg_dirty_mult_set(arena_t *arena, ssize_t lg_dirty_mult); +ssize_t arena_decay_time_get(arena_t *arena); +bool arena_decay_time_set(arena_t *arena, ssize_t decay_time); void arena_maybe_purge(arena_t *arena); -void arena_purge_all(arena_t *arena); -void arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, +void arena_purge(arena_t *arena, bool all); +void arena_tcache_fill_small(tsd_t *tsd, arena_t *arena, tcache_bin_t *tbin, szind_t binind, uint64_t prof_accumbytes); void arena_alloc_junk_small(void *ptr, arena_bin_info_t *bin_info, bool zero); @@ -461,8 +536,9 @@ extern arena_dalloc_junk_small_t *arena_dalloc_junk_small; void arena_dalloc_junk_small(void *ptr, arena_bin_info_t *bin_info); #endif void arena_quarantine_junk_small(void *ptr, size_t usize); -void *arena_malloc_small(arena_t *arena, size_t size, bool zero); -void *arena_malloc_large(arena_t *arena, size_t size, bool zero); +void *arena_malloc_large(tsd_t *tsd, arena_t *arena, szind_t ind, bool zero); +void *arena_malloc_hard(tsd_t *tsd, arena_t *arena, size_t size, szind_t ind, + bool zero, tcache_t *tcache); void *arena_palloc(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, bool zero, tcache_t *tcache); void arena_prof_promoted(const void *ptr, size_t size); @@ -470,8 +546,8 @@ void arena_dalloc_bin_junked_locked(arena_t *arena, arena_chunk_t *chunk, void *ptr, arena_chunk_map_bits_t *bitselm); void arena_dalloc_bin(arena_t *arena, arena_chunk_t *chunk, void *ptr, size_t pageind, arena_chunk_map_bits_t *bitselm); -void arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr, - size_t pageind); +void arena_dalloc_small(tsd_t *tsd, arena_t *arena, arena_chunk_t *chunk, + void *ptr, size_t pageind); #ifdef JEMALLOC_JET typedef void (arena_dalloc_junk_large_t)(void *, size_t); extern arena_dalloc_junk_large_t *arena_dalloc_junk_large; @@ -480,12 +556,13 @@ void arena_dalloc_junk_large(void *ptr, size_t usize); #endif void arena_dalloc_large_junked_locked(arena_t *arena, arena_chunk_t *chunk, void *ptr); -void arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr); +void arena_dalloc_large(tsd_t *tsd, arena_t *arena, arena_chunk_t *chunk, + void *ptr); #ifdef JEMALLOC_JET typedef void (arena_ralloc_junk_large_t)(void *, size_t, size_t); extern arena_ralloc_junk_large_t *arena_ralloc_junk_large; #endif -bool arena_ralloc_no_move(void *ptr, size_t oldsize, size_t size, +bool arena_ralloc_no_move(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t extra, bool zero); void *arena_ralloc(tsd_t *tsd, arena_t *arena, void *ptr, size_t oldsize, size_t size, size_t alignment, bool zero, tcache_t *tcache); @@ -493,10 +570,18 @@ dss_prec_t arena_dss_prec_get(arena_t *arena); bool arena_dss_prec_set(arena_t *arena, dss_prec_t dss_prec); ssize_t arena_lg_dirty_mult_default_get(void); bool arena_lg_dirty_mult_default_set(ssize_t lg_dirty_mult); -void arena_stats_merge(arena_t *arena, const char **dss, - ssize_t *lg_dirty_mult, size_t *nactive, size_t *ndirty, - arena_stats_t *astats, malloc_bin_stats_t *bstats, +ssize_t arena_decay_time_default_get(void); +bool arena_decay_time_default_set(ssize_t decay_time); +void arena_basic_stats_merge(arena_t *arena, unsigned *nthreads, + const char **dss, ssize_t *lg_dirty_mult, ssize_t *decay_time, + size_t *nactive, size_t *ndirty); +void arena_stats_merge(arena_t *arena, unsigned *nthreads, const char **dss, + ssize_t *lg_dirty_mult, ssize_t *decay_time, size_t *nactive, + size_t *ndirty, arena_stats_t *astats, malloc_bin_stats_t *bstats, malloc_large_stats_t *lstats, malloc_huge_stats_t *hstats); +unsigned arena_nthreads_get(arena_t *arena); +void arena_nthreads_inc(arena_t *arena); +void arena_nthreads_dec(arena_t *arena); arena_t *arena_new(unsigned ind); bool arena_boot(void); void arena_prefork(arena_t *arena); @@ -512,7 +597,7 @@ arena_chunk_map_bits_t *arena_bitselm_get(arena_chunk_t *chunk, size_t pageind); arena_chunk_map_misc_t *arena_miscelm_get(arena_chunk_t *chunk, size_t pageind); -size_t arena_miscelm_to_pageind(arena_chunk_map_misc_t *miscelm); +size_t arena_miscelm_to_pageind(const arena_chunk_map_misc_t *miscelm); void *arena_miscelm_to_rpages(arena_chunk_map_misc_t *miscelm); arena_chunk_map_misc_t *arena_rd_to_miscelm(arena_runs_dirty_link_t *rd); arena_chunk_map_misc_t *arena_run_to_miscelm(arena_run_t *run); @@ -552,17 +637,19 @@ bool arena_prof_accum_locked(arena_t *arena, uint64_t accumbytes); bool arena_prof_accum(arena_t *arena, uint64_t accumbytes); szind_t arena_ptr_small_binind_get(const void *ptr, size_t mapbits); szind_t arena_bin_index(arena_t *arena, arena_bin_t *bin); -unsigned arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, +size_t arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, const void *ptr); prof_tctx_t *arena_prof_tctx_get(const void *ptr); void arena_prof_tctx_set(const void *ptr, size_t usize, prof_tctx_t *tctx); void arena_prof_tctx_reset(const void *ptr, size_t usize, const void *old_ptr, prof_tctx_t *old_tctx); -void *arena_malloc(tsd_t *tsd, arena_t *arena, size_t size, bool zero, - tcache_t *tcache); +void arena_decay_ticks(tsd_t *tsd, arena_t *arena, unsigned nticks); +void arena_decay_tick(tsd_t *tsd, arena_t *arena); +void *arena_malloc(tsd_t *tsd, arena_t *arena, size_t size, szind_t ind, + bool zero, tcache_t *tcache, bool slow_path); arena_t *arena_aalloc(const void *ptr); size_t arena_salloc(const void *ptr, bool demote); -void arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache); +void arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache, bool slow_path); void arena_sdalloc(tsd_t *tsd, void *ptr, size_t size, tcache_t *tcache); #endif @@ -590,7 +677,7 @@ arena_miscelm_get(arena_chunk_t *chunk, size_t pageind) } JEMALLOC_ALWAYS_INLINE size_t -arena_miscelm_to_pageind(arena_chunk_map_misc_t *miscelm) +arena_miscelm_to_pageind(const arena_chunk_map_misc_t *miscelm) { arena_chunk_t *chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(miscelm); size_t pageind = ((uintptr_t)miscelm - ((uintptr_t)chunk + @@ -970,7 +1057,7 @@ arena_ptr_small_binind_get(const void *ptr, size_t mapbits) run = &miscelm->run; run_binind = run->binind; bin = &arena->bins[run_binind]; - actual_binind = bin - arena->bins; + actual_binind = (szind_t)(bin - arena->bins); assert(run_binind == actual_binind); bin_info = &arena_bin_info[actual_binind]; rpages = arena_miscelm_to_rpages(miscelm); @@ -987,16 +1074,15 @@ arena_ptr_small_binind_get(const void *ptr, size_t mapbits) JEMALLOC_INLINE szind_t arena_bin_index(arena_t *arena, arena_bin_t *bin) { - szind_t binind = bin - arena->bins; + szind_t binind = (szind_t)(bin - arena->bins); assert(binind < NBINS); return (binind); } -JEMALLOC_INLINE unsigned +JEMALLOC_INLINE size_t arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, const void *ptr) { - unsigned shift, diff, regind; - size_t interval; + size_t diff, interval, shift, regind; arena_chunk_map_misc_t *miscelm = arena_run_to_miscelm(run); void *rpages = arena_miscelm_to_rpages(miscelm); @@ -1011,12 +1097,12 @@ arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, const void *ptr) * Avoid doing division with a variable divisor if possible. Using * actual division here can reduce allocator throughput by over 20%! */ - diff = (unsigned)((uintptr_t)ptr - (uintptr_t)rpages - + diff = (size_t)((uintptr_t)ptr - (uintptr_t)rpages - bin_info->reg0_offset); /* Rescale (factor powers of 2 out of the numerator and denominator). */ interval = bin_info->reg_interval; - shift = jemalloc_ffs(interval) - 1; + shift = ffs_zu(interval) - 1; diff >>= shift; interval >>= shift; @@ -1038,9 +1124,9 @@ arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, const void *ptr) * divide by 0, and 1 and 2 are both powers of two, which are * handled above. */ -#define SIZE_INV_SHIFT ((sizeof(unsigned) << 3) - LG_RUN_MAXREGS) -#define SIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s)) + 1) - static const unsigned interval_invs[] = { +#define SIZE_INV_SHIFT ((sizeof(size_t) << 3) - LG_RUN_MAXREGS) +#define SIZE_INV(s) (((ZU(1) << SIZE_INV_SHIFT) / (s)) + 1) + static const size_t interval_invs[] = { SIZE_INV(3), SIZE_INV(4), SIZE_INV(5), SIZE_INV(6), SIZE_INV(7), SIZE_INV(8), SIZE_INV(9), SIZE_INV(10), SIZE_INV(11), @@ -1051,8 +1137,8 @@ arena_run_regind(arena_run_t *run, arena_bin_info_t *bin_info, const void *ptr) SIZE_INV(28), SIZE_INV(29), SIZE_INV(30), SIZE_INV(31) }; - if (likely(interval <= ((sizeof(interval_invs) / - sizeof(unsigned)) + 2))) { + if (likely(interval <= ((sizeof(interval_invs) / sizeof(size_t)) + + 2))) { regind = (diff * interval_invs[interval - 3]) >> SIZE_INV_SHIFT; } else @@ -1157,35 +1243,48 @@ arena_prof_tctx_reset(const void *ptr, size_t usize, const void *old_ptr, } } +JEMALLOC_ALWAYS_INLINE void +arena_decay_ticks(tsd_t *tsd, arena_t *arena, unsigned nticks) +{ + ticker_t *decay_ticker; + + if (unlikely(tsd == NULL)) + return; + decay_ticker = decay_ticker_get(tsd, arena->ind); + if (unlikely(decay_ticker == NULL)) + return; + if (unlikely(ticker_ticks(decay_ticker, nticks))) + arena_purge(arena, false); +} + +JEMALLOC_ALWAYS_INLINE void +arena_decay_tick(tsd_t *tsd, arena_t *arena) +{ + + arena_decay_ticks(tsd, arena, 1); +} + JEMALLOC_ALWAYS_INLINE void * -arena_malloc(tsd_t *tsd, arena_t *arena, size_t size, bool zero, - tcache_t *tcache) +arena_malloc(tsd_t *tsd, arena_t *arena, size_t size, szind_t ind, bool zero, + tcache_t *tcache, bool slow_path) { assert(size != 0); - arena = arena_choose(tsd, arena); - if (unlikely(arena == NULL)) - return (NULL); - - if (likely(size <= SMALL_MAXCLASS)) { - if (likely(tcache != NULL)) { + if (likely(tcache != NULL)) { + if (likely(size <= SMALL_MAXCLASS)) { return (tcache_alloc_small(tsd, arena, tcache, size, - zero)); - } else - return (arena_malloc_small(arena, size, zero)); - } else if (likely(size <= large_maxclass)) { - /* - * Initialize tcache after checking size in order to avoid - * infinite recursion during tcache initialization. - */ - if (likely(tcache != NULL) && size <= tcache_maxclass) { + ind, zero, slow_path)); + } + if (likely(size <= tcache_maxclass)) { return (tcache_alloc_large(tsd, arena, tcache, size, - zero)); - } else - return (arena_malloc_large(arena, size, zero)); - } else - return (huge_malloc(tsd, arena, size, zero, tcache)); + ind, zero, slow_path)); + } + /* (size > tcache_maxclass) case falls through. */ + assert(size > tcache_maxclass); + } + + return (arena_malloc_hard(tsd, arena, size, ind, zero, tcache)); } JEMALLOC_ALWAYS_INLINE arena_t * @@ -1251,7 +1350,7 @@ arena_salloc(const void *ptr, bool demote) } JEMALLOC_ALWAYS_INLINE void -arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache) +arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache, bool slow_path) { arena_chunk_t *chunk; size_t pageind, mapbits; @@ -1268,9 +1367,10 @@ arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache) if (likely(tcache != NULL)) { szind_t binind = arena_ptr_small_binind_get(ptr, mapbits); - tcache_dalloc_small(tsd, tcache, ptr, binind); + tcache_dalloc_small(tsd, tcache, ptr, binind, + slow_path); } else { - arena_dalloc_small(extent_node_arena_get( + arena_dalloc_small(tsd, extent_node_arena_get( &chunk->node), chunk, ptr, pageind); } } else { @@ -1283,9 +1383,9 @@ arena_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache) if (likely(tcache != NULL) && size - large_pad <= tcache_maxclass) { tcache_dalloc_large(tsd, tcache, ptr, size - - large_pad); + large_pad, slow_path); } else { - arena_dalloc_large(extent_node_arena_get( + arena_dalloc_large(tsd, extent_node_arena_get( &chunk->node), chunk, ptr); } } @@ -1303,7 +1403,8 @@ arena_sdalloc(tsd_t *tsd, void *ptr, size_t size, tcache_t *tcache) if (config_prof && opt_prof) { size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> LG_PAGE; - assert(arena_mapbits_allocated_get(chunk, pageind) != 0); + assert(arena_mapbits_allocated_get(chunk, pageind) != + 0); if (arena_mapbits_large_get(chunk, pageind) != 0) { /* * Make sure to use promoted size, not request @@ -1319,21 +1420,23 @@ arena_sdalloc(tsd_t *tsd, void *ptr, size_t size, tcache_t *tcache) /* Small allocation. */ if (likely(tcache != NULL)) { szind_t binind = size2index(size); - tcache_dalloc_small(tsd, tcache, ptr, binind); + tcache_dalloc_small(tsd, tcache, ptr, binind, + true); } else { size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> LG_PAGE; - arena_dalloc_small(extent_node_arena_get( + arena_dalloc_small(tsd, extent_node_arena_get( &chunk->node), chunk, ptr, pageind); } } else { assert(config_cache_oblivious || ((uintptr_t)ptr & PAGE_MASK) == 0); - if (likely(tcache != NULL) && size <= tcache_maxclass) - tcache_dalloc_large(tsd, tcache, ptr, size); - else { - arena_dalloc_large(extent_node_arena_get( + if (likely(tcache != NULL) && size <= tcache_maxclass) { + tcache_dalloc_large(tsd, tcache, ptr, size, + true); + } else { + arena_dalloc_large(tsd, extent_node_arena_get( &chunk->node), chunk, ptr); } } diff --git a/contrib/jemalloc/include/jemalloc/internal/assert.h b/contrib/jemalloc/include/jemalloc/internal/assert.h new file mode 100644 index 0000000..6f8f7eb --- /dev/null +++ b/contrib/jemalloc/include/jemalloc/internal/assert.h @@ -0,0 +1,45 @@ +/* + * Define a custom assert() in order to reduce the chances of deadlock during + * assertion failure. + */ +#ifndef assert +#define assert(e) do { \ + if (unlikely(config_debug && !(e))) { \ + malloc_printf( \ + "<jemalloc>: %s:%d: Failed assertion: \"%s\"\n", \ + __FILE__, __LINE__, #e); \ + abort(); \ + } \ +} while (0) +#endif + +#ifndef not_reached +#define not_reached() do { \ + if (config_debug) { \ + malloc_printf( \ + "<jemalloc>: %s:%d: Unreachable code reached\n", \ + __FILE__, __LINE__); \ + abort(); \ + } \ + unreachable(); \ +} while (0) +#endif + +#ifndef not_implemented +#define not_implemented() do { \ + if (config_debug) { \ + malloc_printf("<jemalloc>: %s:%d: Not implemented\n", \ + __FILE__, __LINE__); \ + abort(); \ + } \ +} while (0) +#endif + +#ifndef assert_not_implemented +#define assert_not_implemented(e) do { \ + if (unlikely(config_debug && !(e))) \ + not_implemented(); \ +} while (0) +#endif + + diff --git a/contrib/jemalloc/include/jemalloc/internal/atomic.h b/contrib/jemalloc/include/jemalloc/internal/atomic.h index a9aad35..3f15ea1 100644 --- a/contrib/jemalloc/include/jemalloc/internal/atomic.h +++ b/contrib/jemalloc/include/jemalloc/internal/atomic.h @@ -28,8 +28,8 @@ * callers. * * <t> atomic_read_<t>(<t> *p) { return (*p); } - * <t> atomic_add_<t>(<t> *p, <t> x) { return (*p + x); } - * <t> atomic_sub_<t>(<t> *p, <t> x) { return (*p - x); } + * <t> atomic_add_<t>(<t> *p, <t> x) { return (*p += x); } + * <t> atomic_sub_<t>(<t> *p, <t> x) { return (*p -= x); } * bool atomic_cas_<t>(<t> *p, <t> c, <t> s) * { * if (*p != c) diff --git a/contrib/jemalloc/include/jemalloc/internal/bitmap.h b/contrib/jemalloc/include/jemalloc/internal/bitmap.h index fcc6005..2594e3a 100644 --- a/contrib/jemalloc/include/jemalloc/internal/bitmap.h +++ b/contrib/jemalloc/include/jemalloc/internal/bitmap.h @@ -15,6 +15,15 @@ typedef unsigned long bitmap_t; #define BITMAP_GROUP_NBITS (ZU(1) << LG_BITMAP_GROUP_NBITS) #define BITMAP_GROUP_NBITS_MASK (BITMAP_GROUP_NBITS-1) +/* + * Do some analysis on how big the bitmap is before we use a tree. For a brute + * force linear search, if we would have to call ffsl more than 2^3 times, use a + * tree instead. + */ +#if LG_BITMAP_MAXBITS - LG_BITMAP_GROUP_NBITS > 3 +# define USE_TREE +#endif + /* Number of groups required to store a given number of bits. */ #define BITMAP_BITS2GROUPS(nbits) \ ((nbits + BITMAP_GROUP_NBITS_MASK) >> LG_BITMAP_GROUP_NBITS) @@ -48,6 +57,8 @@ typedef unsigned long bitmap_t; /* * Maximum number of groups required to support LG_BITMAP_MAXBITS. */ +#ifdef USE_TREE + #if LG_BITMAP_MAXBITS <= LG_BITMAP_GROUP_NBITS # define BITMAP_GROUPS_MAX BITMAP_GROUPS_1_LEVEL(BITMAP_MAXBITS) #elif LG_BITMAP_MAXBITS <= LG_BITMAP_GROUP_NBITS * 2 @@ -65,6 +76,12 @@ typedef unsigned long bitmap_t; (LG_BITMAP_MAXBITS / LG_SIZEOF_BITMAP) \ + !!(LG_BITMAP_MAXBITS % LG_SIZEOF_BITMAP) +#else /* USE_TREE */ + +#define BITMAP_GROUPS_MAX BITMAP_BITS2GROUPS(BITMAP_MAXBITS) + +#endif /* USE_TREE */ + #endif /* JEMALLOC_H_TYPES */ /******************************************************************************/ #ifdef JEMALLOC_H_STRUCTS @@ -78,6 +95,7 @@ struct bitmap_info_s { /* Logical number of bits in bitmap (stored at bottom level). */ size_t nbits; +#ifdef USE_TREE /* Number of levels necessary for nbits. */ unsigned nlevels; @@ -86,6 +104,10 @@ struct bitmap_info_s { * bottom to top (e.g. the bottom level is stored in levels[0]). */ bitmap_level_t levels[BITMAP_MAX_LEVELS+1]; +#else /* USE_TREE */ + /* Number of groups necessary for nbits. */ + size_t ngroups; +#endif /* USE_TREE */ }; #endif /* JEMALLOC_H_STRUCTS */ @@ -93,9 +115,8 @@ struct bitmap_info_s { #ifdef JEMALLOC_H_EXTERNS void bitmap_info_init(bitmap_info_t *binfo, size_t nbits); -size_t bitmap_info_ngroups(const bitmap_info_t *binfo); -size_t bitmap_size(size_t nbits); void bitmap_init(bitmap_t *bitmap, const bitmap_info_t *binfo); +size_t bitmap_size(const bitmap_info_t *binfo); #endif /* JEMALLOC_H_EXTERNS */ /******************************************************************************/ @@ -113,10 +134,20 @@ void bitmap_unset(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit); JEMALLOC_INLINE bool bitmap_full(bitmap_t *bitmap, const bitmap_info_t *binfo) { - unsigned rgoff = binfo->levels[binfo->nlevels].group_offset - 1; +#ifdef USE_TREE + size_t rgoff = binfo->levels[binfo->nlevels].group_offset - 1; bitmap_t rg = bitmap[rgoff]; /* The bitmap is full iff the root group is 0. */ return (rg == 0); +#else + size_t i; + + for (i = 0; i < binfo->ngroups; i++) { + if (bitmap[i] != 0) + return (false); + } + return (true); +#endif } JEMALLOC_INLINE bool @@ -128,7 +159,7 @@ bitmap_get(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) assert(bit < binfo->nbits); goff = bit >> LG_BITMAP_GROUP_NBITS; g = bitmap[goff]; - return (!(g & (1LU << (bit & BITMAP_GROUP_NBITS_MASK)))); + return (!(g & (ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK)))); } JEMALLOC_INLINE void @@ -143,10 +174,11 @@ bitmap_set(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) goff = bit >> LG_BITMAP_GROUP_NBITS; gp = &bitmap[goff]; g = *gp; - assert(g & (1LU << (bit & BITMAP_GROUP_NBITS_MASK))); - g ^= 1LU << (bit & BITMAP_GROUP_NBITS_MASK); + assert(g & (ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK))); + g ^= ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK); *gp = g; assert(bitmap_get(bitmap, binfo, bit)); +#ifdef USE_TREE /* Propagate group state transitions up the tree. */ if (g == 0) { unsigned i; @@ -155,13 +187,14 @@ bitmap_set(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) goff = bit >> LG_BITMAP_GROUP_NBITS; gp = &bitmap[binfo->levels[i].group_offset + goff]; g = *gp; - assert(g & (1LU << (bit & BITMAP_GROUP_NBITS_MASK))); - g ^= 1LU << (bit & BITMAP_GROUP_NBITS_MASK); + assert(g & (ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK))); + g ^= ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK); *gp = g; if (g != 0) break; } } +#endif } /* sfu: set first unset. */ @@ -174,15 +207,24 @@ bitmap_sfu(bitmap_t *bitmap, const bitmap_info_t *binfo) assert(!bitmap_full(bitmap, binfo)); +#ifdef USE_TREE i = binfo->nlevels - 1; g = bitmap[binfo->levels[i].group_offset]; - bit = jemalloc_ffsl(g) - 1; + bit = ffs_lu(g) - 1; while (i > 0) { i--; g = bitmap[binfo->levels[i].group_offset + bit]; - bit = (bit << LG_BITMAP_GROUP_NBITS) + (jemalloc_ffsl(g) - 1); + bit = (bit << LG_BITMAP_GROUP_NBITS) + (ffs_lu(g) - 1); } - +#else + i = 0; + g = bitmap[0]; + while ((bit = ffs_lu(g)) == 0) { + i++; + g = bitmap[i]; + } + bit = (bit - 1) + (i << 6); +#endif bitmap_set(bitmap, binfo, bit); return (bit); } @@ -193,7 +235,7 @@ bitmap_unset(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) size_t goff; bitmap_t *gp; bitmap_t g; - bool propagate; + UNUSED bool propagate; assert(bit < binfo->nbits); assert(bitmap_get(bitmap, binfo, bit)); @@ -201,10 +243,11 @@ bitmap_unset(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) gp = &bitmap[goff]; g = *gp; propagate = (g == 0); - assert((g & (1LU << (bit & BITMAP_GROUP_NBITS_MASK))) == 0); - g ^= 1LU << (bit & BITMAP_GROUP_NBITS_MASK); + assert((g & (ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK))) == 0); + g ^= ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK); *gp = g; assert(!bitmap_get(bitmap, binfo, bit)); +#ifdef USE_TREE /* Propagate group state transitions up the tree. */ if (propagate) { unsigned i; @@ -214,14 +257,15 @@ bitmap_unset(bitmap_t *bitmap, const bitmap_info_t *binfo, size_t bit) gp = &bitmap[binfo->levels[i].group_offset + goff]; g = *gp; propagate = (g == 0); - assert((g & (1LU << (bit & BITMAP_GROUP_NBITS_MASK))) + assert((g & (ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK))) == 0); - g ^= 1LU << (bit & BITMAP_GROUP_NBITS_MASK); + g ^= ZU(1) << (bit & BITMAP_GROUP_NBITS_MASK); *gp = g; if (!propagate) break; } } +#endif /* USE_TREE */ } #endif diff --git a/contrib/jemalloc/include/jemalloc/internal/chunk_mmap.h b/contrib/jemalloc/include/jemalloc/internal/chunk_mmap.h index 7d8014c..6f2d0ac 100644 --- a/contrib/jemalloc/include/jemalloc/internal/chunk_mmap.h +++ b/contrib/jemalloc/include/jemalloc/internal/chunk_mmap.h @@ -9,8 +9,8 @@ /******************************************************************************/ #ifdef JEMALLOC_H_EXTERNS -void *chunk_alloc_mmap(size_t size, size_t alignment, bool *zero, - bool *commit); +void *chunk_alloc_mmap(void *new_addr, size_t size, size_t alignment, + bool *zero, bool *commit); bool chunk_dalloc_mmap(void *chunk, size_t size); #endif /* JEMALLOC_H_EXTERNS */ diff --git a/contrib/jemalloc/include/jemalloc/internal/ckh.h b/contrib/jemalloc/include/jemalloc/internal/ckh.h index 75c1c97..f75ad90 100644 --- a/contrib/jemalloc/include/jemalloc/internal/ckh.h +++ b/contrib/jemalloc/include/jemalloc/internal/ckh.h @@ -40,9 +40,7 @@ struct ckh_s { #endif /* Used for pseudo-random number generation. */ -#define CKH_A 1103515241 -#define CKH_C 12347 - uint32_t prng_state; + uint64_t prng_state; /* Total number of items. */ size_t count; @@ -74,7 +72,7 @@ bool ckh_iter(ckh_t *ckh, size_t *tabind, void **key, void **data); bool ckh_insert(tsd_t *tsd, ckh_t *ckh, const void *key, const void *data); bool ckh_remove(tsd_t *tsd, ckh_t *ckh, const void *searchkey, void **key, void **data); -bool ckh_search(ckh_t *ckh, const void *seachkey, void **key, void **data); +bool ckh_search(ckh_t *ckh, const void *searchkey, void **key, void **data); void ckh_string_hash(const void *key, size_t r_hash[2]); bool ckh_string_keycomp(const void *k1, const void *k2); void ckh_pointer_hash(const void *key, size_t r_hash[2]); diff --git a/contrib/jemalloc/include/jemalloc/internal/ctl.h b/contrib/jemalloc/include/jemalloc/internal/ctl.h index 751c14b..9c5e932 100644 --- a/contrib/jemalloc/include/jemalloc/internal/ctl.h +++ b/contrib/jemalloc/include/jemalloc/internal/ctl.h @@ -35,8 +35,12 @@ struct ctl_arena_stats_s { unsigned nthreads; const char *dss; ssize_t lg_dirty_mult; + ssize_t decay_time; size_t pactive; size_t pdirty; + + /* The remainder are only populated if config_stats is true. */ + arena_stats_t astats; /* Aggregate stats for small size classes, based on bin stats. */ diff --git a/contrib/jemalloc/include/jemalloc/internal/hash.h b/contrib/jemalloc/include/jemalloc/internal/hash.h index bcead33..864fda8 100644 --- a/contrib/jemalloc/include/jemalloc/internal/hash.h +++ b/contrib/jemalloc/include/jemalloc/internal/hash.h @@ -1,6 +1,6 @@ /* * The following hash function is based on MurmurHash3, placed into the public - * domain by Austin Appleby. See http://code.google.com/p/smhasher/ for + * domain by Austin Appleby. See https://github.com/aappleby/smhasher for * details. */ /******************************************************************************/ @@ -49,6 +49,14 @@ JEMALLOC_INLINE uint32_t hash_get_block_32(const uint32_t *p, int i) { + /* Handle unaligned read. */ + if (unlikely((uintptr_t)p & (sizeof(uint32_t)-1)) != 0) { + uint32_t ret; + + memcpy(&ret, &p[i], sizeof(uint32_t)); + return (ret); + } + return (p[i]); } @@ -56,6 +64,14 @@ JEMALLOC_INLINE uint64_t hash_get_block_64(const uint64_t *p, int i) { + /* Handle unaligned read. */ + if (unlikely((uintptr_t)p & (sizeof(uint64_t)-1)) != 0) { + uint64_t ret; + + memcpy(&ret, &p[i], sizeof(uint64_t)); + return (ret); + } + return (p[i]); } @@ -321,13 +337,18 @@ hash_x64_128(const void *key, const int len, const uint32_t seed, JEMALLOC_INLINE void hash(const void *key, size_t len, const uint32_t seed, size_t r_hash[2]) { + + assert(len <= INT_MAX); /* Unfortunate implementation limitation. */ + #if (LG_SIZEOF_PTR == 3 && !defined(JEMALLOC_BIG_ENDIAN)) - hash_x64_128(key, len, seed, (uint64_t *)r_hash); + hash_x64_128(key, (int)len, seed, (uint64_t *)r_hash); #else - uint64_t hashes[2]; - hash_x86_128(key, len, seed, hashes); - r_hash[0] = (size_t)hashes[0]; - r_hash[1] = (size_t)hashes[1]; + { + uint64_t hashes[2]; + hash_x86_128(key, (int)len, seed, hashes); + r_hash[0] = (size_t)hashes[0]; + r_hash[1] = (size_t)hashes[1]; + } #endif } #endif diff --git a/contrib/jemalloc/include/jemalloc/internal/huge.h b/contrib/jemalloc/include/jemalloc/internal/huge.h index ece7af9..cb6f69e 100644 --- a/contrib/jemalloc/include/jemalloc/internal/huge.h +++ b/contrib/jemalloc/include/jemalloc/internal/huge.h @@ -9,12 +9,12 @@ /******************************************************************************/ #ifdef JEMALLOC_H_EXTERNS -void *huge_malloc(tsd_t *tsd, arena_t *arena, size_t size, bool zero, +void *huge_malloc(tsd_t *tsd, arena_t *arena, size_t usize, bool zero, tcache_t *tcache); -void *huge_palloc(tsd_t *tsd, arena_t *arena, size_t size, size_t alignment, +void *huge_palloc(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, bool zero, tcache_t *tcache); -bool huge_ralloc_no_move(void *ptr, size_t oldsize, size_t usize_min, - size_t usize_max, bool zero); +bool huge_ralloc_no_move(tsd_t *tsd, void *ptr, size_t oldsize, + size_t usize_min, size_t usize_max, bool zero); void *huge_ralloc(tsd_t *tsd, arena_t *arena, void *ptr, size_t oldsize, size_t usize, size_t alignment, bool zero, tcache_t *tcache); #ifdef JEMALLOC_JET diff --git a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal.h b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal.h index c34c237..7f77d12 100644 --- a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal.h +++ b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal.h @@ -46,6 +46,7 @@ static const bool config_fill = #endif ; static const bool config_lazy_lock = true; +static const char * const config_malloc_conf = JEMALLOC_CONFIG_MALLOC_CONF; static const bool config_prof = #ifdef JEMALLOC_PROF true @@ -253,9 +254,6 @@ typedef unsigned szind_t; # ifdef __powerpc__ # define LG_QUANTUM 4 # endif -# ifdef __riscv__ -# define LG_QUANTUM 4 -# endif # ifdef __s390__ # define LG_QUANTUM 4 # endif @@ -355,12 +353,15 @@ typedef unsigned szind_t; # define VARIABLE_ARRAY(type, name, count) type name[(count)] #endif +#include "jemalloc/internal/nstime.h" #include "jemalloc/internal/valgrind.h" #include "jemalloc/internal/util.h" #include "jemalloc/internal/atomic.h" #include "jemalloc/internal/prng.h" +#include "jemalloc/internal/ticker.h" #include "jemalloc/internal/ckh.h" #include "jemalloc/internal/size_classes.h" +#include "jemalloc/internal/smoothstep.h" #include "jemalloc/internal/stats.h" #include "jemalloc/internal/ctl.h" #include "jemalloc/internal/mutex.h" @@ -383,12 +384,15 @@ typedef unsigned szind_t; /******************************************************************************/ #define JEMALLOC_H_STRUCTS +#include "jemalloc/internal/nstime.h" #include "jemalloc/internal/valgrind.h" #include "jemalloc/internal/util.h" #include "jemalloc/internal/atomic.h" #include "jemalloc/internal/prng.h" +#include "jemalloc/internal/ticker.h" #include "jemalloc/internal/ckh.h" #include "jemalloc/internal/size_classes.h" +#include "jemalloc/internal/smoothstep.h" #include "jemalloc/internal/stats.h" #include "jemalloc/internal/ctl.h" #include "jemalloc/internal/mutex.h" @@ -426,18 +430,24 @@ extern bool opt_redzone; extern bool opt_utrace; extern bool opt_xmalloc; extern bool opt_zero; -extern size_t opt_narenas; +extern unsigned opt_narenas; extern bool in_valgrind; /* Number of CPUs. */ -extern unsigned ncpus; +extern unsigned ncpus; + +/* + * Arenas that are used to service external requests. Not all elements of the + * arenas array are necessarily used; arenas are created lazily as needed. + */ +extern arena_t **arenas; /* * index2size_tab encodes the same information as could be computed (at * unacceptable cost in some code paths) by index2size_compute(). */ -extern size_t const index2size_tab[NSIZES]; +extern size_t const index2size_tab[NSIZES+1]; /* * size2index_tab is a compact lookup table that rounds request sizes up to * size classes. In order to reduce cache footprint, the table is compressed, @@ -445,35 +455,36 @@ extern size_t const index2size_tab[NSIZES]; */ extern uint8_t const size2index_tab[]; -arena_t *a0get(void); void *a0malloc(size_t size); void a0dalloc(void *ptr); void *bootstrap_malloc(size_t size); void *bootstrap_calloc(size_t num, size_t size); void bootstrap_free(void *ptr); arena_t *arenas_extend(unsigned ind); -arena_t *arena_init(unsigned ind); unsigned narenas_total_get(void); -arena_t *arena_get_hard(tsd_t *tsd, unsigned ind, bool init_if_missing); +arena_t *arena_init(unsigned ind); +arena_tdata_t *arena_tdata_get_hard(tsd_t *tsd, unsigned ind); arena_t *arena_choose_hard(tsd_t *tsd); void arena_migrate(tsd_t *tsd, unsigned oldind, unsigned newind); -unsigned arena_nbound(unsigned ind); void thread_allocated_cleanup(tsd_t *tsd); void thread_deallocated_cleanup(tsd_t *tsd); void arena_cleanup(tsd_t *tsd); -void arenas_cache_cleanup(tsd_t *tsd); -void narenas_cache_cleanup(tsd_t *tsd); -void arenas_cache_bypass_cleanup(tsd_t *tsd); +void arenas_tdata_cleanup(tsd_t *tsd); +void narenas_tdata_cleanup(tsd_t *tsd); +void arenas_tdata_bypass_cleanup(tsd_t *tsd); void jemalloc_prefork(void); void jemalloc_postfork_parent(void); void jemalloc_postfork_child(void); +#include "jemalloc/internal/nstime.h" #include "jemalloc/internal/valgrind.h" #include "jemalloc/internal/util.h" #include "jemalloc/internal/atomic.h" #include "jemalloc/internal/prng.h" +#include "jemalloc/internal/ticker.h" #include "jemalloc/internal/ckh.h" #include "jemalloc/internal/size_classes.h" +#include "jemalloc/internal/smoothstep.h" #include "jemalloc/internal/stats.h" #include "jemalloc/internal/ctl.h" #include "jemalloc/internal/mutex.h" @@ -496,12 +507,15 @@ void jemalloc_postfork_child(void); /******************************************************************************/ #define JEMALLOC_H_INLINES +#include "jemalloc/internal/nstime.h" #include "jemalloc/internal/valgrind.h" #include "jemalloc/internal/util.h" #include "jemalloc/internal/atomic.h" #include "jemalloc/internal/prng.h" +#include "jemalloc/internal/ticker.h" #include "jemalloc/internal/ckh.h" #include "jemalloc/internal/size_classes.h" +#include "jemalloc/internal/smoothstep.h" #include "jemalloc/internal/stats.h" #include "jemalloc/internal/ctl.h" #include "jemalloc/internal/mutex.h" @@ -526,8 +540,10 @@ size_t s2u_lookup(size_t size); size_t s2u(size_t size); size_t sa2u(size_t size, size_t alignment); arena_t *arena_choose(tsd_t *tsd, arena_t *arena); -arena_t *arena_get(tsd_t *tsd, unsigned ind, bool init_if_missing, +arena_tdata_t *arena_tdata_get(tsd_t *tsd, unsigned ind, bool refresh_if_missing); +arena_t *arena_get(unsigned ind, bool init_if_missing); +ticker_t *decay_ticker_get(tsd_t *tsd, unsigned ind); #endif #if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_C_)) @@ -537,27 +553,27 @@ size2index_compute(size_t size) #if (NTBINS != 0) if (size <= (ZU(1) << LG_TINY_MAXCLASS)) { - size_t lg_tmin = LG_TINY_MAXCLASS - NTBINS + 1; - size_t lg_ceil = lg_floor(pow2_ceil(size)); + szind_t lg_tmin = LG_TINY_MAXCLASS - NTBINS + 1; + szind_t lg_ceil = lg_floor(pow2_ceil_zu(size)); return (lg_ceil < lg_tmin ? 0 : lg_ceil - lg_tmin); } #endif { - size_t x = unlikely(ZI(size) < 0) ? ((size<<1) ? + szind_t x = unlikely(ZI(size) < 0) ? ((size<<1) ? (ZU(1)<<(LG_SIZEOF_PTR+3)) : ((ZU(1)<<(LG_SIZEOF_PTR+3))-1)) : lg_floor((size<<1)-1); - size_t shift = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM) ? 0 : + szind_t shift = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM) ? 0 : x - (LG_SIZE_CLASS_GROUP + LG_QUANTUM); - size_t grp = shift << LG_SIZE_CLASS_GROUP; + szind_t grp = shift << LG_SIZE_CLASS_GROUP; - size_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM + 1) + szind_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM + 1) ? LG_QUANTUM : x - LG_SIZE_CLASS_GROUP - 1; size_t delta_inverse_mask = ZI(-1) << lg_delta; - size_t mod = ((((size-1) & delta_inverse_mask) >> lg_delta)) & + szind_t mod = ((((size-1) & delta_inverse_mask) >> lg_delta)) & ((ZU(1) << LG_SIZE_CLASS_GROUP) - 1); - size_t index = NTBINS + grp + mod; + szind_t index = NTBINS + grp + mod; return (index); } } @@ -568,8 +584,7 @@ size2index_lookup(size_t size) assert(size <= LOOKUP_MAXCLASS); { - size_t ret = ((size_t)(size2index_tab[(size-1) >> - LG_TINY_MIN])); + szind_t ret = (size2index_tab[(size-1) >> LG_TINY_MIN]); assert(ret == size2index_compute(size)); return (ret); } @@ -635,7 +650,7 @@ s2u_compute(size_t size) #if (NTBINS > 0) if (size <= (ZU(1) << LG_TINY_MAXCLASS)) { size_t lg_tmin = LG_TINY_MAXCLASS - NTBINS + 1; - size_t lg_ceil = lg_floor(pow2_ceil(size)); + size_t lg_ceil = lg_floor(pow2_ceil_zu(size)); return (lg_ceil < lg_tmin ? (ZU(1) << lg_tmin) : (ZU(1) << lg_ceil)); } @@ -727,17 +742,16 @@ sa2u(size_t size, size_t alignment) return (usize); } - /* Huge size class. Beware of size_t overflow. */ + /* Huge size class. Beware of overflow. */ + + if (unlikely(alignment > HUGE_MAXCLASS)) + return (0); /* * We can't achieve subchunk alignment, so round up alignment to the * minimum that can actually be supported. */ alignment = CHUNK_CEILING(alignment); - if (alignment == 0) { - /* size_t overflow. */ - return (0); - } /* Make sure result is a huge size class. */ if (size <= chunksize) @@ -776,32 +790,56 @@ arena_choose(tsd_t *tsd, arena_t *arena) return (ret); } -JEMALLOC_INLINE arena_t * -arena_get(tsd_t *tsd, unsigned ind, bool init_if_missing, - bool refresh_if_missing) +JEMALLOC_INLINE arena_tdata_t * +arena_tdata_get(tsd_t *tsd, unsigned ind, bool refresh_if_missing) { - arena_t *arena; - arena_t **arenas_cache = tsd_arenas_cache_get(tsd); - - /* init_if_missing requires refresh_if_missing. */ - assert(!init_if_missing || refresh_if_missing); + arena_tdata_t *tdata; + arena_tdata_t *arenas_tdata = tsd_arenas_tdata_get(tsd); - if (unlikely(arenas_cache == NULL)) { - /* arenas_cache hasn't been initialized yet. */ - return (arena_get_hard(tsd, ind, init_if_missing)); + if (unlikely(arenas_tdata == NULL)) { + /* arenas_tdata hasn't been initialized yet. */ + return (arena_tdata_get_hard(tsd, ind)); } - if (unlikely(ind >= tsd_narenas_cache_get(tsd))) { + if (unlikely(ind >= tsd_narenas_tdata_get(tsd))) { /* - * ind is invalid, cache is old (too small), or arena to be + * ind is invalid, cache is old (too small), or tdata to be * initialized. */ - return (refresh_if_missing ? arena_get_hard(tsd, ind, - init_if_missing) : NULL); + return (refresh_if_missing ? arena_tdata_get_hard(tsd, ind) : + NULL); } - arena = arenas_cache[ind]; - if (likely(arena != NULL) || !refresh_if_missing) - return (arena); - return (arena_get_hard(tsd, ind, init_if_missing)); + + tdata = &arenas_tdata[ind]; + if (likely(tdata != NULL) || !refresh_if_missing) + return (tdata); + return (arena_tdata_get_hard(tsd, ind)); +} + +JEMALLOC_INLINE arena_t * +arena_get(unsigned ind, bool init_if_missing) +{ + arena_t *ret; + + assert(ind <= MALLOCX_ARENA_MAX); + + ret = arenas[ind]; + if (unlikely(ret == NULL)) { + ret = atomic_read_p((void *)&arenas[ind]); + if (init_if_missing && unlikely(ret == NULL)) + ret = arena_init(ind); + } + return (ret); +} + +JEMALLOC_INLINE ticker_t * +decay_ticker_get(tsd_t *tsd, unsigned ind) +{ + arena_tdata_t *tdata; + + tdata = arena_tdata_get(tsd, ind, true); + if (unlikely(tdata == NULL)) + return (NULL); + return (&tdata->decay_ticker); } #endif @@ -823,12 +861,14 @@ arena_get(tsd_t *tsd, unsigned ind, bool init_if_missing, #ifndef JEMALLOC_ENABLE_INLINE arena_t *iaalloc(const void *ptr); size_t isalloc(const void *ptr, bool demote); -void *iallocztm(tsd_t *tsd, size_t size, bool zero, tcache_t *tcache, - bool is_metadata, arena_t *arena); -void *imalloct(tsd_t *tsd, size_t size, tcache_t *tcache, arena_t *arena); -void *imalloc(tsd_t *tsd, size_t size); -void *icalloct(tsd_t *tsd, size_t size, tcache_t *tcache, arena_t *arena); -void *icalloc(tsd_t *tsd, size_t size); +void *iallocztm(tsd_t *tsd, size_t size, szind_t ind, bool zero, + tcache_t *tcache, bool is_metadata, arena_t *arena, bool slow_path); +void *imalloct(tsd_t *tsd, size_t size, szind_t ind, tcache_t *tcache, + arena_t *arena); +void *imalloc(tsd_t *tsd, size_t size, szind_t ind, bool slow_path); +void *icalloct(tsd_t *tsd, size_t size, szind_t ind, tcache_t *tcache, + arena_t *arena); +void *icalloc(tsd_t *tsd, size_t size, szind_t ind); void *ipallocztm(tsd_t *tsd, size_t usize, size_t alignment, bool zero, tcache_t *tcache, bool is_metadata, arena_t *arena); void *ipalloct(tsd_t *tsd, size_t usize, size_t alignment, bool zero, @@ -837,10 +877,11 @@ void *ipalloc(tsd_t *tsd, size_t usize, size_t alignment, bool zero); size_t ivsalloc(const void *ptr, bool demote); size_t u2rz(size_t usize); size_t p2rz(const void *ptr); -void idalloctm(tsd_t *tsd, void *ptr, tcache_t *tcache, bool is_metadata); +void idalloctm(tsd_t *tsd, void *ptr, tcache_t *tcache, bool is_metadata, + bool slow_path); void idalloct(tsd_t *tsd, void *ptr, tcache_t *tcache); void idalloc(tsd_t *tsd, void *ptr); -void iqalloc(tsd_t *tsd, void *ptr, tcache_t *tcache); +void iqalloc(tsd_t *tsd, void *ptr, tcache_t *tcache, bool slow_path); void isdalloct(tsd_t *tsd, void *ptr, size_t size, tcache_t *tcache); void isqalloc(tsd_t *tsd, void *ptr, size_t size, tcache_t *tcache); void *iralloct_realign(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, @@ -850,8 +891,8 @@ void *iralloct(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t alignment, bool zero, tcache_t *tcache, arena_t *arena); void *iralloc(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t alignment, bool zero); -bool ixalloc(void *ptr, size_t oldsize, size_t size, size_t extra, - size_t alignment, bool zero); +bool ixalloc(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, + size_t extra, size_t alignment, bool zero); #endif #if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_C_)) @@ -881,14 +922,14 @@ isalloc(const void *ptr, bool demote) } JEMALLOC_ALWAYS_INLINE void * -iallocztm(tsd_t *tsd, size_t size, bool zero, tcache_t *tcache, bool is_metadata, - arena_t *arena) +iallocztm(tsd_t *tsd, size_t size, szind_t ind, bool zero, tcache_t *tcache, + bool is_metadata, arena_t *arena, bool slow_path) { void *ret; assert(size != 0); - ret = arena_malloc(tsd, arena, size, zero, tcache); + ret = arena_malloc(tsd, arena, size, ind, zero, tcache, slow_path); if (config_stats && is_metadata && likely(ret != NULL)) { arena_metadata_allocated_add(iaalloc(ret), isalloc(ret, config_prof)); @@ -897,31 +938,33 @@ iallocztm(tsd_t *tsd, size_t size, bool zero, tcache_t *tcache, bool is_metadata } JEMALLOC_ALWAYS_INLINE void * -imalloct(tsd_t *tsd, size_t size, tcache_t *tcache, arena_t *arena) +imalloct(tsd_t *tsd, size_t size, szind_t ind, tcache_t *tcache, arena_t *arena) { - return (iallocztm(tsd, size, false, tcache, false, arena)); + return (iallocztm(tsd, size, ind, false, tcache, false, arena, true)); } JEMALLOC_ALWAYS_INLINE void * -imalloc(tsd_t *tsd, size_t size) +imalloc(tsd_t *tsd, size_t size, szind_t ind, bool slow_path) { - return (iallocztm(tsd, size, false, tcache_get(tsd, true), false, NULL)); + return (iallocztm(tsd, size, ind, false, tcache_get(tsd, true), false, + NULL, slow_path)); } JEMALLOC_ALWAYS_INLINE void * -icalloct(tsd_t *tsd, size_t size, tcache_t *tcache, arena_t *arena) +icalloct(tsd_t *tsd, size_t size, szind_t ind, tcache_t *tcache, arena_t *arena) { - return (iallocztm(tsd, size, true, tcache, false, arena)); + return (iallocztm(tsd, size, ind, true, tcache, false, arena, true)); } JEMALLOC_ALWAYS_INLINE void * -icalloc(tsd_t *tsd, size_t size) +icalloc(tsd_t *tsd, size_t size, szind_t ind) { - return (iallocztm(tsd, size, true, tcache_get(tsd, true), false, NULL)); + return (iallocztm(tsd, size, ind, true, tcache_get(tsd, true), false, + NULL, true)); } JEMALLOC_ALWAYS_INLINE void * @@ -954,8 +997,8 @@ JEMALLOC_ALWAYS_INLINE void * ipalloc(tsd_t *tsd, size_t usize, size_t alignment, bool zero) { - return (ipallocztm(tsd, usize, alignment, zero, tcache_get(tsd, - NULL), false, NULL)); + return (ipallocztm(tsd, usize, alignment, zero, tcache_get(tsd, true), + false, NULL)); } JEMALLOC_ALWAYS_INLINE size_t @@ -997,7 +1040,8 @@ p2rz(const void *ptr) } JEMALLOC_ALWAYS_INLINE void -idalloctm(tsd_t *tsd, void *ptr, tcache_t *tcache, bool is_metadata) +idalloctm(tsd_t *tsd, void *ptr, tcache_t *tcache, bool is_metadata, + bool slow_path) { assert(ptr != NULL); @@ -1006,31 +1050,31 @@ idalloctm(tsd_t *tsd, void *ptr, tcache_t *tcache, bool is_metadata) config_prof)); } - arena_dalloc(tsd, ptr, tcache); + arena_dalloc(tsd, ptr, tcache, slow_path); } JEMALLOC_ALWAYS_INLINE void idalloct(tsd_t *tsd, void *ptr, tcache_t *tcache) { - idalloctm(tsd, ptr, tcache, false); + idalloctm(tsd, ptr, tcache, false, true); } JEMALLOC_ALWAYS_INLINE void idalloc(tsd_t *tsd, void *ptr) { - idalloctm(tsd, ptr, tcache_get(tsd, false), false); + idalloctm(tsd, ptr, tcache_get(tsd, false), false, true); } JEMALLOC_ALWAYS_INLINE void -iqalloc(tsd_t *tsd, void *ptr, tcache_t *tcache) +iqalloc(tsd_t *tsd, void *ptr, tcache_t *tcache, bool slow_path) { - if (config_fill && unlikely(opt_quarantine)) + if (slow_path && config_fill && unlikely(opt_quarantine)) quarantine(tsd, ptr); else - idalloctm(tsd, ptr, tcache, false); + idalloctm(tsd, ptr, tcache, false, slow_path); } JEMALLOC_ALWAYS_INLINE void @@ -1058,7 +1102,7 @@ iralloct_realign(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t usize, copysize; usize = sa2u(size + extra, alignment); - if (usize == 0) + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) return (NULL); p = ipalloct(tsd, usize, alignment, zero, tcache, arena); if (p == NULL) { @@ -1066,7 +1110,7 @@ iralloct_realign(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, return (NULL); /* Try again, without extra this time. */ usize = sa2u(size, alignment); - if (usize == 0) + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) return (NULL); p = ipalloct(tsd, usize, alignment, zero, tcache, arena); if (p == NULL) @@ -1114,8 +1158,8 @@ iralloc(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t alignment, } JEMALLOC_ALWAYS_INLINE bool -ixalloc(void *ptr, size_t oldsize, size_t size, size_t extra, size_t alignment, - bool zero) +ixalloc(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, size_t extra, + size_t alignment, bool zero) { assert(ptr != NULL); @@ -1127,7 +1171,7 @@ ixalloc(void *ptr, size_t oldsize, size_t size, size_t extra, size_t alignment, return (true); } - return (arena_ralloc_no_move(ptr, oldsize, size, extra, zero)); + return (arena_ralloc_no_move(tsd, ptr, oldsize, size, extra, zero)); } #endif diff --git a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_decls.h b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_decls.h index e7094b2..42d97f2 100644 --- a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_decls.h +++ b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_decls.h @@ -21,6 +21,7 @@ # endif # include <pthread.h> # include <errno.h> +# include <sys/time.h> #endif #include <sys/types.h> diff --git a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h index fa871fb..89c3c52 100644 --- a/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h +++ b/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h @@ -190,9 +190,10 @@ #define JEMALLOC_TLS /* - * ffs()/ffsl() functions to use for bitmapping. Don't use these directly; - * instead, use jemalloc_ffs() or jemalloc_ffsl() from util.h. + * ffs*() functions to use for bitmapping. Don't use these directly; instead, + * use ffs_*() from util.h. */ +#define JEMALLOC_INTERNAL_FFSLL __builtin_ffsll #define JEMALLOC_INTERNAL_FFSL __builtin_ffsl #define JEMALLOC_INTERNAL_FFS __builtin_ffs @@ -242,6 +243,9 @@ /* sizeof(long) == 2^LG_SIZEOF_LONG. */ #define LG_SIZEOF_LONG 3 +/* sizeof(long long) == 2^LG_SIZEOF_LONG_LONG. */ +#define LG_SIZEOF_LONG_LONG 3 + /* sizeof(intmax_t) == 2^LG_SIZEOF_INTMAX_T. */ #define LG_SIZEOF_INTMAX_T 3 @@ -260,4 +264,7 @@ */ /* #undef JEMALLOC_EXPORT */ +/* config.malloc_conf options string. */ +#define JEMALLOC_CONFIG_MALLOC_CONF "" + #endif /* JEMALLOC_INTERNAL_DEFS_H_ */ diff --git a/contrib/jemalloc/include/jemalloc/internal/nstime.h b/contrib/jemalloc/include/jemalloc/internal/nstime.h new file mode 100644 index 0000000..bd04f04 --- /dev/null +++ b/contrib/jemalloc/include/jemalloc/internal/nstime.h @@ -0,0 +1,48 @@ +/******************************************************************************/ +#ifdef JEMALLOC_H_TYPES + +#define JEMALLOC_CLOCK_GETTIME defined(_POSIX_MONOTONIC_CLOCK) \ + && _POSIX_MONOTONIC_CLOCK >= 0 + +typedef struct nstime_s nstime_t; + +/* Maximum supported number of seconds (~584 years). */ +#define NSTIME_SEC_MAX 18446744072 + +#endif /* JEMALLOC_H_TYPES */ +/******************************************************************************/ +#ifdef JEMALLOC_H_STRUCTS + +struct nstime_s { + uint64_t ns; +}; + +#endif /* JEMALLOC_H_STRUCTS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_EXTERNS + +void nstime_init(nstime_t *time, uint64_t ns); +void nstime_init2(nstime_t *time, uint64_t sec, uint64_t nsec); +uint64_t nstime_ns(const nstime_t *time); +uint64_t nstime_sec(const nstime_t *time); +uint64_t nstime_nsec(const nstime_t *time); +void nstime_copy(nstime_t *time, const nstime_t *source); +int nstime_compare(const nstime_t *a, const nstime_t *b); +void nstime_add(nstime_t *time, const nstime_t *addend); +void nstime_subtract(nstime_t *time, const nstime_t *subtrahend); +void nstime_imultiply(nstime_t *time, uint64_t multiplier); +void nstime_idivide(nstime_t *time, uint64_t divisor); +uint64_t nstime_divide(const nstime_t *time, const nstime_t *divisor); +#ifdef JEMALLOC_JET +typedef bool (nstime_update_t)(nstime_t *); +extern nstime_update_t *nstime_update; +#else +bool nstime_update(nstime_t *time); +#endif + +#endif /* JEMALLOC_H_EXTERNS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_INLINES + +#endif /* JEMALLOC_H_INLINES */ +/******************************************************************************/ diff --git a/contrib/jemalloc/include/jemalloc/internal/private_namespace.h b/contrib/jemalloc/include/jemalloc/internal/private_namespace.h index ffc4ea2..fb43a6b 100644 --- a/contrib/jemalloc/include/jemalloc/internal/private_namespace.h +++ b/contrib/jemalloc/include/jemalloc/internal/private_namespace.h @@ -1,8 +1,8 @@ #define a0dalloc JEMALLOC_N(a0dalloc) -#define a0get JEMALLOC_N(a0get) #define a0malloc JEMALLOC_N(a0malloc) #define arena_aalloc JEMALLOC_N(arena_aalloc) #define arena_alloc_junk_small JEMALLOC_N(arena_alloc_junk_small) +#define arena_basic_stats_merge JEMALLOC_N(arena_basic_stats_merge) #define arena_bin_index JEMALLOC_N(arena_bin_index) #define arena_bin_info JEMALLOC_N(arena_bin_info) #define arena_bitselm_get JEMALLOC_N(arena_bitselm_get) @@ -25,18 +25,23 @@ #define arena_dalloc_large JEMALLOC_N(arena_dalloc_large) #define arena_dalloc_large_junked_locked JEMALLOC_N(arena_dalloc_large_junked_locked) #define arena_dalloc_small JEMALLOC_N(arena_dalloc_small) +#define arena_decay_tick JEMALLOC_N(arena_decay_tick) +#define arena_decay_ticks JEMALLOC_N(arena_decay_ticks) +#define arena_decay_time_default_get JEMALLOC_N(arena_decay_time_default_get) +#define arena_decay_time_default_set JEMALLOC_N(arena_decay_time_default_set) +#define arena_decay_time_get JEMALLOC_N(arena_decay_time_get) +#define arena_decay_time_set JEMALLOC_N(arena_decay_time_set) #define arena_dss_prec_get JEMALLOC_N(arena_dss_prec_get) #define arena_dss_prec_set JEMALLOC_N(arena_dss_prec_set) #define arena_get JEMALLOC_N(arena_get) -#define arena_get_hard JEMALLOC_N(arena_get_hard) #define arena_init JEMALLOC_N(arena_init) #define arena_lg_dirty_mult_default_get JEMALLOC_N(arena_lg_dirty_mult_default_get) #define arena_lg_dirty_mult_default_set JEMALLOC_N(arena_lg_dirty_mult_default_set) #define arena_lg_dirty_mult_get JEMALLOC_N(arena_lg_dirty_mult_get) #define arena_lg_dirty_mult_set JEMALLOC_N(arena_lg_dirty_mult_set) #define arena_malloc JEMALLOC_N(arena_malloc) +#define arena_malloc_hard JEMALLOC_N(arena_malloc_hard) #define arena_malloc_large JEMALLOC_N(arena_malloc_large) -#define arena_malloc_small JEMALLOC_N(arena_malloc_small) #define arena_mapbits_allocated_get JEMALLOC_N(arena_mapbits_allocated_get) #define arena_mapbits_binind_get JEMALLOC_N(arena_mapbits_binind_get) #define arena_mapbits_decommitted_get JEMALLOC_N(arena_mapbits_decommitted_get) @@ -47,9 +52,6 @@ #define arena_mapbits_large_get JEMALLOC_N(arena_mapbits_large_get) #define arena_mapbits_large_set JEMALLOC_N(arena_mapbits_large_set) #define arena_mapbits_large_size_get JEMALLOC_N(arena_mapbits_large_size_get) -#define arena_mapbitsp_get JEMALLOC_N(arena_mapbitsp_get) -#define arena_mapbitsp_read JEMALLOC_N(arena_mapbitsp_read) -#define arena_mapbitsp_write JEMALLOC_N(arena_mapbitsp_write) #define arena_mapbits_size_decode JEMALLOC_N(arena_mapbits_size_decode) #define arena_mapbits_size_encode JEMALLOC_N(arena_mapbits_size_encode) #define arena_mapbits_small_runind_get JEMALLOC_N(arena_mapbits_small_runind_get) @@ -58,6 +60,9 @@ #define arena_mapbits_unallocated_size_get JEMALLOC_N(arena_mapbits_unallocated_size_get) #define arena_mapbits_unallocated_size_set JEMALLOC_N(arena_mapbits_unallocated_size_set) #define arena_mapbits_unzeroed_get JEMALLOC_N(arena_mapbits_unzeroed_get) +#define arena_mapbitsp_get JEMALLOC_N(arena_mapbitsp_get) +#define arena_mapbitsp_read JEMALLOC_N(arena_mapbitsp_read) +#define arena_mapbitsp_write JEMALLOC_N(arena_mapbitsp_write) #define arena_maxrun JEMALLOC_N(arena_maxrun) #define arena_maybe_purge JEMALLOC_N(arena_maybe_purge) #define arena_metadata_allocated_add JEMALLOC_N(arena_metadata_allocated_add) @@ -67,10 +72,12 @@ #define arena_miscelm_get JEMALLOC_N(arena_miscelm_get) #define arena_miscelm_to_pageind JEMALLOC_N(arena_miscelm_to_pageind) #define arena_miscelm_to_rpages JEMALLOC_N(arena_miscelm_to_rpages) -#define arena_nbound JEMALLOC_N(arena_nbound) #define arena_new JEMALLOC_N(arena_new) #define arena_node_alloc JEMALLOC_N(arena_node_alloc) #define arena_node_dalloc JEMALLOC_N(arena_node_dalloc) +#define arena_nthreads_dec JEMALLOC_N(arena_nthreads_dec) +#define arena_nthreads_get JEMALLOC_N(arena_nthreads_get) +#define arena_nthreads_inc JEMALLOC_N(arena_nthreads_inc) #define arena_palloc JEMALLOC_N(arena_palloc) #define arena_postfork_child JEMALLOC_N(arena_postfork_child) #define arena_postfork_parent JEMALLOC_N(arena_postfork_parent) @@ -83,7 +90,7 @@ #define arena_prof_tctx_reset JEMALLOC_N(arena_prof_tctx_reset) #define arena_prof_tctx_set JEMALLOC_N(arena_prof_tctx_set) #define arena_ptr_small_binind_get JEMALLOC_N(arena_ptr_small_binind_get) -#define arena_purge_all JEMALLOC_N(arena_purge_all) +#define arena_purge JEMALLOC_N(arena_purge) #define arena_quarantine_junk_small JEMALLOC_N(arena_quarantine_junk_small) #define arena_ralloc JEMALLOC_N(arena_ralloc) #define arena_ralloc_junk_large JEMALLOC_N(arena_ralloc_junk_large) @@ -93,11 +100,14 @@ #define arena_run_regind JEMALLOC_N(arena_run_regind) #define arena_run_to_miscelm JEMALLOC_N(arena_run_to_miscelm) #define arena_salloc JEMALLOC_N(arena_salloc) -#define arenas_cache_bypass_cleanup JEMALLOC_N(arenas_cache_bypass_cleanup) -#define arenas_cache_cleanup JEMALLOC_N(arenas_cache_cleanup) #define arena_sdalloc JEMALLOC_N(arena_sdalloc) #define arena_stats_merge JEMALLOC_N(arena_stats_merge) #define arena_tcache_fill_small JEMALLOC_N(arena_tcache_fill_small) +#define arena_tdata_get JEMALLOC_N(arena_tdata_get) +#define arena_tdata_get_hard JEMALLOC_N(arena_tdata_get_hard) +#define arenas JEMALLOC_N(arenas) +#define arenas_tdata_bypass_cleanup JEMALLOC_N(arenas_tdata_bypass_cleanup) +#define arenas_tdata_cleanup JEMALLOC_N(arenas_tdata_cleanup) #define atomic_add_p JEMALLOC_N(atomic_add_p) #define atomic_add_u JEMALLOC_N(atomic_add_u) #define atomic_add_uint32 JEMALLOC_N(atomic_add_uint32) @@ -122,7 +132,6 @@ #define bitmap_full JEMALLOC_N(bitmap_full) #define bitmap_get JEMALLOC_N(bitmap_get) #define bitmap_info_init JEMALLOC_N(bitmap_info_init) -#define bitmap_info_ngroups JEMALLOC_N(bitmap_info_ngroups) #define bitmap_init JEMALLOC_N(bitmap_init) #define bitmap_set JEMALLOC_N(bitmap_set) #define bitmap_sfu JEMALLOC_N(bitmap_sfu) @@ -162,9 +171,9 @@ #define chunk_purge_arena JEMALLOC_N(chunk_purge_arena) #define chunk_purge_wrapper JEMALLOC_N(chunk_purge_wrapper) #define chunk_register JEMALLOC_N(chunk_register) +#define chunks_rtree JEMALLOC_N(chunks_rtree) #define chunksize JEMALLOC_N(chunksize) #define chunksize_mask JEMALLOC_N(chunksize_mask) -#define chunks_rtree JEMALLOC_N(chunks_rtree) #define ckh_count JEMALLOC_N(ckh_count) #define ckh_delete JEMALLOC_N(ckh_delete) #define ckh_insert JEMALLOC_N(ckh_insert) @@ -183,6 +192,7 @@ #define ctl_postfork_child JEMALLOC_N(ctl_postfork_child) #define ctl_postfork_parent JEMALLOC_N(ctl_postfork_parent) #define ctl_prefork JEMALLOC_N(ctl_prefork) +#define decay_ticker_get JEMALLOC_N(decay_ticker_get) #define dss_prec_names JEMALLOC_N(dss_prec_names) #define extent_node_achunk_get JEMALLOC_N(extent_node_achunk_get) #define extent_node_achunk_set JEMALLOC_N(extent_node_achunk_set) @@ -234,6 +244,12 @@ #define extent_tree_szad_reverse_iter_recurse JEMALLOC_N(extent_tree_szad_reverse_iter_recurse) #define extent_tree_szad_reverse_iter_start JEMALLOC_N(extent_tree_szad_reverse_iter_start) #define extent_tree_szad_search JEMALLOC_N(extent_tree_szad_search) +#define ffs_llu JEMALLOC_N(ffs_llu) +#define ffs_lu JEMALLOC_N(ffs_lu) +#define ffs_u JEMALLOC_N(ffs_u) +#define ffs_u32 JEMALLOC_N(ffs_u32) +#define ffs_u64 JEMALLOC_N(ffs_u64) +#define ffs_zu JEMALLOC_N(ffs_zu) #define get_errno JEMALLOC_N(get_errno) #define hash JEMALLOC_N(hash) #define hash_fmix_32 JEMALLOC_N(hash_fmix_32) @@ -265,11 +281,11 @@ #define idalloctm JEMALLOC_N(idalloctm) #define imalloc JEMALLOC_N(imalloc) #define imalloct JEMALLOC_N(imalloct) +#define in_valgrind JEMALLOC_N(in_valgrind) #define index2size JEMALLOC_N(index2size) #define index2size_compute JEMALLOC_N(index2size_compute) #define index2size_lookup JEMALLOC_N(index2size_lookup) #define index2size_tab JEMALLOC_N(index2size_tab) -#define in_valgrind JEMALLOC_N(in_valgrind) #define ipalloc JEMALLOC_N(ipalloc) #define ipalloct JEMALLOC_N(ipalloct) #define ipallocztm JEMALLOC_N(ipallocztm) @@ -310,11 +326,25 @@ #define map_misc_offset JEMALLOC_N(map_misc_offset) #define mb_write JEMALLOC_N(mb_write) #define mutex_boot JEMALLOC_N(mutex_boot) -#define narenas_cache_cleanup JEMALLOC_N(narenas_cache_cleanup) +#define narenas_tdata_cleanup JEMALLOC_N(narenas_tdata_cleanup) #define narenas_total_get JEMALLOC_N(narenas_total_get) #define ncpus JEMALLOC_N(ncpus) #define nhbins JEMALLOC_N(nhbins) +#define nstime_add JEMALLOC_N(nstime_add) +#define nstime_compare JEMALLOC_N(nstime_compare) +#define nstime_copy JEMALLOC_N(nstime_copy) +#define nstime_divide JEMALLOC_N(nstime_divide) +#define nstime_idivide JEMALLOC_N(nstime_idivide) +#define nstime_imultiply JEMALLOC_N(nstime_imultiply) +#define nstime_init JEMALLOC_N(nstime_init) +#define nstime_init2 JEMALLOC_N(nstime_init2) +#define nstime_ns JEMALLOC_N(nstime_ns) +#define nstime_nsec JEMALLOC_N(nstime_nsec) +#define nstime_sec JEMALLOC_N(nstime_sec) +#define nstime_subtract JEMALLOC_N(nstime_subtract) +#define nstime_update JEMALLOC_N(nstime_update) #define opt_abort JEMALLOC_N(opt_abort) +#define opt_decay_time JEMALLOC_N(opt_decay_time) #define opt_dss JEMALLOC_N(opt_dss) #define opt_junk JEMALLOC_N(opt_junk) #define opt_junk_alloc JEMALLOC_N(opt_junk_alloc) @@ -333,6 +363,7 @@ #define opt_prof_leak JEMALLOC_N(opt_prof_leak) #define opt_prof_prefix JEMALLOC_N(opt_prof_prefix) #define opt_prof_thread_active_init JEMALLOC_N(opt_prof_thread_active_init) +#define opt_purge JEMALLOC_N(opt_purge) #define opt_quarantine JEMALLOC_N(opt_quarantine) #define opt_redzone JEMALLOC_N(opt_redzone) #define opt_stats_print JEMALLOC_N(opt_stats_print) @@ -347,7 +378,11 @@ #define pages_purge JEMALLOC_N(pages_purge) #define pages_trim JEMALLOC_N(pages_trim) #define pages_unmap JEMALLOC_N(pages_unmap) -#define pow2_ceil JEMALLOC_N(pow2_ceil) +#define pow2_ceil_u32 JEMALLOC_N(pow2_ceil_u32) +#define pow2_ceil_u64 JEMALLOC_N(pow2_ceil_u64) +#define pow2_ceil_zu JEMALLOC_N(pow2_ceil_zu) +#define prng_lg_range JEMALLOC_N(prng_lg_range) +#define prng_range JEMALLOC_N(prng_range) #define prof_active_get JEMALLOC_N(prof_active_get) #define prof_active_get_unlocked JEMALLOC_N(prof_active_get_unlocked) #define prof_active_set JEMALLOC_N(prof_active_set) @@ -392,6 +427,7 @@ #define prof_thread_active_set JEMALLOC_N(prof_thread_active_set) #define prof_thread_name_get JEMALLOC_N(prof_thread_name_get) #define prof_thread_name_set JEMALLOC_N(prof_thread_name_set) +#define purge_mode_names JEMALLOC_N(purge_mode_names) #define quarantine JEMALLOC_N(quarantine) #define quarantine_alloc_hook JEMALLOC_N(quarantine_alloc_hook) #define quarantine_alloc_hook_work JEMALLOC_N(quarantine_alloc_hook_work) @@ -412,6 +448,9 @@ #define rtree_subtree_tryread JEMALLOC_N(rtree_subtree_tryread) #define rtree_val_read JEMALLOC_N(rtree_val_read) #define rtree_val_write JEMALLOC_N(rtree_val_write) +#define run_quantize_ceil JEMALLOC_N(run_quantize_ceil) +#define run_quantize_floor JEMALLOC_N(run_quantize_floor) +#define run_quantize_max JEMALLOC_N(run_quantize_max) #define s2u JEMALLOC_N(s2u) #define s2u_compute JEMALLOC_N(s2u_compute) #define s2u_lookup JEMALLOC_N(s2u_lookup) @@ -450,15 +489,20 @@ #define tcache_get JEMALLOC_N(tcache_get) #define tcache_get_hard JEMALLOC_N(tcache_get_hard) #define tcache_maxclass JEMALLOC_N(tcache_maxclass) -#define tcaches JEMALLOC_N(tcaches) #define tcache_salloc JEMALLOC_N(tcache_salloc) +#define tcache_stats_merge JEMALLOC_N(tcache_stats_merge) +#define tcaches JEMALLOC_N(tcaches) #define tcaches_create JEMALLOC_N(tcaches_create) #define tcaches_destroy JEMALLOC_N(tcaches_destroy) #define tcaches_flush JEMALLOC_N(tcaches_flush) #define tcaches_get JEMALLOC_N(tcaches_get) -#define tcache_stats_merge JEMALLOC_N(tcache_stats_merge) #define thread_allocated_cleanup JEMALLOC_N(thread_allocated_cleanup) #define thread_deallocated_cleanup JEMALLOC_N(thread_deallocated_cleanup) +#define ticker_copy JEMALLOC_N(ticker_copy) +#define ticker_init JEMALLOC_N(ticker_init) +#define ticker_read JEMALLOC_N(ticker_read) +#define ticker_tick JEMALLOC_N(ticker_tick) +#define ticker_ticks JEMALLOC_N(ticker_ticks) #define tsd_arena_get JEMALLOC_N(tsd_arena_get) #define tsd_arena_set JEMALLOC_N(tsd_arena_set) #define tsd_boot JEMALLOC_N(tsd_boot) @@ -476,6 +520,8 @@ #define tsd_init_finish JEMALLOC_N(tsd_init_finish) #define tsd_init_head JEMALLOC_N(tsd_init_head) #define tsd_nominal JEMALLOC_N(tsd_nominal) +#define tsd_prof_tdata_get JEMALLOC_N(tsd_prof_tdata_get) +#define tsd_prof_tdata_set JEMALLOC_N(tsd_prof_tdata_set) #define tsd_quarantine_get JEMALLOC_N(tsd_quarantine_get) #define tsd_quarantine_set JEMALLOC_N(tsd_quarantine_set) #define tsd_set JEMALLOC_N(tsd_set) @@ -483,14 +529,12 @@ #define tsd_tcache_enabled_set JEMALLOC_N(tsd_tcache_enabled_set) #define tsd_tcache_get JEMALLOC_N(tsd_tcache_get) #define tsd_tcache_set JEMALLOC_N(tsd_tcache_set) -#define tsd_tls JEMALLOC_N(tsd_tls) -#define tsd_tsd JEMALLOC_N(tsd_tsd) -#define tsd_prof_tdata_get JEMALLOC_N(tsd_prof_tdata_get) -#define tsd_prof_tdata_set JEMALLOC_N(tsd_prof_tdata_set) #define tsd_thread_allocated_get JEMALLOC_N(tsd_thread_allocated_get) #define tsd_thread_allocated_set JEMALLOC_N(tsd_thread_allocated_set) #define tsd_thread_deallocated_get JEMALLOC_N(tsd_thread_deallocated_get) #define tsd_thread_deallocated_set JEMALLOC_N(tsd_thread_deallocated_set) +#define tsd_tls JEMALLOC_N(tsd_tls) +#define tsd_tsd JEMALLOC_N(tsd_tsd) #define u2rz JEMALLOC_N(u2rz) #define valgrind_freelike_block JEMALLOC_N(valgrind_freelike_block) #define valgrind_make_mem_defined JEMALLOC_N(valgrind_make_mem_defined) diff --git a/contrib/jemalloc/include/jemalloc/internal/prng.h b/contrib/jemalloc/include/jemalloc/internal/prng.h index 216d0ef..5830f8b7 100644 --- a/contrib/jemalloc/include/jemalloc/internal/prng.h +++ b/contrib/jemalloc/include/jemalloc/internal/prng.h @@ -18,31 +18,9 @@ * proportional to bit position. For example, the lowest bit has a cycle of 2, * the next has a cycle of 4, etc. For this reason, we prefer to use the upper * bits. - * - * Macro parameters: - * uint32_t r : Result. - * unsigned lg_range : (0..32], number of least significant bits to return. - * uint32_t state : Seed value. - * const uint32_t a, c : See above discussion. */ -#define prng32(r, lg_range, state, a, c) do { \ - assert((lg_range) > 0); \ - assert((lg_range) <= 32); \ - \ - r = (state * (a)) + (c); \ - state = r; \ - r >>= (32 - (lg_range)); \ -} while (false) - -/* Same as prng32(), but 64 bits of pseudo-randomness, using uint64_t. */ -#define prng64(r, lg_range, state, a, c) do { \ - assert((lg_range) > 0); \ - assert((lg_range) <= 64); \ - \ - r = (state * (a)) + (c); \ - state = r; \ - r >>= (64 - (lg_range)); \ -} while (false) +#define PRNG_A UINT64_C(6364136223846793005) +#define PRNG_C UINT64_C(1442695040888963407) #endif /* JEMALLOC_H_TYPES */ /******************************************************************************/ @@ -56,5 +34,46 @@ /******************************************************************************/ #ifdef JEMALLOC_H_INLINES +#ifndef JEMALLOC_ENABLE_INLINE +uint64_t prng_lg_range(uint64_t *state, unsigned lg_range); +uint64_t prng_range(uint64_t *state, uint64_t range); +#endif + +#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_PRNG_C_)) +JEMALLOC_ALWAYS_INLINE uint64_t +prng_lg_range(uint64_t *state, unsigned lg_range) +{ + uint64_t ret; + + assert(lg_range > 0); + assert(lg_range <= 64); + + ret = (*state * PRNG_A) + PRNG_C; + *state = ret; + ret >>= (64 - lg_range); + + return (ret); +} + +JEMALLOC_ALWAYS_INLINE uint64_t +prng_range(uint64_t *state, uint64_t range) +{ + uint64_t ret; + unsigned lg_range; + + assert(range > 1); + + /* Compute the ceiling of lg(range). */ + lg_range = ffs_u64(pow2_ceil_u64(range)) - 1; + + /* Generate a result in [0..range) via repeated trial. */ + do { + ret = prng_lg_range(state, lg_range); + } while (ret >= range); + + return (ret); +} +#endif + #endif /* JEMALLOC_H_INLINES */ /******************************************************************************/ diff --git a/contrib/jemalloc/include/jemalloc/internal/prof.h b/contrib/jemalloc/include/jemalloc/internal/prof.h index e5198c3..a25502a 100644 --- a/contrib/jemalloc/include/jemalloc/internal/prof.h +++ b/contrib/jemalloc/include/jemalloc/internal/prof.h @@ -436,16 +436,16 @@ prof_sample_accum_update(tsd_t *tsd, size_t usize, bool update, cassert(config_prof); tdata = prof_tdata_get(tsd, true); - if ((uintptr_t)tdata <= (uintptr_t)PROF_TDATA_STATE_MAX) + if (unlikely((uintptr_t)tdata <= (uintptr_t)PROF_TDATA_STATE_MAX)) tdata = NULL; if (tdata_out != NULL) *tdata_out = tdata; - if (tdata == NULL) + if (unlikely(tdata == NULL)) return (true); - if (tdata->bytes_until_sample >= usize) { + if (likely(tdata->bytes_until_sample >= usize)) { if (update) tdata->bytes_until_sample -= usize; return (true); diff --git a/contrib/jemalloc/include/jemalloc/internal/rb.h b/contrib/jemalloc/include/jemalloc/internal/rb.h index 2ca8e59..3770342 100644 --- a/contrib/jemalloc/include/jemalloc/internal/rb.h +++ b/contrib/jemalloc/include/jemalloc/internal/rb.h @@ -42,7 +42,6 @@ struct { \ #define rb_tree(a_type) \ struct { \ a_type *rbt_root; \ - a_type rbt_nil; \ } /* Left accessors. */ @@ -79,6 +78,15 @@ struct { \ (a_node)->a_field.rbn_right_red = (a_type *) (((intptr_t) \ (a_node)->a_field.rbn_right_red) & ((ssize_t)-2)); \ } while (0) + +/* Node initializer. */ +#define rbt_node_new(a_type, a_field, a_rbt, a_node) do { \ + /* Bookkeeping bit cannot be used by node pointer. */ \ + assert(((uintptr_t)(a_node) & 0x1) == 0); \ + rbtn_left_set(a_type, a_field, (a_node), NULL); \ + rbtn_right_set(a_type, a_field, (a_node), NULL); \ + rbtn_red_set(a_type, a_field, (a_node)); \ +} while (0) #else /* Right accessors. */ #define rbtn_right_get(a_type, a_field, a_node) \ @@ -99,28 +107,26 @@ struct { \ #define rbtn_black_set(a_type, a_field, a_node) do { \ (a_node)->a_field.rbn_red = false; \ } while (0) -#endif /* Node initializer. */ #define rbt_node_new(a_type, a_field, a_rbt, a_node) do { \ - rbtn_left_set(a_type, a_field, (a_node), &(a_rbt)->rbt_nil); \ - rbtn_right_set(a_type, a_field, (a_node), &(a_rbt)->rbt_nil); \ + rbtn_left_set(a_type, a_field, (a_node), NULL); \ + rbtn_right_set(a_type, a_field, (a_node), NULL); \ rbtn_red_set(a_type, a_field, (a_node)); \ } while (0) +#endif /* Tree initializer. */ #define rb_new(a_type, a_field, a_rbt) do { \ - (a_rbt)->rbt_root = &(a_rbt)->rbt_nil; \ - rbt_node_new(a_type, a_field, a_rbt, &(a_rbt)->rbt_nil); \ - rbtn_black_set(a_type, a_field, &(a_rbt)->rbt_nil); \ + (a_rbt)->rbt_root = NULL; \ } while (0) /* Internal utility macros. */ #define rbtn_first(a_type, a_field, a_rbt, a_root, r_node) do { \ (r_node) = (a_root); \ - if ((r_node) != &(a_rbt)->rbt_nil) { \ + if ((r_node) != NULL) { \ for (; \ - rbtn_left_get(a_type, a_field, (r_node)) != &(a_rbt)->rbt_nil;\ + rbtn_left_get(a_type, a_field, (r_node)) != NULL; \ (r_node) = rbtn_left_get(a_type, a_field, (r_node))) { \ } \ } \ @@ -128,10 +134,9 @@ struct { \ #define rbtn_last(a_type, a_field, a_rbt, a_root, r_node) do { \ (r_node) = (a_root); \ - if ((r_node) != &(a_rbt)->rbt_nil) { \ - for (; rbtn_right_get(a_type, a_field, (r_node)) != \ - &(a_rbt)->rbt_nil; (r_node) = rbtn_right_get(a_type, a_field, \ - (r_node))) { \ + if ((r_node) != NULL) { \ + for (; rbtn_right_get(a_type, a_field, (r_node)) != NULL; \ + (r_node) = rbtn_right_get(a_type, a_field, (r_node))) { \ } \ } \ } while (0) @@ -169,11 +174,11 @@ a_prefix##next(a_rbt_type *rbtree, a_type *node); \ a_attr a_type * \ a_prefix##prev(a_rbt_type *rbtree, a_type *node); \ a_attr a_type * \ -a_prefix##search(a_rbt_type *rbtree, a_type *key); \ +a_prefix##search(a_rbt_type *rbtree, const a_type *key); \ a_attr a_type * \ -a_prefix##nsearch(a_rbt_type *rbtree, a_type *key); \ +a_prefix##nsearch(a_rbt_type *rbtree, const a_type *key); \ a_attr a_type * \ -a_prefix##psearch(a_rbt_type *rbtree, a_type *key); \ +a_prefix##psearch(a_rbt_type *rbtree, const a_type *key); \ a_attr void \ a_prefix##insert(a_rbt_type *rbtree, a_type *node); \ a_attr void \ @@ -183,7 +188,10 @@ a_prefix##iter(a_rbt_type *rbtree, a_type *start, a_type *(*cb)( \ a_rbt_type *, a_type *, void *), void *arg); \ a_attr a_type * \ a_prefix##reverse_iter(a_rbt_type *rbtree, a_type *start, \ - a_type *(*cb)(a_rbt_type *, a_type *, void *), void *arg); + a_type *(*cb)(a_rbt_type *, a_type *, void *), void *arg); \ +a_attr void \ +a_prefix##destroy(a_rbt_type *rbtree, void (*cb)(a_type *, void *), \ + void *arg); /* * The rb_gen() macro generates a type-specific red-black tree implementation, @@ -254,7 +262,7 @@ a_prefix##reverse_iter(a_rbt_type *rbtree, a_type *start, \ * last/first. * * static ex_node_t * - * ex_search(ex_t *tree, ex_node_t *key); + * ex_search(ex_t *tree, const ex_node_t *key); * Description: Search for node that matches key. * Args: * tree: Pointer to an initialized red-black tree object. @@ -262,9 +270,9 @@ a_prefix##reverse_iter(a_rbt_type *rbtree, a_type *start, \ * Ret: Node in tree that matches key, or NULL if no match. * * static ex_node_t * - * ex_nsearch(ex_t *tree, ex_node_t *key); + * ex_nsearch(ex_t *tree, const ex_node_t *key); * static ex_node_t * - * ex_psearch(ex_t *tree, ex_node_t *key); + * ex_psearch(ex_t *tree, const ex_node_t *key); * Description: Search for node that matches key. If no match is found, * return what would be key's successor/predecessor, were * key in tree. @@ -312,6 +320,20 @@ a_prefix##reverse_iter(a_rbt_type *rbtree, a_type *start, \ * arg : Opaque pointer passed to cb(). * Ret: NULL if iteration completed, or the non-NULL callback return value * that caused termination of the iteration. + * + * static void + * ex_destroy(ex_t *tree, void (*cb)(ex_node_t *, void *), void *arg); + * Description: Iterate over the tree with post-order traversal, remove + * each node, and run the callback if non-null. This is + * used for destroying a tree without paying the cost to + * rebalance it. The tree must not be otherwise altered + * during traversal. + * Args: + * tree: Pointer to an initialized red-black tree object. + * cb : Callback function, which, if non-null, is called for each node + * during iteration. There is no way to stop iteration once it + * has begun. + * arg : Opaque pointer passed to cb(). */ #define rb_gen(a_attr, a_prefix, a_rbt_type, a_type, a_field, a_cmp) \ a_attr void \ @@ -320,36 +342,30 @@ a_prefix##new(a_rbt_type *rbtree) { \ } \ a_attr bool \ a_prefix##empty(a_rbt_type *rbtree) { \ - return (rbtree->rbt_root == &rbtree->rbt_nil); \ + return (rbtree->rbt_root == NULL); \ } \ a_attr a_type * \ a_prefix##first(a_rbt_type *rbtree) { \ a_type *ret; \ rbtn_first(a_type, a_field, rbtree, rbtree->rbt_root, ret); \ - if (ret == &rbtree->rbt_nil) { \ - ret = NULL; \ - } \ return (ret); \ } \ a_attr a_type * \ a_prefix##last(a_rbt_type *rbtree) { \ a_type *ret; \ rbtn_last(a_type, a_field, rbtree, rbtree->rbt_root, ret); \ - if (ret == &rbtree->rbt_nil) { \ - ret = NULL; \ - } \ return (ret); \ } \ a_attr a_type * \ a_prefix##next(a_rbt_type *rbtree, a_type *node) { \ a_type *ret; \ - if (rbtn_right_get(a_type, a_field, node) != &rbtree->rbt_nil) { \ + if (rbtn_right_get(a_type, a_field, node) != NULL) { \ rbtn_first(a_type, a_field, rbtree, rbtn_right_get(a_type, \ a_field, node), ret); \ } else { \ a_type *tnode = rbtree->rbt_root; \ - assert(tnode != &rbtree->rbt_nil); \ - ret = &rbtree->rbt_nil; \ + assert(tnode != NULL); \ + ret = NULL; \ while (true) { \ int cmp = (a_cmp)(node, tnode); \ if (cmp < 0) { \ @@ -360,24 +376,21 @@ a_prefix##next(a_rbt_type *rbtree, a_type *node) { \ } else { \ break; \ } \ - assert(tnode != &rbtree->rbt_nil); \ + assert(tnode != NULL); \ } \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = (NULL); \ - } \ return (ret); \ } \ a_attr a_type * \ a_prefix##prev(a_rbt_type *rbtree, a_type *node) { \ a_type *ret; \ - if (rbtn_left_get(a_type, a_field, node) != &rbtree->rbt_nil) { \ + if (rbtn_left_get(a_type, a_field, node) != NULL) { \ rbtn_last(a_type, a_field, rbtree, rbtn_left_get(a_type, \ a_field, node), ret); \ } else { \ a_type *tnode = rbtree->rbt_root; \ - assert(tnode != &rbtree->rbt_nil); \ - ret = &rbtree->rbt_nil; \ + assert(tnode != NULL); \ + ret = NULL; \ while (true) { \ int cmp = (a_cmp)(node, tnode); \ if (cmp < 0) { \ @@ -388,20 +401,17 @@ a_prefix##prev(a_rbt_type *rbtree, a_type *node) { \ } else { \ break; \ } \ - assert(tnode != &rbtree->rbt_nil); \ + assert(tnode != NULL); \ } \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = (NULL); \ - } \ return (ret); \ } \ a_attr a_type * \ -a_prefix##search(a_rbt_type *rbtree, a_type *key) { \ +a_prefix##search(a_rbt_type *rbtree, const a_type *key) { \ a_type *ret; \ int cmp; \ ret = rbtree->rbt_root; \ - while (ret != &rbtree->rbt_nil \ + while (ret != NULL \ && (cmp = (a_cmp)(key, ret)) != 0) { \ if (cmp < 0) { \ ret = rbtn_left_get(a_type, a_field, ret); \ @@ -409,17 +419,14 @@ a_prefix##search(a_rbt_type *rbtree, a_type *key) { \ ret = rbtn_right_get(a_type, a_field, ret); \ } \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = (NULL); \ - } \ return (ret); \ } \ a_attr a_type * \ -a_prefix##nsearch(a_rbt_type *rbtree, a_type *key) { \ +a_prefix##nsearch(a_rbt_type *rbtree, const a_type *key) { \ a_type *ret; \ a_type *tnode = rbtree->rbt_root; \ - ret = &rbtree->rbt_nil; \ - while (tnode != &rbtree->rbt_nil) { \ + ret = NULL; \ + while (tnode != NULL) { \ int cmp = (a_cmp)(key, tnode); \ if (cmp < 0) { \ ret = tnode; \ @@ -431,17 +438,14 @@ a_prefix##nsearch(a_rbt_type *rbtree, a_type *key) { \ break; \ } \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = (NULL); \ - } \ return (ret); \ } \ a_attr a_type * \ -a_prefix##psearch(a_rbt_type *rbtree, a_type *key) { \ +a_prefix##psearch(a_rbt_type *rbtree, const a_type *key) { \ a_type *ret; \ a_type *tnode = rbtree->rbt_root; \ - ret = &rbtree->rbt_nil; \ - while (tnode != &rbtree->rbt_nil) { \ + ret = NULL; \ + while (tnode != NULL) { \ int cmp = (a_cmp)(key, tnode); \ if (cmp < 0) { \ tnode = rbtn_left_get(a_type, a_field, tnode); \ @@ -453,9 +457,6 @@ a_prefix##psearch(a_rbt_type *rbtree, a_type *key) { \ break; \ } \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = (NULL); \ - } \ return (ret); \ } \ a_attr void \ @@ -467,7 +468,7 @@ a_prefix##insert(a_rbt_type *rbtree, a_type *node) { \ rbt_node_new(a_type, a_field, rbtree, node); \ /* Wind. */ \ path->node = rbtree->rbt_root; \ - for (pathp = path; pathp->node != &rbtree->rbt_nil; pathp++) { \ + for (pathp = path; pathp->node != NULL; pathp++) { \ int cmp = pathp->cmp = a_cmp(node, pathp->node); \ assert(cmp != 0); \ if (cmp < 0) { \ @@ -487,7 +488,8 @@ a_prefix##insert(a_rbt_type *rbtree, a_type *node) { \ rbtn_left_set(a_type, a_field, cnode, left); \ if (rbtn_red_get(a_type, a_field, left)) { \ a_type *leftleft = rbtn_left_get(a_type, a_field, left);\ - if (rbtn_red_get(a_type, a_field, leftleft)) { \ + if (leftleft != NULL && rbtn_red_get(a_type, a_field, \ + leftleft)) { \ /* Fix up 4-node. */ \ a_type *tnode; \ rbtn_black_set(a_type, a_field, leftleft); \ @@ -502,7 +504,8 @@ a_prefix##insert(a_rbt_type *rbtree, a_type *node) { \ rbtn_right_set(a_type, a_field, cnode, right); \ if (rbtn_red_get(a_type, a_field, right)) { \ a_type *left = rbtn_left_get(a_type, a_field, cnode); \ - if (rbtn_red_get(a_type, a_field, left)) { \ + if (left != NULL && rbtn_red_get(a_type, a_field, \ + left)) { \ /* Split 4-node. */ \ rbtn_black_set(a_type, a_field, left); \ rbtn_black_set(a_type, a_field, right); \ @@ -535,7 +538,7 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ /* Wind. */ \ nodep = NULL; /* Silence compiler warning. */ \ path->node = rbtree->rbt_root; \ - for (pathp = path; pathp->node != &rbtree->rbt_nil; pathp++) { \ + for (pathp = path; pathp->node != NULL; pathp++) { \ int cmp = pathp->cmp = a_cmp(node, pathp->node); \ if (cmp < 0) { \ pathp[1].node = rbtn_left_get(a_type, a_field, \ @@ -547,7 +550,7 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ /* Find node's successor, in preparation for swap. */ \ pathp->cmp = 1; \ nodep = pathp; \ - for (pathp++; pathp->node != &rbtree->rbt_nil; \ + for (pathp++; pathp->node != NULL; \ pathp++) { \ pathp->cmp = -1; \ pathp[1].node = rbtn_left_get(a_type, a_field, \ @@ -590,7 +593,7 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ } \ } else { \ a_type *left = rbtn_left_get(a_type, a_field, node); \ - if (left != &rbtree->rbt_nil) { \ + if (left != NULL) { \ /* node has no successor, but it has a left child. */\ /* Splice node out, without losing the left child. */\ assert(!rbtn_red_get(a_type, a_field, node)); \ @@ -610,33 +613,32 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ return; \ } else if (pathp == path) { \ /* The tree only contained one node. */ \ - rbtree->rbt_root = &rbtree->rbt_nil; \ + rbtree->rbt_root = NULL; \ return; \ } \ } \ if (rbtn_red_get(a_type, a_field, pathp->node)) { \ /* Prune red node, which requires no fixup. */ \ assert(pathp[-1].cmp < 0); \ - rbtn_left_set(a_type, a_field, pathp[-1].node, \ - &rbtree->rbt_nil); \ + rbtn_left_set(a_type, a_field, pathp[-1].node, NULL); \ return; \ } \ /* The node to be pruned is black, so unwind until balance is */\ /* restored. */\ - pathp->node = &rbtree->rbt_nil; \ + pathp->node = NULL; \ for (pathp--; (uintptr_t)pathp >= (uintptr_t)path; pathp--) { \ assert(pathp->cmp != 0); \ if (pathp->cmp < 0) { \ rbtn_left_set(a_type, a_field, pathp->node, \ pathp[1].node); \ - assert(!rbtn_red_get(a_type, a_field, pathp[1].node)); \ if (rbtn_red_get(a_type, a_field, pathp->node)) { \ a_type *right = rbtn_right_get(a_type, a_field, \ pathp->node); \ a_type *rightleft = rbtn_left_get(a_type, a_field, \ right); \ a_type *tnode; \ - if (rbtn_red_get(a_type, a_field, rightleft)) { \ + if (rightleft != NULL && rbtn_red_get(a_type, a_field, \ + rightleft)) { \ /* In the following diagrams, ||, //, and \\ */\ /* indicate the path to the removed node. */\ /* */\ @@ -679,7 +681,8 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ pathp->node); \ a_type *rightleft = rbtn_left_get(a_type, a_field, \ right); \ - if (rbtn_red_get(a_type, a_field, rightleft)) { \ + if (rightleft != NULL && rbtn_red_get(a_type, a_field, \ + rightleft)) { \ /* || */\ /* pathp(b) */\ /* // \ */\ @@ -733,7 +736,8 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ left); \ a_type *leftrightleft = rbtn_left_get(a_type, a_field, \ leftright); \ - if (rbtn_red_get(a_type, a_field, leftrightleft)) { \ + if (leftrightleft != NULL && rbtn_red_get(a_type, \ + a_field, leftrightleft)) { \ /* || */\ /* pathp(b) */\ /* / \\ */\ @@ -759,7 +763,7 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ /* (b) */\ /* / */\ /* (b) */\ - assert(leftright != &rbtree->rbt_nil); \ + assert(leftright != NULL); \ rbtn_red_set(a_type, a_field, leftright); \ rbtn_rotate_right(a_type, a_field, pathp->node, \ tnode); \ @@ -782,7 +786,8 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ return; \ } else if (rbtn_red_get(a_type, a_field, pathp->node)) { \ a_type *leftleft = rbtn_left_get(a_type, a_field, left);\ - if (rbtn_red_get(a_type, a_field, leftleft)) { \ + if (leftleft != NULL && rbtn_red_get(a_type, a_field, \ + leftleft)) { \ /* || */\ /* pathp(r) */\ /* / \\ */\ @@ -820,7 +825,8 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ } \ } else { \ a_type *leftleft = rbtn_left_get(a_type, a_field, left);\ - if (rbtn_red_get(a_type, a_field, leftleft)) { \ + if (leftleft != NULL && rbtn_red_get(a_type, a_field, \ + leftleft)) { \ /* || */\ /* pathp(b) */\ /* / \\ */\ @@ -866,13 +872,13 @@ a_prefix##remove(a_rbt_type *rbtree, a_type *node) { \ a_attr a_type * \ a_prefix##iter_recurse(a_rbt_type *rbtree, a_type *node, \ a_type *(*cb)(a_rbt_type *, a_type *, void *), void *arg) { \ - if (node == &rbtree->rbt_nil) { \ - return (&rbtree->rbt_nil); \ + if (node == NULL) { \ + return (NULL); \ } else { \ a_type *ret; \ if ((ret = a_prefix##iter_recurse(rbtree, rbtn_left_get(a_type, \ - a_field, node), cb, arg)) != &rbtree->rbt_nil \ - || (ret = cb(rbtree, node, arg)) != NULL) { \ + a_field, node), cb, arg)) != NULL || (ret = cb(rbtree, node, \ + arg)) != NULL) { \ return (ret); \ } \ return (a_prefix##iter_recurse(rbtree, rbtn_right_get(a_type, \ @@ -886,8 +892,8 @@ a_prefix##iter_start(a_rbt_type *rbtree, a_type *start, a_type *node, \ if (cmp < 0) { \ a_type *ret; \ if ((ret = a_prefix##iter_start(rbtree, start, \ - rbtn_left_get(a_type, a_field, node), cb, arg)) != \ - &rbtree->rbt_nil || (ret = cb(rbtree, node, arg)) != NULL) { \ + rbtn_left_get(a_type, a_field, node), cb, arg)) != NULL || \ + (ret = cb(rbtree, node, arg)) != NULL) { \ return (ret); \ } \ return (a_prefix##iter_recurse(rbtree, rbtn_right_get(a_type, \ @@ -914,21 +920,18 @@ a_prefix##iter(a_rbt_type *rbtree, a_type *start, a_type *(*cb)( \ } else { \ ret = a_prefix##iter_recurse(rbtree, rbtree->rbt_root, cb, arg);\ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = NULL; \ - } \ return (ret); \ } \ a_attr a_type * \ a_prefix##reverse_iter_recurse(a_rbt_type *rbtree, a_type *node, \ a_type *(*cb)(a_rbt_type *, a_type *, void *), void *arg) { \ - if (node == &rbtree->rbt_nil) { \ - return (&rbtree->rbt_nil); \ + if (node == NULL) { \ + return (NULL); \ } else { \ a_type *ret; \ if ((ret = a_prefix##reverse_iter_recurse(rbtree, \ - rbtn_right_get(a_type, a_field, node), cb, arg)) != \ - &rbtree->rbt_nil || (ret = cb(rbtree, node, arg)) != NULL) { \ + rbtn_right_get(a_type, a_field, node), cb, arg)) != NULL || \ + (ret = cb(rbtree, node, arg)) != NULL) { \ return (ret); \ } \ return (a_prefix##reverse_iter_recurse(rbtree, \ @@ -943,8 +946,8 @@ a_prefix##reverse_iter_start(a_rbt_type *rbtree, a_type *start, \ if (cmp > 0) { \ a_type *ret; \ if ((ret = a_prefix##reverse_iter_start(rbtree, start, \ - rbtn_right_get(a_type, a_field, node), cb, arg)) != \ - &rbtree->rbt_nil || (ret = cb(rbtree, node, arg)) != NULL) { \ + rbtn_right_get(a_type, a_field, node), cb, arg)) != NULL || \ + (ret = cb(rbtree, node, arg)) != NULL) { \ return (ret); \ } \ return (a_prefix##reverse_iter_recurse(rbtree, \ @@ -972,10 +975,29 @@ a_prefix##reverse_iter(a_rbt_type *rbtree, a_type *start, \ ret = a_prefix##reverse_iter_recurse(rbtree, rbtree->rbt_root, \ cb, arg); \ } \ - if (ret == &rbtree->rbt_nil) { \ - ret = NULL; \ - } \ return (ret); \ +} \ +a_attr void \ +a_prefix##destroy_recurse(a_rbt_type *rbtree, a_type *node, void (*cb)( \ + a_type *, void *), void *arg) { \ + if (node == NULL) { \ + return; \ + } \ + a_prefix##destroy_recurse(rbtree, rbtn_left_get(a_type, a_field, \ + node), cb, arg); \ + rbtn_left_set(a_type, a_field, (node), NULL); \ + a_prefix##destroy_recurse(rbtree, rbtn_right_get(a_type, a_field, \ + node), cb, arg); \ + rbtn_right_set(a_type, a_field, (node), NULL); \ + if (cb) { \ + cb(node, arg); \ + } \ +} \ +a_attr void \ +a_prefix##destroy(a_rbt_type *rbtree, void (*cb)(a_type *, void *), \ + void *arg) { \ + a_prefix##destroy_recurse(rbtree, rbtree->rbt_root, cb, arg); \ + rbtree->rbt_root = NULL; \ } #endif /* RB_H_ */ diff --git a/contrib/jemalloc/include/jemalloc/internal/size_classes.h b/contrib/jemalloc/include/jemalloc/internal/size_classes.h index 4ca58a1..615586d 100644 --- a/contrib/jemalloc/include/jemalloc/internal/size_classes.h +++ b/contrib/jemalloc/include/jemalloc/internal/size_classes.h @@ -166,22 +166,17 @@ SC(104, 30, 28, 1, no, no) \ SC(105, 30, 28, 2, no, no) \ SC(106, 30, 28, 3, no, no) \ - SC(107, 30, 28, 4, no, no) \ - \ - SC(108, 31, 29, 1, no, no) \ - SC(109, 31, 29, 2, no, no) \ - SC(110, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 39 -#define NSIZES 111 +#define NSIZES 107 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 13) @@ -320,22 +315,17 @@ SC(104, 30, 28, 1, no, no) \ SC(105, 30, 28, 2, no, no) \ SC(106, 30, 28, 3, no, no) \ - SC(107, 30, 28, 4, no, no) \ - \ - SC(108, 31, 29, 1, no, no) \ - SC(109, 31, 29, 2, no, no) \ - SC(110, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 43 -#define NSIZES 111 +#define NSIZES 107 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 14) @@ -474,22 +464,17 @@ SC(104, 30, 28, 1, no, no) \ SC(105, 30, 28, 2, no, no) \ SC(106, 30, 28, 3, no, no) \ - SC(107, 30, 28, 4, no, no) \ - \ - SC(108, 31, 29, 1, no, no) \ - SC(109, 31, 29, 2, no, no) \ - SC(110, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 47 -#define NSIZES 111 +#define NSIZES 107 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 16) @@ -628,22 +613,17 @@ SC(104, 30, 28, 1, no, no) \ SC(105, 30, 28, 2, no, no) \ SC(106, 30, 28, 3, no, no) \ - SC(107, 30, 28, 4, no, no) \ - \ - SC(108, 31, 29, 1, no, no) \ - SC(109, 31, 29, 2, no, no) \ - SC(110, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 55 -#define NSIZES 111 +#define NSIZES 107 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 12) @@ -779,22 +759,17 @@ SC(101, 30, 28, 1, no, no) \ SC(102, 30, 28, 2, no, no) \ SC(103, 30, 28, 3, no, no) \ - SC(104, 30, 28, 4, no, no) \ - \ - SC(105, 31, 29, 1, no, no) \ - SC(106, 31, 29, 2, no, no) \ - SC(107, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 36 -#define NSIZES 108 +#define NSIZES 104 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 13) @@ -930,22 +905,17 @@ SC(101, 30, 28, 1, no, no) \ SC(102, 30, 28, 2, no, no) \ SC(103, 30, 28, 3, no, no) \ - SC(104, 30, 28, 4, no, no) \ - \ - SC(105, 31, 29, 1, no, no) \ - SC(106, 31, 29, 2, no, no) \ - SC(107, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 40 -#define NSIZES 108 +#define NSIZES 104 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 14) @@ -1081,22 +1051,17 @@ SC(101, 30, 28, 1, no, no) \ SC(102, 30, 28, 2, no, no) \ SC(103, 30, 28, 3, no, no) \ - SC(104, 30, 28, 4, no, no) \ - \ - SC(105, 31, 29, 1, no, no) \ - SC(106, 31, 29, 2, no, no) \ - SC(107, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 44 -#define NSIZES 108 +#define NSIZES 104 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 16) @@ -1232,22 +1197,17 @@ SC(101, 30, 28, 1, no, no) \ SC(102, 30, 28, 2, no, no) \ SC(103, 30, 28, 3, no, no) \ - SC(104, 30, 28, 4, no, no) \ - \ - SC(105, 31, 29, 1, no, no) \ - SC(106, 31, 29, 2, no, no) \ - SC(107, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 52 -#define NSIZES 108 +#define NSIZES 104 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 12) @@ -1381,22 +1341,17 @@ SC(100, 30, 28, 1, no, no) \ SC(101, 30, 28, 2, no, no) \ SC(102, 30, 28, 3, no, no) \ - SC(103, 30, 28, 4, no, no) \ - \ - SC(104, 31, 29, 1, no, no) \ - SC(105, 31, 29, 2, no, no) \ - SC(106, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 35 -#define NSIZES 107 +#define NSIZES 103 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 13) @@ -1530,22 +1485,17 @@ SC(100, 30, 28, 1, no, no) \ SC(101, 30, 28, 2, no, no) \ SC(102, 30, 28, 3, no, no) \ - SC(103, 30, 28, 4, no, no) \ - \ - SC(104, 31, 29, 1, no, no) \ - SC(105, 31, 29, 2, no, no) \ - SC(106, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 39 -#define NSIZES 107 +#define NSIZES 103 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 14) @@ -1679,22 +1629,17 @@ SC(100, 30, 28, 1, no, no) \ SC(101, 30, 28, 2, no, no) \ SC(102, 30, 28, 3, no, no) \ - SC(103, 30, 28, 4, no, no) \ - \ - SC(104, 31, 29, 1, no, no) \ - SC(105, 31, 29, 2, no, no) \ - SC(106, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 43 -#define NSIZES 107 +#define NSIZES 103 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 2 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 16) @@ -1828,22 +1773,17 @@ SC(100, 30, 28, 1, no, no) \ SC(101, 30, 28, 2, no, no) \ SC(102, 30, 28, 3, no, no) \ - SC(103, 30, 28, 4, no, no) \ - \ - SC(104, 31, 29, 1, no, no) \ - SC(105, 31, 29, 2, no, no) \ - SC(106, 31, 29, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 51 -#define NSIZES 107 +#define NSIZES 103 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 31) + (((size_t)3) << 29)) +#define HUGE_MAXCLASS ((((size_t)1) << 30) + (((size_t)3) << 28)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 12) @@ -2142,22 +2082,17 @@ SC(232, 62, 60, 1, no, no) \ SC(233, 62, 60, 2, no, no) \ SC(234, 62, 60, 3, no, no) \ - SC(235, 62, 60, 4, no, no) \ - \ - SC(236, 63, 61, 1, no, no) \ - SC(237, 63, 61, 2, no, no) \ - SC(238, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 39 -#define NSIZES 239 +#define NSIZES 235 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 13) @@ -2456,22 +2391,17 @@ SC(232, 62, 60, 1, no, no) \ SC(233, 62, 60, 2, no, no) \ SC(234, 62, 60, 3, no, no) \ - SC(235, 62, 60, 4, no, no) \ - \ - SC(236, 63, 61, 1, no, no) \ - SC(237, 63, 61, 2, no, no) \ - SC(238, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 43 -#define NSIZES 239 +#define NSIZES 235 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 14) @@ -2770,22 +2700,17 @@ SC(232, 62, 60, 1, no, no) \ SC(233, 62, 60, 2, no, no) \ SC(234, 62, 60, 3, no, no) \ - SC(235, 62, 60, 4, no, no) \ - \ - SC(236, 63, 61, 1, no, no) \ - SC(237, 63, 61, 2, no, no) \ - SC(238, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 47 -#define NSIZES 239 +#define NSIZES 235 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 3 && LG_PAGE == 16) @@ -3084,22 +3009,17 @@ SC(232, 62, 60, 1, no, no) \ SC(233, 62, 60, 2, no, no) \ SC(234, 62, 60, 3, no, no) \ - SC(235, 62, 60, 4, no, no) \ - \ - SC(236, 63, 61, 1, no, no) \ - SC(237, 63, 61, 2, no, no) \ - SC(238, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 32 #define NBINS 55 -#define NSIZES 239 +#define NSIZES 235 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 12) @@ -3395,22 +3315,17 @@ SC(229, 62, 60, 1, no, no) \ SC(230, 62, 60, 2, no, no) \ SC(231, 62, 60, 3, no, no) \ - SC(232, 62, 60, 4, no, no) \ - \ - SC(233, 63, 61, 1, no, no) \ - SC(234, 63, 61, 2, no, no) \ - SC(235, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 36 -#define NSIZES 236 +#define NSIZES 232 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 13) @@ -3706,22 +3621,17 @@ SC(229, 62, 60, 1, no, no) \ SC(230, 62, 60, 2, no, no) \ SC(231, 62, 60, 3, no, no) \ - SC(232, 62, 60, 4, no, no) \ - \ - SC(233, 63, 61, 1, no, no) \ - SC(234, 63, 61, 2, no, no) \ - SC(235, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 40 -#define NSIZES 236 +#define NSIZES 232 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 14) @@ -4017,22 +3927,17 @@ SC(229, 62, 60, 1, no, no) \ SC(230, 62, 60, 2, no, no) \ SC(231, 62, 60, 3, no, no) \ - SC(232, 62, 60, 4, no, no) \ - \ - SC(233, 63, 61, 1, no, no) \ - SC(234, 63, 61, 2, no, no) \ - SC(235, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 44 -#define NSIZES 236 +#define NSIZES 232 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 3 && LG_QUANTUM == 4 && LG_PAGE == 16) @@ -4328,22 +4233,17 @@ SC(229, 62, 60, 1, no, no) \ SC(230, 62, 60, 2, no, no) \ SC(231, 62, 60, 3, no, no) \ - SC(232, 62, 60, 4, no, no) \ - \ - SC(233, 63, 61, 1, no, no) \ - SC(234, 63, 61, 2, no, no) \ - SC(235, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 1 #define NLBINS 29 #define NBINS 52 -#define NSIZES 236 +#define NSIZES 232 #define LG_TINY_MAXCLASS 3 #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 12) @@ -4637,22 +4537,17 @@ SC(228, 62, 60, 1, no, no) \ SC(229, 62, 60, 2, no, no) \ SC(230, 62, 60, 3, no, no) \ - SC(231, 62, 60, 4, no, no) \ - \ - SC(232, 63, 61, 1, no, no) \ - SC(233, 63, 61, 2, no, no) \ - SC(234, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 35 -#define NSIZES 235 +#define NSIZES 231 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 13) + (((size_t)3) << 11)) #define LG_LARGE_MINCLASS 14 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 13) @@ -4946,22 +4841,17 @@ SC(228, 62, 60, 1, no, no) \ SC(229, 62, 60, 2, no, no) \ SC(230, 62, 60, 3, no, no) \ - SC(231, 62, 60, 4, no, no) \ - \ - SC(232, 63, 61, 1, no, no) \ - SC(233, 63, 61, 2, no, no) \ - SC(234, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 39 -#define NSIZES 235 +#define NSIZES 231 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 14) + (((size_t)3) << 12)) #define LG_LARGE_MINCLASS 15 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 14) @@ -5255,22 +5145,17 @@ SC(228, 62, 60, 1, no, no) \ SC(229, 62, 60, 2, no, no) \ SC(230, 62, 60, 3, no, no) \ - SC(231, 62, 60, 4, no, no) \ - \ - SC(232, 63, 61, 1, no, no) \ - SC(233, 63, 61, 2, no, no) \ - SC(234, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 43 -#define NSIZES 235 +#define NSIZES 231 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 15) + (((size_t)3) << 13)) #define LG_LARGE_MINCLASS 16 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #if (LG_SIZEOF_PTR == 3 && LG_TINY_MIN == 4 && LG_QUANTUM == 4 && LG_PAGE == 16) @@ -5564,22 +5449,17 @@ SC(228, 62, 60, 1, no, no) \ SC(229, 62, 60, 2, no, no) \ SC(230, 62, 60, 3, no, no) \ - SC(231, 62, 60, 4, no, no) \ - \ - SC(232, 63, 61, 1, no, no) \ - SC(233, 63, 61, 2, no, no) \ - SC(234, 63, 61, 3, no, no) \ #define SIZE_CLASSES_DEFINED #define NTBINS 0 #define NLBINS 28 #define NBINS 51 -#define NSIZES 235 +#define NSIZES 231 #define LG_TINY_MAXCLASS "NA" #define LOOKUP_MAXCLASS ((((size_t)1) << 11) + (((size_t)4) << 9)) #define SMALL_MAXCLASS ((((size_t)1) << 17) + (((size_t)3) << 15)) #define LG_LARGE_MINCLASS 18 -#define HUGE_MAXCLASS ((((size_t)1) << 63) + (((size_t)3) << 61)) +#define HUGE_MAXCLASS ((((size_t)1) << 62) + (((size_t)3) << 60)) #endif #ifndef SIZE_CLASSES_DEFINED diff --git a/contrib/jemalloc/include/jemalloc/internal/smoothstep.h b/contrib/jemalloc/include/jemalloc/internal/smoothstep.h new file mode 100644 index 0000000..c5333cc --- /dev/null +++ b/contrib/jemalloc/include/jemalloc/internal/smoothstep.h @@ -0,0 +1,246 @@ +/* + * This file was generated by the following command: + * sh smoothstep.sh smoother 200 24 3 15 + */ +/******************************************************************************/ +#ifdef JEMALLOC_H_TYPES + +/* + * This header defines a precomputed table based on the smoothstep family of + * sigmoidal curves (https://en.wikipedia.org/wiki/Smoothstep) that grow from 0 + * to 1 in 0 <= x <= 1. The table is stored as integer fixed point values so + * that floating point math can be avoided. + * + * 3 2 + * smoothstep(x) = -2x + 3x + * + * 5 4 3 + * smootherstep(x) = 6x - 15x + 10x + * + * 7 6 5 4 + * smootheststep(x) = -20x + 70x - 84x + 35x + */ + +#define SMOOTHSTEP_VARIANT "smoother" +#define SMOOTHSTEP_NSTEPS 200 +#define SMOOTHSTEP_BFP 24 +#define SMOOTHSTEP \ + /* STEP(step, h, x, y) */ \ + STEP( 1, UINT64_C(0x0000000000000014), 0.005, 0.000001240643750) \ + STEP( 2, UINT64_C(0x00000000000000a5), 0.010, 0.000009850600000) \ + STEP( 3, UINT64_C(0x0000000000000229), 0.015, 0.000032995181250) \ + STEP( 4, UINT64_C(0x0000000000000516), 0.020, 0.000077619200000) \ + STEP( 5, UINT64_C(0x00000000000009dc), 0.025, 0.000150449218750) \ + STEP( 6, UINT64_C(0x00000000000010e8), 0.030, 0.000257995800000) \ + STEP( 7, UINT64_C(0x0000000000001aa4), 0.035, 0.000406555756250) \ + STEP( 8, UINT64_C(0x0000000000002777), 0.040, 0.000602214400000) \ + STEP( 9, UINT64_C(0x00000000000037c2), 0.045, 0.000850847793750) \ + STEP( 10, UINT64_C(0x0000000000004be6), 0.050, 0.001158125000000) \ + STEP( 11, UINT64_C(0x000000000000643c), 0.055, 0.001529510331250) \ + STEP( 12, UINT64_C(0x000000000000811f), 0.060, 0.001970265600000) \ + STEP( 13, UINT64_C(0x000000000000a2e2), 0.065, 0.002485452368750) \ + STEP( 14, UINT64_C(0x000000000000c9d8), 0.070, 0.003079934200000) \ + STEP( 15, UINT64_C(0x000000000000f64f), 0.075, 0.003758378906250) \ + STEP( 16, UINT64_C(0x0000000000012891), 0.080, 0.004525260800000) \ + STEP( 17, UINT64_C(0x00000000000160e7), 0.085, 0.005384862943750) \ + STEP( 18, UINT64_C(0x0000000000019f95), 0.090, 0.006341279400000) \ + STEP( 19, UINT64_C(0x000000000001e4dc), 0.095, 0.007398417481250) \ + STEP( 20, UINT64_C(0x00000000000230fc), 0.100, 0.008560000000000) \ + STEP( 21, UINT64_C(0x0000000000028430), 0.105, 0.009829567518750) \ + STEP( 22, UINT64_C(0x000000000002deb0), 0.110, 0.011210480600000) \ + STEP( 23, UINT64_C(0x00000000000340b1), 0.115, 0.012705922056250) \ + STEP( 24, UINT64_C(0x000000000003aa67), 0.120, 0.014318899200000) \ + STEP( 25, UINT64_C(0x0000000000041c00), 0.125, 0.016052246093750) \ + STEP( 26, UINT64_C(0x00000000000495a8), 0.130, 0.017908625800000) \ + STEP( 27, UINT64_C(0x000000000005178b), 0.135, 0.019890532631250) \ + STEP( 28, UINT64_C(0x000000000005a1cf), 0.140, 0.022000294400000) \ + STEP( 29, UINT64_C(0x0000000000063498), 0.145, 0.024240074668750) \ + STEP( 30, UINT64_C(0x000000000006d009), 0.150, 0.026611875000000) \ + STEP( 31, UINT64_C(0x000000000007743f), 0.155, 0.029117537206250) \ + STEP( 32, UINT64_C(0x0000000000082157), 0.160, 0.031758745600000) \ + STEP( 33, UINT64_C(0x000000000008d76b), 0.165, 0.034537029243750) \ + STEP( 34, UINT64_C(0x0000000000099691), 0.170, 0.037453764200000) \ + STEP( 35, UINT64_C(0x00000000000a5edf), 0.175, 0.040510175781250) \ + STEP( 36, UINT64_C(0x00000000000b3067), 0.180, 0.043707340800000) \ + STEP( 37, UINT64_C(0x00000000000c0b38), 0.185, 0.047046189818750) \ + STEP( 38, UINT64_C(0x00000000000cef5e), 0.190, 0.050527509400000) \ + STEP( 39, UINT64_C(0x00000000000ddce6), 0.195, 0.054151944356250) \ + STEP( 40, UINT64_C(0x00000000000ed3d8), 0.200, 0.057920000000000) \ + STEP( 41, UINT64_C(0x00000000000fd439), 0.205, 0.061832044393750) \ + STEP( 42, UINT64_C(0x000000000010de0e), 0.210, 0.065888310600000) \ + STEP( 43, UINT64_C(0x000000000011f158), 0.215, 0.070088898931250) \ + STEP( 44, UINT64_C(0x0000000000130e17), 0.220, 0.074433779200000) \ + STEP( 45, UINT64_C(0x0000000000143448), 0.225, 0.078922792968750) \ + STEP( 46, UINT64_C(0x00000000001563e7), 0.230, 0.083555655800000) \ + STEP( 47, UINT64_C(0x0000000000169cec), 0.235, 0.088331959506250) \ + STEP( 48, UINT64_C(0x000000000017df4f), 0.240, 0.093251174400000) \ + STEP( 49, UINT64_C(0x0000000000192b04), 0.245, 0.098312651543750) \ + STEP( 50, UINT64_C(0x00000000001a8000), 0.250, 0.103515625000000) \ + STEP( 51, UINT64_C(0x00000000001bde32), 0.255, 0.108859214081250) \ + STEP( 52, UINT64_C(0x00000000001d458b), 0.260, 0.114342425600000) \ + STEP( 53, UINT64_C(0x00000000001eb5f8), 0.265, 0.119964156118750) \ + STEP( 54, UINT64_C(0x0000000000202f65), 0.270, 0.125723194200000) \ + STEP( 55, UINT64_C(0x000000000021b1bb), 0.275, 0.131618222656250) \ + STEP( 56, UINT64_C(0x0000000000233ce3), 0.280, 0.137647820800000) \ + STEP( 57, UINT64_C(0x000000000024d0c3), 0.285, 0.143810466693750) \ + STEP( 58, UINT64_C(0x0000000000266d40), 0.290, 0.150104539400000) \ + STEP( 59, UINT64_C(0x000000000028123d), 0.295, 0.156528321231250) \ + STEP( 60, UINT64_C(0x000000000029bf9c), 0.300, 0.163080000000000) \ + STEP( 61, UINT64_C(0x00000000002b753d), 0.305, 0.169757671268750) \ + STEP( 62, UINT64_C(0x00000000002d32fe), 0.310, 0.176559340600000) \ + STEP( 63, UINT64_C(0x00000000002ef8bc), 0.315, 0.183482925806250) \ + STEP( 64, UINT64_C(0x000000000030c654), 0.320, 0.190526259200000) \ + STEP( 65, UINT64_C(0x0000000000329b9f), 0.325, 0.197687089843750) \ + STEP( 66, UINT64_C(0x0000000000347875), 0.330, 0.204963085800000) \ + STEP( 67, UINT64_C(0x0000000000365cb0), 0.335, 0.212351836381250) \ + STEP( 68, UINT64_C(0x0000000000384825), 0.340, 0.219850854400000) \ + STEP( 69, UINT64_C(0x00000000003a3aa8), 0.345, 0.227457578418750) \ + STEP( 70, UINT64_C(0x00000000003c340f), 0.350, 0.235169375000000) \ + STEP( 71, UINT64_C(0x00000000003e342b), 0.355, 0.242983540956250) \ + STEP( 72, UINT64_C(0x0000000000403ace), 0.360, 0.250897305600000) \ + STEP( 73, UINT64_C(0x00000000004247c8), 0.365, 0.258907832993750) \ + STEP( 74, UINT64_C(0x0000000000445ae9), 0.370, 0.267012224200000) \ + STEP( 75, UINT64_C(0x0000000000467400), 0.375, 0.275207519531250) \ + STEP( 76, UINT64_C(0x00000000004892d8), 0.380, 0.283490700800000) \ + STEP( 77, UINT64_C(0x00000000004ab740), 0.385, 0.291858693568750) \ + STEP( 78, UINT64_C(0x00000000004ce102), 0.390, 0.300308369400000) \ + STEP( 79, UINT64_C(0x00000000004f0fe9), 0.395, 0.308836548106250) \ + STEP( 80, UINT64_C(0x00000000005143bf), 0.400, 0.317440000000000) \ + STEP( 81, UINT64_C(0x0000000000537c4d), 0.405, 0.326115448143750) \ + STEP( 82, UINT64_C(0x000000000055b95b), 0.410, 0.334859570600000) \ + STEP( 83, UINT64_C(0x000000000057fab1), 0.415, 0.343669002681250) \ + STEP( 84, UINT64_C(0x00000000005a4015), 0.420, 0.352540339200000) \ + STEP( 85, UINT64_C(0x00000000005c894e), 0.425, 0.361470136718750) \ + STEP( 86, UINT64_C(0x00000000005ed622), 0.430, 0.370454915800000) \ + STEP( 87, UINT64_C(0x0000000000612655), 0.435, 0.379491163256250) \ + STEP( 88, UINT64_C(0x00000000006379ac), 0.440, 0.388575334400000) \ + STEP( 89, UINT64_C(0x000000000065cfeb), 0.445, 0.397703855293750) \ + STEP( 90, UINT64_C(0x00000000006828d6), 0.450, 0.406873125000000) \ + STEP( 91, UINT64_C(0x00000000006a842f), 0.455, 0.416079517831250) \ + STEP( 92, UINT64_C(0x00000000006ce1bb), 0.460, 0.425319385600000) \ + STEP( 93, UINT64_C(0x00000000006f413a), 0.465, 0.434589059868750) \ + STEP( 94, UINT64_C(0x000000000071a270), 0.470, 0.443884854200000) \ + STEP( 95, UINT64_C(0x000000000074051d), 0.475, 0.453203066406250) \ + STEP( 96, UINT64_C(0x0000000000766905), 0.480, 0.462539980800000) \ + STEP( 97, UINT64_C(0x000000000078cde7), 0.485, 0.471891870443750) \ + STEP( 98, UINT64_C(0x00000000007b3387), 0.490, 0.481254999400000) \ + STEP( 99, UINT64_C(0x00000000007d99a4), 0.495, 0.490625624981250) \ + STEP( 100, UINT64_C(0x0000000000800000), 0.500, 0.500000000000000) \ + STEP( 101, UINT64_C(0x000000000082665b), 0.505, 0.509374375018750) \ + STEP( 102, UINT64_C(0x000000000084cc78), 0.510, 0.518745000600000) \ + STEP( 103, UINT64_C(0x0000000000873218), 0.515, 0.528108129556250) \ + STEP( 104, UINT64_C(0x00000000008996fa), 0.520, 0.537460019200000) \ + STEP( 105, UINT64_C(0x00000000008bfae2), 0.525, 0.546796933593750) \ + STEP( 106, UINT64_C(0x00000000008e5d8f), 0.530, 0.556115145800000) \ + STEP( 107, UINT64_C(0x000000000090bec5), 0.535, 0.565410940131250) \ + STEP( 108, UINT64_C(0x0000000000931e44), 0.540, 0.574680614400000) \ + STEP( 109, UINT64_C(0x0000000000957bd0), 0.545, 0.583920482168750) \ + STEP( 110, UINT64_C(0x000000000097d729), 0.550, 0.593126875000000) \ + STEP( 111, UINT64_C(0x00000000009a3014), 0.555, 0.602296144706250) \ + STEP( 112, UINT64_C(0x00000000009c8653), 0.560, 0.611424665600000) \ + STEP( 113, UINT64_C(0x00000000009ed9aa), 0.565, 0.620508836743750) \ + STEP( 114, UINT64_C(0x0000000000a129dd), 0.570, 0.629545084200000) \ + STEP( 115, UINT64_C(0x0000000000a376b1), 0.575, 0.638529863281250) \ + STEP( 116, UINT64_C(0x0000000000a5bfea), 0.580, 0.647459660800000) \ + STEP( 117, UINT64_C(0x0000000000a8054e), 0.585, 0.656330997318750) \ + STEP( 118, UINT64_C(0x0000000000aa46a4), 0.590, 0.665140429400000) \ + STEP( 119, UINT64_C(0x0000000000ac83b2), 0.595, 0.673884551856250) \ + STEP( 120, UINT64_C(0x0000000000aebc40), 0.600, 0.682560000000000) \ + STEP( 121, UINT64_C(0x0000000000b0f016), 0.605, 0.691163451893750) \ + STEP( 122, UINT64_C(0x0000000000b31efd), 0.610, 0.699691630600000) \ + STEP( 123, UINT64_C(0x0000000000b548bf), 0.615, 0.708141306431250) \ + STEP( 124, UINT64_C(0x0000000000b76d27), 0.620, 0.716509299200000) \ + STEP( 125, UINT64_C(0x0000000000b98c00), 0.625, 0.724792480468750) \ + STEP( 126, UINT64_C(0x0000000000bba516), 0.630, 0.732987775800000) \ + STEP( 127, UINT64_C(0x0000000000bdb837), 0.635, 0.741092167006250) \ + STEP( 128, UINT64_C(0x0000000000bfc531), 0.640, 0.749102694400000) \ + STEP( 129, UINT64_C(0x0000000000c1cbd4), 0.645, 0.757016459043750) \ + STEP( 130, UINT64_C(0x0000000000c3cbf0), 0.650, 0.764830625000000) \ + STEP( 131, UINT64_C(0x0000000000c5c557), 0.655, 0.772542421581250) \ + STEP( 132, UINT64_C(0x0000000000c7b7da), 0.660, 0.780149145600000) \ + STEP( 133, UINT64_C(0x0000000000c9a34f), 0.665, 0.787648163618750) \ + STEP( 134, UINT64_C(0x0000000000cb878a), 0.670, 0.795036914200000) \ + STEP( 135, UINT64_C(0x0000000000cd6460), 0.675, 0.802312910156250) \ + STEP( 136, UINT64_C(0x0000000000cf39ab), 0.680, 0.809473740800000) \ + STEP( 137, UINT64_C(0x0000000000d10743), 0.685, 0.816517074193750) \ + STEP( 138, UINT64_C(0x0000000000d2cd01), 0.690, 0.823440659400000) \ + STEP( 139, UINT64_C(0x0000000000d48ac2), 0.695, 0.830242328731250) \ + STEP( 140, UINT64_C(0x0000000000d64063), 0.700, 0.836920000000000) \ + STEP( 141, UINT64_C(0x0000000000d7edc2), 0.705, 0.843471678768750) \ + STEP( 142, UINT64_C(0x0000000000d992bf), 0.710, 0.849895460600000) \ + STEP( 143, UINT64_C(0x0000000000db2f3c), 0.715, 0.856189533306250) \ + STEP( 144, UINT64_C(0x0000000000dcc31c), 0.720, 0.862352179200000) \ + STEP( 145, UINT64_C(0x0000000000de4e44), 0.725, 0.868381777343750) \ + STEP( 146, UINT64_C(0x0000000000dfd09a), 0.730, 0.874276805800000) \ + STEP( 147, UINT64_C(0x0000000000e14a07), 0.735, 0.880035843881250) \ + STEP( 148, UINT64_C(0x0000000000e2ba74), 0.740, 0.885657574400000) \ + STEP( 149, UINT64_C(0x0000000000e421cd), 0.745, 0.891140785918750) \ + STEP( 150, UINT64_C(0x0000000000e58000), 0.750, 0.896484375000000) \ + STEP( 151, UINT64_C(0x0000000000e6d4fb), 0.755, 0.901687348456250) \ + STEP( 152, UINT64_C(0x0000000000e820b0), 0.760, 0.906748825600000) \ + STEP( 153, UINT64_C(0x0000000000e96313), 0.765, 0.911668040493750) \ + STEP( 154, UINT64_C(0x0000000000ea9c18), 0.770, 0.916444344200000) \ + STEP( 155, UINT64_C(0x0000000000ebcbb7), 0.775, 0.921077207031250) \ + STEP( 156, UINT64_C(0x0000000000ecf1e8), 0.780, 0.925566220800000) \ + STEP( 157, UINT64_C(0x0000000000ee0ea7), 0.785, 0.929911101068750) \ + STEP( 158, UINT64_C(0x0000000000ef21f1), 0.790, 0.934111689400000) \ + STEP( 159, UINT64_C(0x0000000000f02bc6), 0.795, 0.938167955606250) \ + STEP( 160, UINT64_C(0x0000000000f12c27), 0.800, 0.942080000000000) \ + STEP( 161, UINT64_C(0x0000000000f22319), 0.805, 0.945848055643750) \ + STEP( 162, UINT64_C(0x0000000000f310a1), 0.810, 0.949472490600000) \ + STEP( 163, UINT64_C(0x0000000000f3f4c7), 0.815, 0.952953810181250) \ + STEP( 164, UINT64_C(0x0000000000f4cf98), 0.820, 0.956292659200000) \ + STEP( 165, UINT64_C(0x0000000000f5a120), 0.825, 0.959489824218750) \ + STEP( 166, UINT64_C(0x0000000000f6696e), 0.830, 0.962546235800000) \ + STEP( 167, UINT64_C(0x0000000000f72894), 0.835, 0.965462970756250) \ + STEP( 168, UINT64_C(0x0000000000f7dea8), 0.840, 0.968241254400000) \ + STEP( 169, UINT64_C(0x0000000000f88bc0), 0.845, 0.970882462793750) \ + STEP( 170, UINT64_C(0x0000000000f92ff6), 0.850, 0.973388125000000) \ + STEP( 171, UINT64_C(0x0000000000f9cb67), 0.855, 0.975759925331250) \ + STEP( 172, UINT64_C(0x0000000000fa5e30), 0.860, 0.977999705600000) \ + STEP( 173, UINT64_C(0x0000000000fae874), 0.865, 0.980109467368750) \ + STEP( 174, UINT64_C(0x0000000000fb6a57), 0.870, 0.982091374200000) \ + STEP( 175, UINT64_C(0x0000000000fbe400), 0.875, 0.983947753906250) \ + STEP( 176, UINT64_C(0x0000000000fc5598), 0.880, 0.985681100800000) \ + STEP( 177, UINT64_C(0x0000000000fcbf4e), 0.885, 0.987294077943750) \ + STEP( 178, UINT64_C(0x0000000000fd214f), 0.890, 0.988789519400000) \ + STEP( 179, UINT64_C(0x0000000000fd7bcf), 0.895, 0.990170432481250) \ + STEP( 180, UINT64_C(0x0000000000fdcf03), 0.900, 0.991440000000000) \ + STEP( 181, UINT64_C(0x0000000000fe1b23), 0.905, 0.992601582518750) \ + STEP( 182, UINT64_C(0x0000000000fe606a), 0.910, 0.993658720600000) \ + STEP( 183, UINT64_C(0x0000000000fe9f18), 0.915, 0.994615137056250) \ + STEP( 184, UINT64_C(0x0000000000fed76e), 0.920, 0.995474739200000) \ + STEP( 185, UINT64_C(0x0000000000ff09b0), 0.925, 0.996241621093750) \ + STEP( 186, UINT64_C(0x0000000000ff3627), 0.930, 0.996920065800000) \ + STEP( 187, UINT64_C(0x0000000000ff5d1d), 0.935, 0.997514547631250) \ + STEP( 188, UINT64_C(0x0000000000ff7ee0), 0.940, 0.998029734400000) \ + STEP( 189, UINT64_C(0x0000000000ff9bc3), 0.945, 0.998470489668750) \ + STEP( 190, UINT64_C(0x0000000000ffb419), 0.950, 0.998841875000000) \ + STEP( 191, UINT64_C(0x0000000000ffc83d), 0.955, 0.999149152206250) \ + STEP( 192, UINT64_C(0x0000000000ffd888), 0.960, 0.999397785600000) \ + STEP( 193, UINT64_C(0x0000000000ffe55b), 0.965, 0.999593444243750) \ + STEP( 194, UINT64_C(0x0000000000ffef17), 0.970, 0.999742004200000) \ + STEP( 195, UINT64_C(0x0000000000fff623), 0.975, 0.999849550781250) \ + STEP( 196, UINT64_C(0x0000000000fffae9), 0.980, 0.999922380800000) \ + STEP( 197, UINT64_C(0x0000000000fffdd6), 0.985, 0.999967004818750) \ + STEP( 198, UINT64_C(0x0000000000ffff5a), 0.990, 0.999990149400000) \ + STEP( 199, UINT64_C(0x0000000000ffffeb), 0.995, 0.999998759356250) \ + STEP( 200, UINT64_C(0x0000000001000000), 1.000, 1.000000000000000) \ + +#endif /* JEMALLOC_H_TYPES */ +/******************************************************************************/ +#ifdef JEMALLOC_H_STRUCTS + + +#endif /* JEMALLOC_H_STRUCTS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_EXTERNS + + +#endif /* JEMALLOC_H_EXTERNS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_INLINES + + +#endif /* JEMALLOC_H_INLINES */ +/******************************************************************************/ diff --git a/contrib/jemalloc/include/jemalloc/internal/stats.h b/contrib/jemalloc/include/jemalloc/internal/stats.h index c91dba9..705903a 100644 --- a/contrib/jemalloc/include/jemalloc/internal/stats.h +++ b/contrib/jemalloc/include/jemalloc/internal/stats.h @@ -167,15 +167,25 @@ stats_cactive_get(void) JEMALLOC_INLINE void stats_cactive_add(size_t size) { + UNUSED size_t cactive; - atomic_add_z(&stats_cactive, size); + assert(size > 0); + assert((size & chunksize_mask) == 0); + + cactive = atomic_add_z(&stats_cactive, size); + assert(cactive - size < cactive); } JEMALLOC_INLINE void stats_cactive_sub(size_t size) { + UNUSED size_t cactive; + + assert(size > 0); + assert((size & chunksize_mask) == 0); - atomic_sub_z(&stats_cactive, size); + cactive = atomic_sub_z(&stats_cactive, size); + assert(cactive + size > cactive); } #endif diff --git a/contrib/jemalloc/include/jemalloc/internal/tcache.h b/contrib/jemalloc/include/jemalloc/internal/tcache.h index 5079cd2..8357820 100644 --- a/contrib/jemalloc/include/jemalloc/internal/tcache.h +++ b/contrib/jemalloc/include/jemalloc/internal/tcache.h @@ -70,13 +70,20 @@ struct tcache_bin_s { int low_water; /* Min # cached since last GC. */ unsigned lg_fill_div; /* Fill (ncached_max >> lg_fill_div). */ unsigned ncached; /* # of cached objects. */ + /* + * To make use of adjacent cacheline prefetch, the items in the avail + * stack goes to higher address for newer allocations. avail points + * just above the available space, which means that + * avail[-ncached, ... -1] are available items and the lowest item will + * be allocated first. + */ void **avail; /* Stack of available objects. */ }; struct tcache_s { ql_elm(tcache_t) link; /* Used for aggregating stats. */ uint64_t prof_accumbytes;/* Cleared after arena_prof_accum(). */ - unsigned ev_cnt; /* Event count since incremental GC. */ + ticker_t gc_ticker; /* Drives incremental GC. */ szind_t next_gc_bin; /* Next bin to GC. */ tcache_bin_t tbins[1]; /* Dynamically sized. */ /* @@ -108,7 +115,7 @@ extern tcache_bin_info_t *tcache_bin_info; * Number of tcache bins. There are NBINS small-object bins, plus 0 or more * large-object bins. */ -extern size_t nhbins; +extern unsigned nhbins; /* Maximum cached size class. */ extern size_t tcache_maxclass; @@ -126,7 +133,7 @@ extern tcaches_t *tcaches; size_t tcache_salloc(const void *ptr); void tcache_event_hard(tsd_t *tsd, tcache_t *tcache); void *tcache_alloc_small_hard(tsd_t *tsd, arena_t *arena, tcache_t *tcache, - tcache_bin_t *tbin, szind_t binind); + tcache_bin_t *tbin, szind_t binind, bool *tcache_success); void tcache_bin_flush_small(tsd_t *tsd, tcache_t *tcache, tcache_bin_t *tbin, szind_t binind, unsigned rem); void tcache_bin_flush_large(tsd_t *tsd, tcache_bin_t *tbin, szind_t binind, @@ -155,15 +162,15 @@ void tcache_flush(void); bool tcache_enabled_get(void); tcache_t *tcache_get(tsd_t *tsd, bool create); void tcache_enabled_set(bool enabled); -void *tcache_alloc_easy(tcache_bin_t *tbin); +void *tcache_alloc_easy(tcache_bin_t *tbin, bool *tcache_success); void *tcache_alloc_small(tsd_t *tsd, arena_t *arena, tcache_t *tcache, - size_t size, bool zero); + size_t size, szind_t ind, bool zero, bool slow_path); void *tcache_alloc_large(tsd_t *tsd, arena_t *arena, tcache_t *tcache, - size_t size, bool zero); + size_t size, szind_t ind, bool zero, bool slow_path); void tcache_dalloc_small(tsd_t *tsd, tcache_t *tcache, void *ptr, - szind_t binind); + szind_t binind, bool slow_path); void tcache_dalloc_large(tsd_t *tsd, tcache_t *tcache, void *ptr, - size_t size); + size_t size, bool slow_path); tcache_t *tcaches_get(tsd_t *tsd, unsigned ind); #endif @@ -240,51 +247,74 @@ tcache_event(tsd_t *tsd, tcache_t *tcache) if (TCACHE_GC_INCR == 0) return; - tcache->ev_cnt++; - assert(tcache->ev_cnt <= TCACHE_GC_INCR); - if (unlikely(tcache->ev_cnt == TCACHE_GC_INCR)) + if (unlikely(ticker_tick(&tcache->gc_ticker))) tcache_event_hard(tsd, tcache); } JEMALLOC_ALWAYS_INLINE void * -tcache_alloc_easy(tcache_bin_t *tbin) +tcache_alloc_easy(tcache_bin_t *tbin, bool *tcache_success) { void *ret; if (unlikely(tbin->ncached == 0)) { tbin->low_water = -1; + *tcache_success = false; return (NULL); } + /* + * tcache_success (instead of ret) should be checked upon the return of + * this function. We avoid checking (ret == NULL) because there is + * never a null stored on the avail stack (which is unknown to the + * compiler), and eagerly checking ret would cause pipeline stall + * (waiting for the cacheline). + */ + *tcache_success = true; + ret = *(tbin->avail - tbin->ncached); tbin->ncached--; + if (unlikely((int)tbin->ncached < tbin->low_water)) tbin->low_water = tbin->ncached; - ret = tbin->avail[tbin->ncached]; + return (ret); } JEMALLOC_ALWAYS_INLINE void * tcache_alloc_small(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, - bool zero) + szind_t binind, bool zero, bool slow_path) { void *ret; - szind_t binind; - size_t usize; tcache_bin_t *tbin; + bool tcache_success; + size_t usize JEMALLOC_CC_SILENCE_INIT(0); - binind = size2index(size); assert(binind < NBINS); tbin = &tcache->tbins[binind]; - usize = index2size(binind); - ret = tcache_alloc_easy(tbin); - if (unlikely(ret == NULL)) { - ret = tcache_alloc_small_hard(tsd, arena, tcache, tbin, binind); - if (ret == NULL) + ret = tcache_alloc_easy(tbin, &tcache_success); + assert(tcache_success == (ret != NULL)); + if (unlikely(!tcache_success)) { + bool tcache_hard_success; + arena = arena_choose(tsd, arena); + if (unlikely(arena == NULL)) + return (NULL); + + ret = tcache_alloc_small_hard(tsd, arena, tcache, tbin, binind, + &tcache_hard_success); + if (tcache_hard_success == false) return (NULL); } - assert(tcache_salloc(ret) == usize); + + assert(ret); + /* + * Only compute usize if required. The checks in the following if + * statement are all static. + */ + if (config_prof || (slow_path && config_fill) || unlikely(zero)) { + usize = index2size(binind); + assert(tcache_salloc(ret) == usize); + } if (likely(!zero)) { - if (config_fill) { + if (slow_path && config_fill) { if (unlikely(opt_junk_alloc)) { arena_alloc_junk_small(ret, &arena_bin_info[binind], false); @@ -292,7 +322,7 @@ tcache_alloc_small(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, memset(ret, 0, usize); } } else { - if (config_fill && unlikely(opt_junk_alloc)) { + if (slow_path && config_fill && unlikely(opt_junk_alloc)) { arena_alloc_junk_small(ret, &arena_bin_info[binind], true); } @@ -309,28 +339,38 @@ tcache_alloc_small(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, JEMALLOC_ALWAYS_INLINE void * tcache_alloc_large(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, - bool zero) + szind_t binind, bool zero, bool slow_path) { void *ret; - szind_t binind; - size_t usize; tcache_bin_t *tbin; + bool tcache_success; - binind = size2index(size); - usize = index2size(binind); - assert(usize <= tcache_maxclass); assert(binind < nhbins); tbin = &tcache->tbins[binind]; - ret = tcache_alloc_easy(tbin); - if (unlikely(ret == NULL)) { + ret = tcache_alloc_easy(tbin, &tcache_success); + assert(tcache_success == (ret != NULL)); + if (unlikely(!tcache_success)) { /* * Only allocate one large object at a time, because it's quite * expensive to create one and not use it. */ - ret = arena_malloc_large(arena, usize, zero); + arena = arena_choose(tsd, arena); + if (unlikely(arena == NULL)) + return (NULL); + + ret = arena_malloc_large(tsd, arena, binind, zero); if (ret == NULL) return (NULL); } else { + size_t usize JEMALLOC_CC_SILENCE_INIT(0); + + /* Only compute usize on demand */ + if (config_prof || (slow_path && config_fill) || + unlikely(zero)) { + usize = index2size(binind); + assert(usize <= tcache_maxclass); + } + if (config_prof && usize == LARGE_MINCLASS) { arena_chunk_t *chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ret); @@ -340,7 +380,7 @@ tcache_alloc_large(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, BININD_INVALID); } if (likely(!zero)) { - if (config_fill) { + if (slow_path && config_fill) { if (unlikely(opt_junk_alloc)) memset(ret, 0xa5, usize); else if (unlikely(opt_zero)) @@ -360,14 +400,15 @@ tcache_alloc_large(tsd_t *tsd, arena_t *arena, tcache_t *tcache, size_t size, } JEMALLOC_ALWAYS_INLINE void -tcache_dalloc_small(tsd_t *tsd, tcache_t *tcache, void *ptr, szind_t binind) +tcache_dalloc_small(tsd_t *tsd, tcache_t *tcache, void *ptr, szind_t binind, + bool slow_path) { tcache_bin_t *tbin; tcache_bin_info_t *tbin_info; assert(tcache_salloc(ptr) <= SMALL_MAXCLASS); - if (config_fill && unlikely(opt_junk_free)) + if (slow_path && config_fill && unlikely(opt_junk_free)) arena_dalloc_junk_small(ptr, &arena_bin_info[binind]); tbin = &tcache->tbins[binind]; @@ -377,14 +418,15 @@ tcache_dalloc_small(tsd_t *tsd, tcache_t *tcache, void *ptr, szind_t binind) (tbin_info->ncached_max >> 1)); } assert(tbin->ncached < tbin_info->ncached_max); - tbin->avail[tbin->ncached] = ptr; tbin->ncached++; + *(tbin->avail - tbin->ncached) = ptr; tcache_event(tsd, tcache); } JEMALLOC_ALWAYS_INLINE void -tcache_dalloc_large(tsd_t *tsd, tcache_t *tcache, void *ptr, size_t size) +tcache_dalloc_large(tsd_t *tsd, tcache_t *tcache, void *ptr, size_t size, + bool slow_path) { szind_t binind; tcache_bin_t *tbin; @@ -396,7 +438,7 @@ tcache_dalloc_large(tsd_t *tsd, tcache_t *tcache, void *ptr, size_t size) binind = size2index(size); - if (config_fill && unlikely(opt_junk_free)) + if (slow_path && config_fill && unlikely(opt_junk_free)) arena_dalloc_junk_large(ptr, size); tbin = &tcache->tbins[binind]; @@ -406,8 +448,8 @@ tcache_dalloc_large(tsd_t *tsd, tcache_t *tcache, void *ptr, size_t size) (tbin_info->ncached_max >> 1), tcache); } assert(tbin->ncached < tbin_info->ncached_max); - tbin->avail[tbin->ncached] = ptr; tbin->ncached++; + *(tbin->avail - tbin->ncached) = ptr; tcache_event(tsd, tcache); } diff --git a/contrib/jemalloc/include/jemalloc/internal/ticker.h b/contrib/jemalloc/include/jemalloc/internal/ticker.h new file mode 100644 index 0000000..4696e56 --- /dev/null +++ b/contrib/jemalloc/include/jemalloc/internal/ticker.h @@ -0,0 +1,75 @@ +/******************************************************************************/ +#ifdef JEMALLOC_H_TYPES + +typedef struct ticker_s ticker_t; + +#endif /* JEMALLOC_H_TYPES */ +/******************************************************************************/ +#ifdef JEMALLOC_H_STRUCTS + +struct ticker_s { + int32_t tick; + int32_t nticks; +}; + +#endif /* JEMALLOC_H_STRUCTS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_EXTERNS + +#endif /* JEMALLOC_H_EXTERNS */ +/******************************************************************************/ +#ifdef JEMALLOC_H_INLINES + +#ifndef JEMALLOC_ENABLE_INLINE +void ticker_init(ticker_t *ticker, int32_t nticks); +void ticker_copy(ticker_t *ticker, const ticker_t *other); +int32_t ticker_read(const ticker_t *ticker); +bool ticker_ticks(ticker_t *ticker, int32_t nticks); +bool ticker_tick(ticker_t *ticker); +#endif + +#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_TICKER_C_)) +JEMALLOC_INLINE void +ticker_init(ticker_t *ticker, int32_t nticks) +{ + + ticker->tick = nticks; + ticker->nticks = nticks; +} + +JEMALLOC_INLINE void +ticker_copy(ticker_t *ticker, const ticker_t *other) +{ + + *ticker = *other; +} + +JEMALLOC_INLINE int32_t +ticker_read(const ticker_t *ticker) +{ + + return (ticker->tick); +} + +JEMALLOC_INLINE bool +ticker_ticks(ticker_t *ticker, int32_t nticks) +{ + + if (unlikely(ticker->tick < nticks)) { + ticker->tick = ticker->nticks; + return (true); + } + ticker->tick -= nticks; + return(false); +} + +JEMALLOC_INLINE bool +ticker_tick(ticker_t *ticker) +{ + + return (ticker_ticks(ticker, 1)); +} +#endif + +#endif /* JEMALLOC_H_INLINES */ +/******************************************************************************/ diff --git a/contrib/jemalloc/include/jemalloc/internal/tsd.h b/contrib/jemalloc/include/jemalloc/internal/tsd.h index eed7aa0..16cc2f1 100644 --- a/contrib/jemalloc/include/jemalloc/internal/tsd.h +++ b/contrib/jemalloc/include/jemalloc/internal/tsd.h @@ -537,9 +537,9 @@ struct tsd_init_head_s { O(thread_deallocated, uint64_t) \ O(prof_tdata, prof_tdata_t *) \ O(arena, arena_t *) \ - O(arenas_cache, arena_t **) \ - O(narenas_cache, unsigned) \ - O(arenas_cache_bypass, bool) \ + O(arenas_tdata, arena_tdata_t *) \ + O(narenas_tdata, unsigned) \ + O(arenas_tdata_bypass, bool) \ O(tcache_enabled, tcache_enabled_t) \ O(quarantine, quarantine_t *) \ diff --git a/contrib/jemalloc/include/jemalloc/internal/util.h b/contrib/jemalloc/include/jemalloc/internal/util.h index b2ea740..b8885bf 100644 --- a/contrib/jemalloc/include/jemalloc/internal/util.h +++ b/contrib/jemalloc/include/jemalloc/internal/util.h @@ -81,49 +81,7 @@ # define unreachable() #endif -/* - * Define a custom assert() in order to reduce the chances of deadlock during - * assertion failure. - */ -#ifndef assert -#define assert(e) do { \ - if (unlikely(config_debug && !(e))) { \ - malloc_printf( \ - "<jemalloc>: %s:%d: Failed assertion: \"%s\"\n", \ - __FILE__, __LINE__, #e); \ - abort(); \ - } \ -} while (0) -#endif - -#ifndef not_reached -#define not_reached() do { \ - if (config_debug) { \ - malloc_printf( \ - "<jemalloc>: %s:%d: Unreachable code reached\n", \ - __FILE__, __LINE__); \ - abort(); \ - } \ - unreachable(); \ -} while (0) -#endif - -#ifndef not_implemented -#define not_implemented() do { \ - if (config_debug) { \ - malloc_printf("<jemalloc>: %s:%d: Not implemented\n", \ - __FILE__, __LINE__); \ - abort(); \ - } \ -} while (0) -#endif - -#ifndef assert_not_implemented -#define assert_not_implemented(e) do { \ - if (unlikely(config_debug && !(e))) \ - not_implemented(); \ -} while (0) -#endif +#include "jemalloc/internal/assert.h" /* Use to assert a particular configuration, e.g., cassert(config_debug). */ #define cassert(c) do { \ @@ -163,10 +121,16 @@ void malloc_printf(const char *format, ...) JEMALLOC_FORMAT_PRINTF(1, 2); #ifdef JEMALLOC_H_INLINES #ifndef JEMALLOC_ENABLE_INLINE -int jemalloc_ffsl(long bitmap); -int jemalloc_ffs(int bitmap); -size_t pow2_ceil(size_t x); -size_t lg_floor(size_t x); +unsigned ffs_llu(unsigned long long bitmap); +unsigned ffs_lu(unsigned long bitmap); +unsigned ffs_u(unsigned bitmap); +unsigned ffs_zu(size_t bitmap); +unsigned ffs_u64(uint64_t bitmap); +unsigned ffs_u32(uint32_t bitmap); +uint64_t pow2_ceil_u64(uint64_t x); +uint32_t pow2_ceil_u32(uint32_t x); +size_t pow2_ceil_zu(size_t x); +unsigned lg_floor(size_t x); void set_errno(int errnum); int get_errno(void); #endif @@ -174,27 +138,74 @@ int get_errno(void); #if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_UTIL_C_)) /* Sanity check. */ -#if !defined(JEMALLOC_INTERNAL_FFSL) || !defined(JEMALLOC_INTERNAL_FFS) -# error Both JEMALLOC_INTERNAL_FFSL && JEMALLOC_INTERNAL_FFS should have been defined by configure +#if !defined(JEMALLOC_INTERNAL_FFSLL) || !defined(JEMALLOC_INTERNAL_FFSL) \ + || !defined(JEMALLOC_INTERNAL_FFS) +# error JEMALLOC_INTERNAL_FFS{,L,LL} should have been defined by configure #endif -JEMALLOC_ALWAYS_INLINE int -jemalloc_ffsl(long bitmap) +JEMALLOC_ALWAYS_INLINE unsigned +ffs_llu(unsigned long long bitmap) +{ + + return (JEMALLOC_INTERNAL_FFSLL(bitmap)); +} + +JEMALLOC_ALWAYS_INLINE unsigned +ffs_lu(unsigned long bitmap) { return (JEMALLOC_INTERNAL_FFSL(bitmap)); } -JEMALLOC_ALWAYS_INLINE int -jemalloc_ffs(int bitmap) +JEMALLOC_ALWAYS_INLINE unsigned +ffs_u(unsigned bitmap) { return (JEMALLOC_INTERNAL_FFS(bitmap)); } -/* Compute the smallest power of 2 that is >= x. */ -JEMALLOC_INLINE size_t -pow2_ceil(size_t x) +JEMALLOC_ALWAYS_INLINE unsigned +ffs_zu(size_t bitmap) +{ + +#if LG_SIZEOF_PTR == LG_SIZEOF_INT + return (ffs_u(bitmap)); +#elif LG_SIZEOF_PTR == LG_SIZEOF_LONG + return (ffs_lu(bitmap)); +#elif LG_SIZEOF_PTR == LG_SIZEOF_LONG_LONG + return (ffs_llu(bitmap)); +#else +#error No implementation for size_t ffs() +#endif +} + +JEMALLOC_ALWAYS_INLINE unsigned +ffs_u64(uint64_t bitmap) +{ + +#if LG_SIZEOF_LONG == 3 + return (ffs_lu(bitmap)); +#elif LG_SIZEOF_LONG_LONG == 3 + return (ffs_llu(bitmap)); +#else +#error No implementation for 64-bit ffs() +#endif +} + +JEMALLOC_ALWAYS_INLINE unsigned +ffs_u32(uint32_t bitmap) +{ + +#if LG_SIZEOF_INT == 2 + return (ffs_u(bitmap)); +#else +#error No implementation for 32-bit ffs() +#endif + return (ffs_u(bitmap)); +} + +JEMALLOC_INLINE uint64_t +pow2_ceil_u64(uint64_t x) { x--; @@ -203,15 +214,39 @@ pow2_ceil(size_t x) x |= x >> 4; x |= x >> 8; x |= x >> 16; -#if (LG_SIZEOF_PTR == 3) x |= x >> 32; -#endif x++; return (x); } -#if (defined(__i386__) || defined(__amd64__) || defined(__x86_64__)) +JEMALLOC_INLINE uint32_t +pow2_ceil_u32(uint32_t x) +{ + + x--; + x |= x >> 1; + x |= x >> 2; + x |= x >> 4; + x |= x >> 8; + x |= x >> 16; + x++; + return (x); +} + +/* Compute the smallest power of 2 that is >= x. */ JEMALLOC_INLINE size_t +pow2_ceil_zu(size_t x) +{ + +#if (LG_SIZEOF_PTR == 3) + return (pow2_ceil_u64(x)); +#else + return (pow2_ceil_u32(x)); +#endif +} + +#if (defined(__i386__) || defined(__amd64__) || defined(__x86_64__)) +JEMALLOC_INLINE unsigned lg_floor(size_t x) { size_t ret; @@ -222,10 +257,11 @@ lg_floor(size_t x) : "=r"(ret) // Outputs. : "r"(x) // Inputs. ); - return (ret); + assert(ret < UINT_MAX); + return ((unsigned)ret); } #elif (defined(_MSC_VER)) -JEMALLOC_INLINE size_t +JEMALLOC_INLINE unsigned lg_floor(size_t x) { unsigned long ret; @@ -237,12 +273,13 @@ lg_floor(size_t x) #elif (LG_SIZEOF_PTR == 2) _BitScanReverse(&ret, x); #else -# error "Unsupported type sizes for lg_floor()" +# error "Unsupported type size for lg_floor()" #endif - return (ret); + assert(ret < UINT_MAX); + return ((unsigned)ret); } #elif (defined(JEMALLOC_HAVE_BUILTIN_CLZ)) -JEMALLOC_INLINE size_t +JEMALLOC_INLINE unsigned lg_floor(size_t x) { @@ -253,11 +290,11 @@ lg_floor(size_t x) #elif (LG_SIZEOF_PTR == LG_SIZEOF_LONG) return (((8 << LG_SIZEOF_PTR) - 1) - __builtin_clzl(x)); #else -# error "Unsupported type sizes for lg_floor()" +# error "Unsupported type size for lg_floor()" #endif } #else -JEMALLOC_INLINE size_t +JEMALLOC_INLINE unsigned lg_floor(size_t x) { @@ -268,20 +305,13 @@ lg_floor(size_t x) x |= (x >> 4); x |= (x >> 8); x |= (x >> 16); -#if (LG_SIZEOF_PTR == 3 && LG_SIZEOF_PTR == LG_SIZEOF_LONG) +#if (LG_SIZEOF_PTR == 3) x |= (x >> 32); - if (x == KZU(0xffffffffffffffff)) - return (63); - x++; - return (jemalloc_ffsl(x) - 2); -#elif (LG_SIZEOF_PTR == 2) - if (x == KZU(0xffffffff)) - return (31); - x++; - return (jemalloc_ffs(x) - 2); -#else -# error "Unsupported type sizes for lg_floor()" #endif + if (x == SIZE_T_MAX) + return ((8 << LG_SIZEOF_PTR) - 1); + x++; + return (ffs_zu(x) - 2); } #endif diff --git a/contrib/jemalloc/include/jemalloc/jemalloc.h b/contrib/jemalloc/include/jemalloc/jemalloc.h index d632d1e..2d0825a 100644 --- a/contrib/jemalloc/include/jemalloc/jemalloc.h +++ b/contrib/jemalloc/include/jemalloc/jemalloc.h @@ -39,6 +39,14 @@ extern "C" { */ /* #undef JEMALLOC_USE_CXX_THROW */ +#ifdef _MSC_VER +# ifdef _WIN64 +# define LG_SIZEOF_PTR_WIN 3 +# else +# define LG_SIZEOF_PTR_WIN 2 +# endif +#endif + /* sizeof(void *) == 2^LG_SIZEOF_PTR. */ #define LG_SIZEOF_PTR 3 @@ -79,19 +87,20 @@ extern "C" { #include <limits.h> #include <strings.h> -#define JEMALLOC_VERSION "4.0.4-0-g91010a9e2ebfc84b1ac1ed7fdde3bfed4f65f180" +#define JEMALLOC_VERSION "4.1.0-1-g994da4232621dd1210fcf39bdf0d6454cefda473" #define JEMALLOC_VERSION_MAJOR 4 -#define JEMALLOC_VERSION_MINOR 0 -#define JEMALLOC_VERSION_BUGFIX 4 -#define JEMALLOC_VERSION_NREV 0 -#define JEMALLOC_VERSION_GID "91010a9e2ebfc84b1ac1ed7fdde3bfed4f65f180" +#define JEMALLOC_VERSION_MINOR 1 +#define JEMALLOC_VERSION_BUGFIX 0 +#define JEMALLOC_VERSION_NREV 1 +#define JEMALLOC_VERSION_GID "994da4232621dd1210fcf39bdf0d6454cefda473" -# define MALLOCX_LG_ALIGN(la) (la) +# define MALLOCX_LG_ALIGN(la) ((int)(la)) # if LG_SIZEOF_PTR == 2 -# define MALLOCX_ALIGN(a) (ffs(a)-1) +# define MALLOCX_ALIGN(a) ((int)(ffs(a)-1)) # else # define MALLOCX_ALIGN(a) \ - ((a < (size_t)INT_MAX) ? ffs(a)-1 : ffs(a>>32)+31) + ((int)(((a) < (size_t)INT_MAX) ? ffs((int)(a))-1 : \ + ffs((int)((a)>>32))+31)) # endif # define MALLOCX_ZERO ((int)0x40) /* @@ -111,32 +120,7 @@ extern "C" { # define JEMALLOC_CXX_THROW #endif -#ifdef JEMALLOC_HAVE_ATTR -# define JEMALLOC_ATTR(s) __attribute__((s)) -# define JEMALLOC_ALIGNED(s) JEMALLOC_ATTR(aligned(s)) -# ifdef JEMALLOC_HAVE_ATTR_ALLOC_SIZE -# define JEMALLOC_ALLOC_SIZE(s) JEMALLOC_ATTR(alloc_size(s)) -# define JEMALLOC_ALLOC_SIZE2(s1, s2) JEMALLOC_ATTR(alloc_size(s1, s2)) -# else -# define JEMALLOC_ALLOC_SIZE(s) -# define JEMALLOC_ALLOC_SIZE2(s1, s2) -# endif -# ifndef JEMALLOC_EXPORT -# define JEMALLOC_EXPORT JEMALLOC_ATTR(visibility("default")) -# endif -# ifdef JEMALLOC_HAVE_ATTR_FORMAT_GNU_PRINTF -# define JEMALLOC_FORMAT_PRINTF(s, i) JEMALLOC_ATTR(format(gnu_printf, s, i)) -# elif defined(JEMALLOC_HAVE_ATTR_FORMAT_PRINTF) -# define JEMALLOC_FORMAT_PRINTF(s, i) JEMALLOC_ATTR(format(printf, s, i)) -# else -# define JEMALLOC_FORMAT_PRINTF(s, i) -# endif -# define JEMALLOC_NOINLINE JEMALLOC_ATTR(noinline) -# define JEMALLOC_NOTHROW JEMALLOC_ATTR(nothrow) -# define JEMALLOC_SECTION(s) JEMALLOC_ATTR(section(s)) -# define JEMALLOC_RESTRICT_RETURN -# define JEMALLOC_ALLOCATOR -#elif _MSC_VER +#if _MSC_VER # define JEMALLOC_ATTR(s) # define JEMALLOC_ALIGNED(s) __declspec(align(s)) # define JEMALLOC_ALLOC_SIZE(s) @@ -162,6 +146,31 @@ extern "C" { # else # define JEMALLOC_ALLOCATOR # endif +#elif defined(JEMALLOC_HAVE_ATTR) +# define JEMALLOC_ATTR(s) __attribute__((s)) +# define JEMALLOC_ALIGNED(s) JEMALLOC_ATTR(aligned(s)) +# ifdef JEMALLOC_HAVE_ATTR_ALLOC_SIZE +# define JEMALLOC_ALLOC_SIZE(s) JEMALLOC_ATTR(alloc_size(s)) +# define JEMALLOC_ALLOC_SIZE2(s1, s2) JEMALLOC_ATTR(alloc_size(s1, s2)) +# else +# define JEMALLOC_ALLOC_SIZE(s) +# define JEMALLOC_ALLOC_SIZE2(s1, s2) +# endif +# ifndef JEMALLOC_EXPORT +# define JEMALLOC_EXPORT JEMALLOC_ATTR(visibility("default")) +# endif +# ifdef JEMALLOC_HAVE_ATTR_FORMAT_GNU_PRINTF +# define JEMALLOC_FORMAT_PRINTF(s, i) JEMALLOC_ATTR(format(gnu_printf, s, i)) +# elif defined(JEMALLOC_HAVE_ATTR_FORMAT_PRINTF) +# define JEMALLOC_FORMAT_PRINTF(s, i) JEMALLOC_ATTR(format(printf, s, i)) +# else +# define JEMALLOC_FORMAT_PRINTF(s, i) +# endif +# define JEMALLOC_NOINLINE JEMALLOC_ATTR(noinline) +# define JEMALLOC_NOTHROW JEMALLOC_ATTR(nothrow) +# define JEMALLOC_SECTION(s) JEMALLOC_ATTR(section(s)) +# define JEMALLOC_RESTRICT_RETURN +# define JEMALLOC_ALLOCATOR #else # define JEMALLOC_ATTR(s) # define JEMALLOC_ALIGNED(s) diff --git a/contrib/jemalloc/include/jemalloc/jemalloc_FreeBSD.h b/contrib/jemalloc/include/jemalloc/jemalloc_FreeBSD.h index 1ab2ce5..433dab5 100644 --- a/contrib/jemalloc/include/jemalloc/jemalloc_FreeBSD.h +++ b/contrib/jemalloc/include/jemalloc/jemalloc_FreeBSD.h @@ -78,17 +78,22 @@ extern int __isthreaded; /* Mangle. */ #undef je_malloc #undef je_calloc -#undef je_realloc -#undef je_free #undef je_posix_memalign #undef je_aligned_alloc +#undef je_realloc +#undef je_free #undef je_malloc_usable_size #undef je_mallocx #undef je_rallocx #undef je_xallocx #undef je_sallocx #undef je_dallocx +#undef je_sdallocx #undef je_nallocx +#undef je_mallctl +#undef je_mallctlnametomib +#undef je_mallctlbymib +#undef je_malloc_stats_print #undef je_allocm #undef je_rallocm #undef je_sallocm @@ -96,17 +101,22 @@ extern int __isthreaded; #undef je_nallocm #define je_malloc __malloc #define je_calloc __calloc -#define je_realloc __realloc -#define je_free __free #define je_posix_memalign __posix_memalign #define je_aligned_alloc __aligned_alloc +#define je_realloc __realloc +#define je_free __free #define je_malloc_usable_size __malloc_usable_size #define je_mallocx __mallocx #define je_rallocx __rallocx #define je_xallocx __xallocx #define je_sallocx __sallocx #define je_dallocx __dallocx +#define je_sdallocx __sdallocx #define je_nallocx __nallocx +#define je_mallctl __mallctl +#define je_mallctlnametomib __mallctlnametomib +#define je_mallctlbymib __mallctlbymib +#define je_malloc_stats_print __malloc_stats_print #define je_allocm __allocm #define je_rallocm __rallocm #define je_sallocm __sallocm @@ -126,17 +136,22 @@ extern int __isthreaded; */ __weak_reference(__malloc, malloc); __weak_reference(__calloc, calloc); -__weak_reference(__realloc, realloc); -__weak_reference(__free, free); __weak_reference(__posix_memalign, posix_memalign); __weak_reference(__aligned_alloc, aligned_alloc); +__weak_reference(__realloc, realloc); +__weak_reference(__free, free); __weak_reference(__malloc_usable_size, malloc_usable_size); __weak_reference(__mallocx, mallocx); __weak_reference(__rallocx, rallocx); __weak_reference(__xallocx, xallocx); __weak_reference(__sallocx, sallocx); __weak_reference(__dallocx, dallocx); +__weak_reference(__sdallocx, sdallocx); __weak_reference(__nallocx, nallocx); +__weak_reference(__mallctl, mallctl); +__weak_reference(__mallctlnametomib, mallctlnametomib); +__weak_reference(__mallctlbymib, mallctlbymib); +__weak_reference(__malloc_stats_print, malloc_stats_print); __weak_reference(__allocm, allocm); __weak_reference(__rallocm, rallocm); __weak_reference(__sallocm, sallocm); diff --git a/contrib/jemalloc/src/arena.c b/contrib/jemalloc/src/arena.c index 43733cc..99e20fd 100644 --- a/contrib/jemalloc/src/arena.c +++ b/contrib/jemalloc/src/arena.c @@ -4,18 +4,32 @@ /******************************************************************************/ /* Data. */ +purge_mode_t opt_purge = PURGE_DEFAULT; +const char *purge_mode_names[] = { + "ratio", + "decay", + "N/A" +}; ssize_t opt_lg_dirty_mult = LG_DIRTY_MULT_DEFAULT; static ssize_t lg_dirty_mult_default; +ssize_t opt_decay_time = DECAY_TIME_DEFAULT; +static ssize_t decay_time_default; + arena_bin_info_t arena_bin_info[NBINS]; size_t map_bias; size_t map_misc_offset; size_t arena_maxrun; /* Max run size for arenas. */ size_t large_maxclass; /* Max large size class. */ -static size_t small_maxrun; /* Max run size used for small size classes. */ +size_t run_quantize_max; /* Max run_quantize_*() input. */ +static size_t small_maxrun; /* Max run size for small size classes. */ static bool *small_run_tab; /* Valid small run page multiples. */ +static size_t *run_quantize_floor_tab; /* run_quantize_floor() memoization. */ +static size_t *run_quantize_ceil_tab; /* run_quantize_ceil() memoization. */ unsigned nlclasses; /* Number of large size classes. */ unsigned nhclasses; /* Number of huge size classes. */ +static szind_t runs_avail_bias; /* Size index for first runs_avail tree. */ +static szind_t runs_avail_nclasses; /* Number of runs_avail trees. */ /******************************************************************************/ /* @@ -23,7 +37,7 @@ unsigned nhclasses; /* Number of huge size classes. */ * definition. */ -static void arena_purge(arena_t *arena, bool all); +static void arena_purge_to_limit(arena_t *arena, size_t ndirty_limit); static void arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty, bool cleaned, bool decommitted); static void arena_dalloc_bin_run(arena_t *arena, arena_chunk_t *chunk, @@ -33,42 +47,12 @@ static void arena_bin_lower_run(arena_t *arena, arena_chunk_t *chunk, /******************************************************************************/ -#define CHUNK_MAP_KEY ((uintptr_t)0x1U) - -JEMALLOC_INLINE_C arena_chunk_map_misc_t * -arena_miscelm_key_create(size_t size) -{ - - return ((arena_chunk_map_misc_t *)(arena_mapbits_size_encode(size) | - CHUNK_MAP_KEY)); -} - -JEMALLOC_INLINE_C bool -arena_miscelm_is_key(const arena_chunk_map_misc_t *miscelm) -{ - - return (((uintptr_t)miscelm & CHUNK_MAP_KEY) != 0); -} - -#undef CHUNK_MAP_KEY - -JEMALLOC_INLINE_C size_t -arena_miscelm_key_size_get(const arena_chunk_map_misc_t *miscelm) -{ - - assert(arena_miscelm_is_key(miscelm)); - - return (arena_mapbits_size_decode((uintptr_t)miscelm)); -} - JEMALLOC_INLINE_C size_t -arena_miscelm_size_get(arena_chunk_map_misc_t *miscelm) +arena_miscelm_size_get(const arena_chunk_map_misc_t *miscelm) { arena_chunk_t *chunk; size_t pageind, mapbits; - assert(!arena_miscelm_is_key(miscelm)); - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(miscelm); pageind = arena_miscelm_to_pageind(miscelm); mapbits = arena_mapbits_get(chunk, pageind); @@ -76,7 +60,8 @@ arena_miscelm_size_get(arena_chunk_map_misc_t *miscelm) } JEMALLOC_INLINE_C int -arena_run_comp(arena_chunk_map_misc_t *a, arena_chunk_map_misc_t *b) +arena_run_addr_comp(const arena_chunk_map_misc_t *a, + const arena_chunk_map_misc_t *b) { uintptr_t a_miscelm = (uintptr_t)a; uintptr_t b_miscelm = (uintptr_t)b; @@ -89,10 +74,10 @@ arena_run_comp(arena_chunk_map_misc_t *a, arena_chunk_map_misc_t *b) /* Generate red-black tree functions. */ rb_gen(static UNUSED, arena_run_tree_, arena_run_tree_t, arena_chunk_map_misc_t, - rb_link, arena_run_comp) + rb_link, arena_run_addr_comp) static size_t -run_quantize(size_t size) +run_quantize_floor_compute(size_t size) { size_t qsize; @@ -110,13 +95,13 @@ run_quantize(size_t size) */ qsize = index2size(size2index(size - large_pad + 1) - 1) + large_pad; if (qsize <= SMALL_MAXCLASS + large_pad) - return (run_quantize(size - large_pad)); + return (run_quantize_floor_compute(size - large_pad)); assert(qsize <= size); return (qsize); } static size_t -run_quantize_next(size_t size) +run_quantize_ceil_compute_hard(size_t size) { size_t large_run_size_next; @@ -150,9 +135,9 @@ run_quantize_next(size_t size) } static size_t -run_quantize_first(size_t size) +run_quantize_ceil_compute(size_t size) { - size_t qsize = run_quantize(size); + size_t qsize = run_quantize_floor_compute(size); if (qsize < size) { /* @@ -163,65 +148,89 @@ run_quantize_first(size_t size) * search would potentially find sufficiently aligned available * memory somewhere lower. */ - qsize = run_quantize_next(size); + qsize = run_quantize_ceil_compute_hard(qsize); } return (qsize); } -JEMALLOC_INLINE_C int -arena_avail_comp(arena_chunk_map_misc_t *a, arena_chunk_map_misc_t *b) +#ifdef JEMALLOC_JET +#undef run_quantize_floor +#define run_quantize_floor JEMALLOC_N(run_quantize_floor_impl) +#endif +static size_t +run_quantize_floor(size_t size) { - int ret; - uintptr_t a_miscelm = (uintptr_t)a; - size_t a_qsize = run_quantize(arena_miscelm_is_key(a) ? - arena_miscelm_key_size_get(a) : arena_miscelm_size_get(a)); - size_t b_qsize = run_quantize(arena_miscelm_size_get(b)); + size_t ret; - /* - * Compare based on quantized size rather than size, in order to sort - * equally useful runs only by address. - */ - ret = (a_qsize > b_qsize) - (a_qsize < b_qsize); - if (ret == 0) { - if (!arena_miscelm_is_key(a)) { - uintptr_t b_miscelm = (uintptr_t)b; + assert(size > 0); + assert(size <= run_quantize_max); + assert((size & PAGE_MASK) == 0); - ret = (a_miscelm > b_miscelm) - (a_miscelm < b_miscelm); - } else { - /* - * Treat keys as if they are lower than anything else. - */ - ret = -1; - } - } + ret = run_quantize_floor_tab[(size >> LG_PAGE) - 1]; + assert(ret == run_quantize_floor_compute(size)); + return (ret); +} +#ifdef JEMALLOC_JET +#undef run_quantize_floor +#define run_quantize_floor JEMALLOC_N(run_quantize_floor) +run_quantize_t *run_quantize_floor = JEMALLOC_N(run_quantize_floor_impl); +#endif + +#ifdef JEMALLOC_JET +#undef run_quantize_ceil +#define run_quantize_ceil JEMALLOC_N(run_quantize_ceil_impl) +#endif +static size_t +run_quantize_ceil(size_t size) +{ + size_t ret; + + assert(size > 0); + assert(size <= run_quantize_max); + assert((size & PAGE_MASK) == 0); + ret = run_quantize_ceil_tab[(size >> LG_PAGE) - 1]; + assert(ret == run_quantize_ceil_compute(size)); return (ret); } +#ifdef JEMALLOC_JET +#undef run_quantize_ceil +#define run_quantize_ceil JEMALLOC_N(run_quantize_ceil) +run_quantize_t *run_quantize_ceil = JEMALLOC_N(run_quantize_ceil_impl); +#endif -/* Generate red-black tree functions. */ -rb_gen(static UNUSED, arena_avail_tree_, arena_avail_tree_t, - arena_chunk_map_misc_t, rb_link, arena_avail_comp) +static arena_run_tree_t * +arena_runs_avail_get(arena_t *arena, szind_t ind) +{ + + assert(ind >= runs_avail_bias); + assert(ind - runs_avail_bias < runs_avail_nclasses); + + return (&arena->runs_avail[ind - runs_avail_bias]); +} static void arena_avail_insert(arena_t *arena, arena_chunk_t *chunk, size_t pageind, size_t npages) { - + szind_t ind = size2index(run_quantize_floor(arena_miscelm_size_get( + arena_miscelm_get(chunk, pageind)))); assert(npages == (arena_mapbits_unallocated_size_get(chunk, pageind) >> LG_PAGE)); - arena_avail_tree_insert(&arena->runs_avail, arena_miscelm_get(chunk, - pageind)); + arena_run_tree_insert(arena_runs_avail_get(arena, ind), + arena_miscelm_get(chunk, pageind)); } static void arena_avail_remove(arena_t *arena, arena_chunk_t *chunk, size_t pageind, size_t npages) { - + szind_t ind = size2index(run_quantize_floor(arena_miscelm_size_get( + arena_miscelm_get(chunk, pageind)))); assert(npages == (arena_mapbits_unallocated_size_get(chunk, pageind) >> LG_PAGE)); - arena_avail_tree_remove(&arena->runs_avail, arena_miscelm_get(chunk, - pageind)); + arena_run_tree_remove(arena_runs_avail_get(arena, ind), + arena_miscelm_get(chunk, pageind)); } static void @@ -292,14 +301,14 @@ JEMALLOC_INLINE_C void * arena_run_reg_alloc(arena_run_t *run, arena_bin_info_t *bin_info) { void *ret; - unsigned regind; + size_t regind; arena_chunk_map_misc_t *miscelm; void *rpages; assert(run->nfree > 0); assert(!bitmap_full(run->bitmap, &bin_info->bitmap_info)); - regind = bitmap_sfu(run->bitmap, &bin_info->bitmap_info); + regind = (unsigned)bitmap_sfu(run->bitmap, &bin_info->bitmap_info); miscelm = arena_run_to_miscelm(run); rpages = arena_miscelm_to_rpages(miscelm); ret = (void *)((uintptr_t)rpages + (uintptr_t)bin_info->reg0_offset + @@ -316,7 +325,7 @@ arena_run_reg_dalloc(arena_run_t *run, void *ptr) size_t mapbits = arena_mapbits_get(chunk, pageind); szind_t binind = arena_ptr_small_binind_get(ptr, mapbits); arena_bin_info_t *bin_info = &arena_bin_info[binind]; - unsigned regind = arena_run_regind(run, bin_info, ptr); + size_t regind = arena_run_regind(run, bin_info, ptr); assert(run->nfree < bin_info->nregs); /* Freeing an interior pointer can cause assertion failure. */ @@ -364,16 +373,30 @@ arena_run_page_validate_zeroed(arena_chunk_t *chunk, size_t run_ind) } static void -arena_cactive_update(arena_t *arena, size_t add_pages, size_t sub_pages) +arena_nactive_add(arena_t *arena, size_t add_pages) { if (config_stats) { - ssize_t cactive_diff = CHUNK_CEILING((arena->nactive + add_pages - - sub_pages) << LG_PAGE) - CHUNK_CEILING(arena->nactive << + size_t cactive_add = CHUNK_CEILING((arena->nactive + + add_pages) << LG_PAGE) - CHUNK_CEILING(arena->nactive << LG_PAGE); - if (cactive_diff != 0) - stats_cactive_add(cactive_diff); + if (cactive_add != 0) + stats_cactive_add(cactive_add); + } + arena->nactive += add_pages; +} + +static void +arena_nactive_sub(arena_t *arena, size_t sub_pages) +{ + + if (config_stats) { + size_t cactive_sub = CHUNK_CEILING(arena->nactive << LG_PAGE) - + CHUNK_CEILING((arena->nactive - sub_pages) << LG_PAGE); + if (cactive_sub != 0) + stats_cactive_sub(cactive_sub); } + arena->nactive -= sub_pages; } static void @@ -394,8 +417,7 @@ arena_run_split_remove(arena_t *arena, arena_chunk_t *chunk, size_t run_ind, arena_avail_remove(arena, chunk, run_ind, total_pages); if (flag_dirty != 0) arena_run_dirty_remove(arena, chunk, run_ind, total_pages); - arena_cactive_update(arena, need_pages, 0); - arena->nactive += need_pages; + arena_nactive_add(arena, need_pages); /* Keep track of trailing unused pages for later use. */ if (rem_pages > 0) { @@ -711,7 +733,6 @@ arena_chunk_alloc(arena_t *arena) return (NULL); } - /* Insert the run into the runs_avail tree. */ arena_avail_insert(arena, chunk, map_bias, chunk_npages-map_bias); return (chunk); @@ -732,10 +753,7 @@ arena_chunk_dalloc(arena_t *arena, arena_chunk_t *chunk) assert(arena_mapbits_decommitted_get(chunk, map_bias) == arena_mapbits_decommitted_get(chunk, chunk_npages-1)); - /* - * Remove run from the runs_avail tree, so that the arena does not use - * it. - */ + /* Remove run from runs_avail, so that the arena does not use it. */ arena_avail_remove(arena, chunk, map_bias, chunk_npages-map_bias); if (arena->spare != NULL) { @@ -888,7 +906,7 @@ arena_chunk_alloc_huge_hard(arena_t *arena, chunk_hooks_t *chunk_hooks, arena_huge_malloc_stats_update_undo(arena, usize); arena->stats.mapped -= usize; } - arena->nactive -= (usize >> LG_PAGE); + arena_nactive_sub(arena, usize >> LG_PAGE); malloc_mutex_unlock(&arena->lock); } @@ -910,7 +928,7 @@ arena_chunk_alloc_huge(arena_t *arena, size_t usize, size_t alignment, arena_huge_malloc_stats_update(arena, usize); arena->stats.mapped += usize; } - arena->nactive += (usize >> LG_PAGE); + arena_nactive_add(arena, usize >> LG_PAGE); ret = chunk_alloc_cache(arena, &chunk_hooks, NULL, csize, alignment, zero, true); @@ -920,8 +938,6 @@ arena_chunk_alloc_huge(arena_t *arena, size_t usize, size_t alignment, alignment, zero, csize); } - if (config_stats && ret != NULL) - stats_cactive_add(usize); return (ret); } @@ -936,9 +952,8 @@ arena_chunk_dalloc_huge(arena_t *arena, void *chunk, size_t usize) if (config_stats) { arena_huge_dalloc_stats_update(arena, usize); arena->stats.mapped -= usize; - stats_cactive_sub(usize); } - arena->nactive -= (usize >> LG_PAGE); + arena_nactive_sub(arena, usize >> LG_PAGE); chunk_dalloc_cache(arena, &chunk_hooks, chunk, csize, true); malloc_mutex_unlock(&arena->lock); @@ -955,17 +970,10 @@ arena_chunk_ralloc_huge_similar(arena_t *arena, void *chunk, size_t oldsize, malloc_mutex_lock(&arena->lock); if (config_stats) arena_huge_ralloc_stats_update(arena, oldsize, usize); - if (oldsize < usize) { - size_t udiff = usize - oldsize; - arena->nactive += udiff >> LG_PAGE; - if (config_stats) - stats_cactive_add(udiff); - } else { - size_t udiff = oldsize - usize; - arena->nactive -= udiff >> LG_PAGE; - if (config_stats) - stats_cactive_sub(udiff); - } + if (oldsize < usize) + arena_nactive_add(arena, (usize - oldsize) >> LG_PAGE); + else + arena_nactive_sub(arena, (oldsize - usize) >> LG_PAGE); malloc_mutex_unlock(&arena->lock); } @@ -979,12 +987,10 @@ arena_chunk_ralloc_huge_shrink(arena_t *arena, void *chunk, size_t oldsize, malloc_mutex_lock(&arena->lock); if (config_stats) { arena_huge_ralloc_stats_update(arena, oldsize, usize); - if (cdiff != 0) { + if (cdiff != 0) arena->stats.mapped -= cdiff; - stats_cactive_sub(udiff); - } } - arena->nactive -= udiff >> LG_PAGE; + arena_nactive_sub(arena, udiff >> LG_PAGE); if (cdiff != 0) { chunk_hooks_t chunk_hooks = CHUNK_HOOKS_INITIALIZER; @@ -1014,7 +1020,7 @@ arena_chunk_ralloc_huge_expand_hard(arena_t *arena, chunk_hooks_t *chunk_hooks, usize); arena->stats.mapped -= cdiff; } - arena->nactive -= (udiff >> LG_PAGE); + arena_nactive_sub(arena, udiff >> LG_PAGE); malloc_mutex_unlock(&arena->lock); } else if (chunk_hooks->merge(chunk, CHUNK_CEILING(oldsize), nchunk, cdiff, true, arena->ind)) { @@ -1042,7 +1048,7 @@ arena_chunk_ralloc_huge_expand(arena_t *arena, void *chunk, size_t oldsize, arena_huge_ralloc_stats_update(arena, oldsize, usize); arena->stats.mapped += cdiff; } - arena->nactive += (udiff >> LG_PAGE); + arena_nactive_add(arena, udiff >> LG_PAGE); err = (chunk_alloc_cache(arena, &arena->chunk_hooks, nchunk, cdiff, chunksize, zero, true) == NULL); @@ -1058,26 +1064,28 @@ arena_chunk_ralloc_huge_expand(arena_t *arena, void *chunk, size_t oldsize, err = true; } - if (config_stats && !err) - stats_cactive_add(udiff); return (err); } /* * Do first-best-fit run selection, i.e. select the lowest run that best fits. - * Run sizes are quantized, so not all candidate runs are necessarily exactly - * the same size. + * Run sizes are indexed, so not all candidate runs are necessarily exactly the + * same size. */ static arena_run_t * arena_run_first_best_fit(arena_t *arena, size_t size) { - size_t search_size = run_quantize_first(size); - arena_chunk_map_misc_t *key = arena_miscelm_key_create(search_size); - arena_chunk_map_misc_t *miscelm = - arena_avail_tree_nsearch(&arena->runs_avail, key); - if (miscelm == NULL) - return (NULL); - return (&miscelm->run); + szind_t ind, i; + + ind = size2index(run_quantize_ceil(size)); + for (i = ind; i < runs_avail_nclasses + runs_avail_bias; i++) { + arena_chunk_map_misc_t *miscelm = arena_run_tree_first( + arena_runs_avail_get(arena, i)); + if (miscelm != NULL) + return (&miscelm->run); + } + + return (NULL); } static arena_run_t * @@ -1204,16 +1212,194 @@ arena_lg_dirty_mult_set(arena_t *arena, ssize_t lg_dirty_mult) return (false); } -void -arena_maybe_purge(arena_t *arena) +static void +arena_decay_deadline_init(arena_t *arena) { + assert(opt_purge == purge_mode_decay); + + /* + * Generate a new deadline that is uniformly random within the next + * epoch after the current one. + */ + nstime_copy(&arena->decay_deadline, &arena->decay_epoch); + nstime_add(&arena->decay_deadline, &arena->decay_interval); + if (arena->decay_time > 0) { + nstime_t jitter; + + nstime_init(&jitter, prng_range(&arena->decay_jitter_state, + nstime_ns(&arena->decay_interval))); + nstime_add(&arena->decay_deadline, &jitter); + } +} + +static bool +arena_decay_deadline_reached(const arena_t *arena, const nstime_t *time) +{ + + assert(opt_purge == purge_mode_decay); + + return (nstime_compare(&arena->decay_deadline, time) <= 0); +} + +static size_t +arena_decay_backlog_npages_limit(const arena_t *arena) +{ + static const uint64_t h_steps[] = { +#define STEP(step, h, x, y) \ + h, + SMOOTHSTEP +#undef STEP + }; + uint64_t sum; + size_t npages_limit_backlog; + unsigned i; + + assert(opt_purge == purge_mode_decay); + + /* + * For each element of decay_backlog, multiply by the corresponding + * fixed-point smoothstep decay factor. Sum the products, then divide + * to round down to the nearest whole number of pages. + */ + sum = 0; + for (i = 0; i < SMOOTHSTEP_NSTEPS; i++) + sum += arena->decay_backlog[i] * h_steps[i]; + npages_limit_backlog = (sum >> SMOOTHSTEP_BFP); + + return (npages_limit_backlog); +} + +static void +arena_decay_epoch_advance(arena_t *arena, const nstime_t *time) +{ + uint64_t nadvance; + nstime_t delta; + size_t ndirty_delta; + + assert(opt_purge == purge_mode_decay); + assert(arena_decay_deadline_reached(arena, time)); + + nstime_copy(&delta, time); + nstime_subtract(&delta, &arena->decay_epoch); + nadvance = nstime_divide(&delta, &arena->decay_interval); + assert(nadvance > 0); + + /* Add nadvance decay intervals to epoch. */ + nstime_copy(&delta, &arena->decay_interval); + nstime_imultiply(&delta, nadvance); + nstime_add(&arena->decay_epoch, &delta); + + /* Set a new deadline. */ + arena_decay_deadline_init(arena); + + /* Update the backlog. */ + if (nadvance >= SMOOTHSTEP_NSTEPS) { + memset(arena->decay_backlog, 0, (SMOOTHSTEP_NSTEPS-1) * + sizeof(size_t)); + } else { + memmove(arena->decay_backlog, &arena->decay_backlog[nadvance], + (SMOOTHSTEP_NSTEPS - nadvance) * sizeof(size_t)); + if (nadvance > 1) { + memset(&arena->decay_backlog[SMOOTHSTEP_NSTEPS - + nadvance], 0, (nadvance-1) * sizeof(size_t)); + } + } + ndirty_delta = (arena->ndirty > arena->decay_ndirty) ? arena->ndirty - + arena->decay_ndirty : 0; + arena->decay_ndirty = arena->ndirty; + arena->decay_backlog[SMOOTHSTEP_NSTEPS-1] = ndirty_delta; + arena->decay_backlog_npages_limit = + arena_decay_backlog_npages_limit(arena); +} + +static size_t +arena_decay_npages_limit(arena_t *arena) +{ + size_t npages_limit; + + assert(opt_purge == purge_mode_decay); + + npages_limit = arena->decay_backlog_npages_limit; + + /* Add in any dirty pages created during the current epoch. */ + if (arena->ndirty > arena->decay_ndirty) + npages_limit += arena->ndirty - arena->decay_ndirty; + + return (npages_limit); +} + +static void +arena_decay_init(arena_t *arena, ssize_t decay_time) +{ + + arena->decay_time = decay_time; + if (decay_time > 0) { + nstime_init2(&arena->decay_interval, decay_time, 0); + nstime_idivide(&arena->decay_interval, SMOOTHSTEP_NSTEPS); + } + + nstime_init(&arena->decay_epoch, 0); + nstime_update(&arena->decay_epoch); + arena->decay_jitter_state = (uint64_t)(uintptr_t)arena; + arena_decay_deadline_init(arena); + arena->decay_ndirty = arena->ndirty; + arena->decay_backlog_npages_limit = 0; + memset(arena->decay_backlog, 0, SMOOTHSTEP_NSTEPS * sizeof(size_t)); +} + +static bool +arena_decay_time_valid(ssize_t decay_time) +{ + + return (decay_time >= -1 && decay_time <= NSTIME_SEC_MAX); +} + +ssize_t +arena_decay_time_get(arena_t *arena) +{ + ssize_t decay_time; + + malloc_mutex_lock(&arena->lock); + decay_time = arena->decay_time; + malloc_mutex_unlock(&arena->lock); + + return (decay_time); +} + +bool +arena_decay_time_set(arena_t *arena, ssize_t decay_time) +{ + + if (!arena_decay_time_valid(decay_time)) + return (true); + + malloc_mutex_lock(&arena->lock); + /* + * Restart decay backlog from scratch, which may cause many dirty pages + * to be immediately purged. It would conceptually be possible to map + * the old backlog onto the new backlog, but there is no justification + * for such complexity since decay_time changes are intended to be + * infrequent, either between the {-1, 0, >0} states, or a one-time + * arbitrary change during initial arena configuration. + */ + arena_decay_init(arena, decay_time); + arena_maybe_purge(arena); + malloc_mutex_unlock(&arena->lock); + + return (false); +} + +static void +arena_maybe_purge_ratio(arena_t *arena) +{ + + assert(opt_purge == purge_mode_ratio); + /* Don't purge if the option is disabled. */ if (arena->lg_dirty_mult < 0) return; - /* Don't recursively purge. */ - if (arena->purging) - return; + /* * Iterate, since preventing recursive purging could otherwise leave too * many dirty pages. @@ -1228,8 +1414,57 @@ arena_maybe_purge(arena_t *arena) */ if (arena->ndirty <= threshold) return; - arena_purge(arena, false); + arena_purge_to_limit(arena, threshold); + } +} + +static void +arena_maybe_purge_decay(arena_t *arena) +{ + nstime_t time; + size_t ndirty_limit; + + assert(opt_purge == purge_mode_decay); + + /* Purge all or nothing if the option is disabled. */ + if (arena->decay_time <= 0) { + if (arena->decay_time == 0) + arena_purge_to_limit(arena, 0); + return; } + + nstime_copy(&time, &arena->decay_epoch); + if (unlikely(nstime_update(&time))) { + /* Time went backwards. Force an epoch advance. */ + nstime_copy(&time, &arena->decay_deadline); + } + + if (arena_decay_deadline_reached(arena, &time)) + arena_decay_epoch_advance(arena, &time); + + ndirty_limit = arena_decay_npages_limit(arena); + + /* + * Don't try to purge unless the number of purgeable pages exceeds the + * current limit. + */ + if (arena->ndirty <= ndirty_limit) + return; + arena_purge_to_limit(arena, ndirty_limit); +} + +void +arena_maybe_purge(arena_t *arena) +{ + + /* Don't recursively purge. */ + if (arena->purging) + return; + + if (opt_purge == purge_mode_ratio) + arena_maybe_purge_ratio(arena); + else + arena_maybe_purge_decay(arena); } static size_t @@ -1267,35 +1502,15 @@ arena_dirty_count(arena_t *arena) } static size_t -arena_compute_npurge(arena_t *arena, bool all) -{ - size_t npurge; - - /* - * Compute the minimum number of pages that this thread should try to - * purge. - */ - if (!all) { - size_t threshold = (arena->nactive >> arena->lg_dirty_mult); - threshold = threshold < chunk_npages ? chunk_npages : threshold; - - npurge = arena->ndirty - threshold; - } else - npurge = arena->ndirty; - - return (npurge); -} - -static size_t -arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, bool all, - size_t npurge, arena_runs_dirty_link_t *purge_runs_sentinel, +arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, + size_t ndirty_limit, arena_runs_dirty_link_t *purge_runs_sentinel, extent_node_t *purge_chunks_sentinel) { arena_runs_dirty_link_t *rdelm, *rdelm_next; extent_node_t *chunkselm; size_t nstashed = 0; - /* Stash at least npurge pages. */ + /* Stash runs/chunks according to ndirty_limit. */ for (rdelm = qr_next(&arena->runs_dirty, rd_link), chunkselm = qr_next(&arena->chunks_cache, cc_link); rdelm != &arena->runs_dirty; rdelm = rdelm_next) { @@ -1307,6 +1522,11 @@ arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, bool all, bool zero; UNUSED void *chunk; + npages = extent_node_size_get(chunkselm) >> LG_PAGE; + if (opt_purge == purge_mode_decay && arena->ndirty - + (nstashed + npages) < ndirty_limit) + break; + chunkselm_next = qr_next(chunkselm, cc_link); /* * Allocate. chunkselm remains valid due to the @@ -1321,7 +1541,8 @@ arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, bool all, assert(zero == extent_node_zeroed_get(chunkselm)); extent_node_dirty_insert(chunkselm, purge_runs_sentinel, purge_chunks_sentinel); - npages = extent_node_size_get(chunkselm) >> LG_PAGE; + assert(npages == (extent_node_size_get(chunkselm) >> + LG_PAGE)); chunkselm = chunkselm_next; } else { arena_chunk_t *chunk = @@ -1334,6 +1555,9 @@ arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, bool all, arena_mapbits_unallocated_size_get(chunk, pageind); npages = run_size >> LG_PAGE; + if (opt_purge == purge_mode_decay && arena->ndirty - + (nstashed + npages) < ndirty_limit) + break; assert(pageind + npages <= chunk_npages); assert(arena_mapbits_dirty_get(chunk, pageind) == @@ -1359,7 +1583,8 @@ arena_stash_dirty(arena_t *arena, chunk_hooks_t *chunk_hooks, bool all, } nstashed += npages; - if (!all && nstashed >= npurge) + if (opt_purge == purge_mode_ratio && arena->ndirty - nstashed <= + ndirty_limit) break; } @@ -1499,11 +1724,20 @@ arena_unstash_purged(arena_t *arena, chunk_hooks_t *chunk_hooks, } } +/* + * NB: ndirty_limit is interpreted differently depending on opt_purge: + * - purge_mode_ratio: Purge as few dirty run/chunks as possible to reach the + * desired state: + * (arena->ndirty <= ndirty_limit) + * - purge_mode_decay: Purge as many dirty runs/chunks as possible without + * violating the invariant: + * (arena->ndirty >= ndirty_limit) + */ static void -arena_purge(arena_t *arena, bool all) +arena_purge_to_limit(arena_t *arena, size_t ndirty_limit) { chunk_hooks_t chunk_hooks = chunk_hooks_get(arena); - size_t npurge, npurgeable, npurged; + size_t npurge, npurged; arena_runs_dirty_link_t purge_runs_sentinel; extent_node_t purge_chunks_sentinel; @@ -1517,33 +1751,38 @@ arena_purge(arena_t *arena, bool all) size_t ndirty = arena_dirty_count(arena); assert(ndirty == arena->ndirty); } - assert((arena->nactive >> arena->lg_dirty_mult) < arena->ndirty || all); + assert(opt_purge != purge_mode_ratio || (arena->nactive >> + arena->lg_dirty_mult) < arena->ndirty || ndirty_limit == 0); - if (config_stats) - arena->stats.npurge++; - - npurge = arena_compute_npurge(arena, all); qr_new(&purge_runs_sentinel, rd_link); extent_node_dirty_linkage_init(&purge_chunks_sentinel); - npurgeable = arena_stash_dirty(arena, &chunk_hooks, all, npurge, + npurge = arena_stash_dirty(arena, &chunk_hooks, ndirty_limit, &purge_runs_sentinel, &purge_chunks_sentinel); - assert(npurgeable >= npurge); + if (npurge == 0) + goto label_return; npurged = arena_purge_stashed(arena, &chunk_hooks, &purge_runs_sentinel, &purge_chunks_sentinel); - assert(npurged == npurgeable); + assert(npurged == npurge); arena_unstash_purged(arena, &chunk_hooks, &purge_runs_sentinel, &purge_chunks_sentinel); + if (config_stats) + arena->stats.npurge++; + +label_return: arena->purging = false; } void -arena_purge_all(arena_t *arena) +arena_purge(arena_t *arena, bool all) { malloc_mutex_lock(&arena->lock); - arena_purge(arena, true); + if (all) + arena_purge_to_limit(arena, 0); + else + arena_maybe_purge(arena); malloc_mutex_unlock(&arena->lock); } @@ -1660,18 +1899,6 @@ arena_run_size_get(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, return (size); } -static bool -arena_run_decommit(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run) -{ - arena_chunk_map_misc_t *miscelm = arena_run_to_miscelm(run); - size_t run_ind = arena_miscelm_to_pageind(miscelm); - size_t offset = run_ind << LG_PAGE; - size_t length = arena_run_size_get(arena, chunk, run, run_ind); - - return (arena->chunk_hooks.decommit(chunk, chunksize, offset, length, - arena->ind)); -} - static void arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty, bool cleaned, bool decommitted) @@ -1687,8 +1914,7 @@ arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty, bool cleaned, assert(run_ind < chunk_npages); size = arena_run_size_get(arena, chunk, run, run_ind); run_pages = (size >> LG_PAGE); - arena_cactive_update(arena, 0, run_pages); - arena->nactive -= run_pages; + arena_nactive_sub(arena, run_pages); /* * The run is dirty if the caller claims to have dirtied it, as well as @@ -1750,15 +1976,6 @@ arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty, bool cleaned, } static void -arena_run_dalloc_decommit(arena_t *arena, arena_chunk_t *chunk, - arena_run_t *run) -{ - bool committed = arena_run_decommit(arena, chunk, run); - - arena_run_dalloc(arena, run, committed, false, !committed); -} - -static void arena_run_trim_head(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, size_t oldsize, size_t newsize) { @@ -1986,8 +2203,8 @@ arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin) } void -arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, szind_t binind, - uint64_t prof_accumbytes) +arena_tcache_fill_small(tsd_t *tsd, arena_t *arena, tcache_bin_t *tbin, + szind_t binind, uint64_t prof_accumbytes) { unsigned i, nfill; arena_bin_t *bin; @@ -2010,11 +2227,10 @@ arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, szind_t binind, /* * OOM. tbin->avail isn't yet filled down to its first * element, so the successful allocations (if any) must - * be moved to the base of tbin->avail before bailing - * out. + * be moved just before tbin->avail before bailing out. */ if (i > 0) { - memmove(tbin->avail, &tbin->avail[nfill - i], + memmove(tbin->avail - i, tbin->avail - nfill, i * sizeof(void *)); } break; @@ -2024,7 +2240,7 @@ arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, szind_t binind, true); } /* Insert such that low regions get used first. */ - tbin->avail[nfill - 1 - i] = ptr; + *(tbin->avail - nfill + i) = ptr; } if (config_stats) { bin->stats.nmalloc += i; @@ -2035,6 +2251,7 @@ arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, szind_t binind, } malloc_mutex_unlock(&bin->lock); tbin->ncached = i; + arena_decay_tick(tsd, arena); } void @@ -2144,18 +2361,17 @@ arena_quarantine_junk_small(void *ptr, size_t usize) arena_redzones_validate(ptr, bin_info, true); } -void * -arena_malloc_small(arena_t *arena, size_t size, bool zero) +static void * +arena_malloc_small(tsd_t *tsd, arena_t *arena, szind_t binind, bool zero) { void *ret; arena_bin_t *bin; + size_t usize; arena_run_t *run; - szind_t binind; - binind = size2index(size); assert(binind < NBINS); bin = &arena->bins[binind]; - size = index2size(binind); + usize = index2size(binind); malloc_mutex_lock(&bin->lock); if ((run = bin->runcur) != NULL && run->nfree > 0) @@ -2174,7 +2390,7 @@ arena_malloc_small(arena_t *arena, size_t size, bool zero) bin->stats.curregs++; } malloc_mutex_unlock(&bin->lock); - if (config_prof && !isthreaded && arena_prof_accum(arena, size)) + if (config_prof && !isthreaded && arena_prof_accum(arena, usize)) prof_idump(); if (!zero) { @@ -2183,23 +2399,24 @@ arena_malloc_small(arena_t *arena, size_t size, bool zero) arena_alloc_junk_small(ret, &arena_bin_info[binind], false); } else if (unlikely(opt_zero)) - memset(ret, 0, size); + memset(ret, 0, usize); } - JEMALLOC_VALGRIND_MAKE_MEM_UNDEFINED(ret, size); + JEMALLOC_VALGRIND_MAKE_MEM_UNDEFINED(ret, usize); } else { if (config_fill && unlikely(opt_junk_alloc)) { arena_alloc_junk_small(ret, &arena_bin_info[binind], true); } - JEMALLOC_VALGRIND_MAKE_MEM_UNDEFINED(ret, size); - memset(ret, 0, size); + JEMALLOC_VALGRIND_MAKE_MEM_UNDEFINED(ret, usize); + memset(ret, 0, usize); } + arena_decay_tick(tsd, arena); return (ret); } void * -arena_malloc_large(arena_t *arena, size_t size, bool zero) +arena_malloc_large(tsd_t *tsd, arena_t *arena, szind_t binind, bool zero) { void *ret; size_t usize; @@ -2209,7 +2426,7 @@ arena_malloc_large(arena_t *arena, size_t size, bool zero) UNUSED bool idump; /* Large allocation. */ - usize = s2u(size); + usize = index2size(binind); malloc_mutex_lock(&arena->lock); if (config_cache_oblivious) { uint64_t r; @@ -2219,9 +2436,7 @@ arena_malloc_large(arena_t *arena, size_t size, bool zero) * that is a multiple of the cacheline size, e.g. [0 .. 63) * 64 * for 4 KiB pages and 64-byte cachelines. */ - prng64(r, LG_PAGE - LG_CACHELINE, arena->offset_state, - UINT64_C(6364136223846793009), - UINT64_C(1442695040888963409)); + r = prng_lg_range(&arena->offset_state, LG_PAGE - LG_CACHELINE); random_offset = ((uintptr_t)r) << LG_CACHELINE; } else random_offset = 0; @@ -2234,7 +2449,7 @@ arena_malloc_large(arena_t *arena, size_t size, bool zero) ret = (void *)((uintptr_t)arena_miscelm_to_rpages(miscelm) + random_offset); if (config_stats) { - szind_t index = size2index(usize) - NBINS; + szind_t index = binind - NBINS; arena->stats.nmalloc_large++; arena->stats.nrequests_large++; @@ -2258,9 +2473,26 @@ arena_malloc_large(arena_t *arena, size_t size, bool zero) } } + arena_decay_tick(tsd, arena); return (ret); } +void * +arena_malloc_hard(tsd_t *tsd, arena_t *arena, size_t size, szind_t ind, + bool zero, tcache_t *tcache) +{ + + arena = arena_choose(tsd, arena); + if (unlikely(arena == NULL)) + return (NULL); + + if (likely(size <= SMALL_MAXCLASS)) + return (arena_malloc_small(tsd, arena, ind, zero)); + if (likely(size <= large_maxclass)) + return (arena_malloc_large(tsd, arena, ind, zero)); + return (huge_malloc(tsd, arena, index2size(ind), zero, tcache)); +} + /* Only handles large allocations that require more than page alignment. */ static void * arena_palloc_large(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, @@ -2344,6 +2576,7 @@ arena_palloc_large(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, else if (unlikely(opt_zero)) memset(ret, 0, usize); } + arena_decay_tick(tsd, arena); return (ret); } @@ -2356,7 +2589,8 @@ arena_palloc(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, if (usize <= SMALL_MAXCLASS && (alignment < PAGE || (alignment == PAGE && (usize & PAGE_MASK) == 0))) { /* Small; alignment doesn't require special run placement. */ - ret = arena_malloc(tsd, arena, usize, zero, tcache); + ret = arena_malloc(tsd, arena, usize, size2index(usize), zero, + tcache, true); } else if (usize <= large_maxclass && alignment <= PAGE) { /* * Large; alignment doesn't require special run placement. @@ -2364,7 +2598,8 @@ arena_palloc(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, * the base of the run, so do some bit manipulation to retrieve * the base. */ - ret = arena_malloc(tsd, arena, usize, zero, tcache); + ret = arena_malloc(tsd, arena, usize, size2index(usize), zero, + tcache, true); if (config_cache_oblivious) ret = (void *)((uintptr_t)ret & ~PAGE_MASK); } else { @@ -2441,7 +2676,7 @@ arena_dalloc_bin_run(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, malloc_mutex_unlock(&bin->lock); /******************************/ malloc_mutex_lock(&arena->lock); - arena_run_dalloc_decommit(arena, chunk, run); + arena_run_dalloc(arena, run, true, false, false); malloc_mutex_unlock(&arena->lock); /****************************/ malloc_mutex_lock(&bin->lock); @@ -2528,7 +2763,7 @@ arena_dalloc_bin(arena_t *arena, arena_chunk_t *chunk, void *ptr, } void -arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr, +arena_dalloc_small(tsd_t *tsd, arena_t *arena, arena_chunk_t *chunk, void *ptr, size_t pageind) { arena_chunk_map_bits_t *bitselm; @@ -2540,6 +2775,7 @@ arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr, } bitselm = arena_bitselm_get(chunk, pageind); arena_dalloc_bin(arena, chunk, ptr, pageind, bitselm); + arena_decay_tick(tsd, arena); } #ifdef JEMALLOC_JET @@ -2584,7 +2820,7 @@ arena_dalloc_large_locked_impl(arena_t *arena, arena_chunk_t *chunk, } } - arena_run_dalloc_decommit(arena, chunk, run); + arena_run_dalloc(arena, run, true, false, false); } void @@ -2596,12 +2832,13 @@ arena_dalloc_large_junked_locked(arena_t *arena, arena_chunk_t *chunk, } void -arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr) +arena_dalloc_large(tsd_t *tsd, arena_t *arena, arena_chunk_t *chunk, void *ptr) { malloc_mutex_lock(&arena->lock); arena_dalloc_large_locked_impl(arena, chunk, ptr, false); malloc_mutex_unlock(&arena->lock); + arena_decay_tick(tsd, arena); } static void @@ -2802,14 +3039,22 @@ arena_ralloc_large(void *ptr, size_t oldsize, size_t usize_min, } bool -arena_ralloc_no_move(void *ptr, size_t oldsize, size_t size, size_t extra, - bool zero) +arena_ralloc_no_move(tsd_t *tsd, void *ptr, size_t oldsize, size_t size, + size_t extra, bool zero) { size_t usize_min, usize_max; + /* Calls with non-zero extra had to clamp extra. */ + assert(extra == 0 || size + extra <= HUGE_MAXCLASS); + + if (unlikely(size > HUGE_MAXCLASS)) + return (true); + usize_min = s2u(size); usize_max = s2u(size + extra); if (likely(oldsize <= large_maxclass && usize_min <= large_maxclass)) { + arena_chunk_t *chunk; + /* * Avoid moving the allocation if the size class can be left the * same. @@ -2817,23 +3062,24 @@ arena_ralloc_no_move(void *ptr, size_t oldsize, size_t size, size_t extra, if (oldsize <= SMALL_MAXCLASS) { assert(arena_bin_info[size2index(oldsize)].reg_size == oldsize); - if ((usize_max <= SMALL_MAXCLASS && - size2index(usize_max) == size2index(oldsize)) || - (size <= oldsize && usize_max >= oldsize)) - return (false); + if ((usize_max > SMALL_MAXCLASS || + size2index(usize_max) != size2index(oldsize)) && + (size > oldsize || usize_max < oldsize)) + return (true); } else { - if (usize_max > SMALL_MAXCLASS) { - if (!arena_ralloc_large(ptr, oldsize, usize_min, - usize_max, zero)) - return (false); - } + if (usize_max <= SMALL_MAXCLASS) + return (true); + if (arena_ralloc_large(ptr, oldsize, usize_min, + usize_max, zero)) + return (true); } - /* Reallocation would require a move. */ - return (true); + chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); + arena_decay_tick(tsd, extent_node_arena_get(&chunk->node)); + return (false); } else { - return (huge_ralloc_no_move(ptr, oldsize, usize_min, usize_max, - zero)); + return (huge_ralloc_no_move(tsd, ptr, oldsize, usize_min, + usize_max, zero)); } } @@ -2843,9 +3089,10 @@ arena_ralloc_move_helper(tsd_t *tsd, arena_t *arena, size_t usize, { if (alignment == 0) - return (arena_malloc(tsd, arena, usize, zero, tcache)); + return (arena_malloc(tsd, arena, usize, size2index(usize), zero, + tcache, true)); usize = sa2u(usize, alignment); - if (usize == 0) + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) return (NULL); return (ipalloct(tsd, usize, alignment, zero, tcache, arena)); } @@ -2858,14 +3105,14 @@ arena_ralloc(tsd_t *tsd, arena_t *arena, void *ptr, size_t oldsize, size_t size, size_t usize; usize = s2u(size); - if (usize == 0) + if (unlikely(usize == 0 || size > HUGE_MAXCLASS)) return (NULL); if (likely(usize <= large_maxclass)) { size_t copysize; /* Try to avoid moving the allocation. */ - if (!arena_ralloc_no_move(ptr, oldsize, usize, 0, zero)) + if (!arena_ralloc_no_move(tsd, ptr, oldsize, usize, 0, zero)) return (ptr); /* @@ -2928,25 +3175,72 @@ bool arena_lg_dirty_mult_default_set(ssize_t lg_dirty_mult) { + if (opt_purge != purge_mode_ratio) + return (true); if (!arena_lg_dirty_mult_valid(lg_dirty_mult)) return (true); atomic_write_z((size_t *)&lg_dirty_mult_default, (size_t)lg_dirty_mult); return (false); } -void -arena_stats_merge(arena_t *arena, const char **dss, ssize_t *lg_dirty_mult, - size_t *nactive, size_t *ndirty, arena_stats_t *astats, - malloc_bin_stats_t *bstats, malloc_large_stats_t *lstats, - malloc_huge_stats_t *hstats) +ssize_t +arena_decay_time_default_get(void) { - unsigned i; - malloc_mutex_lock(&arena->lock); + return ((ssize_t)atomic_read_z((size_t *)&decay_time_default)); +} + +bool +arena_decay_time_default_set(ssize_t decay_time) +{ + + if (opt_purge != purge_mode_decay) + return (true); + if (!arena_decay_time_valid(decay_time)) + return (true); + atomic_write_z((size_t *)&decay_time_default, (size_t)decay_time); + return (false); +} + +static void +arena_basic_stats_merge_locked(arena_t *arena, unsigned *nthreads, + const char **dss, ssize_t *lg_dirty_mult, ssize_t *decay_time, + size_t *nactive, size_t *ndirty) +{ + + *nthreads += arena_nthreads_get(arena); *dss = dss_prec_names[arena->dss_prec]; *lg_dirty_mult = arena->lg_dirty_mult; + *decay_time = arena->decay_time; *nactive += arena->nactive; *ndirty += arena->ndirty; +} + +void +arena_basic_stats_merge(arena_t *arena, unsigned *nthreads, const char **dss, + ssize_t *lg_dirty_mult, ssize_t *decay_time, size_t *nactive, + size_t *ndirty) +{ + + malloc_mutex_lock(&arena->lock); + arena_basic_stats_merge_locked(arena, nthreads, dss, lg_dirty_mult, + decay_time, nactive, ndirty); + malloc_mutex_unlock(&arena->lock); +} + +void +arena_stats_merge(arena_t *arena, unsigned *nthreads, const char **dss, + ssize_t *lg_dirty_mult, ssize_t *decay_time, size_t *nactive, + size_t *ndirty, arena_stats_t *astats, malloc_bin_stats_t *bstats, + malloc_large_stats_t *lstats, malloc_huge_stats_t *hstats) +{ + unsigned i; + + cassert(config_stats); + + malloc_mutex_lock(&arena->lock); + arena_basic_stats_merge_locked(arena, nthreads, dss, lg_dirty_mult, + decay_time, nactive, ndirty); astats->mapped += arena->stats.mapped; astats->npurge += arena->stats.npurge; @@ -2995,23 +3289,48 @@ arena_stats_merge(arena_t *arena, const char **dss, ssize_t *lg_dirty_mult, } } +unsigned +arena_nthreads_get(arena_t *arena) +{ + + return (atomic_read_u(&arena->nthreads)); +} + +void +arena_nthreads_inc(arena_t *arena) +{ + + atomic_add_u(&arena->nthreads, 1); +} + +void +arena_nthreads_dec(arena_t *arena) +{ + + atomic_sub_u(&arena->nthreads, 1); +} + arena_t * arena_new(unsigned ind) { arena_t *arena; + size_t arena_size; unsigned i; arena_bin_t *bin; + /* Compute arena size to incorporate sufficient runs_avail elements. */ + arena_size = offsetof(arena_t, runs_avail) + (sizeof(arena_run_tree_t) * + runs_avail_nclasses); /* * Allocate arena, arena->lstats, and arena->hstats contiguously, mainly * because there is no way to clean up if base_alloc() OOMs. */ if (config_stats) { - arena = (arena_t *)base_alloc(CACHELINE_CEILING(sizeof(arena_t)) - + QUANTUM_CEILING(nlclasses * sizeof(malloc_large_stats_t) + + arena = (arena_t *)base_alloc(CACHELINE_CEILING(arena_size) + + QUANTUM_CEILING(nlclasses * sizeof(malloc_large_stats_t) + nhclasses) * sizeof(malloc_huge_stats_t)); } else - arena = (arena_t *)base_alloc(sizeof(arena_t)); + arena = (arena_t *)base_alloc(arena_size); if (arena == NULL) return (NULL); @@ -3023,11 +3342,11 @@ arena_new(unsigned ind) if (config_stats) { memset(&arena->stats, 0, sizeof(arena_stats_t)); arena->stats.lstats = (malloc_large_stats_t *)((uintptr_t)arena - + CACHELINE_CEILING(sizeof(arena_t))); + + CACHELINE_CEILING(arena_size)); memset(arena->stats.lstats, 0, nlclasses * sizeof(malloc_large_stats_t)); arena->stats.hstats = (malloc_huge_stats_t *)((uintptr_t)arena - + CACHELINE_CEILING(sizeof(arena_t)) + + + CACHELINE_CEILING(arena_size) + QUANTUM_CEILING(nlclasses * sizeof(malloc_large_stats_t))); memset(arena->stats.hstats, 0, nhclasses * sizeof(malloc_huge_stats_t)); @@ -3059,10 +3378,14 @@ arena_new(unsigned ind) arena->nactive = 0; arena->ndirty = 0; - arena_avail_tree_new(&arena->runs_avail); + for(i = 0; i < runs_avail_nclasses; i++) + arena_run_tree_new(&arena->runs_avail[i]); qr_new(&arena->runs_dirty, rd_link); qr_new(&arena->chunks_cache, cc_link); + if (opt_purge == purge_mode_decay) + arena_decay_init(arena, arena_decay_time_default_get()); + ql_new(&arena->huge); if (malloc_mutex_init(&arena->huge_mtx)) return (NULL); @@ -3117,8 +3440,7 @@ bin_info_run_size_calc(arena_bin_info_t *bin_info) * be twice as large in order to maintain alignment. */ if (config_fill && unlikely(opt_redzone)) { - size_t align_min = ZU(1) << (jemalloc_ffs(bin_info->reg_size) - - 1); + size_t align_min = ZU(1) << (ffs_zu(bin_info->reg_size) - 1); if (align_min <= REDZONE_MINSIZE) { bin_info->redzone_size = REDZONE_MINSIZE; pad_size = 0; @@ -3138,18 +3460,19 @@ bin_info_run_size_calc(arena_bin_info_t *bin_info) * size). */ try_run_size = PAGE; - try_nregs = try_run_size / bin_info->reg_size; + try_nregs = (uint32_t)(try_run_size / bin_info->reg_size); do { perfect_run_size = try_run_size; perfect_nregs = try_nregs; try_run_size += PAGE; - try_nregs = try_run_size / bin_info->reg_size; + try_nregs = (uint32_t)(try_run_size / bin_info->reg_size); } while (perfect_run_size != perfect_nregs * bin_info->reg_size); assert(perfect_nregs <= RUN_MAXREGS); actual_run_size = perfect_run_size; - actual_nregs = (actual_run_size - pad_size) / bin_info->reg_interval; + actual_nregs = (uint32_t)((actual_run_size - pad_size) / + bin_info->reg_interval); /* * Redzones can require enough padding that not even a single region can @@ -3161,8 +3484,8 @@ bin_info_run_size_calc(arena_bin_info_t *bin_info) assert(config_fill && unlikely(opt_redzone)); actual_run_size += PAGE; - actual_nregs = (actual_run_size - pad_size) / - bin_info->reg_interval; + actual_nregs = (uint32_t)((actual_run_size - pad_size) / + bin_info->reg_interval); } /* @@ -3170,8 +3493,8 @@ bin_info_run_size_calc(arena_bin_info_t *bin_info) */ while (actual_run_size > arena_maxrun) { actual_run_size -= PAGE; - actual_nregs = (actual_run_size - pad_size) / - bin_info->reg_interval; + actual_nregs = (uint32_t)((actual_run_size - pad_size) / + bin_info->reg_interval); } assert(actual_nregs > 0); assert(actual_run_size == s2u(actual_run_size)); @@ -3179,8 +3502,8 @@ bin_info_run_size_calc(arena_bin_info_t *bin_info) /* Copy final settings. */ bin_info->run_size = actual_run_size; bin_info->nregs = actual_nregs; - bin_info->reg0_offset = actual_run_size - (actual_nregs * - bin_info->reg_interval) - pad_size + bin_info->redzone_size; + bin_info->reg0_offset = (uint32_t)(actual_run_size - (actual_nregs * + bin_info->reg_interval) - pad_size + bin_info->redzone_size); if (actual_run_size > small_maxrun) small_maxrun = actual_run_size; @@ -3234,12 +3557,42 @@ small_run_size_init(void) return (false); } +static bool +run_quantize_init(void) +{ + unsigned i; + + run_quantize_max = chunksize + large_pad; + + run_quantize_floor_tab = (size_t *)base_alloc(sizeof(size_t) * + (run_quantize_max >> LG_PAGE)); + if (run_quantize_floor_tab == NULL) + return (true); + + run_quantize_ceil_tab = (size_t *)base_alloc(sizeof(size_t) * + (run_quantize_max >> LG_PAGE)); + if (run_quantize_ceil_tab == NULL) + return (true); + + for (i = 1; i <= run_quantize_max >> LG_PAGE; i++) { + size_t run_size = i << LG_PAGE; + + run_quantize_floor_tab[i-1] = + run_quantize_floor_compute(run_size); + run_quantize_ceil_tab[i-1] = + run_quantize_ceil_compute(run_size); + } + + return (false); +} + bool arena_boot(void) { unsigned i; arena_lg_dirty_mult_default_set(opt_lg_dirty_mult); + arena_decay_time_default_set(opt_decay_time); /* * Compute the header size such that it is large enough to contain the @@ -3281,7 +3634,15 @@ arena_boot(void) nhclasses = NSIZES - nlclasses - NBINS; bin_info_init(); - return (small_run_size_init()); + if (small_run_size_init()) + return (true); + if (run_quantize_init()) + return (true); + + runs_avail_bias = size2index(PAGE); + runs_avail_nclasses = size2index(run_quantize_max)+1 - runs_avail_bias; + + return (false); } void diff --git a/contrib/jemalloc/src/bitmap.c b/contrib/jemalloc/src/bitmap.c index c733372..b1e6627 100644 --- a/contrib/jemalloc/src/bitmap.c +++ b/contrib/jemalloc/src/bitmap.c @@ -3,6 +3,8 @@ /******************************************************************************/ +#ifdef USE_TREE + void bitmap_info_init(bitmap_info_t *binfo, size_t nbits) { @@ -32,20 +34,11 @@ bitmap_info_init(bitmap_info_t *binfo, size_t nbits) binfo->nbits = nbits; } -size_t +static size_t bitmap_info_ngroups(const bitmap_info_t *binfo) { - return (binfo->levels[binfo->nlevels].group_offset << LG_SIZEOF_BITMAP); -} - -size_t -bitmap_size(size_t nbits) -{ - bitmap_info_t binfo; - - bitmap_info_init(&binfo, nbits); - return (bitmap_info_ngroups(&binfo)); + return (binfo->levels[binfo->nlevels].group_offset); } void @@ -61,8 +54,7 @@ bitmap_init(bitmap_t *bitmap, const bitmap_info_t *binfo) * correspond to the first logical bit in the group, so extra bits * are the most significant bits of the last group. */ - memset(bitmap, 0xffU, binfo->levels[binfo->nlevels].group_offset << - LG_SIZEOF_BITMAP); + memset(bitmap, 0xffU, bitmap_size(binfo)); extra = (BITMAP_GROUP_NBITS - (binfo->nbits & BITMAP_GROUP_NBITS_MASK)) & BITMAP_GROUP_NBITS_MASK; if (extra != 0) @@ -76,3 +68,47 @@ bitmap_init(bitmap_t *bitmap, const bitmap_info_t *binfo) bitmap[binfo->levels[i+1].group_offset - 1] >>= extra; } } + +#else /* USE_TREE */ + +void +bitmap_info_init(bitmap_info_t *binfo, size_t nbits) +{ + size_t i; + + assert(nbits > 0); + assert(nbits <= (ZU(1) << LG_BITMAP_MAXBITS)); + + i = nbits >> LG_BITMAP_GROUP_NBITS; + if (nbits % BITMAP_GROUP_NBITS != 0) + i++; + binfo->ngroups = i; + binfo->nbits = nbits; +} + +static size_t +bitmap_info_ngroups(const bitmap_info_t *binfo) +{ + + return (binfo->ngroups); +} + +void +bitmap_init(bitmap_t *bitmap, const bitmap_info_t *binfo) +{ + size_t extra; + + memset(bitmap, 0xffU, bitmap_size(binfo)); + extra = (binfo->nbits % (binfo->ngroups * BITMAP_GROUP_NBITS)); + if (extra != 0) + bitmap[binfo->ngroups - 1] >>= (BITMAP_GROUP_NBITS - extra); +} + +#endif /* USE_TREE */ + +size_t +bitmap_size(const bitmap_info_t *binfo) +{ + + return (bitmap_info_ngroups(binfo) << LG_SIZEOF_BITMAP); +} diff --git a/contrib/jemalloc/src/chunk.c b/contrib/jemalloc/src/chunk.c index 6ba1ca7..b179d21 100644 --- a/contrib/jemalloc/src/chunk.c +++ b/contrib/jemalloc/src/chunk.c @@ -332,30 +332,20 @@ chunk_alloc_core(arena_t *arena, void *new_addr, size_t size, size_t alignment, bool *zero, bool *commit, dss_prec_t dss_prec) { void *ret; - chunk_hooks_t chunk_hooks = CHUNK_HOOKS_INITIALIZER; assert(size != 0); assert((size & chunksize_mask) == 0); assert(alignment != 0); assert((alignment & chunksize_mask) == 0); - /* Retained. */ - if ((ret = chunk_recycle(arena, &chunk_hooks, - &arena->chunks_szad_retained, &arena->chunks_ad_retained, false, - new_addr, size, alignment, zero, commit, true)) != NULL) - return (ret); - /* "primary" dss. */ if (have_dss && dss_prec == dss_prec_primary && (ret = chunk_alloc_dss(arena, new_addr, size, alignment, zero, commit)) != NULL) return (ret); - /* - * mmap. Requesting an address is not implemented for - * chunk_alloc_mmap(), so only call it if (new_addr == NULL). - */ - if (new_addr == NULL && (ret = chunk_alloc_mmap(size, alignment, zero, - commit)) != NULL) + /* mmap. */ + if ((ret = chunk_alloc_mmap(new_addr, size, alignment, zero, commit)) != + NULL) return (ret); /* "secondary" dss. */ if (have_dss && dss_prec == dss_prec_secondary && (ret = @@ -380,7 +370,7 @@ chunk_alloc_base(size_t size) */ zero = true; commit = true; - ret = chunk_alloc_mmap(size, chunksize, &zero, &commit); + ret = chunk_alloc_mmap(NULL, size, chunksize, &zero, &commit); if (ret == NULL) return (NULL); if (config_valgrind) @@ -418,9 +408,7 @@ chunk_arena_get(unsigned arena_ind) { arena_t *arena; - /* Dodge tsd for a0 in order to avoid bootstrapping issues. */ - arena = (arena_ind == 0) ? a0get() : arena_get(tsd_fetch(), arena_ind, - false, true); + arena = arena_get(arena_ind, false); /* * The arena we're allocating on behalf of must have been initialized * already. @@ -447,6 +435,21 @@ chunk_alloc_default(void *new_addr, size_t size, size_t alignment, bool *zero, return (ret); } +static void * +chunk_alloc_retained(arena_t *arena, chunk_hooks_t *chunk_hooks, void *new_addr, + size_t size, size_t alignment, bool *zero, bool *commit) +{ + + assert(size != 0); + assert((size & chunksize_mask) == 0); + assert(alignment != 0); + assert((alignment & chunksize_mask) == 0); + + return (chunk_recycle(arena, chunk_hooks, &arena->chunks_szad_retained, + &arena->chunks_ad_retained, false, new_addr, size, alignment, zero, + commit, true)); +} + void * chunk_alloc_wrapper(arena_t *arena, chunk_hooks_t *chunk_hooks, void *new_addr, size_t size, size_t alignment, bool *zero, bool *commit) @@ -454,10 +457,16 @@ chunk_alloc_wrapper(arena_t *arena, chunk_hooks_t *chunk_hooks, void *new_addr, void *ret; chunk_hooks_assure_initialized(arena, chunk_hooks); - ret = chunk_hooks->alloc(new_addr, size, alignment, zero, commit, - arena->ind); - if (ret == NULL) - return (NULL); + + ret = chunk_alloc_retained(arena, chunk_hooks, new_addr, size, + alignment, zero, commit); + if (ret == NULL) { + ret = chunk_hooks->alloc(new_addr, size, alignment, zero, + commit, arena->ind); + if (ret == NULL) + return (NULL); + } + if (config_valgrind && chunk_hooks->alloc != chunk_alloc_default) JEMALLOC_VALGRIND_MAKE_MEM_UNDEFINED(ret, chunksize); return (ret); @@ -716,7 +725,7 @@ chunk_boot(void) * so pages_map will always take fast path. */ if (!opt_lg_chunk) { - opt_lg_chunk = jemalloc_ffs((int)info.dwAllocationGranularity) + opt_lg_chunk = ffs_u((unsigned)info.dwAllocationGranularity) - 1; } #else @@ -732,8 +741,8 @@ chunk_boot(void) if (have_dss && chunk_dss_boot()) return (true); - if (rtree_new(&chunks_rtree, (ZU(1) << (LG_SIZEOF_PTR+3)) - - opt_lg_chunk, chunks_rtree_node_alloc, NULL)) + if (rtree_new(&chunks_rtree, (unsigned)((ZU(1) << (LG_SIZEOF_PTR+3)) - + opt_lg_chunk), chunks_rtree_node_alloc, NULL)) return (true); return (false); diff --git a/contrib/jemalloc/src/chunk_mmap.c b/contrib/jemalloc/src/chunk_mmap.c index b9ba741..56b2ee4 100644 --- a/contrib/jemalloc/src/chunk_mmap.c +++ b/contrib/jemalloc/src/chunk_mmap.c @@ -32,7 +32,8 @@ chunk_alloc_mmap_slow(size_t size, size_t alignment, bool *zero, bool *commit) } void * -chunk_alloc_mmap(size_t size, size_t alignment, bool *zero, bool *commit) +chunk_alloc_mmap(void *new_addr, size_t size, size_t alignment, bool *zero, + bool *commit) { void *ret; size_t offset; @@ -53,9 +54,10 @@ chunk_alloc_mmap(size_t size, size_t alignment, bool *zero, bool *commit) assert(alignment != 0); assert((alignment & chunksize_mask) == 0); - ret = pages_map(NULL, size); - if (ret == NULL) - return (NULL); + ret = pages_map(new_addr, size); + if (ret == NULL || ret == new_addr) + return (ret); + assert(new_addr == NULL); offset = ALIGNMENT_ADDR2OFFSET(ret, alignment); if (offset != 0) { pages_unmap(ret, size); diff --git a/contrib/jemalloc/src/ckh.c b/contrib/jemalloc/src/ckh.c index 53a1c1e..3b423aa 100644 --- a/contrib/jemalloc/src/ckh.c +++ b/contrib/jemalloc/src/ckh.c @@ -99,7 +99,7 @@ ckh_try_bucket_insert(ckh_t *ckh, size_t bucket, const void *key, * Cycle through the cells in the bucket, starting at a random position. * The randomness avoids worst-case search overhead as buckets fill up. */ - prng32(offset, LG_CKH_BUCKET_CELLS, ckh->prng_state, CKH_A, CKH_C); + offset = (unsigned)prng_lg_range(&ckh->prng_state, LG_CKH_BUCKET_CELLS); for (i = 0; i < (ZU(1) << LG_CKH_BUCKET_CELLS); i++) { cell = &ckh->tab[(bucket << LG_CKH_BUCKET_CELLS) + ((i + offset) & ((ZU(1) << LG_CKH_BUCKET_CELLS) - 1))]; @@ -141,7 +141,8 @@ ckh_evict_reloc_insert(ckh_t *ckh, size_t argbucket, void const **argkey, * were an item for which both hashes indicated the same * bucket. */ - prng32(i, LG_CKH_BUCKET_CELLS, ckh->prng_state, CKH_A, CKH_C); + i = (unsigned)prng_lg_range(&ckh->prng_state, + LG_CKH_BUCKET_CELLS); cell = &ckh->tab[(bucket << LG_CKH_BUCKET_CELLS) + i]; assert(cell->key != NULL); @@ -247,8 +248,7 @@ ckh_grow(tsd_t *tsd, ckh_t *ckh) { bool ret; ckhc_t *tab, *ttab; - size_t lg_curcells; - unsigned lg_prevbuckets; + unsigned lg_prevbuckets, lg_curcells; #ifdef CKH_COUNT ckh->ngrows++; @@ -266,7 +266,7 @@ ckh_grow(tsd_t *tsd, ckh_t *ckh) lg_curcells++; usize = sa2u(sizeof(ckhc_t) << lg_curcells, CACHELINE); - if (usize == 0) { + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) { ret = true; goto label_return; } @@ -283,12 +283,12 @@ ckh_grow(tsd_t *tsd, ckh_t *ckh) ckh->lg_curbuckets = lg_curcells - LG_CKH_BUCKET_CELLS; if (!ckh_rebuild(ckh, tab)) { - idalloctm(tsd, tab, tcache_get(tsd, false), true); + idalloctm(tsd, tab, tcache_get(tsd, false), true, true); break; } /* Rebuilding failed, so back out partially rebuilt table. */ - idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true); + idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true, true); ckh->tab = tab; ckh->lg_curbuckets = lg_prevbuckets; } @@ -302,8 +302,8 @@ static void ckh_shrink(tsd_t *tsd, ckh_t *ckh) { ckhc_t *tab, *ttab; - size_t lg_curcells, usize; - unsigned lg_prevbuckets; + size_t usize; + unsigned lg_prevbuckets, lg_curcells; /* * It is possible (though unlikely, given well behaved hashes) that the @@ -312,7 +312,7 @@ ckh_shrink(tsd_t *tsd, ckh_t *ckh) lg_prevbuckets = ckh->lg_curbuckets; lg_curcells = ckh->lg_curbuckets + LG_CKH_BUCKET_CELLS - 1; usize = sa2u(sizeof(ckhc_t) << lg_curcells, CACHELINE); - if (usize == 0) + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) return; tab = (ckhc_t *)ipallocztm(tsd, usize, CACHELINE, true, NULL, true, NULL); @@ -330,7 +330,7 @@ ckh_shrink(tsd_t *tsd, ckh_t *ckh) ckh->lg_curbuckets = lg_curcells - LG_CKH_BUCKET_CELLS; if (!ckh_rebuild(ckh, tab)) { - idalloctm(tsd, tab, tcache_get(tsd, false), true); + idalloctm(tsd, tab, tcache_get(tsd, false), true, true); #ifdef CKH_COUNT ckh->nshrinks++; #endif @@ -338,7 +338,7 @@ ckh_shrink(tsd_t *tsd, ckh_t *ckh) } /* Rebuilding failed, so back out partially rebuilt table. */ - idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true); + idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true, true); ckh->tab = tab; ckh->lg_curbuckets = lg_prevbuckets; #ifdef CKH_COUNT @@ -387,7 +387,7 @@ ckh_new(tsd_t *tsd, ckh_t *ckh, size_t minitems, ckh_hash_t *hash, ckh->keycomp = keycomp; usize = sa2u(sizeof(ckhc_t) << lg_mincells, CACHELINE); - if (usize == 0) { + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) { ret = true; goto label_return; } @@ -421,7 +421,7 @@ ckh_delete(tsd_t *tsd, ckh_t *ckh) (unsigned long long)ckh->nrelocs); #endif - idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true); + idalloctm(tsd, ckh->tab, tcache_get(tsd, false), true, true); if (config_debug) memset(ckh, 0x5a, sizeof(ckh_t)); } diff --git a/contrib/jemalloc/src/ctl.c b/contrib/jemalloc/src/ctl.c index 3de8e60..17bd071 100644 --- a/contrib/jemalloc/src/ctl.c +++ b/contrib/jemalloc/src/ctl.c @@ -24,7 +24,7 @@ ctl_named_node(const ctl_node_t *node) } JEMALLOC_INLINE_C const ctl_named_node_t * -ctl_named_children(const ctl_named_node_t *node, int index) +ctl_named_children(const ctl_named_node_t *node, size_t index) { const ctl_named_node_t *children = ctl_named_node(node->children); @@ -77,6 +77,7 @@ CTL_PROTO(config_cache_oblivious) CTL_PROTO(config_debug) CTL_PROTO(config_fill) CTL_PROTO(config_lazy_lock) +CTL_PROTO(config_malloc_conf) CTL_PROTO(config_munmap) CTL_PROTO(config_prof) CTL_PROTO(config_prof_libgcc) @@ -91,7 +92,9 @@ CTL_PROTO(opt_abort) CTL_PROTO(opt_dss) CTL_PROTO(opt_lg_chunk) CTL_PROTO(opt_narenas) +CTL_PROTO(opt_purge) CTL_PROTO(opt_lg_dirty_mult) +CTL_PROTO(opt_decay_time) CTL_PROTO(opt_stats_print) CTL_PROTO(opt_junk) CTL_PROTO(opt_zero) @@ -114,10 +117,12 @@ CTL_PROTO(opt_prof_accum) CTL_PROTO(tcache_create) CTL_PROTO(tcache_flush) CTL_PROTO(tcache_destroy) +static void arena_i_purge(unsigned arena_ind, bool all); CTL_PROTO(arena_i_purge) -static void arena_purge(unsigned arena_ind); +CTL_PROTO(arena_i_decay) CTL_PROTO(arena_i_dss) CTL_PROTO(arena_i_lg_dirty_mult) +CTL_PROTO(arena_i_decay_time) CTL_PROTO(arena_i_chunk_hooks) INDEX_PROTO(arena_i) CTL_PROTO(arenas_bin_i_size) @@ -131,6 +136,7 @@ INDEX_PROTO(arenas_hchunk_i) CTL_PROTO(arenas_narenas) CTL_PROTO(arenas_initialized) CTL_PROTO(arenas_lg_dirty_mult) +CTL_PROTO(arenas_decay_time) CTL_PROTO(arenas_quantum) CTL_PROTO(arenas_page) CTL_PROTO(arenas_tcache_max) @@ -181,6 +187,7 @@ INDEX_PROTO(stats_arenas_i_hchunks_j) CTL_PROTO(stats_arenas_i_nthreads) CTL_PROTO(stats_arenas_i_dss) CTL_PROTO(stats_arenas_i_lg_dirty_mult) +CTL_PROTO(stats_arenas_i_decay_time) CTL_PROTO(stats_arenas_i_pactive) CTL_PROTO(stats_arenas_i_pdirty) CTL_PROTO(stats_arenas_i_mapped) @@ -241,6 +248,7 @@ static const ctl_named_node_t config_node[] = { {NAME("debug"), CTL(config_debug)}, {NAME("fill"), CTL(config_fill)}, {NAME("lazy_lock"), CTL(config_lazy_lock)}, + {NAME("malloc_conf"), CTL(config_malloc_conf)}, {NAME("munmap"), CTL(config_munmap)}, {NAME("prof"), CTL(config_prof)}, {NAME("prof_libgcc"), CTL(config_prof_libgcc)}, @@ -258,7 +266,9 @@ static const ctl_named_node_t opt_node[] = { {NAME("dss"), CTL(opt_dss)}, {NAME("lg_chunk"), CTL(opt_lg_chunk)}, {NAME("narenas"), CTL(opt_narenas)}, + {NAME("purge"), CTL(opt_purge)}, {NAME("lg_dirty_mult"), CTL(opt_lg_dirty_mult)}, + {NAME("decay_time"), CTL(opt_decay_time)}, {NAME("stats_print"), CTL(opt_stats_print)}, {NAME("junk"), CTL(opt_junk)}, {NAME("zero"), CTL(opt_zero)}, @@ -288,8 +298,10 @@ static const ctl_named_node_t tcache_node[] = { static const ctl_named_node_t arena_i_node[] = { {NAME("purge"), CTL(arena_i_purge)}, + {NAME("decay"), CTL(arena_i_decay)}, {NAME("dss"), CTL(arena_i_dss)}, {NAME("lg_dirty_mult"), CTL(arena_i_lg_dirty_mult)}, + {NAME("decay_time"), CTL(arena_i_decay_time)}, {NAME("chunk_hooks"), CTL(arena_i_chunk_hooks)} }; static const ctl_named_node_t super_arena_i_node[] = { @@ -339,6 +351,7 @@ static const ctl_named_node_t arenas_node[] = { {NAME("narenas"), CTL(arenas_narenas)}, {NAME("initialized"), CTL(arenas_initialized)}, {NAME("lg_dirty_mult"), CTL(arenas_lg_dirty_mult)}, + {NAME("decay_time"), CTL(arenas_decay_time)}, {NAME("quantum"), CTL(arenas_quantum)}, {NAME("page"), CTL(arenas_page)}, {NAME("tcache_max"), CTL(arenas_tcache_max)}, @@ -439,6 +452,7 @@ static const ctl_named_node_t stats_arenas_i_node[] = { {NAME("nthreads"), CTL(stats_arenas_i_nthreads)}, {NAME("dss"), CTL(stats_arenas_i_dss)}, {NAME("lg_dirty_mult"), CTL(stats_arenas_i_lg_dirty_mult)}, + {NAME("decay_time"), CTL(stats_arenas_i_decay_time)}, {NAME("pactive"), CTL(stats_arenas_i_pactive)}, {NAME("pdirty"), CTL(stats_arenas_i_pdirty)}, {NAME("mapped"), CTL(stats_arenas_i_mapped)}, @@ -519,8 +533,10 @@ static void ctl_arena_clear(ctl_arena_stats_t *astats) { + astats->nthreads = 0; astats->dss = dss_prec_names[dss_prec_limit]; astats->lg_dirty_mult = -1; + astats->decay_time = -1; astats->pactive = 0; astats->pdirty = 0; if (config_stats) { @@ -542,16 +558,23 @@ ctl_arena_stats_amerge(ctl_arena_stats_t *cstats, arena_t *arena) { unsigned i; - arena_stats_merge(arena, &cstats->dss, &cstats->lg_dirty_mult, - &cstats->pactive, &cstats->pdirty, &cstats->astats, cstats->bstats, - cstats->lstats, cstats->hstats); - - for (i = 0; i < NBINS; i++) { - cstats->allocated_small += cstats->bstats[i].curregs * - index2size(i); - cstats->nmalloc_small += cstats->bstats[i].nmalloc; - cstats->ndalloc_small += cstats->bstats[i].ndalloc; - cstats->nrequests_small += cstats->bstats[i].nrequests; + if (config_stats) { + arena_stats_merge(arena, &cstats->nthreads, &cstats->dss, + &cstats->lg_dirty_mult, &cstats->decay_time, + &cstats->pactive, &cstats->pdirty, &cstats->astats, + cstats->bstats, cstats->lstats, cstats->hstats); + + for (i = 0; i < NBINS; i++) { + cstats->allocated_small += cstats->bstats[i].curregs * + index2size(i); + cstats->nmalloc_small += cstats->bstats[i].nmalloc; + cstats->ndalloc_small += cstats->bstats[i].ndalloc; + cstats->nrequests_small += cstats->bstats[i].nrequests; + } + } else { + arena_basic_stats_merge(arena, &cstats->nthreads, &cstats->dss, + &cstats->lg_dirty_mult, &cstats->decay_time, + &cstats->pactive, &cstats->pdirty); } } @@ -560,57 +583,68 @@ ctl_arena_stats_smerge(ctl_arena_stats_t *sstats, ctl_arena_stats_t *astats) { unsigned i; + sstats->nthreads += astats->nthreads; sstats->pactive += astats->pactive; sstats->pdirty += astats->pdirty; - sstats->astats.mapped += astats->astats.mapped; - sstats->astats.npurge += astats->astats.npurge; - sstats->astats.nmadvise += astats->astats.nmadvise; - sstats->astats.purged += astats->astats.purged; - - sstats->astats.metadata_mapped += astats->astats.metadata_mapped; - sstats->astats.metadata_allocated += astats->astats.metadata_allocated; - - sstats->allocated_small += astats->allocated_small; - sstats->nmalloc_small += astats->nmalloc_small; - sstats->ndalloc_small += astats->ndalloc_small; - sstats->nrequests_small += astats->nrequests_small; - - sstats->astats.allocated_large += astats->astats.allocated_large; - sstats->astats.nmalloc_large += astats->astats.nmalloc_large; - sstats->astats.ndalloc_large += astats->astats.ndalloc_large; - sstats->astats.nrequests_large += astats->astats.nrequests_large; - - sstats->astats.allocated_huge += astats->astats.allocated_huge; - sstats->astats.nmalloc_huge += astats->astats.nmalloc_huge; - sstats->astats.ndalloc_huge += astats->astats.ndalloc_huge; - - for (i = 0; i < NBINS; i++) { - sstats->bstats[i].nmalloc += astats->bstats[i].nmalloc; - sstats->bstats[i].ndalloc += astats->bstats[i].ndalloc; - sstats->bstats[i].nrequests += astats->bstats[i].nrequests; - sstats->bstats[i].curregs += astats->bstats[i].curregs; - if (config_tcache) { - sstats->bstats[i].nfills += astats->bstats[i].nfills; - sstats->bstats[i].nflushes += - astats->bstats[i].nflushes; + if (config_stats) { + sstats->astats.mapped += astats->astats.mapped; + sstats->astats.npurge += astats->astats.npurge; + sstats->astats.nmadvise += astats->astats.nmadvise; + sstats->astats.purged += astats->astats.purged; + + sstats->astats.metadata_mapped += + astats->astats.metadata_mapped; + sstats->astats.metadata_allocated += + astats->astats.metadata_allocated; + + sstats->allocated_small += astats->allocated_small; + sstats->nmalloc_small += astats->nmalloc_small; + sstats->ndalloc_small += astats->ndalloc_small; + sstats->nrequests_small += astats->nrequests_small; + + sstats->astats.allocated_large += + astats->astats.allocated_large; + sstats->astats.nmalloc_large += astats->astats.nmalloc_large; + sstats->astats.ndalloc_large += astats->astats.ndalloc_large; + sstats->astats.nrequests_large += + astats->astats.nrequests_large; + + sstats->astats.allocated_huge += astats->astats.allocated_huge; + sstats->astats.nmalloc_huge += astats->astats.nmalloc_huge; + sstats->astats.ndalloc_huge += astats->astats.ndalloc_huge; + + for (i = 0; i < NBINS; i++) { + sstats->bstats[i].nmalloc += astats->bstats[i].nmalloc; + sstats->bstats[i].ndalloc += astats->bstats[i].ndalloc; + sstats->bstats[i].nrequests += + astats->bstats[i].nrequests; + sstats->bstats[i].curregs += astats->bstats[i].curregs; + if (config_tcache) { + sstats->bstats[i].nfills += + astats->bstats[i].nfills; + sstats->bstats[i].nflushes += + astats->bstats[i].nflushes; + } + sstats->bstats[i].nruns += astats->bstats[i].nruns; + sstats->bstats[i].reruns += astats->bstats[i].reruns; + sstats->bstats[i].curruns += astats->bstats[i].curruns; } - sstats->bstats[i].nruns += astats->bstats[i].nruns; - sstats->bstats[i].reruns += astats->bstats[i].reruns; - sstats->bstats[i].curruns += astats->bstats[i].curruns; - } - for (i = 0; i < nlclasses; i++) { - sstats->lstats[i].nmalloc += astats->lstats[i].nmalloc; - sstats->lstats[i].ndalloc += astats->lstats[i].ndalloc; - sstats->lstats[i].nrequests += astats->lstats[i].nrequests; - sstats->lstats[i].curruns += astats->lstats[i].curruns; - } + for (i = 0; i < nlclasses; i++) { + sstats->lstats[i].nmalloc += astats->lstats[i].nmalloc; + sstats->lstats[i].ndalloc += astats->lstats[i].ndalloc; + sstats->lstats[i].nrequests += + astats->lstats[i].nrequests; + sstats->lstats[i].curruns += astats->lstats[i].curruns; + } - for (i = 0; i < nhclasses; i++) { - sstats->hstats[i].nmalloc += astats->hstats[i].nmalloc; - sstats->hstats[i].ndalloc += astats->hstats[i].ndalloc; - sstats->hstats[i].curhchunks += astats->hstats[i].curhchunks; + for (i = 0; i < nhclasses; i++) { + sstats->hstats[i].nmalloc += astats->hstats[i].nmalloc; + sstats->hstats[i].ndalloc += astats->hstats[i].ndalloc; + sstats->hstats[i].curhchunks += + astats->hstats[i].curhchunks; + } } } @@ -621,19 +655,9 @@ ctl_arena_refresh(arena_t *arena, unsigned i) ctl_arena_stats_t *sstats = &ctl_stats.arenas[ctl_stats.narenas]; ctl_arena_clear(astats); - - sstats->nthreads += astats->nthreads; - if (config_stats) { - ctl_arena_stats_amerge(astats, arena); - /* Merge into sum stats as well. */ - ctl_arena_stats_smerge(sstats, astats); - } else { - astats->pactive += arena->nactive; - astats->pdirty += arena->ndirty; - /* Merge into sum stats as well. */ - sstats->pactive += arena->nactive; - sstats->pdirty += arena->ndirty; - } + ctl_arena_stats_amerge(astats, arena); + /* Merge into sum stats as well. */ + ctl_arena_stats_smerge(sstats, astats); } static bool @@ -679,33 +703,17 @@ ctl_grow(void) static void ctl_refresh(void) { - tsd_t *tsd; unsigned i; - bool refreshed; VARIABLE_ARRAY(arena_t *, tarenas, ctl_stats.narenas); /* * Clear sum stats, since they will be merged into by * ctl_arena_refresh(). */ - ctl_stats.arenas[ctl_stats.narenas].nthreads = 0; ctl_arena_clear(&ctl_stats.arenas[ctl_stats.narenas]); - tsd = tsd_fetch(); - for (i = 0, refreshed = false; i < ctl_stats.narenas; i++) { - tarenas[i] = arena_get(tsd, i, false, false); - if (tarenas[i] == NULL && !refreshed) { - tarenas[i] = arena_get(tsd, i, false, true); - refreshed = true; - } - } - - for (i = 0; i < ctl_stats.narenas; i++) { - if (tarenas[i] != NULL) - ctl_stats.arenas[i].nthreads = arena_nbound(i); - else - ctl_stats.arenas[i].nthreads = 0; - } + for (i = 0; i < ctl_stats.narenas; i++) + tarenas[i] = arena_get(i, false); for (i = 0; i < ctl_stats.narenas; i++) { bool initialized = (tarenas[i] != NULL); @@ -960,7 +968,7 @@ ctl_bymib(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, assert(node->nchildren > 0); if (ctl_named_node(node->children) != NULL) { /* Children are named. */ - if (node->nchildren <= mib[i]) { + if (node->nchildren <= (unsigned)mib[i]) { ret = ENOENT; goto label_return; } @@ -1199,17 +1207,17 @@ label_return: \ return (ret); \ } -#define CTL_RO_BOOL_CONFIG_GEN(n) \ +#define CTL_RO_CONFIG_GEN(n, t) \ static int \ n##_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, \ void *newp, size_t newlen) \ { \ int ret; \ - bool oldval; \ + t oldval; \ \ READONLY(); \ oldval = n; \ - READ(oldval, bool); \ + READ(oldval, t); \ \ ret = 0; \ label_return: \ @@ -1241,28 +1249,31 @@ label_return: /******************************************************************************/ -CTL_RO_BOOL_CONFIG_GEN(config_cache_oblivious) -CTL_RO_BOOL_CONFIG_GEN(config_debug) -CTL_RO_BOOL_CONFIG_GEN(config_fill) -CTL_RO_BOOL_CONFIG_GEN(config_lazy_lock) -CTL_RO_BOOL_CONFIG_GEN(config_munmap) -CTL_RO_BOOL_CONFIG_GEN(config_prof) -CTL_RO_BOOL_CONFIG_GEN(config_prof_libgcc) -CTL_RO_BOOL_CONFIG_GEN(config_prof_libunwind) -CTL_RO_BOOL_CONFIG_GEN(config_stats) -CTL_RO_BOOL_CONFIG_GEN(config_tcache) -CTL_RO_BOOL_CONFIG_GEN(config_tls) -CTL_RO_BOOL_CONFIG_GEN(config_utrace) -CTL_RO_BOOL_CONFIG_GEN(config_valgrind) -CTL_RO_BOOL_CONFIG_GEN(config_xmalloc) +CTL_RO_CONFIG_GEN(config_cache_oblivious, bool) +CTL_RO_CONFIG_GEN(config_debug, bool) +CTL_RO_CONFIG_GEN(config_fill, bool) +CTL_RO_CONFIG_GEN(config_lazy_lock, bool) +CTL_RO_CONFIG_GEN(config_malloc_conf, const char *) +CTL_RO_CONFIG_GEN(config_munmap, bool) +CTL_RO_CONFIG_GEN(config_prof, bool) +CTL_RO_CONFIG_GEN(config_prof_libgcc, bool) +CTL_RO_CONFIG_GEN(config_prof_libunwind, bool) +CTL_RO_CONFIG_GEN(config_stats, bool) +CTL_RO_CONFIG_GEN(config_tcache, bool) +CTL_RO_CONFIG_GEN(config_tls, bool) +CTL_RO_CONFIG_GEN(config_utrace, bool) +CTL_RO_CONFIG_GEN(config_valgrind, bool) +CTL_RO_CONFIG_GEN(config_xmalloc, bool) /******************************************************************************/ CTL_RO_NL_GEN(opt_abort, opt_abort, bool) CTL_RO_NL_GEN(opt_dss, opt_dss, const char *) CTL_RO_NL_GEN(opt_lg_chunk, opt_lg_chunk, size_t) -CTL_RO_NL_GEN(opt_narenas, opt_narenas, size_t) +CTL_RO_NL_GEN(opt_narenas, opt_narenas, unsigned) +CTL_RO_NL_GEN(opt_purge, purge_mode_names[opt_purge], const char *) CTL_RO_NL_GEN(opt_lg_dirty_mult, opt_lg_dirty_mult, ssize_t) +CTL_RO_NL_GEN(opt_decay_time, opt_decay_time, ssize_t) CTL_RO_NL_GEN(opt_stats_print, opt_stats_print, bool) CTL_RO_NL_CGEN(config_fill, opt_junk, opt_junk, const char *) CTL_RO_NL_CGEN(config_fill, opt_quarantine, opt_quarantine, size_t) @@ -1314,7 +1325,7 @@ thread_arena_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, } /* Initialize arena if necessary. */ - newarena = arena_get(tsd, newind, true, true); + newarena = arena_get(newind, true); if (newarena == NULL) { ret = EAGAIN; goto label_return; @@ -1536,34 +1547,44 @@ label_return: /******************************************************************************/ -/* ctl_mutex must be held during execution of this function. */ static void -arena_purge(unsigned arena_ind) +arena_i_purge(unsigned arena_ind, bool all) { - tsd_t *tsd; - unsigned i; - bool refreshed; - VARIABLE_ARRAY(arena_t *, tarenas, ctl_stats.narenas); - tsd = tsd_fetch(); - for (i = 0, refreshed = false; i < ctl_stats.narenas; i++) { - tarenas[i] = arena_get(tsd, i, false, false); - if (tarenas[i] == NULL && !refreshed) { - tarenas[i] = arena_get(tsd, i, false, true); - refreshed = true; - } - } + malloc_mutex_lock(&ctl_mtx); + { + unsigned narenas = ctl_stats.narenas; + + if (arena_ind == narenas) { + unsigned i; + VARIABLE_ARRAY(arena_t *, tarenas, narenas); + + for (i = 0; i < narenas; i++) + tarenas[i] = arena_get(i, false); + + /* + * No further need to hold ctl_mtx, since narenas and + * tarenas contain everything needed below. + */ + malloc_mutex_unlock(&ctl_mtx); + + for (i = 0; i < narenas; i++) { + if (tarenas[i] != NULL) + arena_purge(tarenas[i], all); + } + } else { + arena_t *tarena; + + assert(arena_ind < narenas); + + tarena = arena_get(arena_ind, false); - if (arena_ind == ctl_stats.narenas) { - unsigned i; - for (i = 0; i < ctl_stats.narenas; i++) { - if (tarenas[i] != NULL) - arena_purge_all(tarenas[i]); + /* No further need to hold ctl_mtx. */ + malloc_mutex_unlock(&ctl_mtx); + + if (tarena != NULL) + arena_purge(tarena, all); } - } else { - assert(arena_ind < ctl_stats.narenas); - if (tarenas[arena_ind] != NULL) - arena_purge_all(tarenas[arena_ind]); } } @@ -1575,9 +1596,22 @@ arena_i_purge_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, READONLY(); WRITEONLY(); - malloc_mutex_lock(&ctl_mtx); - arena_purge(mib[1]); - malloc_mutex_unlock(&ctl_mtx); + arena_i_purge((unsigned)mib[1], true); + + ret = 0; +label_return: + return (ret); +} + +static int +arena_i_decay_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, + void *newp, size_t newlen) +{ + int ret; + + READONLY(); + WRITEONLY(); + arena_i_purge((unsigned)mib[1], false); ret = 0; label_return: @@ -1590,7 +1624,7 @@ arena_i_dss_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, { int ret; const char *dss = NULL; - unsigned arena_ind = mib[1]; + unsigned arena_ind = (unsigned)mib[1]; dss_prec_t dss_prec_old = dss_prec_limit; dss_prec_t dss_prec = dss_prec_limit; @@ -1615,7 +1649,7 @@ arena_i_dss_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, } if (arena_ind < ctl_stats.narenas) { - arena_t *arena = arena_get(tsd_fetch(), arena_ind, false, true); + arena_t *arena = arena_get(arena_ind, false); if (arena == NULL || (dss_prec != dss_prec_limit && arena_dss_prec_set(arena, dss_prec))) { ret = EFAULT; @@ -1645,10 +1679,10 @@ arena_i_lg_dirty_mult_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, void *newp, size_t newlen) { int ret; - unsigned arena_ind = mib[1]; + unsigned arena_ind = (unsigned)mib[1]; arena_t *arena; - arena = arena_get(tsd_fetch(), arena_ind, false, true); + arena = arena_get(arena_ind, false); if (arena == NULL) { ret = EFAULT; goto label_return; @@ -1675,16 +1709,50 @@ label_return: } static int +arena_i_decay_time_ctl(const size_t *mib, size_t miblen, void *oldp, + size_t *oldlenp, void *newp, size_t newlen) +{ + int ret; + unsigned arena_ind = (unsigned)mib[1]; + arena_t *arena; + + arena = arena_get(arena_ind, false); + if (arena == NULL) { + ret = EFAULT; + goto label_return; + } + + if (oldp != NULL && oldlenp != NULL) { + size_t oldval = arena_decay_time_get(arena); + READ(oldval, ssize_t); + } + if (newp != NULL) { + if (newlen != sizeof(ssize_t)) { + ret = EINVAL; + goto label_return; + } + if (arena_decay_time_set(arena, *(ssize_t *)newp)) { + ret = EFAULT; + goto label_return; + } + } + + ret = 0; +label_return: + return (ret); +} + +static int arena_i_chunk_hooks_ctl(const size_t *mib, size_t miblen, void *oldp, size_t *oldlenp, void *newp, size_t newlen) { int ret; - unsigned arena_ind = mib[1]; + unsigned arena_ind = (unsigned)mib[1]; arena_t *arena; malloc_mutex_lock(&ctl_mtx); if (arena_ind < narenas_total_get() && (arena = - arena_get(tsd_fetch(), arena_ind, false, true)) != NULL) { + arena_get(arena_ind, false)) != NULL) { if (newp != NULL) { chunk_hooks_t old_chunk_hooks, new_chunk_hooks; WRITE(new_chunk_hooks, chunk_hooks_t); @@ -1758,7 +1826,7 @@ arenas_initialized_ctl(const size_t *mib, size_t miblen, void *oldp, if (*oldlenp != ctl_stats.narenas * sizeof(bool)) { ret = EINVAL; nread = (*oldlenp < ctl_stats.narenas * sizeof(bool)) - ? (*oldlenp / sizeof(bool)) : ctl_stats.narenas; + ? (unsigned)(*oldlenp / sizeof(bool)) : ctl_stats.narenas; } else { ret = 0; nread = ctl_stats.narenas; @@ -1798,6 +1866,32 @@ label_return: return (ret); } +static int +arenas_decay_time_ctl(const size_t *mib, size_t miblen, void *oldp, + size_t *oldlenp, void *newp, size_t newlen) +{ + int ret; + + if (oldp != NULL && oldlenp != NULL) { + size_t oldval = arena_decay_time_default_get(); + READ(oldval, ssize_t); + } + if (newp != NULL) { + if (newlen != sizeof(ssize_t)) { + ret = EINVAL; + goto label_return; + } + if (arena_decay_time_default_set(*(ssize_t *)newp)) { + ret = EFAULT; + goto label_return; + } + } + + ret = 0; +label_return: + return (ret); +} + CTL_RO_NL_GEN(arenas_quantum, QUANTUM, size_t) CTL_RO_NL_GEN(arenas_page, PAGE, size_t) CTL_RO_NL_CGEN(config_tcache, arenas_tcache_max, tcache_maxclass, size_t) @@ -1816,7 +1910,7 @@ arenas_bin_i_index(const size_t *mib, size_t miblen, size_t i) } CTL_RO_NL_GEN(arenas_nlruns, nlclasses, unsigned) -CTL_RO_NL_GEN(arenas_lrun_i_size, index2size(NBINS+mib[2]), size_t) +CTL_RO_NL_GEN(arenas_lrun_i_size, index2size(NBINS+(szind_t)mib[2]), size_t) static const ctl_named_node_t * arenas_lrun_i_index(const size_t *mib, size_t miblen, size_t i) { @@ -1827,7 +1921,8 @@ arenas_lrun_i_index(const size_t *mib, size_t miblen, size_t i) } CTL_RO_NL_GEN(arenas_nhchunks, nhclasses, unsigned) -CTL_RO_NL_GEN(arenas_hchunk_i_size, index2size(NBINS+nlclasses+mib[2]), size_t) +CTL_RO_NL_GEN(arenas_hchunk_i_size, index2size(NBINS+nlclasses+(szind_t)mib[2]), + size_t) static const ctl_named_node_t * arenas_hchunk_i_index(const size_t *mib, size_t miblen, size_t i) { @@ -1999,6 +2094,8 @@ CTL_RO_CGEN(config_stats, stats_mapped, ctl_stats.mapped, size_t) CTL_RO_GEN(stats_arenas_i_dss, ctl_stats.arenas[mib[2]].dss, const char *) CTL_RO_GEN(stats_arenas_i_lg_dirty_mult, ctl_stats.arenas[mib[2]].lg_dirty_mult, ssize_t) +CTL_RO_GEN(stats_arenas_i_decay_time, ctl_stats.arenas[mib[2]].decay_time, + ssize_t) CTL_RO_GEN(stats_arenas_i_nthreads, ctl_stats.arenas[mib[2]].nthreads, unsigned) CTL_RO_GEN(stats_arenas_i_pactive, ctl_stats.arenas[mib[2]].pactive, size_t) CTL_RO_GEN(stats_arenas_i_pdirty, ctl_stats.arenas[mib[2]].pdirty, size_t) diff --git a/contrib/jemalloc/src/extent.c b/contrib/jemalloc/src/extent.c index 13f9441..9f5146e 100644 --- a/contrib/jemalloc/src/extent.c +++ b/contrib/jemalloc/src/extent.c @@ -15,7 +15,7 @@ extent_quantize(size_t size) } JEMALLOC_INLINE_C int -extent_szad_comp(extent_node_t *a, extent_node_t *b) +extent_szad_comp(const extent_node_t *a, const extent_node_t *b) { int ret; size_t a_qsize = extent_quantize(extent_node_size_get(a)); @@ -41,7 +41,7 @@ rb_gen(, extent_tree_szad_, extent_tree_t, extent_node_t, szad_link, extent_szad_comp) JEMALLOC_INLINE_C int -extent_ad_comp(extent_node_t *a, extent_node_t *b) +extent_ad_comp(const extent_node_t *a, const extent_node_t *b) { uintptr_t a_addr = (uintptr_t)extent_node_addr_get(a); uintptr_t b_addr = (uintptr_t)extent_node_addr_get(b); diff --git a/contrib/jemalloc/src/huge.c b/contrib/jemalloc/src/huge.c index 1e9a665..5f7ceaf 100644 --- a/contrib/jemalloc/src/huge.c +++ b/contrib/jemalloc/src/huge.c @@ -31,35 +31,30 @@ huge_node_unset(const void *ptr, const extent_node_t *node) } void * -huge_malloc(tsd_t *tsd, arena_t *arena, size_t size, bool zero, +huge_malloc(tsd_t *tsd, arena_t *arena, size_t usize, bool zero, tcache_t *tcache) { - size_t usize; - usize = s2u(size); - if (usize == 0) { - /* size_t overflow. */ - return (NULL); - } + assert(usize == s2u(usize)); return (huge_palloc(tsd, arena, usize, chunksize, zero, tcache)); } void * -huge_palloc(tsd_t *tsd, arena_t *arena, size_t size, size_t alignment, +huge_palloc(tsd_t *tsd, arena_t *arena, size_t usize, size_t alignment, bool zero, tcache_t *tcache) { void *ret; - size_t usize; + size_t ausize; extent_node_t *node; bool is_zeroed; /* Allocate one or more contiguous chunks for this request. */ - usize = sa2u(size, alignment); - if (unlikely(usize == 0)) + ausize = sa2u(usize, alignment); + if (unlikely(ausize == 0 || ausize > HUGE_MAXCLASS)) return (NULL); - assert(usize >= chunksize); + assert(ausize >= chunksize); /* Allocate an extent node with which to track the chunk. */ node = ipallocztm(tsd, CACHELINE_CEILING(sizeof(extent_node_t)), @@ -74,16 +69,16 @@ huge_palloc(tsd_t *tsd, arena_t *arena, size_t size, size_t alignment, is_zeroed = zero; arena = arena_choose(tsd, arena); if (unlikely(arena == NULL) || (ret = arena_chunk_alloc_huge(arena, - size, alignment, &is_zeroed)) == NULL) { - idalloctm(tsd, node, tcache, true); + usize, alignment, &is_zeroed)) == NULL) { + idalloctm(tsd, node, tcache, true, true); return (NULL); } - extent_node_init(node, arena, ret, size, is_zeroed, true); + extent_node_init(node, arena, ret, usize, is_zeroed, true); if (huge_node_set(ret, node)) { - arena_chunk_dalloc_huge(arena, ret, size); - idalloctm(tsd, node, tcache, true); + arena_chunk_dalloc_huge(arena, ret, usize); + idalloctm(tsd, node, tcache, true, true); return (NULL); } @@ -95,10 +90,11 @@ huge_palloc(tsd_t *tsd, arena_t *arena, size_t size, size_t alignment, if (zero || (config_fill && unlikely(opt_zero))) { if (!is_zeroed) - memset(ret, 0, size); + memset(ret, 0, usize); } else if (config_fill && unlikely(opt_junk_alloc)) - memset(ret, 0xa5, size); + memset(ret, 0xa5, usize); + arena_decay_tick(tsd, arena); return (ret); } @@ -280,11 +276,13 @@ huge_ralloc_no_move_expand(void *ptr, size_t oldsize, size_t usize, bool zero) { } bool -huge_ralloc_no_move(void *ptr, size_t oldsize, size_t usize_min, +huge_ralloc_no_move(tsd_t *tsd, void *ptr, size_t oldsize, size_t usize_min, size_t usize_max, bool zero) { assert(s2u(oldsize) == oldsize); + /* The following should have been caught by callers. */ + assert(usize_min > 0 && usize_max <= HUGE_MAXCLASS); /* Both allocations must be huge to avoid a move. */ if (oldsize < chunksize || usize_max < chunksize) @@ -292,13 +290,18 @@ huge_ralloc_no_move(void *ptr, size_t oldsize, size_t usize_min, if (CHUNK_CEILING(usize_max) > CHUNK_CEILING(oldsize)) { /* Attempt to expand the allocation in-place. */ - if (!huge_ralloc_no_move_expand(ptr, oldsize, usize_max, zero)) + if (!huge_ralloc_no_move_expand(ptr, oldsize, usize_max, + zero)) { + arena_decay_tick(tsd, huge_aalloc(ptr)); return (false); + } /* Try again, this time with usize_min. */ if (usize_min < usize_max && CHUNK_CEILING(usize_min) > CHUNK_CEILING(oldsize) && huge_ralloc_no_move_expand(ptr, - oldsize, usize_min, zero)) + oldsize, usize_min, zero)) { + arena_decay_tick(tsd, huge_aalloc(ptr)); return (false); + } } /* @@ -309,12 +312,17 @@ huge_ralloc_no_move(void *ptr, size_t oldsize, size_t usize_min, && CHUNK_CEILING(oldsize) <= CHUNK_CEILING(usize_max)) { huge_ralloc_no_move_similar(ptr, oldsize, usize_min, usize_max, zero); + arena_decay_tick(tsd, huge_aalloc(ptr)); return (false); } /* Attempt to shrink the allocation in-place. */ - if (CHUNK_CEILING(oldsize) > CHUNK_CEILING(usize_max)) - return (huge_ralloc_no_move_shrink(ptr, oldsize, usize_max)); + if (CHUNK_CEILING(oldsize) > CHUNK_CEILING(usize_max)) { + if (!huge_ralloc_no_move_shrink(ptr, oldsize, usize_max)) { + arena_decay_tick(tsd, huge_aalloc(ptr)); + return (false); + } + } return (true); } @@ -335,8 +343,11 @@ huge_ralloc(tsd_t *tsd, arena_t *arena, void *ptr, size_t oldsize, size_t usize, void *ret; size_t copysize; + /* The following should have been caught by callers. */ + assert(usize > 0 && usize <= HUGE_MAXCLASS); + /* Try to avoid moving the allocation. */ - if (!huge_ralloc_no_move(ptr, oldsize, usize, usize, zero)) + if (!huge_ralloc_no_move(tsd, ptr, oldsize, usize, usize, zero)) return (ptr); /* @@ -372,7 +383,9 @@ huge_dalloc(tsd_t *tsd, void *ptr, tcache_t *tcache) extent_node_size_get(node)); arena_chunk_dalloc_huge(extent_node_arena_get(node), extent_node_addr_get(node), extent_node_size_get(node)); - idalloctm(tsd, node, tcache, true); + idalloctm(tsd, node, tcache, true, true); + + arena_decay_tick(tsd, arena); } arena_t * diff --git a/contrib/jemalloc/src/jemalloc.c b/contrib/jemalloc/src/jemalloc.c index b6cbb79..a34b85c 100644 --- a/contrib/jemalloc/src/jemalloc.c +++ b/contrib/jemalloc/src/jemalloc.c @@ -44,14 +44,14 @@ bool opt_redzone = false; bool opt_utrace = false; bool opt_xmalloc = false; bool opt_zero = false; -size_t opt_narenas = 0; +unsigned opt_narenas = 0; /* Initialized to true if the process is running inside Valgrind. */ bool in_valgrind; unsigned ncpus; -/* Protects arenas initialization (arenas, narenas_total). */ +/* Protects arenas initialization. */ static malloc_mutex_t arenas_lock; /* * Arenas that are used to service external requests. Not all elements of the @@ -61,8 +61,8 @@ static malloc_mutex_t arenas_lock; * arenas. arenas[narenas_auto..narenas_total) are only used if the application * takes some action to create them and allocate from them. */ -static arena_t **arenas; -static unsigned narenas_total; +arena_t **arenas; +static unsigned narenas_total; /* Use narenas_total_*(). */ static arena_t *a0; /* arenas[0]; read-only after initialization. */ static unsigned narenas_auto; /* Read-only after initialization. */ @@ -74,12 +74,29 @@ typedef enum { } malloc_init_t; static malloc_init_t malloc_init_state = malloc_init_uninitialized; +/* 0 should be the common case. Set to true to trigger initialization. */ +static bool malloc_slow = true; + +/* When malloc_slow != 0, set the corresponding bits for sanity check. */ +enum { + flag_opt_junk_alloc = (1U), + flag_opt_junk_free = (1U << 1), + flag_opt_quarantine = (1U << 2), + flag_opt_zero = (1U << 3), + flag_opt_utrace = (1U << 4), + flag_in_valgrind = (1U << 5), + flag_opt_xmalloc = (1U << 6) +}; +static uint8_t malloc_slow_flags; + +/* Last entry for overflow detection only. */ JEMALLOC_ALIGNED(CACHELINE) -const size_t index2size_tab[NSIZES] = { +const size_t index2size_tab[NSIZES+1] = { #define SC(index, lg_grp, lg_delta, ndelta, bin, lg_delta_lookup) \ ((ZU(1)<<lg_grp) + (ZU(ndelta)<<lg_delta)), SIZE_CLASSES #undef SC + ZU(0) }; JEMALLOC_ALIGNED(CACHELINE) @@ -298,14 +315,6 @@ malloc_init(void) * cannot tolerate TLS variable access. */ -arena_t * -a0get(void) -{ - - assert(a0 != NULL); - return (a0); -} - static void * a0ialloc(size_t size, bool zero, bool is_metadata) { @@ -313,14 +322,15 @@ a0ialloc(size_t size, bool zero, bool is_metadata) if (unlikely(malloc_init_a0())) return (NULL); - return (iallocztm(NULL, size, zero, false, is_metadata, a0get())); + return (iallocztm(NULL, size, size2index(size), zero, false, + is_metadata, arena_get(0, false), true)); } static void a0idalloc(void *ptr, bool is_metadata) { - idalloctm(NULL, ptr, false, is_metadata); + idalloctm(NULL, ptr, false, is_metadata, true); } void * @@ -377,47 +387,59 @@ bootstrap_free(void *ptr) a0idalloc(ptr, false); } +static void +arena_set(unsigned ind, arena_t *arena) +{ + + atomic_write_p((void **)&arenas[ind], arena); +} + +static void +narenas_total_set(unsigned narenas) +{ + + atomic_write_u(&narenas_total, narenas); +} + +static void +narenas_total_inc(void) +{ + + atomic_add_u(&narenas_total, 1); +} + +unsigned +narenas_total_get(void) +{ + + return (atomic_read_u(&narenas_total)); +} + /* Create a new arena and insert it into the arenas array at index ind. */ static arena_t * arena_init_locked(unsigned ind) { arena_t *arena; - /* Expand arenas if necessary. */ - assert(ind <= narenas_total); + assert(ind <= narenas_total_get()); if (ind > MALLOCX_ARENA_MAX) return (NULL); - if (ind == narenas_total) { - unsigned narenas_new = narenas_total + 1; - arena_t **arenas_new = - (arena_t **)a0malloc(CACHELINE_CEILING(narenas_new * - sizeof(arena_t *))); - if (arenas_new == NULL) - return (NULL); - memcpy(arenas_new, arenas, narenas_total * sizeof(arena_t *)); - arenas_new[ind] = NULL; - /* - * Deallocate only if arenas came from a0malloc() (not - * base_alloc()). - */ - if (narenas_total != narenas_auto) - a0dalloc(arenas); - arenas = arenas_new; - narenas_total = narenas_new; - } + if (ind == narenas_total_get()) + narenas_total_inc(); /* * Another thread may have already initialized arenas[ind] if it's an * auto arena. */ - arena = arenas[ind]; + arena = arena_get(ind, false); if (arena != NULL) { assert(ind < narenas_auto); return (arena); } /* Actually initialize the arena. */ - arena = arenas[ind] = arena_new(ind); + arena = arena_new(ind); + arena_set(ind, arena); return (arena); } @@ -432,145 +454,114 @@ arena_init(unsigned ind) return (arena); } -unsigned -narenas_total_get(void) -{ - unsigned narenas; - - malloc_mutex_lock(&arenas_lock); - narenas = narenas_total; - malloc_mutex_unlock(&arenas_lock); - - return (narenas); -} - static void -arena_bind_locked(tsd_t *tsd, unsigned ind) +arena_bind(tsd_t *tsd, unsigned ind) { arena_t *arena; - arena = arenas[ind]; - arena->nthreads++; + arena = arena_get(ind, false); + arena_nthreads_inc(arena); if (tsd_nominal(tsd)) tsd_arena_set(tsd, arena); } -static void -arena_bind(tsd_t *tsd, unsigned ind) -{ - - malloc_mutex_lock(&arenas_lock); - arena_bind_locked(tsd, ind); - malloc_mutex_unlock(&arenas_lock); -} - void arena_migrate(tsd_t *tsd, unsigned oldind, unsigned newind) { arena_t *oldarena, *newarena; - malloc_mutex_lock(&arenas_lock); - oldarena = arenas[oldind]; - newarena = arenas[newind]; - oldarena->nthreads--; - newarena->nthreads++; - malloc_mutex_unlock(&arenas_lock); + oldarena = arena_get(oldind, false); + newarena = arena_get(newind, false); + arena_nthreads_dec(oldarena); + arena_nthreads_inc(newarena); tsd_arena_set(tsd, newarena); } -unsigned -arena_nbound(unsigned ind) -{ - unsigned nthreads; - - malloc_mutex_lock(&arenas_lock); - nthreads = arenas[ind]->nthreads; - malloc_mutex_unlock(&arenas_lock); - return (nthreads); -} - static void arena_unbind(tsd_t *tsd, unsigned ind) { arena_t *arena; - malloc_mutex_lock(&arenas_lock); - arena = arenas[ind]; - arena->nthreads--; - malloc_mutex_unlock(&arenas_lock); + arena = arena_get(ind, false); + arena_nthreads_dec(arena); tsd_arena_set(tsd, NULL); } -arena_t * -arena_get_hard(tsd_t *tsd, unsigned ind, bool init_if_missing) +arena_tdata_t * +arena_tdata_get_hard(tsd_t *tsd, unsigned ind) { - arena_t *arena; - arena_t **arenas_cache = tsd_arenas_cache_get(tsd); - unsigned narenas_cache = tsd_narenas_cache_get(tsd); + arena_tdata_t *tdata, *arenas_tdata_old; + arena_tdata_t *arenas_tdata = tsd_arenas_tdata_get(tsd); + unsigned narenas_tdata_old, i; + unsigned narenas_tdata = tsd_narenas_tdata_get(tsd); unsigned narenas_actual = narenas_total_get(); - /* Deallocate old cache if it's too small. */ - if (arenas_cache != NULL && narenas_cache < narenas_actual) { - a0dalloc(arenas_cache); - arenas_cache = NULL; - narenas_cache = 0; - tsd_arenas_cache_set(tsd, arenas_cache); - tsd_narenas_cache_set(tsd, narenas_cache); - } - - /* Allocate cache if it's missing. */ - if (arenas_cache == NULL) { - bool *arenas_cache_bypassp = tsd_arenas_cache_bypassp_get(tsd); - assert(ind < narenas_actual || !init_if_missing); - narenas_cache = (ind < narenas_actual) ? narenas_actual : ind+1; - - if (tsd_nominal(tsd) && !*arenas_cache_bypassp) { - *arenas_cache_bypassp = true; - arenas_cache = (arena_t **)a0malloc(sizeof(arena_t *) * - narenas_cache); - *arenas_cache_bypassp = false; + /* + * Dissociate old tdata array (and set up for deallocation upon return) + * if it's too small. + */ + if (arenas_tdata != NULL && narenas_tdata < narenas_actual) { + arenas_tdata_old = arenas_tdata; + narenas_tdata_old = narenas_tdata; + arenas_tdata = NULL; + narenas_tdata = 0; + tsd_arenas_tdata_set(tsd, arenas_tdata); + tsd_narenas_tdata_set(tsd, narenas_tdata); + } else { + arenas_tdata_old = NULL; + narenas_tdata_old = 0; + } + + /* Allocate tdata array if it's missing. */ + if (arenas_tdata == NULL) { + bool *arenas_tdata_bypassp = tsd_arenas_tdata_bypassp_get(tsd); + narenas_tdata = (ind < narenas_actual) ? narenas_actual : ind+1; + + if (tsd_nominal(tsd) && !*arenas_tdata_bypassp) { + *arenas_tdata_bypassp = true; + arenas_tdata = (arena_tdata_t *)a0malloc( + sizeof(arena_tdata_t) * narenas_tdata); + *arenas_tdata_bypassp = false; } - if (arenas_cache == NULL) { - /* - * This function must always tell the truth, even if - * it's slow, so don't let OOM, thread cleanup (note - * tsd_nominal check), nor recursive allocation - * avoidance (note arenas_cache_bypass check) get in the - * way. - */ - if (ind >= narenas_actual) - return (NULL); - malloc_mutex_lock(&arenas_lock); - arena = arenas[ind]; - malloc_mutex_unlock(&arenas_lock); - return (arena); + if (arenas_tdata == NULL) { + tdata = NULL; + goto label_return; } - assert(tsd_nominal(tsd) && !*arenas_cache_bypassp); - tsd_arenas_cache_set(tsd, arenas_cache); - tsd_narenas_cache_set(tsd, narenas_cache); + assert(tsd_nominal(tsd) && !*arenas_tdata_bypassp); + tsd_arenas_tdata_set(tsd, arenas_tdata); + tsd_narenas_tdata_set(tsd, narenas_tdata); } /* - * Copy to cache. It's possible that the actual number of arenas has - * increased since narenas_total_get() was called above, but that causes - * no correctness issues unless two threads concurrently execute the - * arenas.extend mallctl, which we trust mallctl synchronization to + * Copy to tdata array. It's possible that the actual number of arenas + * has increased since narenas_total_get() was called above, but that + * causes no correctness issues unless two threads concurrently execute + * the arenas.extend mallctl, which we trust mallctl synchronization to * prevent. */ - malloc_mutex_lock(&arenas_lock); - memcpy(arenas_cache, arenas, sizeof(arena_t *) * narenas_actual); - malloc_mutex_unlock(&arenas_lock); - if (narenas_cache > narenas_actual) { - memset(&arenas_cache[narenas_actual], 0, sizeof(arena_t *) * - (narenas_cache - narenas_actual)); + + /* Copy/initialize tickers. */ + for (i = 0; i < narenas_actual; i++) { + if (i < narenas_tdata_old) { + ticker_copy(&arenas_tdata[i].decay_ticker, + &arenas_tdata_old[i].decay_ticker); + } else { + ticker_init(&arenas_tdata[i].decay_ticker, + DECAY_NTICKS_PER_UPDATE); + } + } + if (narenas_tdata > narenas_actual) { + memset(&arenas_tdata[narenas_actual], 0, sizeof(arena_tdata_t) + * (narenas_tdata - narenas_actual)); } - /* Read the refreshed cache, and init the arena if necessary. */ - arena = arenas_cache[ind]; - if (init_if_missing && arena == NULL) - arena = arenas_cache[ind] = arena_init(ind); - return (arena); + /* Read the refreshed tdata array. */ + tdata = &arenas_tdata[ind]; +label_return: + if (arenas_tdata_old != NULL) + a0dalloc(arenas_tdata_old); + return (tdata); } /* Slow path, called only by arena_choose(). */ @@ -585,15 +576,16 @@ arena_choose_hard(tsd_t *tsd) choose = 0; first_null = narenas_auto; malloc_mutex_lock(&arenas_lock); - assert(a0get() != NULL); + assert(arena_get(0, false) != NULL); for (i = 1; i < narenas_auto; i++) { - if (arenas[i] != NULL) { + if (arena_get(i, false) != NULL) { /* * Choose the first arena that has the lowest * number of threads assigned to it. */ - if (arenas[i]->nthreads < - arenas[choose]->nthreads) + if (arena_nthreads_get(arena_get(i, false)) < + arena_nthreads_get(arena_get(choose, + false))) choose = i; } else if (first_null == narenas_auto) { /* @@ -609,13 +601,13 @@ arena_choose_hard(tsd_t *tsd) } } - if (arenas[choose]->nthreads == 0 + if (arena_nthreads_get(arena_get(choose, false)) == 0 || first_null == narenas_auto) { /* * Use an unloaded arena, or the least loaded arena if * all arenas are already initialized. */ - ret = arenas[choose]; + ret = arena_get(choose, false); } else { /* Initialize a new arena. */ choose = first_null; @@ -625,10 +617,10 @@ arena_choose_hard(tsd_t *tsd) return (NULL); } } - arena_bind_locked(tsd, choose); + arena_bind(tsd, choose); malloc_mutex_unlock(&arenas_lock); } else { - ret = a0get(); + ret = arena_get(0, false); arena_bind(tsd, 0); } @@ -660,26 +652,29 @@ arena_cleanup(tsd_t *tsd) } void -arenas_cache_cleanup(tsd_t *tsd) +arenas_tdata_cleanup(tsd_t *tsd) { - arena_t **arenas_cache; + arena_tdata_t *arenas_tdata; - arenas_cache = tsd_arenas_cache_get(tsd); - if (arenas_cache != NULL) { - tsd_arenas_cache_set(tsd, NULL); - a0dalloc(arenas_cache); + /* Prevent tsd->arenas_tdata from being (re)created. */ + *tsd_arenas_tdata_bypassp_get(tsd) = true; + + arenas_tdata = tsd_arenas_tdata_get(tsd); + if (arenas_tdata != NULL) { + tsd_arenas_tdata_set(tsd, NULL); + a0dalloc(arenas_tdata); } } void -narenas_cache_cleanup(tsd_t *tsd) +narenas_tdata_cleanup(tsd_t *tsd) { /* Do nothing. */ } void -arenas_cache_bypass_cleanup(tsd_t *tsd) +arenas_tdata_bypass_cleanup(tsd_t *tsd) { /* Do nothing. */ @@ -700,7 +695,7 @@ stats_print_atexit(void) * continue to allocate. */ for (i = 0, narenas = narenas_total_get(); i < narenas; i++) { - arena_t *arena = arenas[i]; + arena_t *arena = arena_get(i, false); if (arena != NULL) { tcache_t *tcache; @@ -843,6 +838,26 @@ malloc_conf_error(const char *msg, const char *k, size_t klen, const char *v, } static void +malloc_slow_flag_init(void) +{ + /* + * Combine the runtime options into malloc_slow for fast path. Called + * after processing all the options. + */ + malloc_slow_flags |= (opt_junk_alloc ? flag_opt_junk_alloc : 0) + | (opt_junk_free ? flag_opt_junk_free : 0) + | (opt_quarantine ? flag_opt_quarantine : 0) + | (opt_zero ? flag_opt_zero : 0) + | (opt_utrace ? flag_opt_utrace : 0) + | (opt_xmalloc ? flag_opt_xmalloc : 0); + + if (config_valgrind) + malloc_slow_flags |= (in_valgrind ? flag_in_valgrind : 0); + + malloc_slow = (malloc_slow_flags != 0); +} + +static void malloc_conf_init(void) { unsigned i; @@ -868,10 +883,13 @@ malloc_conf_init(void) opt_tcache = false; } - for (i = 0; i < 3; i++) { + for (i = 0; i < 4; i++) { /* Get runtime configuration. */ switch (i) { case 0: + opts = config_malloc_conf; + break; + case 1: if (je_malloc_conf != NULL) { /* * Use options that were compiled into the @@ -884,8 +902,8 @@ malloc_conf_init(void) opts = buf; } break; - case 1: { - int linklen = 0; + case 2: { + ssize_t linklen = 0; #ifndef _WIN32 int saved_errno = errno; const char *linkname = @@ -911,7 +929,7 @@ malloc_conf_init(void) buf[linklen] = '\0'; opts = buf; break; - } case 2: { + } case 3: { const char *envname = #ifdef JEMALLOC_PREFIX JEMALLOC_CPREFIX"MALLOC_CONF" @@ -958,7 +976,7 @@ malloc_conf_init(void) if (cont) \ continue; \ } -#define CONF_HANDLE_SIZE_T(o, n, min, max, clip) \ +#define CONF_HANDLE_T_U(t, o, n, min, max, clip) \ if (CONF_MATCH(n)) { \ uintmax_t um; \ char *end; \ @@ -972,11 +990,11 @@ malloc_conf_init(void) k, klen, v, vlen); \ } else if (clip) { \ if ((min) != 0 && um < (min)) \ - o = (min); \ + o = (t)(min); \ else if (um > (max)) \ - o = (max); \ + o = (t)(max); \ else \ - o = um; \ + o = (t)um; \ } else { \ if (((min) != 0 && um < (min)) \ || um > (max)) { \ @@ -985,10 +1003,14 @@ malloc_conf_init(void) "conf value", \ k, klen, v, vlen); \ } else \ - o = um; \ + o = (t)um; \ } \ continue; \ } +#define CONF_HANDLE_UNSIGNED(o, n, min, max, clip) \ + CONF_HANDLE_T_U(unsigned, o, n, min, max, clip) +#define CONF_HANDLE_SIZE_T(o, n, min, max, clip) \ + CONF_HANDLE_T_U(size_t, o, n, min, max, clip) #define CONF_HANDLE_SSIZE_T(o, n, min, max) \ if (CONF_MATCH(n)) { \ long l; \ @@ -1056,10 +1078,29 @@ malloc_conf_init(void) } continue; } - CONF_HANDLE_SIZE_T(opt_narenas, "narenas", 1, - SIZE_T_MAX, false) + CONF_HANDLE_UNSIGNED(opt_narenas, "narenas", 1, + UINT_MAX, false) + if (strncmp("purge", k, klen) == 0) { + int i; + bool match = false; + for (i = 0; i < purge_mode_limit; i++) { + if (strncmp(purge_mode_names[i], v, + vlen) == 0) { + opt_purge = (purge_mode_t)i; + match = true; + break; + } + } + if (!match) { + malloc_conf_error("Invalid conf value", + k, klen, v, vlen); + } + continue; + } CONF_HANDLE_SSIZE_T(opt_lg_dirty_mult, "lg_dirty_mult", -1, (sizeof(size_t) << 3) - 1) + CONF_HANDLE_SSIZE_T(opt_decay_time, "decay_time", -1, + NSTIME_SEC_MAX); CONF_HANDLE_BOOL(opt_stats_print, "stats_print", true) if (config_fill) { if (CONF_MATCH("junk")) { @@ -1213,7 +1254,8 @@ malloc_init_hard_a0_locked(void) * Create enough scaffolding to allow recursive allocation in * malloc_ncpus(). */ - narenas_total = narenas_auto = 1; + narenas_auto = 1; + narenas_total_set(narenas_auto); arenas = &a0; memset(arenas, 0, sizeof(arena_t *) * narenas_auto); /* @@ -1242,26 +1284,37 @@ malloc_init_hard_a0(void) * * init_lock must be held. */ -static void +static bool malloc_init_hard_recursible(void) { + bool ret = false; malloc_init_state = malloc_init_recursible; malloc_mutex_unlock(&init_lock); + /* LinuxThreads' pthread_setspecific() allocates. */ + if (malloc_tsd_boot0()) { + ret = true; + goto label_return; + } + ncpus = malloc_ncpus(); #if (!defined(JEMALLOC_MUTEX_INIT_CB) && !defined(JEMALLOC_ZONE) \ && !defined(_WIN32) && !defined(__native_client__)) - /* LinuxThreads's pthread_atfork() allocates. */ + /* LinuxThreads' pthread_atfork() allocates. */ if (pthread_atfork(jemalloc_prefork, jemalloc_postfork_parent, jemalloc_postfork_child) != 0) { + ret = true; malloc_write("<jemalloc>: Error in pthread_atfork()\n"); if (opt_abort) abort(); } #endif + +label_return: malloc_mutex_lock(&init_lock); + return (ret); } /* init_lock must be held. */ @@ -1284,30 +1337,26 @@ malloc_init_hard_finish(void) } narenas_auto = opt_narenas; /* - * Make sure that the arenas array can be allocated. In practice, this - * limit is enough to allow the allocator to function, but the ctl - * machinery will fail to allocate memory at far lower limits. + * Limit the number of arenas to the indexing range of MALLOCX_ARENA(). */ - if (narenas_auto > chunksize / sizeof(arena_t *)) { - narenas_auto = chunksize / sizeof(arena_t *); + if (narenas_auto > MALLOCX_ARENA_MAX) { + narenas_auto = MALLOCX_ARENA_MAX; malloc_printf("<jemalloc>: Reducing narenas to limit (%d)\n", narenas_auto); } - narenas_total = narenas_auto; + narenas_total_set(narenas_auto); /* Allocate and initialize arenas. */ - arenas = (arena_t **)base_alloc(sizeof(arena_t *) * narenas_total); + arenas = (arena_t **)base_alloc(sizeof(arena_t *) * + (MALLOCX_ARENA_MAX+1)); if (arenas == NULL) return (true); - /* - * Zero the array. In practice, this should always be pre-zeroed, - * since it was just mmap()ed, but let's be sure. - */ - memset(arenas, 0, sizeof(arena_t *) * narenas_total); /* Copy the pointer to the one arena that was already initialized. */ - arenas[0] = a0; + arena_set(0, a0); malloc_init_state = malloc_init_initialized; + malloc_slow_flag_init(); + return (false); } @@ -1329,17 +1378,17 @@ malloc_init_hard(void) malloc_mutex_unlock(&init_lock); return (true); } - if (malloc_tsd_boot0()) { + + if (malloc_init_hard_recursible()) { malloc_mutex_unlock(&init_lock); return (true); } + if (config_prof && prof_boot2()) { malloc_mutex_unlock(&init_lock); return (true); } - malloc_init_hard_recursible(); - if (malloc_init_hard_finish()) { malloc_mutex_unlock(&init_lock); return (true); @@ -1359,34 +1408,36 @@ malloc_init_hard(void) */ static void * -imalloc_prof_sample(tsd_t *tsd, size_t usize, prof_tctx_t *tctx) +imalloc_prof_sample(tsd_t *tsd, size_t usize, szind_t ind, + prof_tctx_t *tctx, bool slow_path) { void *p; if (tctx == NULL) return (NULL); if (usize <= SMALL_MAXCLASS) { - p = imalloc(tsd, LARGE_MINCLASS); + szind_t ind_large = size2index(LARGE_MINCLASS); + p = imalloc(tsd, LARGE_MINCLASS, ind_large, slow_path); if (p == NULL) return (NULL); arena_prof_promoted(p, usize); } else - p = imalloc(tsd, usize); + p = imalloc(tsd, usize, ind, slow_path); return (p); } JEMALLOC_ALWAYS_INLINE_C void * -imalloc_prof(tsd_t *tsd, size_t usize) +imalloc_prof(tsd_t *tsd, size_t usize, szind_t ind, bool slow_path) { void *p; prof_tctx_t *tctx; tctx = prof_alloc_prep(tsd, usize, prof_active_get_unlocked(), true); if (unlikely((uintptr_t)tctx != (uintptr_t)1U)) - p = imalloc_prof_sample(tsd, usize, tctx); + p = imalloc_prof_sample(tsd, usize, ind, tctx, slow_path); else - p = imalloc(tsd, usize); + p = imalloc(tsd, usize, ind, slow_path); if (unlikely(p == NULL)) { prof_alloc_rollback(tsd, tctx, true); return (NULL); @@ -1397,23 +1448,44 @@ imalloc_prof(tsd_t *tsd, size_t usize) } JEMALLOC_ALWAYS_INLINE_C void * -imalloc_body(size_t size, tsd_t **tsd, size_t *usize) +imalloc_body(size_t size, tsd_t **tsd, size_t *usize, bool slow_path) { + szind_t ind; - if (unlikely(malloc_init())) + if (slow_path && unlikely(malloc_init())) return (NULL); *tsd = tsd_fetch(); + ind = size2index(size); + if (unlikely(ind >= NSIZES)) + return (NULL); - if (config_prof && opt_prof) { - *usize = s2u(size); - if (unlikely(*usize == 0)) - return (NULL); - return (imalloc_prof(*tsd, *usize)); + if (config_stats || (config_prof && opt_prof) || (slow_path && + config_valgrind && unlikely(in_valgrind))) { + *usize = index2size(ind); + assert(*usize > 0 && *usize <= HUGE_MAXCLASS); } - if (config_stats || (config_valgrind && unlikely(in_valgrind))) - *usize = s2u(size); - return (imalloc(*tsd, size)); + if (config_prof && opt_prof) + return (imalloc_prof(*tsd, *usize, ind, slow_path)); + + return (imalloc(*tsd, size, ind, slow_path)); +} + +JEMALLOC_ALWAYS_INLINE_C void +imalloc_post_check(void *ret, tsd_t *tsd, size_t usize, bool slow_path) +{ + if (unlikely(ret == NULL)) { + if (slow_path && config_xmalloc && unlikely(opt_xmalloc)) { + malloc_write("<jemalloc>: Error in malloc(): " + "out of memory\n"); + abort(); + } + set_errno(ENOMEM); + } + if (config_stats && likely(ret != NULL)) { + assert(usize == isalloc(ret, config_prof)); + *tsd_thread_allocatedp_get(tsd) += usize; + } } JEMALLOC_EXPORT JEMALLOC_ALLOCATOR JEMALLOC_RESTRICT_RETURN @@ -1428,21 +1500,20 @@ je_malloc(size_t size) if (size == 0) size = 1; - ret = imalloc_body(size, &tsd, &usize); - if (unlikely(ret == NULL)) { - if (config_xmalloc && unlikely(opt_xmalloc)) { - malloc_write("<jemalloc>: Error in malloc(): " - "out of memory\n"); - abort(); - } - set_errno(ENOMEM); - } - if (config_stats && likely(ret != NULL)) { - assert(usize == isalloc(ret, config_prof)); - *tsd_thread_allocatedp_get(tsd) += usize; + if (likely(!malloc_slow)) { + /* + * imalloc_body() is inlined so that fast and slow paths are + * generated separately with statically known slow_path. + */ + ret = imalloc_body(size, &tsd, &usize, false); + imalloc_post_check(ret, tsd, usize, false); + } else { + ret = imalloc_body(size, &tsd, &usize, true); + imalloc_post_check(ret, tsd, usize, true); + UTRACE(0, size, ret); + JEMALLOC_VALGRIND_MALLOC(ret != NULL, ret, usize, false); } - UTRACE(0, size, ret); - JEMALLOC_VALGRIND_MALLOC(ret != NULL, ret, usize, false); + return (ret); } @@ -1519,7 +1590,7 @@ imemalign(void **memptr, size_t alignment, size_t size, size_t min_alignment) } usize = sa2u(size, alignment); - if (unlikely(usize == 0)) { + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) { result = NULL; goto label_oom; } @@ -1580,34 +1651,35 @@ je_aligned_alloc(size_t alignment, size_t size) } static void * -icalloc_prof_sample(tsd_t *tsd, size_t usize, prof_tctx_t *tctx) +icalloc_prof_sample(tsd_t *tsd, size_t usize, szind_t ind, prof_tctx_t *tctx) { void *p; if (tctx == NULL) return (NULL); if (usize <= SMALL_MAXCLASS) { - p = icalloc(tsd, LARGE_MINCLASS); + szind_t ind_large = size2index(LARGE_MINCLASS); + p = icalloc(tsd, LARGE_MINCLASS, ind_large); if (p == NULL) return (NULL); arena_prof_promoted(p, usize); } else - p = icalloc(tsd, usize); + p = icalloc(tsd, usize, ind); return (p); } JEMALLOC_ALWAYS_INLINE_C void * -icalloc_prof(tsd_t *tsd, size_t usize) +icalloc_prof(tsd_t *tsd, size_t usize, szind_t ind) { void *p; prof_tctx_t *tctx; tctx = prof_alloc_prep(tsd, usize, prof_active_get_unlocked(), true); if (unlikely((uintptr_t)tctx != (uintptr_t)1U)) - p = icalloc_prof_sample(tsd, usize, tctx); + p = icalloc_prof_sample(tsd, usize, ind, tctx); else - p = icalloc(tsd, usize); + p = icalloc(tsd, usize, ind); if (unlikely(p == NULL)) { prof_alloc_rollback(tsd, tctx, true); return (NULL); @@ -1625,6 +1697,7 @@ je_calloc(size_t num, size_t size) void *ret; tsd_t *tsd; size_t num_size; + szind_t ind; size_t usize JEMALLOC_CC_SILENCE_INIT(0); if (unlikely(malloc_init())) { @@ -1654,17 +1727,18 @@ je_calloc(size_t num, size_t size) goto label_return; } + ind = size2index(num_size); + if (unlikely(ind >= NSIZES)) { + ret = NULL; + goto label_return; + } if (config_prof && opt_prof) { - usize = s2u(num_size); - if (unlikely(usize == 0)) { - ret = NULL; - goto label_return; - } - ret = icalloc_prof(tsd, usize); + usize = index2size(ind); + ret = icalloc_prof(tsd, usize, ind); } else { if (config_stats || (config_valgrind && unlikely(in_valgrind))) - usize = s2u(num_size); - ret = icalloc(tsd, num_size); + usize = index2size(ind); + ret = icalloc(tsd, num_size, ind); } label_return: @@ -1729,7 +1803,7 @@ irealloc_prof(tsd_t *tsd, void *old_ptr, size_t old_usize, size_t usize) } JEMALLOC_INLINE_C void -ifree(tsd_t *tsd, void *ptr, tcache_t *tcache) +ifree(tsd_t *tsd, void *ptr, tcache_t *tcache, bool slow_path) { size_t usize; UNUSED size_t rzsize JEMALLOC_CC_SILENCE_INIT(0); @@ -1744,10 +1818,15 @@ ifree(tsd_t *tsd, void *ptr, tcache_t *tcache) usize = isalloc(ptr, config_prof); if (config_stats) *tsd_thread_deallocatedp_get(tsd) += usize; - if (config_valgrind && unlikely(in_valgrind)) - rzsize = p2rz(ptr); - iqalloc(tsd, ptr, tcache); - JEMALLOC_VALGRIND_FREE(ptr, rzsize); + + if (likely(!slow_path)) + iqalloc(tsd, ptr, tcache, false); + else { + if (config_valgrind && unlikely(in_valgrind)) + rzsize = p2rz(ptr); + iqalloc(tsd, ptr, tcache, true); + JEMALLOC_VALGRIND_FREE(ptr, rzsize); + } } JEMALLOC_INLINE_C void @@ -1784,7 +1863,7 @@ je_realloc(void *ptr, size_t size) /* realloc(ptr, 0) is equivalent to free(ptr). */ UTRACE(ptr, 0, 0); tsd = tsd_fetch(); - ifree(tsd, ptr, tcache_get(tsd, false)); + ifree(tsd, ptr, tcache_get(tsd, false), true); return (NULL); } size = 1; @@ -1801,8 +1880,8 @@ je_realloc(void *ptr, size_t size) if (config_prof && opt_prof) { usize = s2u(size); - ret = unlikely(usize == 0) ? NULL : irealloc_prof(tsd, - ptr, old_usize, usize); + ret = unlikely(usize == 0 || usize > HUGE_MAXCLASS) ? + NULL : irealloc_prof(tsd, ptr, old_usize, usize); } else { if (config_stats || (config_valgrind && unlikely(in_valgrind))) @@ -1811,7 +1890,10 @@ je_realloc(void *ptr, size_t size) } } else { /* realloc(NULL, size) is equivalent to malloc(size). */ - ret = imalloc_body(size, &tsd, &usize); + if (likely(!malloc_slow)) + ret = imalloc_body(size, &tsd, &usize, false); + else + ret = imalloc_body(size, &tsd, &usize, true); } if (unlikely(ret == NULL)) { @@ -1840,7 +1922,10 @@ je_free(void *ptr) UTRACE(ptr, 0, 0); if (likely(ptr != NULL)) { tsd_t *tsd = tsd_fetch(); - ifree(tsd, ptr, tcache_get(tsd, false)); + if (likely(!malloc_slow)) + ifree(tsd, ptr, tcache_get(tsd, false), false); + else + ifree(tsd, ptr, tcache_get(tsd, false), true); } } @@ -1927,7 +2012,8 @@ imallocx_flags_decode_hard(tsd_t *tsd, size_t size, int flags, size_t *usize, *alignment = MALLOCX_ALIGN_GET_SPECIFIED(flags); *usize = sa2u(size, *alignment); } - assert(*usize != 0); + if (unlikely(*usize == 0 || *usize > HUGE_MAXCLASS)) + return (true); *zero = MALLOCX_ZERO_GET(flags); if ((flags & MALLOCX_TCACHE_MASK) != 0) { if ((flags & MALLOCX_TCACHE_MASK) == MALLOCX_TCACHE_NONE) @@ -1938,7 +2024,7 @@ imallocx_flags_decode_hard(tsd_t *tsd, size_t size, int flags, size_t *usize, *tcache = tcache_get(tsd, true); if ((flags & MALLOCX_ARENA_MASK) != 0) { unsigned arena_ind = MALLOCX_ARENA_GET(flags); - *arena = arena_get(tsd, arena_ind, true, true); + *arena = arena_get(arena_ind, true); if (unlikely(*arena == NULL)) return (true); } else @@ -1953,7 +2039,8 @@ imallocx_flags_decode(tsd_t *tsd, size_t size, int flags, size_t *usize, if (likely(flags == 0)) { *usize = s2u(size); - assert(*usize != 0); + if (unlikely(*usize == 0 || *usize > HUGE_MAXCLASS)) + return (true); *alignment = 0; *zero = false; *tcache = tcache_get(tsd, true); @@ -1969,12 +2056,15 @@ JEMALLOC_ALWAYS_INLINE_C void * imallocx_flags(tsd_t *tsd, size_t usize, size_t alignment, bool zero, tcache_t *tcache, arena_t *arena) { + szind_t ind; if (unlikely(alignment != 0)) return (ipalloct(tsd, usize, alignment, zero, tcache, arena)); + ind = size2index(usize); + assert(ind < NSIZES); if (unlikely(zero)) - return (icalloct(tsd, usize, tcache, arena)); - return (imalloct(tsd, usize, tcache, arena)); + return (icalloct(tsd, usize, ind, tcache, arena)); + return (imalloct(tsd, usize, ind, tcache, arena)); } static void * @@ -2038,9 +2128,15 @@ imallocx_no_prof(tsd_t *tsd, size_t size, int flags, size_t *usize) arena_t *arena; if (likely(flags == 0)) { - if (config_stats || (config_valgrind && unlikely(in_valgrind))) - *usize = s2u(size); - return (imalloc(tsd, size)); + szind_t ind = size2index(size); + if (unlikely(ind >= NSIZES)) + return (NULL); + if (config_stats || (config_valgrind && + unlikely(in_valgrind))) { + *usize = index2size(ind); + assert(*usize > 0 && *usize <= HUGE_MAXCLASS); + } + return (imalloc(tsd, size, ind, true)); } if (unlikely(imallocx_flags_decode_hard(tsd, size, flags, usize, @@ -2176,7 +2272,7 @@ je_rallocx(void *ptr, size_t size, int flags) if (unlikely((flags & MALLOCX_ARENA_MASK) != 0)) { unsigned arena_ind = MALLOCX_ARENA_GET(flags); - arena = arena_get(tsd, arena_ind, true, true); + arena = arena_get(arena_ind, true); if (unlikely(arena == NULL)) goto label_oom; } else @@ -2196,7 +2292,8 @@ je_rallocx(void *ptr, size_t size, int flags) if (config_prof && opt_prof) { usize = (alignment == 0) ? s2u(size) : sa2u(size, alignment); - assert(usize != 0); + if (unlikely(usize == 0 || usize > HUGE_MAXCLASS)) + goto label_oom; p = irallocx_prof(tsd, ptr, old_usize, size, alignment, &usize, zero, tcache, arena); if (unlikely(p == NULL)) @@ -2229,12 +2326,12 @@ label_oom: } JEMALLOC_ALWAYS_INLINE_C size_t -ixallocx_helper(void *ptr, size_t old_usize, size_t size, size_t extra, - size_t alignment, bool zero) +ixallocx_helper(tsd_t *tsd, void *ptr, size_t old_usize, size_t size, + size_t extra, size_t alignment, bool zero) { size_t usize; - if (ixalloc(ptr, old_usize, size, extra, alignment, zero)) + if (ixalloc(tsd, ptr, old_usize, size, extra, alignment, zero)) return (old_usize); usize = isalloc(ptr, config_prof); @@ -2242,14 +2339,15 @@ ixallocx_helper(void *ptr, size_t old_usize, size_t size, size_t extra, } static size_t -ixallocx_prof_sample(void *ptr, size_t old_usize, size_t size, size_t extra, - size_t alignment, bool zero, prof_tctx_t *tctx) +ixallocx_prof_sample(tsd_t *tsd, void *ptr, size_t old_usize, size_t size, + size_t extra, size_t alignment, bool zero, prof_tctx_t *tctx) { size_t usize; if (tctx == NULL) return (old_usize); - usize = ixallocx_helper(ptr, old_usize, size, extra, alignment, zero); + usize = ixallocx_helper(tsd, ptr, old_usize, size, extra, alignment, + zero); return (usize); } @@ -2270,16 +2368,29 @@ ixallocx_prof(tsd_t *tsd, void *ptr, size_t old_usize, size_t size, * prof_alloc_prep() to decide whether to capture a backtrace. * prof_realloc() will use the actual usize to decide whether to sample. */ - usize_max = (alignment == 0) ? s2u(size+extra) : sa2u(size+extra, - alignment); - assert(usize_max != 0); + if (alignment == 0) { + usize_max = s2u(size+extra); + assert(usize_max > 0 && usize_max <= HUGE_MAXCLASS); + } else { + usize_max = sa2u(size+extra, alignment); + if (unlikely(usize_max == 0 || usize_max > HUGE_MAXCLASS)) { + /* + * usize_max is out of range, and chances are that + * allocation will fail, but use the maximum possible + * value and carry on with prof_alloc_prep(), just in + * case allocation succeeds. + */ + usize_max = HUGE_MAXCLASS; + } + } tctx = prof_alloc_prep(tsd, usize_max, prof_active, false); + if (unlikely((uintptr_t)tctx != (uintptr_t)1U)) { - usize = ixallocx_prof_sample(ptr, old_usize, size, extra, + usize = ixallocx_prof_sample(tsd, ptr, old_usize, size, extra, alignment, zero, tctx); } else { - usize = ixallocx_helper(ptr, old_usize, size, extra, alignment, - zero); + usize = ixallocx_helper(tsd, ptr, old_usize, size, extra, + alignment, zero); } if (usize == old_usize) { prof_alloc_rollback(tsd, tctx, false); @@ -2309,15 +2420,21 @@ je_xallocx(void *ptr, size_t size, size_t extra, int flags) old_usize = isalloc(ptr, config_prof); - /* Clamp extra if necessary to avoid (size + extra) overflow. */ - if (unlikely(size + extra > HUGE_MAXCLASS)) { - /* Check for size overflow. */ - if (unlikely(size > HUGE_MAXCLASS)) { - usize = old_usize; - goto label_not_resized; - } - extra = HUGE_MAXCLASS - size; + /* + * The API explicitly absolves itself of protecting against (size + + * extra) numerical overflow, but we may need to clamp extra to avoid + * exceeding HUGE_MAXCLASS. + * + * Ordinarily, size limit checking is handled deeper down, but here we + * have to check as part of (size + extra) clamping, since we need the + * clamped value in the above helper functions. + */ + if (unlikely(size > HUGE_MAXCLASS)) { + usize = old_usize; + goto label_not_resized; } + if (unlikely(HUGE_MAXCLASS - size < extra)) + extra = HUGE_MAXCLASS - size; if (config_valgrind && unlikely(in_valgrind)) old_rzsize = u2rz(old_usize); @@ -2326,8 +2443,8 @@ je_xallocx(void *ptr, size_t size, size_t extra, int flags) usize = ixallocx_prof(tsd, ptr, old_usize, size, extra, alignment, zero); } else { - usize = ixallocx_helper(ptr, old_usize, size, extra, alignment, - zero); + usize = ixallocx_helper(tsd, ptr, old_usize, size, extra, + alignment, zero); } if (unlikely(usize == old_usize)) goto label_not_resized; @@ -2379,7 +2496,7 @@ je_dallocx(void *ptr, int flags) tcache = tcache_get(tsd, false); UTRACE(ptr, 0, 0); - ifree(tsd_fetch(), ptr, tcache); + ifree(tsd_fetch(), ptr, tcache, true); } JEMALLOC_ALWAYS_INLINE_C size_t @@ -2391,7 +2508,6 @@ inallocx(size_t size, int flags) usize = s2u(size); else usize = sa2u(size, MALLOCX_ALIGN_GET_SPECIFIED(flags)); - assert(usize != 0); return (usize); } @@ -2424,13 +2540,18 @@ JEMALLOC_EXPORT size_t JEMALLOC_NOTHROW JEMALLOC_ATTR(pure) je_nallocx(size_t size, int flags) { + size_t usize; assert(size != 0); if (unlikely(malloc_init())) return (0); - return (inallocx(size, flags)); + usize = inallocx(size, flags); + if (unlikely(usize > HUGE_MAXCLASS)) + return (0); + + return (usize); } JEMALLOC_EXPORT int JEMALLOC_NOTHROW @@ -2628,7 +2749,7 @@ JEMALLOC_EXPORT void _malloc_prefork(void) #endif { - unsigned i; + unsigned i, narenas; #ifdef JEMALLOC_MUTEX_INIT_CB if (!malloc_initialized()) @@ -2640,9 +2761,11 @@ _malloc_prefork(void) ctl_prefork(); prof_prefork(); malloc_mutex_prefork(&arenas_lock); - for (i = 0; i < narenas_total; i++) { - if (arenas[i] != NULL) - arena_prefork(arenas[i]); + for (i = 0, narenas = narenas_total_get(); i < narenas; i++) { + arena_t *arena; + + if ((arena = arena_get(i, false)) != NULL) + arena_prefork(arena); } chunk_prefork(); base_prefork(); @@ -2656,7 +2779,7 @@ JEMALLOC_EXPORT void _malloc_postfork(void) #endif { - unsigned i; + unsigned i, narenas; #ifdef JEMALLOC_MUTEX_INIT_CB if (!malloc_initialized()) @@ -2667,9 +2790,11 @@ _malloc_postfork(void) /* Release all mutexes, now that fork() has completed. */ base_postfork_parent(); chunk_postfork_parent(); - for (i = 0; i < narenas_total; i++) { - if (arenas[i] != NULL) - arena_postfork_parent(arenas[i]); + for (i = 0, narenas = narenas_total_get(); i < narenas; i++) { + arena_t *arena; + + if ((arena = arena_get(i, false)) != NULL) + arena_postfork_parent(arena); } malloc_mutex_postfork_parent(&arenas_lock); prof_postfork_parent(); @@ -2679,16 +2804,18 @@ _malloc_postfork(void) void jemalloc_postfork_child(void) { - unsigned i; + unsigned i, narenas; assert(malloc_initialized()); /* Release all mutexes, now that fork() has completed. */ base_postfork_child(); chunk_postfork_child(); - for (i = 0; i < narenas_total; i++) { - if (arenas[i] != NULL) - arena_postfork_child(arenas[i]); + for (i = 0, narenas = narenas_total_get(); i < narenas; i++) { + arena_t *arena; + + if ((arena = arena_get(i, false)) != NULL) + arena_postfork_child(arena); } malloc_mutex_postfork_child(&arenas_lock); prof_postfork_child(); diff --git a/contrib/jemalloc/src/nstime.c b/contrib/jemalloc/src/nstime.c new file mode 100644 index 0000000..4cf90b5 --- /dev/null +++ b/contrib/jemalloc/src/nstime.c @@ -0,0 +1,148 @@ +#include "jemalloc/internal/jemalloc_internal.h" + +#define BILLION UINT64_C(1000000000) + +void +nstime_init(nstime_t *time, uint64_t ns) +{ + + time->ns = ns; +} + +void +nstime_init2(nstime_t *time, uint64_t sec, uint64_t nsec) +{ + + time->ns = sec * BILLION + nsec; +} + +uint64_t +nstime_ns(const nstime_t *time) +{ + + return (time->ns); +} + +uint64_t +nstime_sec(const nstime_t *time) +{ + + return (time->ns / BILLION); +} + +uint64_t +nstime_nsec(const nstime_t *time) +{ + + return (time->ns % BILLION); +} + +void +nstime_copy(nstime_t *time, const nstime_t *source) +{ + + *time = *source; +} + +int +nstime_compare(const nstime_t *a, const nstime_t *b) +{ + + return ((a->ns > b->ns) - (a->ns < b->ns)); +} + +void +nstime_add(nstime_t *time, const nstime_t *addend) +{ + + assert(UINT64_MAX - time->ns >= addend->ns); + + time->ns += addend->ns; +} + +void +nstime_subtract(nstime_t *time, const nstime_t *subtrahend) +{ + + assert(nstime_compare(time, subtrahend) >= 0); + + time->ns -= subtrahend->ns; +} + +void +nstime_imultiply(nstime_t *time, uint64_t multiplier) +{ + + assert((((time->ns | multiplier) & (UINT64_MAX << (sizeof(uint64_t) << + 2))) == 0) || ((time->ns * multiplier) / multiplier == time->ns)); + + time->ns *= multiplier; +} + +void +nstime_idivide(nstime_t *time, uint64_t divisor) +{ + + assert(divisor != 0); + + time->ns /= divisor; +} + +uint64_t +nstime_divide(const nstime_t *time, const nstime_t *divisor) +{ + + assert(divisor->ns != 0); + + return (time->ns / divisor->ns); +} + +#ifdef JEMALLOC_JET +#undef nstime_update +#define nstime_update JEMALLOC_N(nstime_update_impl) +#endif +bool +nstime_update(nstime_t *time) +{ + nstime_t old_time; + + nstime_copy(&old_time, time); + +#ifdef _WIN32 + { + FILETIME ft; + uint64_t ticks; + GetSystemTimeAsFileTime(&ft); + ticks = (((uint64_t)ft.dwHighDateTime) << 32) | + ft.dwLowDateTime; + time->ns = ticks * 100; + } +#elif JEMALLOC_CLOCK_GETTIME + { + struct timespec ts; + + if (sysconf(_SC_MONOTONIC_CLOCK) > 0) + clock_gettime(CLOCK_MONOTONIC, &ts); + else + clock_gettime(CLOCK_REALTIME, &ts); + time->ns = ts.tv_sec * BILLION + ts.tv_nsec; + } +#else + struct timeval tv; + gettimeofday(&tv, NULL); + time->ns = tv.tv_sec * BILLION + tv.tv_usec * 1000; +#endif + + /* Handle non-monotonic clocks. */ + if (unlikely(nstime_compare(&old_time, time) > 0)) { + nstime_copy(time, &old_time); + return (true); + } + + return (false); +} +#ifdef JEMALLOC_JET +#undef nstime_update +#define nstime_update JEMALLOC_N(nstime_update) +nstime_update_t *nstime_update = JEMALLOC_N(nstime_update_impl); +#endif diff --git a/contrib/jemalloc/src/prng.c b/contrib/jemalloc/src/prng.c new file mode 100644 index 0000000..76646a2 --- /dev/null +++ b/contrib/jemalloc/src/prng.c @@ -0,0 +1,2 @@ +#define JEMALLOC_PRNG_C_ +#include "jemalloc/internal/jemalloc_internal.h" diff --git a/contrib/jemalloc/src/prof.c b/contrib/jemalloc/src/prof.c index 5d2b959..b387227 100644 --- a/contrib/jemalloc/src/prof.c +++ b/contrib/jemalloc/src/prof.c @@ -109,7 +109,7 @@ static char prof_dump_buf[ 1 #endif ]; -static unsigned prof_dump_buf_end; +static size_t prof_dump_buf_end; static int prof_dump_fd; /* Do not dump any profiles until bootstrapping is complete. */ @@ -551,9 +551,9 @@ prof_gctx_create(tsd_t *tsd, prof_bt_t *bt) /* * Create a single allocation that has space for vec of length bt->len. */ - prof_gctx_t *gctx = (prof_gctx_t *)iallocztm(tsd, offsetof(prof_gctx_t, - vec) + (bt->len * sizeof(void *)), false, tcache_get(tsd, true), - true, NULL); + size_t size = offsetof(prof_gctx_t, vec) + (bt->len * sizeof(void *)); + prof_gctx_t *gctx = (prof_gctx_t *)iallocztm(tsd, size, + size2index(size), false, tcache_get(tsd, true), true, NULL, true); if (gctx == NULL) return (NULL); gctx->lock = prof_gctx_mutex_choose(); @@ -594,7 +594,7 @@ prof_gctx_try_destroy(tsd_t *tsd, prof_tdata_t *tdata_self, prof_gctx_t *gctx, prof_leave(tsd, tdata_self); /* Destroy gctx. */ malloc_mutex_unlock(gctx->lock); - idalloctm(tsd, gctx, tcache_get(tsd, false), true); + idalloctm(tsd, gctx, tcache_get(tsd, false), true, true); } else { /* * Compensate for increment in prof_tctx_destroy() or @@ -701,7 +701,7 @@ prof_tctx_destroy(tsd_t *tsd, prof_tctx_t *tctx) prof_tdata_destroy(tsd, tdata, false); if (destroy_tctx) - idalloctm(tsd, tctx, tcache_get(tsd, false), true); + idalloctm(tsd, tctx, tcache_get(tsd, false), true, true); } static bool @@ -730,7 +730,8 @@ prof_lookup_global(tsd_t *tsd, prof_bt_t *bt, prof_tdata_t *tdata, if (ckh_insert(tsd, &bt2gctx, btkey.v, gctx.v)) { /* OOM. */ prof_leave(tsd, tdata); - idalloctm(tsd, gctx.v, tcache_get(tsd, false), true); + idalloctm(tsd, gctx.v, tcache_get(tsd, false), true, + true); return (true); } new_gctx = true; @@ -789,8 +790,9 @@ prof_lookup(tsd_t *tsd, prof_bt_t *bt) /* Link a prof_tctx_t into gctx for this thread. */ tcache = tcache_get(tsd, true); - ret.v = iallocztm(tsd, sizeof(prof_tctx_t), false, tcache, true, - NULL); + ret.v = iallocztm(tsd, sizeof(prof_tctx_t), + size2index(sizeof(prof_tctx_t)), false, tcache, true, NULL, + true); if (ret.p == NULL) { if (new_gctx) prof_gctx_try_destroy(tsd, tdata, gctx, tdata); @@ -810,7 +812,7 @@ prof_lookup(tsd_t *tsd, prof_bt_t *bt) if (error) { if (new_gctx) prof_gctx_try_destroy(tsd, tdata, gctx, tdata); - idalloctm(tsd, ret.v, tcache, true); + idalloctm(tsd, ret.v, tcache, true, true); return (NULL); } malloc_mutex_lock(gctx->lock); @@ -869,8 +871,7 @@ prof_sample_threshold_update(prof_tdata_t *tdata) * pp 500 * (http://luc.devroye.org/rnbookindex.html) */ - prng64(r, 53, tdata->prng_state, UINT64_C(6364136223846793005), - UINT64_C(1442695040888963407)); + r = prng_lg_range(&tdata->prng_state, 53); u = (double)r * (1.0/9007199254740992.0L); tdata->bytes_until_sample = (uint64_t)(log(u) / log(1.0 - (1.0 / (double)((uint64_t)1U << lg_prof_sample)))) @@ -988,7 +989,7 @@ prof_dump_close(bool propagate_err) static bool prof_dump_write(bool propagate_err, const char *s) { - unsigned i, slen, n; + size_t i, slen, n; cassert(config_prof); @@ -1211,7 +1212,7 @@ prof_gctx_finish(tsd_t *tsd, prof_gctx_tree_t *gctxs) tctx_tree_remove(&gctx->tctxs, to_destroy); idalloctm(tsd, to_destroy, - tcache_get(tsd, false), true); + tcache_get(tsd, false), true, true); } else next = NULL; } while (next != NULL); @@ -1358,6 +1359,7 @@ label_return: return (ret); } +#ifndef _WIN32 JEMALLOC_FORMAT_PRINTF(1, 2) static int prof_open_maps(const char *format, ...) @@ -1373,6 +1375,18 @@ prof_open_maps(const char *format, ...) return (mfd); } +#endif + +static int +prof_getpid(void) +{ + +#ifdef _WIN32 + return (GetCurrentProcessId()); +#else + return (getpid()); +#endif +} static bool prof_dump_maps(bool propagate_err) @@ -1383,9 +1397,11 @@ prof_dump_maps(bool propagate_err) cassert(config_prof); #ifdef __FreeBSD__ mfd = prof_open_maps("/proc/curproc/map"); +#elif defined(_WIN32) + mfd = -1; // Not implemented #else { - int pid = getpid(); + int pid = prof_getpid(); mfd = prof_open_maps("/proc/%d/task/%d/maps", pid, pid); if (mfd == -1) @@ -1554,12 +1570,12 @@ prof_dump_filename(char *filename, char v, uint64_t vseq) /* "<prefix>.<pid>.<seq>.v<vseq>.heap" */ malloc_snprintf(filename, DUMP_FILENAME_BUFSIZE, "%s.%d.%"FMTu64".%c%"FMTu64".heap", - opt_prof_prefix, (int)getpid(), prof_dump_seq, v, vseq); + opt_prof_prefix, prof_getpid(), prof_dump_seq, v, vseq); } else { /* "<prefix>.<pid>.<seq>.<v>.heap" */ malloc_snprintf(filename, DUMP_FILENAME_BUFSIZE, "%s.%d.%"FMTu64".%c.heap", - opt_prof_prefix, (int)getpid(), prof_dump_seq, v); + opt_prof_prefix, prof_getpid(), prof_dump_seq, v); } prof_dump_seq++; } @@ -1714,8 +1730,8 @@ prof_tdata_init_impl(tsd_t *tsd, uint64_t thr_uid, uint64_t thr_discrim, /* Initialize an empty cache for this thread. */ tcache = tcache_get(tsd, true); - tdata = (prof_tdata_t *)iallocztm(tsd, sizeof(prof_tdata_t), false, - tcache, true, NULL); + tdata = (prof_tdata_t *)iallocztm(tsd, sizeof(prof_tdata_t), + size2index(sizeof(prof_tdata_t)), false, tcache, true, NULL, true); if (tdata == NULL) return (NULL); @@ -1729,7 +1745,7 @@ prof_tdata_init_impl(tsd_t *tsd, uint64_t thr_uid, uint64_t thr_discrim, if (ckh_new(tsd, &tdata->bt2tctx, PROF_CKH_MINITEMS, prof_bt_hash, prof_bt_keycomp)) { - idalloctm(tsd, tdata, tcache, true); + idalloctm(tsd, tdata, tcache, true, true); return (NULL); } @@ -1784,9 +1800,9 @@ prof_tdata_destroy_locked(tsd_t *tsd, prof_tdata_t *tdata, tcache = tcache_get(tsd, false); if (tdata->thread_name != NULL) - idalloctm(tsd, tdata->thread_name, tcache, true); + idalloctm(tsd, tdata->thread_name, tcache, true, true); ckh_delete(tsd, &tdata->bt2tctx); - idalloctm(tsd, tdata, tcache, true); + idalloctm(tsd, tdata, tcache, true, true); } static void @@ -1947,7 +1963,8 @@ prof_thread_name_alloc(tsd_t *tsd, const char *thread_name) if (size == 1) return (""); - ret = iallocztm(tsd, size, false, tcache_get(tsd, true), true, NULL); + ret = iallocztm(tsd, size, size2index(size), false, tcache_get(tsd, + true), true, NULL, true); if (ret == NULL) return (NULL); memcpy(ret, thread_name, size); @@ -1980,7 +1997,7 @@ prof_thread_name_set(tsd_t *tsd, const char *thread_name) if (tdata->thread_name != NULL) { idalloctm(tsd, tdata->thread_name, tcache_get(tsd, false), - true); + true, true); tdata->thread_name = NULL; } if (strlen(s) > 0) diff --git a/contrib/jemalloc/src/quarantine.c b/contrib/jemalloc/src/quarantine.c index 6c43dfc..ff8801c 100644 --- a/contrib/jemalloc/src/quarantine.c +++ b/contrib/jemalloc/src/quarantine.c @@ -23,12 +23,14 @@ static quarantine_t * quarantine_init(tsd_t *tsd, size_t lg_maxobjs) { quarantine_t *quarantine; + size_t size; assert(tsd_nominal(tsd)); - quarantine = (quarantine_t *)iallocztm(tsd, offsetof(quarantine_t, objs) - + ((ZU(1) << lg_maxobjs) * sizeof(quarantine_obj_t)), false, - tcache_get(tsd, true), true, NULL); + size = offsetof(quarantine_t, objs) + ((ZU(1) << lg_maxobjs) * + sizeof(quarantine_obj_t)); + quarantine = (quarantine_t *)iallocztm(tsd, size, size2index(size), + false, tcache_get(tsd, true), true, NULL, true); if (quarantine == NULL) return (NULL); quarantine->curbytes = 0; @@ -55,7 +57,7 @@ quarantine_alloc_hook_work(tsd_t *tsd) if (tsd_quarantine_get(tsd) == NULL) tsd_quarantine_set(tsd, quarantine); else - idalloctm(tsd, quarantine, tcache_get(tsd, false), true); + idalloctm(tsd, quarantine, tcache_get(tsd, false), true, true); } static quarantine_t * @@ -87,7 +89,7 @@ quarantine_grow(tsd_t *tsd, quarantine_t *quarantine) memcpy(&ret->objs[ncopy_a], quarantine->objs, ncopy_b * sizeof(quarantine_obj_t)); } - idalloctm(tsd, quarantine, tcache_get(tsd, false), true); + idalloctm(tsd, quarantine, tcache_get(tsd, false), true, true); tsd_quarantine_set(tsd, ret); return (ret); @@ -98,7 +100,7 @@ quarantine_drain_one(tsd_t *tsd, quarantine_t *quarantine) { quarantine_obj_t *obj = &quarantine->objs[quarantine->first]; assert(obj->usize == isalloc(obj->ptr, config_prof)); - idalloctm(tsd, obj->ptr, NULL, false); + idalloctm(tsd, obj->ptr, NULL, false, true); quarantine->curbytes -= obj->usize; quarantine->curobjs--; quarantine->first = (quarantine->first + 1) & ((ZU(1) << @@ -123,7 +125,7 @@ quarantine(tsd_t *tsd, void *ptr) assert(opt_quarantine); if ((quarantine = tsd_quarantine_get(tsd)) == NULL) { - idalloctm(tsd, ptr, NULL, false); + idalloctm(tsd, ptr, NULL, false, true); return; } /* @@ -162,7 +164,7 @@ quarantine(tsd_t *tsd, void *ptr) } } else { assert(quarantine->curbytes == 0); - idalloctm(tsd, ptr, NULL, false); + idalloctm(tsd, ptr, NULL, false, true); } } @@ -177,7 +179,7 @@ quarantine_cleanup(tsd_t *tsd) quarantine = tsd_quarantine_get(tsd); if (quarantine != NULL) { quarantine_drain(tsd, quarantine, 0); - idalloctm(tsd, quarantine, tcache_get(tsd, false), true); + idalloctm(tsd, quarantine, tcache_get(tsd, false), true, true); tsd_quarantine_set(tsd, NULL); } } diff --git a/contrib/jemalloc/src/stats.c b/contrib/jemalloc/src/stats.c index 154c3e7..a724947 100644 --- a/contrib/jemalloc/src/stats.c +++ b/contrib/jemalloc/src/stats.c @@ -258,7 +258,7 @@ stats_arena_print(void (*write_cb)(void *, const char *), void *cbopaque, { unsigned nthreads; const char *dss; - ssize_t lg_dirty_mult; + ssize_t lg_dirty_mult, decay_time; size_t page, pactive, pdirty, mapped; size_t metadata_mapped, metadata_allocated; uint64_t npurge, nmadvise, purged; @@ -278,13 +278,23 @@ stats_arena_print(void (*write_cb)(void *, const char *), void *cbopaque, malloc_cprintf(write_cb, cbopaque, "dss allocation precedence: %s\n", dss); CTL_M2_GET("stats.arenas.0.lg_dirty_mult", i, &lg_dirty_mult, ssize_t); - if (lg_dirty_mult >= 0) { - malloc_cprintf(write_cb, cbopaque, - "min active:dirty page ratio: %u:1\n", - (1U << lg_dirty_mult)); - } else { - malloc_cprintf(write_cb, cbopaque, - "min active:dirty page ratio: N/A\n"); + if (opt_purge == purge_mode_ratio) { + if (lg_dirty_mult >= 0) { + malloc_cprintf(write_cb, cbopaque, + "min active:dirty page ratio: %u:1\n", + (1U << lg_dirty_mult)); + } else { + malloc_cprintf(write_cb, cbopaque, + "min active:dirty page ratio: N/A\n"); + } + } + CTL_M2_GET("stats.arenas.0.decay_time", i, &decay_time, ssize_t); + if (opt_purge == purge_mode_decay) { + if (decay_time >= 0) { + malloc_cprintf(write_cb, cbopaque, "decay time: %zd\n", + decay_time); + } else + malloc_cprintf(write_cb, cbopaque, "decay time: N/A\n"); } CTL_M2_GET("stats.arenas.0.pactive", i, &pactive, size_t); CTL_M2_GET("stats.arenas.0.pdirty", i, &pdirty, size_t); @@ -292,9 +302,8 @@ stats_arena_print(void (*write_cb)(void *, const char *), void *cbopaque, CTL_M2_GET("stats.arenas.0.nmadvise", i, &nmadvise, uint64_t); CTL_M2_GET("stats.arenas.0.purged", i, &purged, uint64_t); malloc_cprintf(write_cb, cbopaque, - "dirty pages: %zu:%zu active:dirty, %"FMTu64" sweep%s, %"FMTu64 - " madvise%s, %"FMTu64" purged\n", pactive, pdirty, npurge, npurge == - 1 ? "" : "s", nmadvise, nmadvise == 1 ? "" : "s", purged); + "purging: dirty: %zu, sweeps: %"FMTu64", madvises: %"FMTu64", " + "purged: %"FMTu64"\n", pdirty, npurge, nmadvise, purged); malloc_cprintf(write_cb, cbopaque, " allocated nmalloc ndalloc" @@ -426,9 +435,10 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque, bool bv; unsigned uv; ssize_t ssv; - size_t sv, bsz, ssz, sssz, cpsz; + size_t sv, bsz, usz, ssz, sssz, cpsz; bsz = sizeof(bool); + usz = sizeof(unsigned); ssz = sizeof(size_t); sssz = sizeof(ssize_t); cpsz = sizeof(const char *); @@ -438,6 +448,8 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque, CTL_GET("config.debug", &bv, bool); malloc_cprintf(write_cb, cbopaque, "Assertions %s\n", bv ? "enabled" : "disabled"); + malloc_cprintf(write_cb, cbopaque, + "config.malloc_conf: \"%s\"\n", config_malloc_conf); #define OPT_WRITE_BOOL(n) \ if (je_mallctl("opt."#n, &bv, &bsz, NULL, 0) == 0) { \ @@ -453,6 +465,11 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque, : "false", bv2 ? "true" : "false"); \ } \ } +#define OPT_WRITE_UNSIGNED(n) \ + if (je_mallctl("opt."#n, &uv, &usz, NULL, 0) == 0) { \ + malloc_cprintf(write_cb, cbopaque, \ + " opt."#n": %zu\n", sv); \ + } #define OPT_WRITE_SIZE_T(n) \ if (je_mallctl("opt."#n, &sv, &ssz, NULL, 0) == 0) { \ malloc_cprintf(write_cb, cbopaque, \ @@ -483,8 +500,14 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque, OPT_WRITE_BOOL(abort) OPT_WRITE_SIZE_T(lg_chunk) OPT_WRITE_CHAR_P(dss) - OPT_WRITE_SIZE_T(narenas) - OPT_WRITE_SSIZE_T_MUTABLE(lg_dirty_mult, arenas.lg_dirty_mult) + OPT_WRITE_UNSIGNED(narenas) + OPT_WRITE_CHAR_P(purge) + if (opt_purge == purge_mode_ratio) { + OPT_WRITE_SSIZE_T_MUTABLE(lg_dirty_mult, + arenas.lg_dirty_mult) + } + if (opt_purge == purge_mode_decay) + OPT_WRITE_SSIZE_T_MUTABLE(decay_time, arenas.decay_time) OPT_WRITE_BOOL(stats_print) OPT_WRITE_CHAR_P(junk) OPT_WRITE_SIZE_T(quarantine) @@ -529,13 +552,22 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque, malloc_cprintf(write_cb, cbopaque, "Page size: %zu\n", sv); CTL_GET("arenas.lg_dirty_mult", &ssv, ssize_t); - if (ssv >= 0) { - malloc_cprintf(write_cb, cbopaque, - "Min active:dirty page ratio per arena: %u:1\n", - (1U << ssv)); - } else { + if (opt_purge == purge_mode_ratio) { + if (ssv >= 0) { + malloc_cprintf(write_cb, cbopaque, + "Min active:dirty page ratio per arena: " + "%u:1\n", (1U << ssv)); + } else { + malloc_cprintf(write_cb, cbopaque, + "Min active:dirty page ratio per arena: " + "N/A\n"); + } + } + CTL_GET("arenas.decay_time", &ssv, ssize_t); + if (opt_purge == purge_mode_decay) { malloc_cprintf(write_cb, cbopaque, - "Min active:dirty page ratio per arena: N/A\n"); + "Unused dirty page decay time: %zd%s\n", + ssv, (ssv < 0) ? " (no decay)" : ""); } if (je_mallctl("arenas.tcache_max", &sv, &ssz, NULL, 0) == 0) { malloc_cprintf(write_cb, cbopaque, diff --git a/contrib/jemalloc/src/tcache.c b/contrib/jemalloc/src/tcache.c index fdafd0c..6e32f40 100644 --- a/contrib/jemalloc/src/tcache.c +++ b/contrib/jemalloc/src/tcache.c @@ -10,7 +10,7 @@ ssize_t opt_lg_tcache_max = LG_TCACHE_MAXCLASS_DEFAULT; tcache_bin_info_t *tcache_bin_info; static unsigned stack_nelms; /* Total stack elms per tcache. */ -size_t nhbins; +unsigned nhbins; size_t tcache_maxclass; tcaches_t *tcaches; @@ -67,20 +67,19 @@ tcache_event_hard(tsd_t *tsd, tcache_t *tcache) tcache->next_gc_bin++; if (tcache->next_gc_bin == nhbins) tcache->next_gc_bin = 0; - tcache->ev_cnt = 0; } void * tcache_alloc_small_hard(tsd_t *tsd, arena_t *arena, tcache_t *tcache, - tcache_bin_t *tbin, szind_t binind) + tcache_bin_t *tbin, szind_t binind, bool *tcache_success) { void *ret; - arena_tcache_fill_small(arena, tbin, binind, config_prof ? + arena_tcache_fill_small(tsd, arena, tbin, binind, config_prof ? tcache->prof_accumbytes : 0); if (config_prof) tcache->prof_accumbytes = 0; - ret = tcache_alloc_easy(tbin); + ret = tcache_alloc_easy(tbin, tcache_success); return (ret); } @@ -102,7 +101,7 @@ tcache_bin_flush_small(tsd_t *tsd, tcache_t *tcache, tcache_bin_t *tbin, for (nflush = tbin->ncached - rem; nflush > 0; nflush = ndeferred) { /* Lock the arena bin associated with the first object. */ arena_chunk_t *chunk = (arena_chunk_t *)CHUNK_ADDR2BASE( - tbin->avail[0]); + *(tbin->avail - 1)); arena_t *bin_arena = extent_node_arena_get(&chunk->node); arena_bin_t *bin = &bin_arena->bins[binind]; @@ -122,7 +121,7 @@ tcache_bin_flush_small(tsd_t *tsd, tcache_t *tcache, tcache_bin_t *tbin, } ndeferred = 0; for (i = 0; i < nflush; i++) { - ptr = tbin->avail[i]; + ptr = *(tbin->avail - 1 - i); assert(ptr != NULL); chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); if (extent_node_arena_get(&chunk->node) == bin_arena) { @@ -139,11 +138,12 @@ tcache_bin_flush_small(tsd_t *tsd, tcache_t *tcache, tcache_bin_t *tbin, * locked. Stash the object, so that it can be * handled in a future pass. */ - tbin->avail[ndeferred] = ptr; + *(tbin->avail - 1 - ndeferred) = ptr; ndeferred++; } } malloc_mutex_unlock(&bin->lock); + arena_decay_ticks(tsd, bin_arena, nflush - ndeferred); } if (config_stats && !merged_stats) { /* @@ -158,8 +158,8 @@ tcache_bin_flush_small(tsd_t *tsd, tcache_t *tcache, tcache_bin_t *tbin, malloc_mutex_unlock(&bin->lock); } - memmove(tbin->avail, &tbin->avail[tbin->ncached - rem], - rem * sizeof(void *)); + memmove(tbin->avail - rem, tbin->avail - tbin->ncached, rem * + sizeof(void *)); tbin->ncached = rem; if ((int)tbin->ncached < tbin->low_water) tbin->low_water = tbin->ncached; @@ -182,7 +182,7 @@ tcache_bin_flush_large(tsd_t *tsd, tcache_bin_t *tbin, szind_t binind, for (nflush = tbin->ncached - rem; nflush > 0; nflush = ndeferred) { /* Lock the arena associated with the first object. */ arena_chunk_t *chunk = (arena_chunk_t *)CHUNK_ADDR2BASE( - tbin->avail[0]); + *(tbin->avail - 1)); arena_t *locked_arena = extent_node_arena_get(&chunk->node); UNUSED bool idump; @@ -206,7 +206,7 @@ tcache_bin_flush_large(tsd_t *tsd, tcache_bin_t *tbin, szind_t binind, } ndeferred = 0; for (i = 0; i < nflush; i++) { - ptr = tbin->avail[i]; + ptr = *(tbin->avail - 1 - i); assert(ptr != NULL); chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); if (extent_node_arena_get(&chunk->node) == @@ -220,13 +220,14 @@ tcache_bin_flush_large(tsd_t *tsd, tcache_bin_t *tbin, szind_t binind, * Stash the object, so that it can be handled * in a future pass. */ - tbin->avail[ndeferred] = ptr; + *(tbin->avail - 1 - ndeferred) = ptr; ndeferred++; } } malloc_mutex_unlock(&locked_arena->lock); if (config_prof && idump) prof_idump(); + arena_decay_ticks(tsd, locked_arena, nflush - ndeferred); } if (config_stats && !merged_stats) { /* @@ -241,8 +242,8 @@ tcache_bin_flush_large(tsd_t *tsd, tcache_bin_t *tbin, szind_t binind, malloc_mutex_unlock(&arena->lock); } - memmove(tbin->avail, &tbin->avail[tbin->ncached - rem], - rem * sizeof(void *)); + memmove(tbin->avail - rem, tbin->avail - tbin->ncached, rem * + sizeof(void *)); tbin->ncached = rem; if ((int)tbin->ncached < tbin->low_water) tbin->low_water = tbin->ncached; @@ -324,18 +325,26 @@ tcache_create(tsd_t *tsd, arena_t *arena) /* Avoid false cacheline sharing. */ size = sa2u(size, CACHELINE); - tcache = ipallocztm(tsd, size, CACHELINE, true, false, true, a0get()); + tcache = ipallocztm(tsd, size, CACHELINE, true, false, true, + arena_get(0, false)); if (tcache == NULL) return (NULL); tcache_arena_associate(tcache, arena); + ticker_init(&tcache->gc_ticker, TCACHE_GC_INCR); + assert((TCACHE_NSLOTS_SMALL_MAX & 1U) == 0); for (i = 0; i < nhbins; i++) { tcache->tbins[i].lg_fill_div = 1; + stack_offset += tcache_bin_info[i].ncached_max * sizeof(void *); + /* + * avail points past the available space. Allocations will + * access the slots toward higher addresses (for the benefit of + * prefetch). + */ tcache->tbins[i].avail = (void **)((uintptr_t)tcache + (uintptr_t)stack_offset); - stack_offset += tcache_bin_info[i].ncached_max * sizeof(void *); } return (tcache); @@ -379,7 +388,7 @@ tcache_destroy(tsd_t *tsd, tcache_t *tcache) arena_prof_accum(arena, tcache->prof_accumbytes)) prof_idump(); - idalloctm(tsd, tcache, false, true); + idalloctm(tsd, tcache, false, true, true); } void @@ -445,7 +454,7 @@ tcaches_create(tsd_t *tsd, unsigned *r_ind) if (tcaches_avail == NULL && tcaches_past > MALLOCX_TCACHE_MAX) return (true); - tcache = tcache_create(tsd, a0get()); + tcache = tcache_create(tsd, arena_get(0, false)); if (tcache == NULL) return (true); @@ -453,7 +462,7 @@ tcaches_create(tsd_t *tsd, unsigned *r_ind) elm = tcaches_avail; tcaches_avail = tcaches_avail->next; elm->tcache = tcache; - *r_ind = elm - tcaches; + *r_ind = (unsigned)(elm - tcaches); } else { elm = &tcaches[tcaches_past]; elm->tcache = tcache; diff --git a/contrib/jemalloc/src/ticker.c b/contrib/jemalloc/src/ticker.c new file mode 100644 index 0000000..db09024 --- /dev/null +++ b/contrib/jemalloc/src/ticker.c @@ -0,0 +1,2 @@ +#define JEMALLOC_TICKER_C_ +#include "jemalloc/internal/jemalloc_internal.h" diff --git a/contrib/jemalloc/src/tsd.c b/contrib/jemalloc/src/tsd.c index 9ffe9af..34c1573 100644 --- a/contrib/jemalloc/src/tsd.c +++ b/contrib/jemalloc/src/tsd.c @@ -113,7 +113,7 @@ malloc_tsd_boot0(void) ncleanups = 0; if (tsd_boot0()) return (true); - *tsd_arenas_cache_bypassp_get(tsd_fetch()) = true; + *tsd_arenas_tdata_bypassp_get(tsd_fetch()) = true; return (false); } @@ -122,7 +122,7 @@ malloc_tsd_boot1(void) { tsd_boot1(); - *tsd_arenas_cache_bypassp_get(tsd_fetch()) = false; + *tsd_arenas_tdata_bypassp_get(tsd_fetch()) = false; } #ifdef _WIN32 @@ -148,13 +148,15 @@ _tls_callback(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved) #ifdef _MSC_VER # ifdef _M_IX86 # pragma comment(linker, "/INCLUDE:__tls_used") +# pragma comment(linker, "/INCLUDE:_tls_callback") # else # pragma comment(linker, "/INCLUDE:_tls_used") +# pragma comment(linker, "/INCLUDE:tls_callback") # endif # pragma section(".CRT$XLY",long,read) #endif JEMALLOC_SECTION(".CRT$XLY") JEMALLOC_ATTR(used) -static BOOL (WINAPI *const tls_callback)(HINSTANCE hinstDLL, +BOOL (WINAPI *const tls_callback)(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved) = _tls_callback; #endif diff --git a/contrib/jemalloc/src/util.c b/contrib/jemalloc/src/util.c index 25b61c2..116e981 100644 --- a/contrib/jemalloc/src/util.c +++ b/contrib/jemalloc/src/util.c @@ -1,3 +1,7 @@ +/* + * Define simple versions of assertion macros that won't recurse in case + * of assertion failures in malloc_*printf(). + */ #define assert(e) do { \ if (config_debug && !(e)) { \ malloc_write("<jemalloc>: Failed assertion\n"); \ @@ -49,10 +53,14 @@ wrtmessage(void *cbopaque, const char *s) * Use syscall(2) rather than write(2) when possible in order to avoid * the possibility of memory allocation within libc. This is necessary * on FreeBSD; most operating systems do not have this problem though. + * + * syscall() returns long or int, depending on platform, so capture the + * unused result in the widest plausible type to avoid compiler + * warnings. */ - UNUSED int result = syscall(SYS_write, STDERR_FILENO, s, strlen(s)); + UNUSED long result = syscall(SYS_write, STDERR_FILENO, s, strlen(s)); #else - UNUSED int result = write(STDERR_FILENO, s, strlen(s)); + UNUSED ssize_t result = write(STDERR_FILENO, s, strlen(s)); #endif } @@ -98,7 +106,7 @@ buferror(int err, char *buf, size_t buflen) #ifdef _WIN32 FormatMessageA(FORMAT_MESSAGE_FROM_SYSTEM, NULL, err, 0, - (LPSTR)buf, buflen, NULL); + (LPSTR)buf, (DWORD)buflen, NULL); return (0); #elif defined(__GLIBC__) && defined(_GNU_SOURCE) char *b = strerror_r(err, buf, buflen); @@ -593,7 +601,8 @@ malloc_vsnprintf(char *str, size_t size, const char *format, va_list ap) str[i] = '\0'; else str[size - 1] = '\0'; - ret = i; + assert(i < INT_MAX); + ret = (int)i; #undef APPEND_C #undef APPEND_S @@ -664,3 +673,12 @@ malloc_printf(const char *format, ...) malloc_vcprintf(NULL, NULL, format, ap); va_end(ap); } + +/* + * Restore normal assertion macros, in order to make it possible to compile all + * C files as a single concatenation. + */ +#undef assert +#undef not_reached +#undef not_implemented +#include "jemalloc/internal/assert.h" diff --git a/include/malloc_np.h b/include/malloc_np.h index 24b3148..88919a4 100644 --- a/include/malloc_np.h +++ b/include/malloc_np.h @@ -86,6 +86,13 @@ void __free(void *ptr); int __posix_memalign(void **ptr, size_t alignment, size_t size); void *__aligned_alloc(size_t alignment, size_t size); size_t __malloc_usable_size(const void *ptr); +void __malloc_stats_print(void (*write_cb)(void *, const char *), + void *cbopaque, const char *opts); +int __mallctl(const char *name, void *oldp, size_t *oldlenp, void *newp, + size_t newlen); +int __mallctlnametomib(const char *name, size_t *mibp, size_t *miblenp); +int __mallctlbymib(const size_t *mib, size_t miblen, void *oldp, + size_t *oldlenp, void *newp, size_t newlen); void *__mallocx(size_t size, int flags); void *__rallocx(void *ptr, size_t size, int flags); size_t __xallocx(void *ptr, size_t size, size_t extra, int flags); diff --git a/lib/libc/stdlib/jemalloc/Makefile.inc b/lib/libc/stdlib/jemalloc/Makefile.inc index f322f98..8c4c12a 100644 --- a/lib/libc/stdlib/jemalloc/Makefile.inc +++ b/lib/libc/stdlib/jemalloc/Makefile.inc @@ -4,8 +4,8 @@ JEMALLOCSRCS:= jemalloc.c arena.c atomic.c base.c bitmap.c chunk.c \ chunk_dss.c chunk_mmap.c ckh.c ctl.c extent.c hash.c huge.c mb.c \ - mutex.c pages.c prof.c quarantine.c rtree.c stats.c tcache.c tsd.c \ - util.c + mutex.c nstime.c pages.c prng.c prof.c quarantine.c rtree.c stats.c \ + tcache.c ticker.c tsd.c util.c SYM_MAPS+=${LIBC_SRCTOP}/stdlib/jemalloc/Symbol.map diff --git a/lib/libc/stdlib/jemalloc/Symbol.map b/lib/libc/stdlib/jemalloc/Symbol.map index c073068..087ca53 100644 --- a/lib/libc/stdlib/jemalloc/Symbol.map +++ b/lib/libc/stdlib/jemalloc/Symbol.map @@ -54,6 +54,11 @@ FBSD_1.3 { FBSD_1.4 { sdallocx; __sdallocx; + __aligned_alloc; + __malloc_stats_print; + __mallctl; + __mallctlnametomib; + __mallctlbymib; }; FBSDprivate_1.0 { |