diff options
Diffstat (limited to 'contrib/perl5/pod/perldebguts.pod')
-rw-r--r-- | contrib/perl5/pod/perldebguts.pod | 925 |
1 files changed, 0 insertions, 925 deletions
diff --git a/contrib/perl5/pod/perldebguts.pod b/contrib/perl5/pod/perldebguts.pod deleted file mode 100644 index 20cc546..0000000 --- a/contrib/perl5/pod/perldebguts.pod +++ /dev/null @@ -1,925 +0,0 @@ -=head1 NAME - -perldebguts - Guts of Perl debugging - -=head1 DESCRIPTION - -This is not the perldebug(1) manpage, which tells you how to use -the debugger. This manpage describes low-level details ranging -between difficult and impossible for anyone who isn't incredibly -intimate with Perl's guts to understand. Caveat lector. - -=head1 Debugger Internals - -Perl has special debugging hooks at compile-time and run-time used -to create debugging environments. These hooks are not to be confused -with the I<perl -Dxxx> command described in L<perlrun>, which is -usable only if a special Perl is built per the instructions in the -F<INSTALL> podpage in the Perl source tree. - -For example, whenever you call Perl's built-in C<caller> function -from the package DB, the arguments that the corresponding stack -frame was called with are copied to the @DB::args array. The -general mechanisms is enabled by calling Perl with the B<-d> switch, the -following additional features are enabled (cf. L<perlvar/$^P>): - -=over 4 - -=item * - -Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require -'perl5db.pl'}> if not present) before the first line of your program. - -=item * - -Each array C<@{"_<$filename"}> holds the lines of $filename for a -file compiled by Perl. The same for C<eval>ed strings that contain -subroutines, or which are currently being executed. The $filename -for C<eval>ed strings looks like C<(eval 34)>. Code assertions -in regexes look like C<(re_eval 19)>. - -Values in this array are magical in numeric context: they compare -equal to zero only if the line is not breakable. - -=item * - -Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed -by line number. Individual entries (as opposed to the whole hash) -are settable. Perl only cares about Boolean true here, although -the values used by F<perl5db.pl> have the form -C<"$break_condition\0$action">. - -The same holds for evaluated strings that contain subroutines, or -which are currently being executed. The $filename for C<eval>ed strings -looks like C<(eval 34)> or C<(re_eval 19)>. - -=item * - -Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is -also the case for evaluated strings that contain subroutines, or -which are currently being executed. The $filename for C<eval>ed -strings looks like C<(eval 34)> or C<(re_eval 19)>. - -=item * - -After each C<require>d file is compiled, but before it is executed, -C<DB::postponed(*{"_<$filename"})> is called if the subroutine -C<DB::postponed> exists. Here, the $filename is the expanded name of -the C<require>d file, as found in the values of %INC. - -=item * - -After each subroutine C<subname> is compiled, the existence of -C<$DB::postponed{subname}> is checked. If this key exists, -C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine -also exists. - -=item * - -A hash C<%DB::sub> is maintained, whose keys are subroutine names -and whose values have the form C<filename:startline-endline>. -C<filename> has the form C<(eval 34)> for subroutines defined inside -C<eval>s, or C<(re_eval 19)> for those within regex code assertions. - -=item * - -When the execution of your program reaches a point that can hold a -breakpoint, the C<DB::DB()> subroutine is called any of the variables -$DB::trace, $DB::single, or $DB::signal is true. These variables -are not C<local>izable. This feature is disabled when executing -inside C<DB::DB()>, including functions called from it -unless C<< $^D & (1<<30) >> is true. - -=item * - -When execution of the program reaches a subroutine call, a call to -C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the -name of the called subroutine. This doesn't happen if the subroutine -was compiled in the C<DB> package.) - -=back - -Note that if C<&DB::sub> needs external data for it to work, no -subroutine call is possible until this is done. For the standard -debugger, the C<$DB::deep> variable (how many levels of recursion -deep into the debugger you can go before a mandatory break) gives -an example of such a dependency. - -=head2 Writing Your Own Debugger - -The minimal working debugger consists of one line - - sub DB::DB {} - -which is quite handy as contents of C<PERL5DB> environment -variable: - - $ PERL5DB="sub DB::DB {}" perl -d your-script - -Another brief debugger, slightly more useful, could be created -with only the line: - - sub DB::DB {print ++$i; scalar <STDIN>} - -This debugger would print the sequential number of encountered -statement, and would wait for you to hit a newline before continuing. - -The following debugger is quite functional: - - { - package DB; - sub DB {} - sub sub {print ++$i, " $sub\n"; &$sub} - } - -It prints the sequential number of subroutine call and the name of the -called subroutine. Note that C<&DB::sub> should be compiled into the -package C<DB>. - -At the start, the debugger reads your rc file (F<./.perldb> or -F<~/.perldb> under Unix), which can set important options. This file may -define a subroutine C<&afterinit> to be executed after the debugger is -initialized. - -After the rc file is read, the debugger reads the PERLDB_OPTS -environment variable and parses this as the remainder of a C<O ...> -line as one might enter at the debugger prompt. - -The debugger also maintains magical internal variables, such as -C<@DB::dbline>, C<%DB::dbline>, which are aliases for -C<@{"::_<current_file"}> C<%{"::_<current_file"}>. Here C<current_file> -is the currently selected file, either explicitly chosen with the -debugger's C<f> command, or implicitly by flow of execution. - -Some functions are provided to simplify customization. See -L<perldebug/"Options"> for description of options parsed by -C<DB::parse_options(string)>. The function C<DB::dump_trace(skip[, -count])> skips the specified number of frames and returns a list -containing information about the calling frames (all of them, if -C<count> is missing). Each entry is reference to a hash with -keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine -name, or info about C<eval>), C<args> (C<undef> or a reference to -an array), C<file>, and C<line>. - -The function C<DB::print_trace(FH, skip[, count[, short]])> prints -formatted info about caller frames. The last two functions may be -convenient as arguments to C<< < >>, C<< << >> commands. - -Note that any variables and functions that are not documented in -this manpages (or in L<perldebug>) are considered for internal -use only, and as such are subject to change without notice. - -=head1 Frame Listing Output Examples - -The C<frame> option can be used to control the output of frame -information. For example, contrast this expression trace: - - $ perl -de 42 - Stack dump during die enabled outside of evals. - - Loading DB routines from perl5db.pl patch level 0.94 - Emacs support available. - - Enter h or `h h' for help. - - main::(-e:1): 0 - DB<1> sub foo { 14 } - - DB<2> sub bar { 3 } - - DB<3> t print foo() * bar() - main::((eval 172):3): print foo() + bar(); - main::foo((eval 168):2): - main::bar((eval 170):2): - 42 - -with this one, once the C<O>ption C<frame=2> has been set: - - DB<4> O f=2 - frame = '2' - DB<5> t print foo() * bar() - 3: foo() * bar() - entering main::foo - 2: sub foo { 14 }; - exited main::foo - entering main::bar - 2: sub bar { 3 }; - exited main::bar - 42 - -By way of demonstration, we present below a laborious listing -resulting from setting your C<PERLDB_OPTS> environment variable to -the value C<f=n N>, and running I<perl -d -V> from the command line. -Examples use various values of C<n> are shown to give you a feel -for the difference between settings. Long those it may be, this -is not a complete listing, but only excerpts. - -=over 4 - -=item 1 - - entering main::BEGIN - entering Config::BEGIN - Package lib/Exporter.pm. - Package lib/Carp.pm. - Package lib/Config.pm. - entering Config::TIEHASH - entering Exporter::import - entering Exporter::export - entering Config::myconfig - entering Config::FETCH - entering Config::FETCH - entering Config::FETCH - entering Config::FETCH - -=item 2 - - entering main::BEGIN - entering Config::BEGIN - Package lib/Exporter.pm. - Package lib/Carp.pm. - exited Config::BEGIN - Package lib/Config.pm. - entering Config::TIEHASH - exited Config::TIEHASH - entering Exporter::import - entering Exporter::export - exited Exporter::export - exited Exporter::import - exited main::BEGIN - entering Config::myconfig - entering Config::FETCH - exited Config::FETCH - entering Config::FETCH - exited Config::FETCH - entering Config::FETCH - -=item 4 - - in $=main::BEGIN() from /dev/null:0 - in $=Config::BEGIN() from lib/Config.pm:2 - Package lib/Exporter.pm. - Package lib/Carp.pm. - Package lib/Config.pm. - in $=Config::TIEHASH('Config') from lib/Config.pm:644 - in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li - in @=Config::myconfig() from /dev/null:0 - in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574 - -=item 6 - - in $=main::BEGIN() from /dev/null:0 - in $=Config::BEGIN() from lib/Config.pm:2 - Package lib/Exporter.pm. - Package lib/Carp.pm. - out $=Config::BEGIN() from lib/Config.pm:0 - Package lib/Config.pm. - in $=Config::TIEHASH('Config') from lib/Config.pm:644 - out $=Config::TIEHASH('Config') from lib/Config.pm:644 - in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ - out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ - out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - out $=main::BEGIN() from /dev/null:0 - in @=Config::myconfig() from /dev/null:0 - in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 - out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 - out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 - out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 - in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 - -=item 14 - - in $=main::BEGIN() from /dev/null:0 - in $=Config::BEGIN() from lib/Config.pm:2 - Package lib/Exporter.pm. - Package lib/Carp.pm. - out $=Config::BEGIN() from lib/Config.pm:0 - Package lib/Config.pm. - in $=Config::TIEHASH('Config') from lib/Config.pm:644 - out $=Config::TIEHASH('Config') from lib/Config.pm:644 - in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E - out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E - out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - out $=main::BEGIN() from /dev/null:0 - in @=Config::myconfig() from /dev/null:0 - in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 - out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 - in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 - out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 - -=item 30 - - in $=CODE(0x15eca4)() from /dev/null:0 - in $=CODE(0x182528)() from lib/Config.pm:2 - Package lib/Exporter.pm. - out $=CODE(0x182528)() from lib/Config.pm:0 - scalar context return from CODE(0x182528): undef - Package lib/Config.pm. - in $=Config::TIEHASH('Config') from lib/Config.pm:628 - out $=Config::TIEHASH('Config') from lib/Config.pm:628 - scalar context return from Config::TIEHASH: empty hash - in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 - out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 - scalar context return from Exporter::export: '' - out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 - scalar context return from Exporter::import: '' - -=back - -In all cases shown above, the line indentation shows the call tree. -If bit 2 of C<frame> is set, a line is printed on exit from a -subroutine as well. If bit 4 is set, the arguments are printed -along with the caller info. If bit 8 is set, the arguments are -printed even if they are tied or references. If bit 16 is set, the -return value is printed, too. - -When a package is compiled, a line like this - - Package lib/Carp.pm. - -is printed with proper indentation. - -=head1 Debugging regular expressions - -There are two ways to enable debugging output for regular expressions. - -If your perl is compiled with C<-DDEBUGGING>, you may use the -B<-Dr> flag on the command line. - -Otherwise, one can C<use re 'debug'>, which has effects at -compile time and run time. It is not lexically scoped. - -=head2 Compile-time output - -The debugging output at compile time looks like this: - - compiling RE `[bc]d(ef*g)+h[ij]k$' - size 43 first at 1 - 1: ANYOF(11) - 11: EXACT <d>(13) - 13: CURLYX {1,32767}(27) - 15: OPEN1(17) - 17: EXACT <e>(19) - 19: STAR(22) - 20: EXACT <f>(0) - 22: EXACT <g>(24) - 24: CLOSE1(26) - 26: WHILEM(0) - 27: NOTHING(28) - 28: EXACT <h>(30) - 30: ANYOF(40) - 40: EXACT <k>(42) - 42: EOL(43) - 43: END(0) - anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) - stclass `ANYOF' minlen 7 - -The first line shows the pre-compiled form of the regex. The second -shows the size of the compiled form (in arbitrary units, usually -4-byte words) and the label I<id> of the first node that does a -match. - -The last line (split into two lines above) contains optimizer -information. In the example shown, the optimizer found that the match -should contain a substring C<de> at offset 1, plus substring C<gh> -at some offset between 3 and infinity. Moreover, when checking for -these substrings (to abandon impossible matches quickly), Perl will check -for the substring C<gh> before checking for the substring C<de>. The -optimizer may also use the knowledge that the match starts (at the -C<first> I<id>) with a character class, and the match cannot be -shorter than 7 chars. - -The fields of interest which may appear in the last line are - -=over 4 - -=item C<anchored> I<STRING> C<at> I<POS> - -=item C<floating> I<STRING> C<at> I<POS1..POS2> - -See above. - -=item C<matching floating/anchored> - -Which substring to check first. - -=item C<minlen> - -The minimal length of the match. - -=item C<stclass> I<TYPE> - -Type of first matching node. - -=item C<noscan> - -Don't scan for the found substrings. - -=item C<isall> - -Means that the optimizer info is all that the regular -expression contains, and thus one does not need to enter the regex engine at -all. - -=item C<GPOS> - -Set if the pattern contains C<\G>. - -=item C<plus> - -Set if the pattern starts with a repeated char (as in C<x+y>). - -=item C<implicit> - -Set if the pattern starts with C<.*>. - -=item C<with eval> - -Set if the pattern contain eval-groups, such as C<(?{ code })> and -C<(??{ code })>. - -=item C<anchored(TYPE)> - -If the pattern may match only at a handful of places, (with C<TYPE> -being C<BOL>, C<MBOL>, or C<GPOS>. See the table below. - -=back - -If a substring is known to match at end-of-line only, it may be -followed by C<$>, as in C<floating `k'$>. - -The optimizer-specific info is used to avoid entering (a slow) regex -engine on strings that will not definitely match. If C<isall> flag -is set, a call to the regex engine may be avoided even when the optimizer -found an appropriate place for the match. - -The rest of the output contains the list of I<nodes> of the compiled -form of the regex. Each line has format - -C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>) - -=head2 Types of nodes - -Here are the possible types, with short descriptions: - - # TYPE arg-description [num-args] [longjump-len] DESCRIPTION - - # Exit points - END no End of program. - SUCCEED no Return from a subroutine, basically. - - # Anchors: - BOL no Match "" at beginning of line. - MBOL no Same, assuming multiline. - SBOL no Same, assuming singleline. - EOS no Match "" at end of string. - EOL no Match "" at end of line. - MEOL no Same, assuming multiline. - SEOL no Same, assuming singleline. - BOUND no Match "" at any word boundary - BOUNDL no Match "" at any word boundary - NBOUND no Match "" at any word non-boundary - NBOUNDL no Match "" at any word non-boundary - GPOS no Matches where last m//g left off. - - # [Special] alternatives - ANY no Match any one character (except newline). - SANY no Match any one character. - ANYOF sv Match character in (or not in) this class. - ALNUM no Match any alphanumeric character - ALNUML no Match any alphanumeric char in locale - NALNUM no Match any non-alphanumeric character - NALNUML no Match any non-alphanumeric char in locale - SPACE no Match any whitespace character - SPACEL no Match any whitespace char in locale - NSPACE no Match any non-whitespace character - NSPACEL no Match any non-whitespace char in locale - DIGIT no Match any numeric character - NDIGIT no Match any non-numeric character - - # BRANCH The set of branches constituting a single choice are hooked - # together with their "next" pointers, since precedence prevents - # anything being concatenated to any individual branch. The - # "next" pointer of the last BRANCH in a choice points to the - # thing following the whole choice. This is also where the - # final "next" pointer of each individual branch points; each - # branch starts with the operand node of a BRANCH node. - # - BRANCH node Match this alternative, or the next... - - # BACK Normal "next" pointers all implicitly point forward; BACK - # exists to make loop structures possible. - # not used - BACK no Match "", "next" ptr points backward. - - # Literals - EXACT sv Match this string (preceded by length). - EXACTF sv Match this string, folded (prec. by length). - EXACTFL sv Match this string, folded in locale (w/len). - - # Do nothing - NOTHING no Match empty string. - # A variant of above which delimits a group, thus stops optimizations - TAIL no Match empty string. Can jump here from outside. - - # STAR,PLUS '?', and complex '*' and '+', are implemented as circular - # BRANCH structures using BACK. Simple cases (one character - # per match) are implemented with STAR and PLUS for speed - # and to minimize recursive plunges. - # - STAR node Match this (simple) thing 0 or more times. - PLUS node Match this (simple) thing 1 or more times. - - CURLY sv 2 Match this simple thing {n,m} times. - CURLYN no 2 Match next-after-this simple thing - # {n,m} times, set parens. - CURLYM no 2 Match this medium-complex thing {n,m} times. - CURLYX sv 2 Match this complex thing {n,m} times. - - # This terminator creates a loop structure for CURLYX - WHILEM no Do curly processing and see if rest matches. - - # OPEN,CLOSE,GROUPP ...are numbered at compile time. - OPEN num 1 Mark this point in input as start of #n. - CLOSE num 1 Analogous to OPEN. - - REF num 1 Match some already matched string - REFF num 1 Match already matched string, folded - REFFL num 1 Match already matched string, folded in loc. - - # grouping assertions - IFMATCH off 1 2 Succeeds if the following matches. - UNLESSM off 1 2 Fails if the following matches. - SUSPEND off 1 1 "Independent" sub-regex. - IFTHEN off 1 1 Switch, should be preceded by switcher . - GROUPP num 1 Whether the group matched. - - # Support for long regex - LONGJMP off 1 1 Jump far away. - BRANCHJ off 1 1 BRANCH with long offset. - - # The heavy worker - EVAL evl 1 Execute some Perl code. - - # Modifiers - MINMOD no Next operator is not greedy. - LOGICAL no Next opcode should set the flag only. - - # This is not used yet - RENUM off 1 1 Group with independently numbered parens. - - # This is not really a node, but an optimized away piece of a "long" node. - # To simplify debugging output, we mark it as if it were a node - OPTIMIZED off Placeholder for dump. - -=head2 Run-time output - -First of all, when doing a match, one may get no run-time output even -if debugging is enabled. This means that the regex engine was never -entered and that all of the job was therefore done by the optimizer. - -If the regex engine was entered, the output may look like this: - - Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__' - Setting an EVAL scope, savestack=3 - 2 <ab> <cdefg__gh_> | 1: ANYOF - 3 <abc> <defg__gh_> | 11: EXACT <d> - 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767} - 4 <abcd> <efg__gh_> | 26: WHILEM - 0 out of 1..32767 cc=effff31c - 4 <abcd> <efg__gh_> | 15: OPEN1 - 4 <abcd> <efg__gh_> | 17: EXACT <e> - 5 <abcde> <fg__gh_> | 19: STAR - EXACT <f> can match 1 times out of 32767... - Setting an EVAL scope, savestack=3 - 6 <bcdef> <g__gh__> | 22: EXACT <g> - 7 <bcdefg> <__gh__> | 24: CLOSE1 - 7 <bcdefg> <__gh__> | 26: WHILEM - 1 out of 1..32767 cc=effff31c - Setting an EVAL scope, savestack=12 - 7 <bcdefg> <__gh__> | 15: OPEN1 - 7 <bcdefg> <__gh__> | 17: EXACT <e> - restoring \1 to 4(4)..7 - failed, try continuation... - 7 <bcdefg> <__gh__> | 27: NOTHING - 7 <bcdefg> <__gh__> | 28: EXACT <h> - failed... - failed... - -The most significant information in the output is about the particular I<node> -of the compiled regex that is currently being tested against the target string. -The format of these lines is - -C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE> - -The I<TYPE> info is indented with respect to the backtracking level. -Other incidental information appears interspersed within. - -=head1 Debugging Perl memory usage - -Perl is a profligate wastrel when it comes to memory use. There -is a saying that to estimate memory usage of Perl, assume a reasonable -algorithm for memory allocation, multiply that estimate by 10, and -while you still may miss the mark, at least you won't be quite so -astonished. This is not absolutely true, but may provide a good -grasp of what happens. - -Assume that an integer cannot take less than 20 bytes of memory, a -float cannot take less than 24 bytes, a string cannot take less -than 32 bytes (all these examples assume 32-bit architectures, the -result are quite a bit worse on 64-bit architectures). If a variable -is accessed in two of three different ways (which require an integer, -a float, or a string), the memory footprint may increase yet another -20 bytes. A sloppy malloc(3) implementation can inflate these -numbers dramatically. - -On the opposite end of the scale, a declaration like - - sub foo; - -may take up to 500 bytes of memory, depending on which release of Perl -you're running. - -Anecdotal estimates of source-to-compiled code bloat suggest an -eightfold increase. This means that the compiled form of reasonable -(normally commented, properly indented etc.) code will take -about eight times more space in memory than the code took -on disk. - -There are two Perl-specific ways to analyze memory usage: -$ENV{PERL_DEBUG_MSTATS} and B<-DL> command-line switch. The first -is available only if Perl is compiled with Perl's malloc(); the -second only if Perl was built with C<-DDEBUGGING>. See the -instructions for how to do this in the F<INSTALL> podpage at -the top level of the Perl source tree. - -=head2 Using C<$ENV{PERL_DEBUG_MSTATS}> - -If your perl is using Perl's malloc() and was compiled with the -necessary switches (this is the default), then it will print memory -usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS} -> 1 >>, and before termination of the program when C<< -$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to -the following example: - - $ PERL_DEBUG_MSTATS=2 perl -e "require Carp" - Memory allocation statistics after compilation: (buckets 4(4)..8188(8192) - 14216 free: 130 117 28 7 9 0 2 2 1 0 0 - 437 61 36 0 5 - 60924 used: 125 137 161 55 7 8 6 16 2 0 1 - 74 109 304 84 20 - Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048. - Memory allocation statistics after execution: (buckets 4(4)..8188(8192) - 30888 free: 245 78 85 13 6 2 1 3 2 0 1 - 315 162 39 42 11 - 175816 used: 265 176 1112 111 26 22 11 27 2 1 1 - 196 178 1066 798 39 - Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144. - -It is possible to ask for such a statistic at arbitrary points in -your execution using the mstat() function out of the standard -Devel::Peek module. - -Here is some explanation of that format: - -=over 4 - -=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)> - -Perl's malloc() uses bucketed allocations. Every request is rounded -up to the closest bucket size available, and a bucket is taken from -the pool of buckets of that size. - -The line above describes the limits of buckets currently in use. -Each bucket has two sizes: memory footprint and the maximal size -of user data that can fit into this bucket. Suppose in the above -example that the smallest bucket were size 4. The biggest bucket -would have usable size 8188, and the memory footprint would be 8192. - -In a Perl built for debugging, some buckets may have negative usable -size. This means that these buckets cannot (and will not) be used. -For larger buckets, the memory footprint may be one page greater -than a power of 2. If so, case the corresponding power of two is -printed in the C<APPROX> field above. - -=item Free/Used - -The 1 or 2 rows of numbers following that correspond to the number -of buckets of each size between C<SMALLEST> and C<GREATEST>. In -the first row, the sizes (memory footprints) of buckets are powers -of two--or possibly one page greater. In the second row, if present, -the memory footprints of the buckets are between the memory footprints -of two buckets "above". - -For example, suppose under the previous example, the memory footprints -were - - free: 8 16 32 64 128 256 512 1024 2048 4096 8192 - 4 12 24 48 80 - -With non-C<DEBUGGING> perl, the buckets starting from C<128> have -a 4-byte overhead, and thus a 8192-long bucket may take up to -8188-byte allocations. - -=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS> - -The first two fields give the total amount of memory perl sbrk(2)ed -(ess-broken? :-) and number of sbrk(2)s used. The third number is -what perl thinks about continuity of returned chunks. So long as -this number is positive, malloc() will assume that it is probable -that sbrk(2) will provide continuous memory. - -Memory allocated by external libraries is not counted. - -=item C<pad: 0> - -The amount of sbrk(2)ed memory needed to keep buckets aligned. - -=item C<heads: 2192> - -Although memory overhead of bigger buckets is kept inside the bucket, for -smaller buckets, it is kept in separate areas. This field gives the -total size of these areas. - -=item C<chain: 0> - -malloc() may want to subdivide a bigger bucket into smaller buckets. -If only a part of the deceased bucket is left unsubdivided, the rest -is kept as an element of a linked list. This field gives the total -size of these chunks. - -=item C<tail: 6144> - -To minimize the number of sbrk(2)s, malloc() asks for more memory. This -field gives the size of the yet unused part, which is sbrk(2)ed, but -never touched. - -=back - -=head2 Example of using B<-DL> switch - -Below we show how to analyse memory usage by - - do 'lib/auto/POSIX/autosplit.ix'; - -The file in question contains a header and 146 lines similar to - - sub getcwd; - -B<WARNING>: The discussion below supposes 32-bit architecture. In -newer releases of Perl, memory usage of the constructs discussed -here is greatly improved, but the story discussed below is a real-life -story. This story is mercilessly terse, and assumes rather more than cursory -knowledge of Perl internals. Type space to continue, `q' to quit. -(Actually, you just want to skip to the next section.) - -Here is the itemized list of Perl allocations performed during parsing -of this file: - - !!! "after" at test.pl line 3. - Id subtot 4 8 12 16 20 24 28 32 36 40 48 56 64 72 80 80+ - 0 02 13752 . . . . 294 . . . . . . . . . . 4 - 0 54 5545 . . 8 124 16 . . . 1 1 . . . . . 3 - 5 05 32 . . . . . . . 1 . . . . . . . . - 6 02 7152 . . . . . . . . . . 149 . . . . . - 7 02 3600 . . . . . 150 . . . . . . . . . . - 7 03 64 . -1 . 1 . . 2 . . . . . . . . . - 7 04 7056 . . . . . . . . . . . . . . . 7 - 7 17 38404 . . . . . . . 1 . . 442 149 . . 147 . - 9 03 2078 17 249 32 . . . . 2 . . . . . . . . - - -To see this list, insert two C<warn('!...')> statements around the call: - - warn('!'); - do 'lib/auto/POSIX/autosplit.ix'; - warn('!!! "after"'); - -and run it with Perl's B<-DL> option. The first warn() will print -memory allocation info before parsing the file and will memorize -the statistics at this point (we ignore what it prints). The second -warn() prints increments with respect to these memorized data. This -is the printout shown above. - -Different I<Id>s on the left correspond to different subsystems of -the perl interpreter. They are just the first argument given to -the perl memory allocation API named New(). To find what C<9 03> -means, just B<grep> the perl source for C<903>. You'll find it in -F<util.c>, function savepvn(). (I know, you wonder why we told you -to B<grep> and then gave away the answer. That's because grepping -the source is good for the soul.) This function is used to store -a copy of an existing chunk of memory. Using a C debugger, one can -see that the function was called either directly from gv_init() or -via sv_magic(), and that gv_init() is called from gv_fetchpv()--which -was itself called from newSUB(). Please stop to catch your breath now. - -B<NOTE>: To reach this point in the debugger and skip the calls to -savepvn() during the compilation of the main program, you should -set a C breakpoint -in Perl_warn(), continue until this point is reached, and I<then> set -a C breakpoint in Perl_savepvn(). Note that you may need to skip a -handful of Perl_savepvn() calls that do not correspond to mass production -of CVs (there are more C<903> allocations than 146 similar lines of -F<lib/auto/POSIX/autosplit.ix>). Note also that C<Perl_> prefixes are -added by macroization code in perl header files to avoid conflicts -with external libraries. - -Anyway, we see that C<903> ids correspond to creation of globs, twice -per glob - for glob name, and glob stringification magic. - -Here are explanations for other I<Id>s above: - -=over 4 - -=item C<717> - -Creates bigger C<XPV*> structures. In the case above, it -creates 3 C<AV>s per subroutine, one for a list of lexical variable -names, one for a scratchpad (which contains lexical variables and -C<targets>), and one for the array of scratchpads needed for -recursion. - -It also creates a C<GV> and a C<CV> per subroutine, all called from -start_subparse(). - -=item C<002> - -Creates a C array corresponding to the C<AV> of scratchpads and the -scratchpad itself. The first fake entry of this scratchpad is -created though the subroutine itself is not defined yet. - -It also creates C arrays to keep data for the stash. This is one HV, -but it grows; thus, there are 4 big allocations: the big chunks are not -freed, but are kept as additional arenas for C<SV> allocations. - -=item C<054> - -Creates a C<HEK> for the name of the glob for the subroutine. This -name is a key in a I<stash>. - -Big allocations with this I<Id> correspond to allocations of new -arenas to keep C<HE>. - -=item C<602> - -Creates a C<GP> for the glob for the subroutine. - -=item C<702> - -Creates the C<MAGIC> for the glob for the subroutine. - -=item C<704> - -Creates I<arenas> which keep SVs. - -=back - -=head2 B<-DL> details - -If Perl is run with B<-DL> option, then warn()s that start with `!' -behave specially. They print a list of I<categories> of memory -allocations, and statistics of allocations of different sizes for -these categories. - -If warn() string starts with - -=over 4 - -=item C<!!!> - -print changed categories only, print the differences in counts of allocations. - -=item C<!!> - -print grown categories only; print the absolute values of counts, and totals. - -=item C<!> - -print nonempty categories, print the absolute values of counts and totals. - -=back - -=head2 Limitations of B<-DL> statistics - -If an extension or external library does not use the Perl API to -allocate memory, such allocations are not counted. - -=head1 SEE ALSO - -L<perldebug>, -L<perlguts>, -L<perlrun> -L<re>, -and -L<Devel::Dprof>. |