summaryrefslogtreecommitdiffstats
path: root/contrib/perl5/pod/perldebguts.pod
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/perl5/pod/perldebguts.pod')
-rw-r--r--contrib/perl5/pod/perldebguts.pod923
1 files changed, 923 insertions, 0 deletions
diff --git a/contrib/perl5/pod/perldebguts.pod b/contrib/perl5/pod/perldebguts.pod
new file mode 100644
index 0000000..b74f3ef
--- /dev/null
+++ b/contrib/perl5/pod/perldebguts.pod
@@ -0,0 +1,923 @@
+=head1 NAME
+
+perldebguts - Guts of Perl debugging
+
+=head1 DESCRIPTION
+
+This is not the perldebug(1) manpage, which tells you how to use
+the debugger. This manpage describes low-level details ranging
+between difficult and impossible for anyone who isn't incredibly
+intimate with Perl's guts to understand. Caveat lector.
+
+=head1 Debugger Internals
+
+Perl has special debugging hooks at compile-time and run-time used
+to create debugging environments. These hooks are not to be confused
+with the I<perl -Dxxx> command described in L<perlrun>, which are
+usable only if a special Perl built per the instructions the
+F<INSTALL> podpage in the Perl source tree.
+
+For example, whenever you call Perl's built-in C<caller> function
+from the package DB, the arguments that the corresponding stack
+frame was called with are copied to the the @DB::args array. The
+general mechanisms is enabled by calling Perl with the B<-d> switch, the
+following additional features are enabled (cf. L<perlvar/$^P>):
+
+=over
+
+=item *
+
+Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
+'perl5db.pl'}> if not present) before the first line of your program.
+
+=item *
+
+The array C<@{"_<$filename"}> holds the lines of $filename for all
+files compiled by Perl. The same for C<eval>ed strings that contain
+subroutines, or which are currently being executed. The $filename
+for C<eval>ed strings looks like C<(eval 34)>. Code assertions
+in regexes look like C<(re_eval 19)>.
+
+=item *
+
+The hash C<%{"_<$filename"}> contains breakpoints and actions keyed
+by line number. Individual entries (as opposed to the whole hash)
+are settable. Perl only cares about Boolean true here, although
+the values used by F<perl5db.pl> have the form
+C<"$break_condition\0$action">. Values in this hash are magical
+in numeric context: they are zeros if the line is not breakable.
+
+The same holds for evaluated strings that contain subroutines, or
+which are currently being executed. The $filename for C<eval>ed strings
+looks like C<(eval 34)> or C<(re_eval 19)>.
+
+=item *
+
+The scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
+also the case for evaluated strings that contain subroutines, or
+which are currently being executed. The $filename for C<eval>ed
+strings looks like C<(eval 34)> or C<(re_eval 19)>.
+
+=item *
+
+After each C<require>d file is compiled, but before it is executed,
+C<DB::postponed(*{"_<$filename"})> is called if the subroutine
+C<DB::postponed> exists. Here, the $filename is the expanded name of
+the C<require>d file, as found in the values of %INC.
+
+=item *
+
+After each subroutine C<subname> is compiled, the existence of
+C<$DB::postponed{subname}> is checked. If this key exists,
+C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
+also exists.
+
+=item *
+
+A hash C<%DB::sub> is maintained, whose keys are subroutine names
+and whose values have the form C<filename:startline-endline>.
+C<filename> has the form C<(eval 34)> for subroutines defined inside
+C<eval>s, or C<(re_eval 19)> for those within regex code assertions.
+
+=item *
+
+When the execution of your program reaches a point that can hold a
+breakpoint, the C<DB::DB()> subroutine is called any of the variables
+$DB::trace, $DB::single, or $DB::signal is true. These variables
+are not C<local>izable. This feature is disabled when executing
+inside C<DB::DB()>, including functions called from it
+unless C<< $^D & (1<<30) >> is true.
+
+=item *
+
+When execution of the program reaches a subroutine call, a call to
+C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
+name of the called subroutine. This doesn't happen if the subroutine
+was compiled in the C<DB> package.)
+
+=back
+
+Note that if C<&DB::sub> needs external data for it to work, no
+subroutine call is possible until this is done. For the standard
+debugger, the C<$DB::deep> variable (how many levels of recursion
+deep into the debugger you can go before a mandatory break) gives
+an example of such a dependency.
+
+=head2 Writing Your Own Debugger
+
+The minimal working debugger consists of one line
+
+ sub DB::DB {}
+
+which is quite handy as contents of C<PERL5DB> environment
+variable:
+
+ $ PERL5DB="sub DB::DB {}" perl -d your-script
+
+Another brief debugger, slightly more useful, could be created
+with only the line:
+
+ sub DB::DB {print ++$i; scalar <STDIN>}
+
+This debugger would print the sequential number of encountered
+statement, and would wait for you to hit a newline before continuing.
+
+The following debugger is quite functional:
+
+ {
+ package DB;
+ sub DB {}
+ sub sub {print ++$i, " $sub\n"; &$sub}
+ }
+
+It prints the sequential number of subroutine call and the name of the
+called subroutine. Note that C<&DB::sub> should be compiled into the
+package C<DB>.
+
+At the start, the debugger reads your rc file (F<./.perldb> or
+F<~/.perldb> under Unix), which can set important options. This file may
+define a subroutine C<&afterinit> to be executed after the debugger is
+initialized.
+
+After the rc file is read, the debugger reads the PERLDB_OPTS
+environment variable and parses this as the remainder of a C<O ...>
+line as one might enter at the debugger prompt.
+
+The debugger also maintains magical internal variables, such as
+C<@DB::dbline>, C<%DB::dbline>, which are aliases for
+C<@{"::_<current_file"}> C<%{"::_<current_file"}>. Here C<current_file>
+is the currently selected file, either explicitly chosen with the
+debugger's C<f> command, or implicitly by flow of execution.
+
+Some functions are provided to simplify customization. See
+L<perldebug/"Options"> for description of options parsed by
+C<DB::parse_options(string)>. The function C<DB::dump_trace(skip[,
+count])> skips the specified number of frames and returns a list
+containing information about the calling frames (all of them, if
+C<count> is missing). Each entry is reference to a a hash with
+keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
+name, or info about C<eval>), C<args> (C<undef> or a reference to
+an array), C<file>, and C<line>.
+
+The function C<DB::print_trace(FH, skip[, count[, short]])> prints
+formatted info about caller frames. The last two functions may be
+convenient as arguments to C<< < >>, C<< << >> commands.
+
+Note that any variables and functions that are not documented in
+this manpages (or in L<perldebug>) are considered for internal
+use only, and as such are subject to change without notice.
+
+=head1 Frame Listing Output Examples
+
+The C<frame> option can be used to control the output of frame
+information. For example, contrast this expression trace:
+
+ $ perl -de 42
+ Stack dump during die enabled outside of evals.
+
+ Loading DB routines from perl5db.pl patch level 0.94
+ Emacs support available.
+
+ Enter h or `h h' for help.
+
+ main::(-e:1): 0
+ DB<1> sub foo { 14 }
+
+ DB<2> sub bar { 3 }
+
+ DB<3> t print foo() * bar()
+ main::((eval 172):3): print foo() + bar();
+ main::foo((eval 168):2):
+ main::bar((eval 170):2):
+ 42
+
+with this one, once the C<O>ption C<frame=2> has been set:
+
+ DB<4> O f=2
+ frame = '2'
+ DB<5> t print foo() * bar()
+ 3: foo() * bar()
+ entering main::foo
+ 2: sub foo { 14 };
+ exited main::foo
+ entering main::bar
+ 2: sub bar { 3 };
+ exited main::bar
+ 42
+
+By way of demonstration, we present below a laborious listing
+resulting from setting your C<PERLDB_OPTS> environment variable to
+the value C<f=n N>, and running I<perl -d -V> from the command line.
+Examples use various values of C<n> are shown to give you a feel
+for the difference between settings. Long those it may be, this
+is not a complete listing, but only excerpts.
+
+=over 4
+
+=item 1
+
+ entering main::BEGIN
+ entering Config::BEGIN
+ Package lib/Exporter.pm.
+ Package lib/Carp.pm.
+ Package lib/Config.pm.
+ entering Config::TIEHASH
+ entering Exporter::import
+ entering Exporter::export
+ entering Config::myconfig
+ entering Config::FETCH
+ entering Config::FETCH
+ entering Config::FETCH
+ entering Config::FETCH
+
+=item 2
+
+ entering main::BEGIN
+ entering Config::BEGIN
+ Package lib/Exporter.pm.
+ Package lib/Carp.pm.
+ exited Config::BEGIN
+ Package lib/Config.pm.
+ entering Config::TIEHASH
+ exited Config::TIEHASH
+ entering Exporter::import
+ entering Exporter::export
+ exited Exporter::export
+ exited Exporter::import
+ exited main::BEGIN
+ entering Config::myconfig
+ entering Config::FETCH
+ exited Config::FETCH
+ entering Config::FETCH
+ exited Config::FETCH
+ entering Config::FETCH
+
+=item 4
+
+ in $=main::BEGIN() from /dev/null:0
+ in $=Config::BEGIN() from lib/Config.pm:2
+ Package lib/Exporter.pm.
+ Package lib/Carp.pm.
+ Package lib/Config.pm.
+ in $=Config::TIEHASH('Config') from lib/Config.pm:644
+ in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
+ in @=Config::myconfig() from /dev/null:0
+ in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
+
+=item 6
+
+ in $=main::BEGIN() from /dev/null:0
+ in $=Config::BEGIN() from lib/Config.pm:2
+ Package lib/Exporter.pm.
+ Package lib/Carp.pm.
+ out $=Config::BEGIN() from lib/Config.pm:0
+ Package lib/Config.pm.
+ in $=Config::TIEHASH('Config') from lib/Config.pm:644
+ out $=Config::TIEHASH('Config') from lib/Config.pm:644
+ in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
+ out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
+ out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ out $=main::BEGIN() from /dev/null:0
+ in @=Config::myconfig() from /dev/null:0
+ in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
+ out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
+ out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
+ out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
+ in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
+
+=item 14
+
+ in $=main::BEGIN() from /dev/null:0
+ in $=Config::BEGIN() from lib/Config.pm:2
+ Package lib/Exporter.pm.
+ Package lib/Carp.pm.
+ out $=Config::BEGIN() from lib/Config.pm:0
+ Package lib/Config.pm.
+ in $=Config::TIEHASH('Config') from lib/Config.pm:644
+ out $=Config::TIEHASH('Config') from lib/Config.pm:644
+ in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
+ out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
+ out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ out $=main::BEGIN() from /dev/null:0
+ in @=Config::myconfig() from /dev/null:0
+ in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
+ out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
+ in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
+ out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
+
+=item 30
+
+ in $=CODE(0x15eca4)() from /dev/null:0
+ in $=CODE(0x182528)() from lib/Config.pm:2
+ Package lib/Exporter.pm.
+ out $=CODE(0x182528)() from lib/Config.pm:0
+ scalar context return from CODE(0x182528): undef
+ Package lib/Config.pm.
+ in $=Config::TIEHASH('Config') from lib/Config.pm:628
+ out $=Config::TIEHASH('Config') from lib/Config.pm:628
+ scalar context return from Config::TIEHASH: empty hash
+ in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
+ out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
+ scalar context return from Exporter::export: ''
+ out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
+ scalar context return from Exporter::import: ''
+
+=back
+
+In all cases shown above, the line indentation shows the call tree.
+If bit 2 of C<frame> is set, a line is printed on exit from a
+subroutine as well. If bit 4 is set, the arguments are printed
+along with the caller info. If bit 8 is set, the arguments are
+printed even if they are tied or references. If bit 16 is set, the
+return value is printed, too.
+
+When a package is compiled, a line like this
+
+ Package lib/Carp.pm.
+
+is printed with proper indentation.
+
+=head1 Debugging regular expressions
+
+There are two ways to enable debugging output for regular expressions.
+
+If your perl is compiled with C<-DDEBUGGING>, you may use the
+B<-Dr> flag on the command line.
+
+Otherwise, one can C<use re 'debug'>, which has effects at
+compile time and run time. It is not lexically scoped.
+
+=head2 Compile-time output
+
+The debugging output at compile time looks like this:
+
+ compiling RE `[bc]d(ef*g)+h[ij]k$'
+ size 43 first at 1
+ 1: ANYOF(11)
+ 11: EXACT <d>(13)
+ 13: CURLYX {1,32767}(27)
+ 15: OPEN1(17)
+ 17: EXACT <e>(19)
+ 19: STAR(22)
+ 20: EXACT <f>(0)
+ 22: EXACT <g>(24)
+ 24: CLOSE1(26)
+ 26: WHILEM(0)
+ 27: NOTHING(28)
+ 28: EXACT <h>(30)
+ 30: ANYOF(40)
+ 40: EXACT <k>(42)
+ 42: EOL(43)
+ 43: END(0)
+ anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
+ stclass `ANYOF' minlen 7
+
+The first line shows the pre-compiled form of the regex. The second
+shows the size of the compiled form (in arbitrary units, usually
+4-byte words) and the label I<id> of the first node that does a
+match.
+
+The last line (split into two lines above) contains optimizer
+information. In the example shown, the optimizer found that the match
+should contain a substring C<de> at offset 1, plus substring C<gh>
+at some offset between 3 and infinity. Moreover, when checking for
+these substrings (to abandon impossible matches quickly), Perl will check
+for the substring C<gh> before checking for the substring C<de>. The
+optimizer may also use the knowledge that the match starts (at the
+C<first> I<id>) with a character class, and the match cannot be
+shorter than 7 chars.
+
+The fields of interest which may appear in the last line are
+
+=over
+
+=item C<anchored> I<STRING> C<at> I<POS>
+
+=item C<floating> I<STRING> C<at> I<POS1..POS2>
+
+See above.
+
+=item C<matching floating/anchored>
+
+Which substring to check first.
+
+=item C<minlen>
+
+The minimal length of the match.
+
+=item C<stclass> I<TYPE>
+
+Type of first matching node.
+
+=item C<noscan>
+
+Don't scan for the found substrings.
+
+=item C<isall>
+
+Means that the optimizer info is all that the regular
+expression contains, and thus one does not need to enter the regex engine at
+all.
+
+=item C<GPOS>
+
+Set if the pattern contains C<\G>.
+
+=item C<plus>
+
+Set if the pattern starts with a repeated char (as in C<x+y>).
+
+=item C<implicit>
+
+Set if the pattern starts with C<.*>.
+
+=item C<with eval>
+
+Set if the pattern contain eval-groups, such as C<(?{ code })> and
+C<(??{ code })>.
+
+=item C<anchored(TYPE)>
+
+If the pattern may match only at a handful of places, (with C<TYPE>
+being C<BOL>, C<MBOL>, or C<GPOS>. See the table below.
+
+=back
+
+If a substring is known to match at end-of-line only, it may be
+followed by C<$>, as in C<floating `k'$>.
+
+The optimizer-specific info is used to avoid entering (a slow) regex
+engine on strings that will not definitely match. If C<isall> flag
+is set, a call to the regex engine may be avoided even when the optimizer
+found an appropriate place for the match.
+
+The rest of the output contains the list of I<nodes> of the compiled
+form of the regex. Each line has format
+
+C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
+
+=head2 Types of nodes
+
+Here are the possible types, with short descriptions:
+
+ # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
+
+ # Exit points
+ END no End of program.
+ SUCCEED no Return from a subroutine, basically.
+
+ # Anchors:
+ BOL no Match "" at beginning of line.
+ MBOL no Same, assuming multiline.
+ SBOL no Same, assuming singleline.
+ EOS no Match "" at end of string.
+ EOL no Match "" at end of line.
+ MEOL no Same, assuming multiline.
+ SEOL no Same, assuming singleline.
+ BOUND no Match "" at any word boundary
+ BOUNDL no Match "" at any word boundary
+ NBOUND no Match "" at any word non-boundary
+ NBOUNDL no Match "" at any word non-boundary
+ GPOS no Matches where last m//g left off.
+
+ # [Special] alternatives
+ ANY no Match any one character (except newline).
+ SANY no Match any one character.
+ ANYOF sv Match character in (or not in) this class.
+ ALNUM no Match any alphanumeric character
+ ALNUML no Match any alphanumeric char in locale
+ NALNUM no Match any non-alphanumeric character
+ NALNUML no Match any non-alphanumeric char in locale
+ SPACE no Match any whitespace character
+ SPACEL no Match any whitespace char in locale
+ NSPACE no Match any non-whitespace character
+ NSPACEL no Match any non-whitespace char in locale
+ DIGIT no Match any numeric character
+ NDIGIT no Match any non-numeric character
+
+ # BRANCH The set of branches constituting a single choice are hooked
+ # together with their "next" pointers, since precedence prevents
+ # anything being concatenated to any individual branch. The
+ # "next" pointer of the last BRANCH in a choice points to the
+ # thing following the whole choice. This is also where the
+ # final "next" pointer of each individual branch points; each
+ # branch starts with the operand node of a BRANCH node.
+ #
+ BRANCH node Match this alternative, or the next...
+
+ # BACK Normal "next" pointers all implicitly point forward; BACK
+ # exists to make loop structures possible.
+ # not used
+ BACK no Match "", "next" ptr points backward.
+
+ # Literals
+ EXACT sv Match this string (preceded by length).
+ EXACTF sv Match this string, folded (prec. by length).
+ EXACTFL sv Match this string, folded in locale (w/len).
+
+ # Do nothing
+ NOTHING no Match empty string.
+ # A variant of above which delimits a group, thus stops optimizations
+ TAIL no Match empty string. Can jump here from outside.
+
+ # STAR,PLUS '?', and complex '*' and '+', are implemented as circular
+ # BRANCH structures using BACK. Simple cases (one character
+ # per match) are implemented with STAR and PLUS for speed
+ # and to minimize recursive plunges.
+ #
+ STAR node Match this (simple) thing 0 or more times.
+ PLUS node Match this (simple) thing 1 or more times.
+
+ CURLY sv 2 Match this simple thing {n,m} times.
+ CURLYN no 2 Match next-after-this simple thing
+ # {n,m} times, set parens.
+ CURLYM no 2 Match this medium-complex thing {n,m} times.
+ CURLYX sv 2 Match this complex thing {n,m} times.
+
+ # This terminator creates a loop structure for CURLYX
+ WHILEM no Do curly processing and see if rest matches.
+
+ # OPEN,CLOSE,GROUPP ...are numbered at compile time.
+ OPEN num 1 Mark this point in input as start of #n.
+ CLOSE num 1 Analogous to OPEN.
+
+ REF num 1 Match some already matched string
+ REFF num 1 Match already matched string, folded
+ REFFL num 1 Match already matched string, folded in loc.
+
+ # grouping assertions
+ IFMATCH off 1 2 Succeeds if the following matches.
+ UNLESSM off 1 2 Fails if the following matches.
+ SUSPEND off 1 1 "Independent" sub-regex.
+ IFTHEN off 1 1 Switch, should be preceded by switcher .
+ GROUPP num 1 Whether the group matched.
+
+ # Support for long regex
+ LONGJMP off 1 1 Jump far away.
+ BRANCHJ off 1 1 BRANCH with long offset.
+
+ # The heavy worker
+ EVAL evl 1 Execute some Perl code.
+
+ # Modifiers
+ MINMOD no Next operator is not greedy.
+ LOGICAL no Next opcode should set the flag only.
+
+ # This is not used yet
+ RENUM off 1 1 Group with independently numbered parens.
+
+ # This is not really a node, but an optimized away piece of a "long" node.
+ # To simplify debugging output, we mark it as if it were a node
+ OPTIMIZED off Placeholder for dump.
+
+=head2 Run-time output
+
+First of all, when doing a match, one may get no run-time output even
+if debugging is enabled. This means that the regex engine was never
+entered and that all of the job was therefore done by the optimizer.
+
+If the regex engine was entered, the output may look like this:
+
+ Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
+ Setting an EVAL scope, savestack=3
+ 2 <ab> <cdefg__gh_> | 1: ANYOF
+ 3 <abc> <defg__gh_> | 11: EXACT <d>
+ 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767}
+ 4 <abcd> <efg__gh_> | 26: WHILEM
+ 0 out of 1..32767 cc=effff31c
+ 4 <abcd> <efg__gh_> | 15: OPEN1
+ 4 <abcd> <efg__gh_> | 17: EXACT <e>
+ 5 <abcde> <fg__gh_> | 19: STAR
+ EXACT <f> can match 1 times out of 32767...
+ Setting an EVAL scope, savestack=3
+ 6 <bcdef> <g__gh__> | 22: EXACT <g>
+ 7 <bcdefg> <__gh__> | 24: CLOSE1
+ 7 <bcdefg> <__gh__> | 26: WHILEM
+ 1 out of 1..32767 cc=effff31c
+ Setting an EVAL scope, savestack=12
+ 7 <bcdefg> <__gh__> | 15: OPEN1
+ 7 <bcdefg> <__gh__> | 17: EXACT <e>
+ restoring \1 to 4(4)..7
+ failed, try continuation...
+ 7 <bcdefg> <__gh__> | 27: NOTHING
+ 7 <bcdefg> <__gh__> | 28: EXACT <h>
+ failed...
+ failed...
+
+The most significant information in the output is about the particular I<node>
+of the compiled regex that is currently being tested against the target string.
+The format of these lines is
+
+C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE>
+
+The I<TYPE> info is indented with respect to the backtracking level.
+Other incidental information appears interspersed within.
+
+=head1 Debugging Perl memory usage
+
+Perl is a profligate wastrel when it comes to memory use. There
+is a saying that to estimate memory usage of Perl, assume a reasonable
+algorithm for memory allocation, multiply that estimate by 10, and
+while you still may miss the mark, at least you won't be quite so
+astonished. This is not absolutely true, but may prvide a good
+grasp of what happens.
+
+Assume that an integer cannot take less than 20 bytes of memory, a
+float cannot take less than 24 bytes, a string cannot take less
+than 32 bytes (all these examples assume 32-bit architectures, the
+result are quite a bit worse on 64-bit architectures). If a variable
+is accessed in two of three different ways (which require an integer,
+a float, or a string), the memory footprint may increase yet another
+20 bytes. A sloppy malloc(3) implementation can make inflate these
+numbers dramatically.
+
+On the opposite end of the scale, a declaration like
+
+ sub foo;
+
+may take up to 500 bytes of memory, depending on which release of Perl
+you're running.
+
+Anecdotal estimates of source-to-compiled code bloat suggest an
+eightfold increase. This means that the compiled form of reasonable
+(normally commented, properly indented etc.) code will take
+about eight times more space in memory than the code took
+on disk.
+
+There are two Perl-specific ways to analyze memory usage:
+$ENV{PERL_DEBUG_MSTATS} and B<-DL> command-line switch. The first
+is available only if Perl is compiled with Perl's malloc(); the
+second only if Perl was built with C<-DDEBUGGING>. See the
+instructions for how to do this in the F<INSTALL> podpage at
+the top level of the Perl source tree.
+
+=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
+
+If your perl is using Perl's malloc() and was compiled with the
+necessary switches (this is the default), then it will print memory
+usage statistics after compiling your code hwen C<< $ENV{PERL_DEBUG_MSTATS}
+> 1 >>, and before termination of the program when C<<
+$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to
+the following example:
+
+ $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
+ Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
+ 14216 free: 130 117 28 7 9 0 2 2 1 0 0
+ 437 61 36 0 5
+ 60924 used: 125 137 161 55 7 8 6 16 2 0 1
+ 74 109 304 84 20
+ Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
+ Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
+ 30888 free: 245 78 85 13 6 2 1 3 2 0 1
+ 315 162 39 42 11
+ 175816 used: 265 176 1112 111 26 22 11 27 2 1 1
+ 196 178 1066 798 39
+ Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
+
+It is possible to ask for such a statistic at arbitrary points in
+your execution using the mstats() function out of the standard
+Devel::Peek module.
+
+Here is some explanation of that format:
+
+=over
+
+=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
+
+Perl's malloc() uses bucketed allocations. Every request is rounded
+up to the closest bucket size available, and a bucket is taken from
+the pool of buckets of that size.
+
+The line above describes the limits of buckets currently in use.
+Each bucket has two sizes: memory footprint and the maximal size
+of user data that can fit into this bucket. Suppose in the above
+example that the smallest bucket were size 4. The biggest bucket
+would have usable size 8188, and the memory footprint would be 8192.
+
+In a Perl built for debugging, some buckets may have negative usable
+size. This means that these buckets cannot (and will not) be used.
+For larger buckets, the memory footprint may be one page greater
+than a power of 2. If so, case the corresponding power of two is
+printed in the C<APPROX> field above.
+
+=item Free/Used
+
+The 1 or 2 rows of numbers following that correspond to the number
+of buckets of each size between C<SMALLEST> and C<GREATEST>. In
+the first row, the sizes (memory footprints) of buckets are powers
+of two--or possibly one page greater. In the second row, if present,
+the memory footprints of the buckets are between the memory footprints
+of two buckets "above".
+
+For example, suppose under the pervious example, the memory footprints
+were
+
+ free: 8 16 32 64 128 256 512 1024 2048 4096 8192
+ 4 12 24 48 80
+
+With non-C<DEBUGGING> perl, the buckets starting from C<128> have
+a 4-byte overhead, and thus a 8192-long bucket may take up to
+8188-byte allocations.
+
+=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
+
+The first two fields give the total amount of memory perl sbrk(2)ed
+(ess-broken? :-) and number of sbrk(2)s used. The third number is
+what perl thinks about continuity of returned chunks. So long as
+this number is positive, malloc() will assume that it is probable
+that sbrk(2) will provide continuous memory.
+
+Memory allocated by external libraries is not counted.
+
+=item C<pad: 0>
+
+The amount of sbrk(2)ed memory needed to keep buckets aligned.
+
+=item C<heads: 2192>
+
+Although memory overhead of bigger buckets is kept inside the bucket, for
+smaller buckets, it is kept in separate areas. This field gives the
+total size of these areas.
+
+=item C<chain: 0>
+
+malloc() may want to subdivide a bigger bucket into smaller buckets.
+If only a part of the deceased bucket is left unsubdivided, the rest
+is kept as an element of a linked list. This field gives the total
+size of these chunks.
+
+=item C<tail: 6144>
+
+To minimize the number of sbrk(2)s, malloc() asks for more memory. This
+field gives the size of the yet unused part, which is sbrk(2)ed, but
+never touched.
+
+=back
+
+=head2 Example of using B<-DL> switch
+
+Below we show how to analyse memory usage by
+
+ do 'lib/auto/POSIX/autosplit.ix';
+
+The file in question contains a header and 146 lines similar to
+
+ sub getcwd;
+
+B<WARNING>: The discussion below supposes 32-bit architecture. In
+newer releases of Perl, memory usage of the constructs discussed
+here is greatly improved, but the story discussed below is a real-life
+story. This story is mercilessly terse, and assumes rather more than cursory
+knowledge of Perl internals. Type space to continue, `q' to quit.
+(Actually, you just want to skip to the next section.)
+
+Here is the itemized list of Perl allocations performed during parsing
+of this file:
+
+ !!! "after" at test.pl line 3.
+ Id subtot 4 8 12 16 20 24 28 32 36 40 48 56 64 72 80 80+
+ 0 02 13752 . . . . 294 . . . . . . . . . . 4
+ 0 54 5545 . . 8 124 16 . . . 1 1 . . . . . 3
+ 5 05 32 . . . . . . . 1 . . . . . . . .
+ 6 02 7152 . . . . . . . . . . 149 . . . . .
+ 7 02 3600 . . . . . 150 . . . . . . . . . .
+ 7 03 64 . -1 . 1 . . 2 . . . . . . . . .
+ 7 04 7056 . . . . . . . . . . . . . . . 7
+ 7 17 38404 . . . . . . . 1 . . 442 149 . . 147 .
+ 9 03 2078 17 249 32 . . . . 2 . . . . . . . .
+
+
+To see this list, insert two C<warn('!...')> statements around the call:
+
+ warn('!');
+ do 'lib/auto/POSIX/autosplit.ix';
+ warn('!!! "after"');
+
+and run it with PErl's B<-DL> option. The first warn() will print
+memory allocation info before parsing the file and will memorize
+the statistics at this point (we ignore what it prints). The second
+warn() prints increments with respect to these memorized data. This
+is the printout shown above.
+
+Different I<Id>s on the left correspond to different subsystems of
+the perl interpreter. They are just the first argument given to
+the perl memory allocation API named New(). To find what C<9 03>
+means, just B<grep> the perl source for C<903>. You'll find it in
+F<util.c>, function savepvn(). (I know, you wonder why we told you
+to B<grep> and then gave away the answer. That's because grepping
+the source is good for the soul.) This function is used to store
+a copy of an existing chunk of memory. Using a C debugger, one can
+see that the function was called either directly from gv_init() or
+via sv_magic(), and that gv_init() is called from gv_fetchpv()--which
+was itself called from newSUB(). Please stop to catch your breath now.
+
+B<NOTE>: To reach this point in the debugger and skip the calls to
+savepvn() during the compilation of the main program, you should
+set a C breakpoint
+in Perl_warn(), continue until this point is reached, and I<then> set
+a C breakpoint in Perl_savepvn(). Note that you may need to skip a
+handful of Perl_savepvn() calls that do not correspond to mass production
+of CVs (there are more C<903> allocations than 146 similar lines of
+F<lib/auto/POSIX/autosplit.ix>). Note also that C<Perl_> prefixes are
+added by macroization code in perl header files to avoid conflicts
+with external libraries.
+
+Anyway, we see that C<903> ids correspond to creation of globs, twice
+per glob - for glob name, and glob stringification magic.
+
+Here are explanations for other I<Id>s above:
+
+=over
+
+=item C<717>
+
+CReates bigger C<XPV*> structures. In the case above, it
+creates 3 C<AV>s per subroutine, one for a list of lexical variable
+names, one for a scratchpad (which contains lexical variables and
+C<targets>), and one for the array of scratchpads needed for
+recursion.
+
+It also creates a C<GV> and a C<CV> per subroutine, all called from
+start_subparse().
+
+=item C<002>
+
+Creates a C array corresponding to the C<AV> of scratchpads and the
+scratchpad itself. The first fake entry of this scratchpad is
+created though the subroutine itself is not defined yet.
+
+It also creates C arrays to keep data for the stash. This is one HV,
+but it grows; thus, there are 4 big allocations: the big chunks are not
+freed, but are kept as additional arenas for C<SV> allocations.
+
+=item C<054>
+
+Creates a C<HEK> for the name of the glob for the subroutine. This
+name is a key in a I<stash>.
+
+Big allocations with this I<Id> correspond to allocations of new
+arenas to keep C<HE>.
+
+=item C<602>
+
+Creates a C<GP> for the glob for the subroutine.
+
+=item C<702>
+
+Creates the C<MAGIC> for the glob for the subroutine.
+
+=item C<704>
+
+Creates I<arenas> which keep SVs.
+
+=back
+
+=head2 B<-DL> details
+
+If Perl is run with B<-DL> option, then warn()s that start with `!'
+behave specially. They print a list of I<categories> of memory
+allocations, and statistics of allocations of different sizes for
+these categories.
+
+If warn() string starts with
+
+=over
+
+=item C<!!!>
+
+print changed categories only, print the differences in counts of allocations.
+
+=item C<!!>
+
+print grown categories only; print the absolute values of counts, and totals.
+
+=item C<!>
+
+print nonempty categories, print the absolute values of counts and totals.
+
+=back
+
+=head2 Limitations of B<-DL> statistics
+
+If an extension or external library does not use the Perl API to
+allocate memory, such allocations are not counted.
+
+=head1 SEE ALSO
+
+L<perldebug>,
+L<perlguts>,
+L<perlrun>
+L<re>,
+and
+L<Devel::Dprof>.
OpenPOWER on IntegriCloud