diff options
Diffstat (limited to 'contrib/perl5/lib/Pod/Parser.pm')
-rw-r--r-- | contrib/perl5/lib/Pod/Parser.pm | 1768 |
1 files changed, 0 insertions, 1768 deletions
diff --git a/contrib/perl5/lib/Pod/Parser.pm b/contrib/perl5/lib/Pod/Parser.pm deleted file mode 100644 index 6782519..0000000 --- a/contrib/perl5/lib/Pod/Parser.pm +++ /dev/null @@ -1,1768 +0,0 @@ -############################################################################# -# Pod/Parser.pm -- package which defines a base class for parsing POD docs. -# -# Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved. -# This file is part of "PodParser". PodParser is free software; -# you can redistribute it and/or modify it under the same terms -# as Perl itself. -############################################################################# - -package Pod::Parser; - -use vars qw($VERSION); -$VERSION = 1.13; ## Current version of this package -require 5.005; ## requires this Perl version or later - -############################################################################# - -=head1 NAME - -Pod::Parser - base class for creating POD filters and translators - -=head1 SYNOPSIS - - use Pod::Parser; - - package MyParser; - @ISA = qw(Pod::Parser); - - sub command { - my ($parser, $command, $paragraph, $line_num) = @_; - ## Interpret the command and its text; sample actions might be: - if ($command eq 'head1') { ... } - elsif ($command eq 'head2') { ... } - ## ... other commands and their actions - my $out_fh = $parser->output_handle(); - my $expansion = $parser->interpolate($paragraph, $line_num); - print $out_fh $expansion; - } - - sub verbatim { - my ($parser, $paragraph, $line_num) = @_; - ## Format verbatim paragraph; sample actions might be: - my $out_fh = $parser->output_handle(); - print $out_fh $paragraph; - } - - sub textblock { - my ($parser, $paragraph, $line_num) = @_; - ## Translate/Format this block of text; sample actions might be: - my $out_fh = $parser->output_handle(); - my $expansion = $parser->interpolate($paragraph, $line_num); - print $out_fh $expansion; - } - - sub interior_sequence { - my ($parser, $seq_command, $seq_argument) = @_; - ## Expand an interior sequence; sample actions might be: - return "*$seq_argument*" if ($seq_command eq 'B'); - return "`$seq_argument'" if ($seq_command eq 'C'); - return "_${seq_argument}_'" if ($seq_command eq 'I'); - ## ... other sequence commands and their resulting text - } - - package main; - - ## Create a parser object and have it parse file whose name was - ## given on the command-line (use STDIN if no files were given). - $parser = new MyParser(); - $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0); - for (@ARGV) { $parser->parse_from_file($_); } - -=head1 REQUIRES - -perl5.005, Pod::InputObjects, Exporter, Symbol, Carp - -=head1 EXPORTS - -Nothing. - -=head1 DESCRIPTION - -B<Pod::Parser> is a base class for creating POD filters and translators. -It handles most of the effort involved with parsing the POD sections -from an input stream, leaving subclasses free to be concerned only with -performing the actual translation of text. - -B<Pod::Parser> parses PODs, and makes method calls to handle the various -components of the POD. Subclasses of B<Pod::Parser> override these methods -to translate the POD into whatever output format they desire. - -=head1 QUICK OVERVIEW - -To create a POD filter for translating POD documentation into some other -format, you create a subclass of B<Pod::Parser> which typically overrides -just the base class implementation for the following methods: - -=over 2 - -=item * - -B<command()> - -=item * - -B<verbatim()> - -=item * - -B<textblock()> - -=item * - -B<interior_sequence()> - -=back - -You may also want to override the B<begin_input()> and B<end_input()> -methods for your subclass (to perform any needed per-file and/or -per-document initialization or cleanup). - -If you need to perform any preprocesssing of input before it is parsed -you may want to override one or more of B<preprocess_line()> and/or -B<preprocess_paragraph()>. - -Sometimes it may be necessary to make more than one pass over the input -files. If this is the case you have several options. You can make the -first pass using B<Pod::Parser> and override your methods to store the -intermediate results in memory somewhere for the B<end_pod()> method to -process. You could use B<Pod::Parser> for several passes with an -appropriate state variable to control the operation for each pass. If -your input source can't be reset to start at the beginning, you can -store it in some other structure as a string or an array and have that -structure implement a B<getline()> method (which is all that -B<parse_from_filehandle()> uses to read input). - -Feel free to add any member data fields you need to keep track of things -like current font, indentation, horizontal or vertical position, or -whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA"> -to avoid name collisions. - -For the most part, the B<Pod::Parser> base class should be able to -do most of the input parsing for you and leave you free to worry about -how to intepret the commands and translate the result. - -Note that all we have described here in this quick overview is the -simplest most straightforward use of B<Pod::Parser> to do stream-based -parsing. It is also possible to use the B<Pod::Parser::parse_text> function -to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">. - -=head1 PARSING OPTIONS - -A I<parse-option> is simply a named option of B<Pod::Parser> with a -value that corresponds to a certain specified behavior. These various -behaviors of B<Pod::Parser> may be enabled/disabled by setting or -or unsetting one or more I<parse-options> using the B<parseopts()> method. -The set of currently accepted parse-options is as follows: - -=over 3 - -=item B<-want_nonPODs> (default: unset) - -Normally (by default) B<Pod::Parser> will only provide access to -the POD sections of the input. Input paragraphs that are not part -of the POD-format documentation are not made available to the caller -(not even using B<preprocess_paragraph()>). Setting this option to a -non-empty, non-zero value will allow B<preprocess_paragraph()> to see -non-POD sections of the input as well as POD sections. The B<cutting()> -method can be used to determine if the corresponding paragraph is a POD -paragraph, or some other input paragraph. - -=item B<-process_cut_cmd> (default: unset) - -Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive -by itself and does not pass it on to the caller for processing. Setting -this option to a non-empty, non-zero value will cause B<Pod::Parser> to -pass the C<=cut> directive to the caller just like any other POD command -(and hence it may be processed by the B<command()> method). - -B<Pod::Parser> will still interpret the C<=cut> directive to mean that -"cutting mode" has been (re)entered, but the caller will get a chance -to capture the actual C<=cut> paragraph itself for whatever purpose -it desires. - -=item B<-warnings> (default: unset) - -Normally (by default) B<Pod::Parser> recognizes a bare minimum of -pod syntax errors and warnings and issues diagnostic messages -for errors, but not for warnings. (Use B<Pod::Checker> to do more -thorough checking of POD syntax.) Setting this option to a non-empty, -non-zero value will cause B<Pod::Parser> to issue diagnostics for -the few warnings it recognizes as well as the errors. - -=back - -Please see L<"parseopts()"> for a complete description of the interface -for the setting and unsetting of parse-options. - -=cut - -############################################################################# - -use vars qw(@ISA); -use strict; -#use diagnostics; -use Pod::InputObjects; -use Carp; -use Exporter; -BEGIN { - if ($] < 5.6) { - require Symbol; - import Symbol; - } -} -@ISA = qw(Exporter); - -## These "variables" are used as local "glob aliases" for performance -use vars qw(%myData %myOpts @input_stack); - -############################################################################# - -=head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES - -B<Pod::Parser> provides several methods which most subclasses will probably -want to override. These methods are as follows: - -=cut - -##--------------------------------------------------------------------------- - -=head1 B<command()> - - $parser->command($cmd,$text,$line_num,$pod_para); - -This method should be overridden by subclasses to take the appropriate -action when a POD command paragraph (denoted by a line beginning with -"=") is encountered. When such a POD directive is seen in the input, -this method is called and is passed: - -=over 3 - -=item C<$cmd> - -the name of the command for this POD paragraph - -=item C<$text> - -the paragraph text for the given POD paragraph command. - -=item C<$line_num> - -the line-number of the beginning of the paragraph - -=item C<$pod_para> - -a reference to a C<Pod::Paragraph> object which contains further -information about the paragraph command (see L<Pod::InputObjects> -for details). - -=back - -B<Note> that this method I<is> called for C<=pod> paragraphs. - -The base class implementation of this method simply treats the raw POD -command as normal block of paragraph text (invoking the B<textblock()> -method with the command paragraph). - -=cut - -sub command { - my ($self, $cmd, $text, $line_num, $pod_para) = @_; - ## Just treat this like a textblock - $self->textblock($pod_para->raw_text(), $line_num, $pod_para); -} - -##--------------------------------------------------------------------------- - -=head1 B<verbatim()> - - $parser->verbatim($text,$line_num,$pod_para); - -This method may be overridden by subclasses to take the appropriate -action when a block of verbatim text is encountered. It is passed the -following parameters: - -=over 3 - -=item C<$text> - -the block of text for the verbatim paragraph - -=item C<$line_num> - -the line-number of the beginning of the paragraph - -=item C<$pod_para> - -a reference to a C<Pod::Paragraph> object which contains further -information about the paragraph (see L<Pod::InputObjects> -for details). - -=back - -The base class implementation of this method simply prints the textblock -(unmodified) to the output filehandle. - -=cut - -sub verbatim { - my ($self, $text, $line_num, $pod_para) = @_; - my $out_fh = $self->{_OUTPUT}; - print $out_fh $text; -} - -##--------------------------------------------------------------------------- - -=head1 B<textblock()> - - $parser->textblock($text,$line_num,$pod_para); - -This method may be overridden by subclasses to take the appropriate -action when a normal block of POD text is encountered (although the base -class method will usually do what you want). It is passed the following -parameters: - -=over 3 - -=item C<$text> - -the block of text for the a POD paragraph - -=item C<$line_num> - -the line-number of the beginning of the paragraph - -=item C<$pod_para> - -a reference to a C<Pod::Paragraph> object which contains further -information about the paragraph (see L<Pod::InputObjects> -for details). - -=back - -In order to process interior sequences, subclasses implementations of -this method will probably want to invoke either B<interpolate()> or -B<parse_text()>, passing it the text block C<$text>, and the corresponding -line number in C<$line_num>, and then perform any desired processing upon -the returned result. - -The base class implementation of this method simply prints the text block -as it occurred in the input stream). - -=cut - -sub textblock { - my ($self, $text, $line_num, $pod_para) = @_; - my $out_fh = $self->{_OUTPUT}; - print $out_fh $self->interpolate($text, $line_num); -} - -##--------------------------------------------------------------------------- - -=head1 B<interior_sequence()> - - $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq); - -This method should be overridden by subclasses to take the appropriate -action when an interior sequence is encountered. An interior sequence is -an embedded command within a block of text which appears as a command -name (usually a single uppercase character) followed immediately by a -string of text which is enclosed in angle brackets. This method is -passed the sequence command C<$seq_cmd> and the corresponding text -C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior -sequence that occurs in the string that it is passed. It should return -the desired text string to be used in place of the interior sequence. -The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence> -object which contains further information about the interior sequence. -Please see L<Pod::InputObjects> for details if you need to access this -additional information. - -Subclass implementations of this method may wish to invoke the -B<nested()> method of C<$pod_seq> to see if it is nested inside -some other interior-sequence (and if so, which kind). - -The base class implementation of the B<interior_sequence()> method -simply returns the raw text of the interior sequence (as it occurred -in the input) to the caller. - -=cut - -sub interior_sequence { - my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_; - ## Just return the raw text of the interior sequence - return $pod_seq->raw_text(); -} - -############################################################################# - -=head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES - -B<Pod::Parser> provides several methods which subclasses may want to override -to perform any special pre/post-processing. These methods do I<not> have to -be overridden, but it may be useful for subclasses to take advantage of them. - -=cut - -##--------------------------------------------------------------------------- - -=head1 B<new()> - - my $parser = Pod::Parser->new(); - -This is the constructor for B<Pod::Parser> and its subclasses. You -I<do not> need to override this method! It is capable of constructing -subclass objects as well as base class objects, provided you use -any of the following constructor invocation styles: - - my $parser1 = MyParser->new(); - my $parser2 = new MyParser(); - my $parser3 = $parser2->new(); - -where C<MyParser> is some subclass of B<Pod::Parser>. - -Using the syntax C<MyParser::new()> to invoke the constructor is I<not> -recommended, but if you insist on being able to do this, then the -subclass I<will> need to override the B<new()> constructor method. If -you do override the constructor, you I<must> be sure to invoke the -B<initialize()> method of the newly blessed object. - -Using any of the above invocations, the first argument to the -constructor is always the corresponding package name (or object -reference). No other arguments are required, but if desired, an -associative array (or hash-table) my be passed to the B<new()> -constructor, as in: - - my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 ); - my $parser2 = new MyParser( -myflag => 1 ); - -All arguments passed to the B<new()> constructor will be treated as -key/value pairs in a hash-table. The newly constructed object will be -initialized by copying the contents of the given hash-table (which may -have been empty). The B<new()> constructor for this class and all of its -subclasses returns a blessed reference to the initialized object (hash-table). - -=cut - -sub new { - ## Determine if we were called via an object-ref or a classname - my $this = shift; - my $class = ref($this) || $this; - ## Any remaining arguments are treated as initial values for the - ## hash that is used to represent this object. - my %params = @_; - my $self = { %params }; - ## Bless ourselves into the desired class and perform any initialization - bless $self, $class; - $self->initialize(); - return $self; -} - -##--------------------------------------------------------------------------- - -=head1 B<initialize()> - - $parser->initialize(); - -This method performs any necessary object initialization. It takes no -arguments (other than the object instance of course, which is typically -copied to a local variable named C<$self>). If subclasses override this -method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>. - -=cut - -sub initialize { - #my $self = shift; - #return; -} - -##--------------------------------------------------------------------------- - -=head1 B<begin_pod()> - - $parser->begin_pod(); - -This method is invoked at the beginning of processing for each POD -document that is encountered in the input. Subclasses should override -this method to perform any per-document initialization. - -=cut - -sub begin_pod { - #my $self = shift; - #return; -} - -##--------------------------------------------------------------------------- - -=head1 B<begin_input()> - - $parser->begin_input(); - -This method is invoked by B<parse_from_filehandle()> immediately I<before> -processing input from a filehandle. The base class implementation does -nothing, however, subclasses may override it to perform any per-file -initializations. - -Note that if multiple files are parsed for a single POD document -(perhaps the result of some future C<=include> directive) this method -is invoked for every file that is parsed. If you wish to perform certain -initializations once per document, then you should use B<begin_pod()>. - -=cut - -sub begin_input { - #my $self = shift; - #return; -} - -##--------------------------------------------------------------------------- - -=head1 B<end_input()> - - $parser->end_input(); - -This method is invoked by B<parse_from_filehandle()> immediately I<after> -processing input from a filehandle. The base class implementation does -nothing, however, subclasses may override it to perform any per-file -cleanup actions. - -Please note that if multiple files are parsed for a single POD document -(perhaps the result of some kind of C<=include> directive) this method -is invoked for every file that is parsed. If you wish to perform certain -cleanup actions once per document, then you should use B<end_pod()>. - -=cut - -sub end_input { - #my $self = shift; - #return; -} - -##--------------------------------------------------------------------------- - -=head1 B<end_pod()> - - $parser->end_pod(); - -This method is invoked at the end of processing for each POD document -that is encountered in the input. Subclasses should override this method -to perform any per-document finalization. - -=cut - -sub end_pod { - #my $self = shift; - #return; -} - -##--------------------------------------------------------------------------- - -=head1 B<preprocess_line()> - - $textline = $parser->preprocess_line($text, $line_num); - -This method should be overridden by subclasses that wish to perform -any kind of preprocessing for each I<line> of input (I<before> it has -been determined whether or not it is part of a POD paragraph). The -parameter C<$text> is the input line; and the parameter C<$line_num> is -the line number of the corresponding text line. - -The value returned should correspond to the new text to use in its -place. If the empty string or an undefined value is returned then no -further processing will be performed for this line. - -Please note that the B<preprocess_line()> method is invoked I<before> -the B<preprocess_paragraph()> method. After all (possibly preprocessed) -lines in a paragraph have been assembled together and it has been -determined that the paragraph is part of the POD documentation from one -of the selected sections, then B<preprocess_paragraph()> is invoked. - -The base class implementation of this method returns the given text. - -=cut - -sub preprocess_line { - my ($self, $text, $line_num) = @_; - return $text; -} - -##--------------------------------------------------------------------------- - -=head1 B<preprocess_paragraph()> - - $textblock = $parser->preprocess_paragraph($text, $line_num); - -This method should be overridden by subclasses that wish to perform any -kind of preprocessing for each block (paragraph) of POD documentation -that appears in the input stream. The parameter C<$text> is the POD -paragraph from the input file; and the parameter C<$line_num> is the -line number for the beginning of the corresponding paragraph. - -The value returned should correspond to the new text to use in its -place If the empty string is returned or an undefined value is -returned, then the given C<$text> is ignored (not processed). - -This method is invoked after gathering up all the lines in a paragraph -and after determining the cutting state of the paragraph, -but before trying to further parse or interpret them. After -B<preprocess_paragraph()> returns, the current cutting state (which -is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates -to true then input text (including the given C<$text>) is cut (not -processed) until the next POD directive is encountered. - -Please note that the B<preprocess_line()> method is invoked I<before> -the B<preprocess_paragraph()> method. After all (possibly preprocessed) -lines in a paragraph have been assembled together and either it has been -determined that the paragraph is part of the POD documentation from one -of the selected sections or the C<-want_nonPODs> option is true, -then B<preprocess_paragraph()> is invoked. - -The base class implementation of this method returns the given text. - -=cut - -sub preprocess_paragraph { - my ($self, $text, $line_num) = @_; - return $text; -} - -############################################################################# - -=head1 METHODS FOR PARSING AND PROCESSING - -B<Pod::Parser> provides several methods to process input text. These -methods typically won't need to be overridden (and in some cases they -can't be overridden), but subclasses may want to invoke them to exploit -their functionality. - -=cut - -##--------------------------------------------------------------------------- - -=head1 B<parse_text()> - - $ptree1 = $parser->parse_text($text, $line_num); - $ptree2 = $parser->parse_text({%opts}, $text, $line_num); - $ptree3 = $parser->parse_text(\%opts, $text, $line_num); - -This method is useful if you need to perform your own interpolation -of interior sequences and can't rely upon B<interpolate> to expand -them in simple bottom-up order order. - -The parameter C<$text> is a string or block of text to be parsed -for interior sequences; and the parameter C<$line_num> is the -line number curresponding to the beginning of C<$text>. - -B<parse_text()> will parse the given text into a parse-tree of "nodes." -and interior-sequences. Each "node" in the parse tree is either a -text-string, or a B<Pod::InteriorSequence>. The result returned is a -parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects> -for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>. - -If desired, an optional hash-ref may be specified as the first argument -to customize certain aspects of the parse-tree that is created and -returned. The set of recognized option keywords are: - -=over 3 - -=item B<-expand_seq> =E<gt> I<code-ref>|I<method-name> - -Normally, the parse-tree returned by B<parse_text()> will contain an -unexpanded C<Pod::InteriorSequence> object for each interior-sequence -encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand" -every interior-sequence it sees by invoking the referenced function -(or named method of the parser object) and using the return value as the -expanded result. - -If a subroutine reference was given, it is invoked as: - - &$code_ref( $parser, $sequence ) - -and if a method-name was given, it is invoked as: - - $parser->method_name( $sequence ) - -where C<$parser> is a reference to the parser object, and C<$sequence> -is a reference to the interior-sequence object. -[I<NOTE>: If the B<interior_sequence()> method is specified, then it is -invoked according to the interface specified in L<"interior_sequence()">]. - -=item B<-expand_text> =E<gt> I<code-ref>|I<method-name> - -Normally, the parse-tree returned by B<parse_text()> will contain a -text-string for each contiguous sequence of characters outside of an -interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to -"preprocess" every such text-string it sees by invoking the referenced -function (or named method of the parser object) and using the return value -as the preprocessed (or "expanded") result. [Note that if the result is -an interior-sequence, then it will I<not> be expanded as specified by the -B<-expand_seq> option; Any such recursive expansion needs to be handled by -the specified callback routine.] - -If a subroutine reference was given, it is invoked as: - - &$code_ref( $parser, $text, $ptree_node ) - -and if a method-name was given, it is invoked as: - - $parser->method_name( $text, $ptree_node ) - -where C<$parser> is a reference to the parser object, C<$text> is the -text-string encountered, and C<$ptree_node> is a reference to the current -node in the parse-tree (usually an interior-sequence object or else the -top-level node of the parse-tree). - -=item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name> - -Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an -argument to the referenced subroutine (or named method of the parser -object) and return the result instead of the parse-tree object. - -If a subroutine reference was given, it is invoked as: - - &$code_ref( $parser, $ptree ) - -and if a method-name was given, it is invoked as: - - $parser->method_name( $ptree ) - -where C<$parser> is a reference to the parser object, and C<$ptree> -is a reference to the parse-tree object. - -=back - -=cut - -sub parse_text { - my $self = shift; - local $_ = ''; - - ## Get options and set any defaults - my %opts = (ref $_[0]) ? %{ shift() } : (); - my $expand_seq = $opts{'-expand_seq'} || undef; - my $expand_text = $opts{'-expand_text'} || undef; - my $expand_ptree = $opts{'-expand_ptree'} || undef; - - my $text = shift; - my $line = shift; - my $file = $self->input_file(); - my $cmd = ""; - - ## Convert method calls into closures, for our convenience - my $xseq_sub = $expand_seq; - my $xtext_sub = $expand_text; - my $xptree_sub = $expand_ptree; - if (defined $expand_seq and $expand_seq eq 'interior_sequence') { - ## If 'interior_sequence' is the method to use, we have to pass - ## more than just the sequence object, we also need to pass the - ## sequence name and text. - $xseq_sub = sub { - my ($self, $iseq) = @_; - my $args = join("", $iseq->parse_tree->children); - return $self->interior_sequence($iseq->name, $args, $iseq); - }; - } - ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) }; - ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) }; - ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) }; - - ## Keep track of the "current" interior sequence, and maintain a stack - ## of "in progress" sequences. - ## - ## NOTE that we push our own "accumulator" at the very beginning of the - ## stack. It's really a parse-tree, not a sequence; but it implements - ## the methods we need so we can use it to gather-up all the sequences - ## and strings we parse. Thus, by the end of our parsing, it should be - ## the only thing left on our stack and all we have to do is return it! - ## - my $seq = Pod::ParseTree->new(); - my @seq_stack = ($seq); - my ($ldelim, $rdelim) = ('', ''); - - ## Iterate over all sequence starts text (NOTE: split with - ## capturing parens keeps the delimiters) - $_ = $text; - my @tokens = split /([A-Z]<(?:<+\s)?)/; - while ( @tokens ) { - $_ = shift @tokens; - ## Look for the beginning of a sequence - if ( /^([A-Z])(<(?:<+\s)?)$/ ) { - ## Push a new sequence onto the stack of those "in-progress" - ($cmd, $ldelim) = ($1, $2); - $seq = Pod::InteriorSequence->new( - -name => $cmd, - -ldelim => $ldelim, -rdelim => '', - -file => $file, -line => $line - ); - $ldelim =~ s/\s+$//, ($rdelim = $ldelim) =~ tr/</>/; - (@seq_stack > 1) and $seq->nested($seq_stack[-1]); - push @seq_stack, $seq; - } - ## Look for sequence ending - elsif ( @seq_stack > 1 ) { - ## Make sure we match the right kind of closing delimiter - my ($seq_end, $post_seq) = ("", ""); - if ( ($ldelim eq '<' and /\A(.*?)(>)/s) - or /\A(.*?)(\s+$rdelim)/s ) - { - ## Found end-of-sequence, capture the interior and the - ## closing the delimiter, and put the rest back on the - ## token-list - $post_seq = substr($_, length($1) + length($2)); - ($_, $seq_end) = ($1, $2); - (length $post_seq) and unshift @tokens, $post_seq; - } - if (length) { - ## In the middle of a sequence, append this text to it, and - ## dont forget to "expand" it if that's what the caller wanted - $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); - $_ .= $seq_end; - } - if (length $seq_end) { - ## End of current sequence, record terminating delimiter - $seq->rdelim($seq_end); - ## Pop it off the stack of "in progress" sequences - pop @seq_stack; - ## Append result to its parent in current parse tree - $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) - : $seq); - ## Remember the current cmd-name and left-delimiter - $cmd = (@seq_stack > 1) ? $seq_stack[-1]->name : ''; - $ldelim = (@seq_stack > 1) ? $seq_stack[-1]->ldelim : ''; - $ldelim =~ s/\s+$//, ($rdelim = $ldelim) =~ tr/</>/; - } - } - elsif (length) { - ## In the middle of a sequence, append this text to it, and - ## dont forget to "expand" it if that's what the caller wanted - $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); - } - ## Keep track of line count - $line += tr/\n//; - ## Remember the "current" sequence - $seq = $seq_stack[-1]; - } - - ## Handle unterminated sequences - my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef; - while (@seq_stack > 1) { - ($cmd, $file, $line) = ($seq->name, $seq->file_line); - $ldelim = $seq->ldelim; - ($rdelim = $ldelim) =~ tr/</>/; - $rdelim =~ s/^(\S+)(\s*)$/$2$1/; - pop @seq_stack; - my $errmsg = "*** ERROR: unterminated ${cmd}${ldelim}...${rdelim}". - " at line $line in file $file\n"; - (ref $errorsub) and &{$errorsub}($errmsg) - or (defined $errorsub) and $self->$errorsub($errmsg) - or warn($errmsg); - $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq); - $seq = $seq_stack[-1]; - } - - ## Return the resulting parse-tree - my $ptree = (pop @seq_stack)->parse_tree; - return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree; -} - -##--------------------------------------------------------------------------- - -=head1 B<interpolate()> - - $textblock = $parser->interpolate($text, $line_num); - -This method translates all text (including any embedded interior sequences) -in the given text string C<$text> and returns the interpolated result. The -parameter C<$line_num> is the line number corresponding to the beginning -of C<$text>. - -B<interpolate()> merely invokes a private method to recursively expand -nested interior sequences in bottom-up order (innermost sequences are -expanded first). If there is a need to expand nested sequences in -some alternate order, use B<parse_text> instead. - -=cut - -sub interpolate { - my($self, $text, $line_num) = @_; - my %parse_opts = ( -expand_seq => 'interior_sequence' ); - my $ptree = $self->parse_text( \%parse_opts, $text, $line_num ); - return join "", $ptree->children(); -} - -##--------------------------------------------------------------------------- - -=begin __PRIVATE__ - -=head1 B<parse_paragraph()> - - $parser->parse_paragraph($text, $line_num); - -This method takes the text of a POD paragraph to be processed, along -with its corresponding line number, and invokes the appropriate method -(one of B<command()>, B<verbatim()>, or B<textblock()>). - -For performance reasons, this method is invoked directly without any -dynamic lookup; Hence subclasses may I<not> override it! - -=end __PRIVATE__ - -=cut - -sub parse_paragraph { - my ($self, $text, $line_num) = @_; - local *myData = $self; ## alias to avoid deref-ing overhead - local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options - local $_; - - ## See if we want to preprocess nonPOD paragraphs as well as POD ones. - my $wantNonPods = $myOpts{'-want_nonPODs'}; - - ## Update cutting status - $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/; - - ## Perform any desired preprocessing if we wanted it this early - $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num); - - ## Ignore up until next POD directive if we are cutting - return if $myData{_CUTTING}; - - ## Now we know this is block of text in a POD section! - - ##----------------------------------------------------------------- - ## This is a hook (hack ;-) for Pod::Select to do its thing without - ## having to override methods, but also without Pod::Parser assuming - ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS - ## field exists then we assume there is an is_selected() method for - ## us to invoke (calling $self->can('is_selected') could verify this - ## but that is more overhead than I want to incur) - ##----------------------------------------------------------------- - - ## Ignore this block if it isnt in one of the selected sections - if (exists $myData{_SELECTED_SECTIONS}) { - $self->is_selected($text) or return ($myData{_CUTTING} = 1); - } - - ## If we havent already, perform any desired preprocessing and - ## then re-check the "cutting" state - unless ($wantNonPods) { - $text = $self->preprocess_paragraph($text, $line_num); - return 1 unless ((defined $text) and (length $text)); - return 1 if ($myData{_CUTTING}); - } - - ## Look for one of the three types of paragraphs - my ($pfx, $cmd, $arg, $sep) = ('', '', '', ''); - my $pod_para = undef; - if ($text =~ /^(={1,2})(?=\S)/) { - ## Looks like a command paragraph. Capture the command prefix used - ## ("=" or "=="), as well as the command-name, its paragraph text, - ## and whatever sequence of characters was used to separate them - $pfx = $1; - $_ = substr($text, length $pfx); - ($cmd, $sep, $text) = split /(\s+)/, $_, 2; - ## If this is a "cut" directive then we dont need to do anything - ## except return to "cutting" mode. - if ($cmd eq 'cut') { - $myData{_CUTTING} = 1; - return unless $myOpts{'-process_cut_cmd'}; - } - } - ## Save the attributes indicating how the command was specified. - $pod_para = new Pod::Paragraph( - -name => $cmd, - -text => $text, - -prefix => $pfx, - -separator => $sep, - -file => $myData{_INFILE}, - -line => $line_num - ); - # ## Invoke appropriate callbacks - # if (exists $myData{_CALLBACKS}) { - # ## Look through the callback list, invoke callbacks, - # ## then see if we need to do the default actions - # ## (invoke_callbacks will return true if we do). - # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para); - # } - if (length $cmd) { - ## A command paragraph - $self->command($cmd, $text, $line_num, $pod_para); - } - elsif ($text =~ /^\s+/) { - ## Indented text - must be a verbatim paragraph - $self->verbatim($text, $line_num, $pod_para); - } - else { - ## Looks like an ordinary block of text - $self->textblock($text, $line_num, $pod_para); - } - return 1; -} - -##--------------------------------------------------------------------------- - -=head1 B<parse_from_filehandle()> - - $parser->parse_from_filehandle($in_fh,$out_fh); - -This method takes an input filehandle (which is assumed to already be -opened for reading) and reads the entire input stream looking for blocks -(paragraphs) of POD documentation to be processed. If no first argument -is given the default input filehandle C<STDIN> is used. - -The C<$in_fh> parameter may be any object that provides a B<getline()> -method to retrieve a single line of input text (hence, an appropriate -wrapper object could be used to parse PODs from a single string or an -array of strings). - -Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled -into paragraphs or "blocks" (which are separated by lines containing -nothing but whitespace). For each block of POD documentation -encountered it will invoke a method to parse the given paragraph. - -If a second argument is given then it should correspond to a filehandle where -output should be sent (otherwise the default output filehandle is -C<STDOUT> if no output filehandle is currently in use). - -B<NOTE:> For performance reasons, this method caches the input stream at -the top of the stack in a local variable. Any attempts by clients to -change the stack contents during processing when in the midst executing -of this method I<will not affect> the input stream used by the current -invocation of this method. - -This method does I<not> usually need to be overridden by subclasses. - -=cut - -sub parse_from_filehandle { - my $self = shift; - my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); - my ($in_fh, $out_fh) = @_; - $in_fh = \*STDIN unless ($in_fh); - local *myData = $self; ## alias to avoid deref-ing overhead - local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options - local $_; - - ## Put this stream at the top of the stack and do beginning-of-input - ## processing. NOTE that $in_fh might be reset during this process. - my $topstream = $self->_push_input_stream($in_fh, $out_fh); - (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} ); - - ## Initialize line/paragraph - my ($textline, $paragraph) = ('', ''); - my ($nlines, $plines) = (0, 0); - - ## Use <$fh> instead of $fh->getline where possible (for speed) - $_ = ref $in_fh; - my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh); - - ## Read paragraphs line-by-line - while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) { - $textline = $self->preprocess_line($textline, ++$nlines); - next unless ((defined $textline) && (length $textline)); - $_ = $paragraph; ## save previous contents - - if ((! length $paragraph) && ($textline =~ /^==/)) { - ## '==' denotes a one-line command paragraph - $paragraph = $textline; - $plines = 1; - $textline = ''; - } else { - ## Append this line to the current paragraph - $paragraph .= $textline; - ++$plines; - } - - ## See if this line is blank and ends the current paragraph. - ## If it isnt, then keep iterating until it is. - next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/) - && (length $paragraph)); - - ## Issue a warning about any non-empty blank lines - if (length($1) > 0 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) { - my $errorsub = $self->errorsub(); - my $file = $self->input_file(); - my $errmsg = "*** WARNING: line containing nothing but whitespace". - " in paragraph at line $nlines in file $file\n"; - (ref $errorsub) and &{$errorsub}($errmsg) - or (defined $errorsub) and $self->$errorsub($errmsg) - or warn($errmsg); - } - - ## Now process the paragraph - parse_paragraph($self, $paragraph, ($nlines - $plines) + 1); - $paragraph = ''; - $plines = 0; - } - ## Dont forget about the last paragraph in the file - if (length $paragraph) { - parse_paragraph($self, $paragraph, ($nlines - $plines) + 1) - } - - ## Now pop the input stream off the top of the input stack. - $self->_pop_input_stream(); -} - -##--------------------------------------------------------------------------- - -=head1 B<parse_from_file()> - - $parser->parse_from_file($filename,$outfile); - -This method takes a filename and does the following: - -=over 2 - -=item * - -opens the input and output files for reading -(creating the appropriate filehandles) - -=item * - -invokes the B<parse_from_filehandle()> method passing it the -corresponding input and output filehandles. - -=item * - -closes the input and output files. - -=back - -If the special input filename "-" or "<&STDIN" is given then the STDIN -filehandle is used for input (and no open or close is performed). If no -input filename is specified then "-" is implied. - -If a second argument is given then it should be the name of the desired -output file. If the special output filename "-" or ">&STDOUT" is given -then the STDOUT filehandle is used for output (and no open or close is -performed). If the special output filename ">&STDERR" is given then the -STDERR filehandle is used for output (and no open or close is -performed). If no output filehandle is currently in use and no output -filename is specified, then "-" is implied. - -This method does I<not> usually need to be overridden by subclasses. - -=cut - -sub parse_from_file { - my $self = shift; - my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); - my ($infile, $outfile) = @_; - my ($in_fh, $out_fh) = (gensym, gensym) if ($] < 5.6); - my ($close_input, $close_output) = (0, 0); - local *myData = $self; - local $_; - - ## Is $infile a filename or a (possibly implied) filehandle - $infile = '-' unless ((defined $infile) && (length $infile)); - if (($infile eq '-') || ($infile =~ /^<&(STDIN|0)$/i)) { - ## Not a filename, just a string implying STDIN - $myData{_INFILE} = "<standard input>"; - $in_fh = \*STDIN; - } - elsif (ref $infile) { - ## Must be a filehandle-ref (or else assume its a ref to an object - ## that supports the common IO read operations). - $myData{_INFILE} = ${$infile}; - $in_fh = $infile; - } - else { - ## We have a filename, open it for reading - $myData{_INFILE} = $infile; - open($in_fh, "< $infile") or - croak "Can't open $infile for reading: $!\n"; - $close_input = 1; - } - - ## NOTE: we need to be *very* careful when "defaulting" the output - ## file. We only want to use a default if this is the beginning of - ## the entire document (but *not* if this is an included file). We - ## determine this by seeing if the input stream stack has been set-up - ## already - ## - unless ((defined $outfile) && (length $outfile)) { - (defined $myData{_TOP_STREAM}) && ($out_fh = $myData{_OUTPUT}) - || ($outfile = '-'); - } - ## Is $outfile a filename or a (possibly implied) filehandle - if ((defined $outfile) && (length $outfile)) { - if (($outfile eq '-') || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) { - ## Not a filename, just a string implying STDOUT - $myData{_OUTFILE} = "<standard output>"; - $out_fh = \*STDOUT; - } - elsif ($outfile =~ /^>&(STDERR|2)$/i) { - ## Not a filename, just a string implying STDERR - $myData{_OUTFILE} = "<standard error>"; - $out_fh = \*STDERR; - } - elsif (ref $outfile) { - ## Must be a filehandle-ref (or else assume its a ref to an - ## object that supports the common IO write operations). - $myData{_OUTFILE} = ${$outfile}; - $out_fh = $outfile; - } - else { - ## We have a filename, open it for writing - $myData{_OUTFILE} = $outfile; - (-d $outfile) and croak "$outfile is a directory, not POD input!\n"; - open($out_fh, "> $outfile") or - croak "Can't open $outfile for writing: $!\n"; - $close_output = 1; - } - } - - ## Whew! That was a lot of work to set up reasonably/robust behavior - ## in the case of a non-filename for reading and writing. Now we just - ## have to parse the input and close the handles when we're finished. - $self->parse_from_filehandle(\%opts, $in_fh, $out_fh); - - $close_input and - close($in_fh) || croak "Can't close $infile after reading: $!\n"; - $close_output and - close($out_fh) || croak "Can't close $outfile after writing: $!\n"; -} - -############################################################################# - -=head1 ACCESSOR METHODS - -Clients of B<Pod::Parser> should use the following methods to access -instance data fields: - -=cut - -##--------------------------------------------------------------------------- - -=head1 B<errorsub()> - - $parser->errorsub("method_name"); - $parser->errorsub(\&warn_user); - $parser->errorsub(sub { print STDERR, @_ }); - -Specifies the method or subroutine to use when printing error messages -about POD syntax. The supplied method/subroutine I<must> return TRUE upon -successful printing of the message. If C<undef> is given, then the B<warn> -builtin is used to issue error messages (this is the default behavior). - - my $errorsub = $parser->errorsub() - my $errmsg = "This is an error message!\n" - (ref $errorsub) and &{$errorsub}($errmsg) - or (defined $errorsub) and $parser->$errorsub($errmsg) - or warn($errmsg); - -Returns a method name, or else a reference to the user-supplied subroutine -used to print error messages. Returns C<undef> if the B<warn> builtin -is used to issue error messages (this is the default behavior). - -=cut - -sub errorsub { - return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB}; -} - -##--------------------------------------------------------------------------- - -=head1 B<cutting()> - - $boolean = $parser->cutting(); - -Returns the current C<cutting> state: a boolean-valued scalar which -evaluates to true if text from the input file is currently being "cut" -(meaning it is I<not> considered part of the POD document). - - $parser->cutting($boolean); - -Sets the current C<cutting> state to the given value and returns the -result. - -=cut - -sub cutting { - return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING}; -} - -##--------------------------------------------------------------------------- - -##--------------------------------------------------------------------------- - -=head1 B<parseopts()> - -When invoked with no additional arguments, B<parseopts> returns a hashtable -of all the current parsing options. - - ## See if we are parsing non-POD sections as well as POD ones - my %opts = $parser->parseopts(); - $opts{'-want_nonPODs}' and print "-want_nonPODs\n"; - -When invoked using a single string, B<parseopts> treats the string as the -name of a parse-option and returns its corresponding value if it exists -(returns C<undef> if it doesn't). - - ## Did we ask to see '=cut' paragraphs? - my $want_cut = $parser->parseopts('-process_cut_cmd'); - $want_cut and print "-process_cut_cmd\n"; - -When invoked with multiple arguments, B<parseopts> treats them as -key/value pairs and the specified parse-option names are set to the -given values. Any unspecified parse-options are unaffected. - - ## Set them back to the default - $parser->parseopts(-warnings => 0); - -When passed a single hash-ref, B<parseopts> uses that hash to completely -reset the existing parse-options, all previous parse-option values -are lost. - - ## Reset all options to default - $parser->parseopts( { } ); - -See L<"PARSING OPTIONS"> for more information on the name and meaning of each -parse-option currently recognized. - -=cut - -sub parseopts { - local *myData = shift; - local *myOpts = ($myData{_PARSEOPTS} ||= {}); - return %myOpts if (@_ == 0); - if (@_ == 1) { - local $_ = shift; - return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_}; - } - my @newOpts = (%myOpts, @_); - $myData{_PARSEOPTS} = { @newOpts }; -} - -##--------------------------------------------------------------------------- - -=head1 B<output_file()> - - $fname = $parser->output_file(); - -Returns the name of the output file being written. - -=cut - -sub output_file { - return $_[0]->{_OUTFILE}; -} - -##--------------------------------------------------------------------------- - -=head1 B<output_handle()> - - $fhandle = $parser->output_handle(); - -Returns the output filehandle object. - -=cut - -sub output_handle { - return $_[0]->{_OUTPUT}; -} - -##--------------------------------------------------------------------------- - -=head1 B<input_file()> - - $fname = $parser->input_file(); - -Returns the name of the input file being read. - -=cut - -sub input_file { - return $_[0]->{_INFILE}; -} - -##--------------------------------------------------------------------------- - -=head1 B<input_handle()> - - $fhandle = $parser->input_handle(); - -Returns the current input filehandle object. - -=cut - -sub input_handle { - return $_[0]->{_INPUT}; -} - -##--------------------------------------------------------------------------- - -=begin __PRIVATE__ - -=head1 B<input_streams()> - - $listref = $parser->input_streams(); - -Returns a reference to an array which corresponds to the stack of all -the input streams that are currently in the middle of being parsed. - -While parsing an input stream, it is possible to invoke -B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input -stream and then return to parsing the previous input stream. Each input -stream to be parsed is pushed onto the end of this input stack -before any of its input is read. The input stream that is currently -being parsed is always at the end (or top) of the input stack. When an -input stream has been exhausted, it is popped off the end of the -input stack. - -Each element on this input stack is a reference to C<Pod::InputSource> -object. Please see L<Pod::InputObjects> for more details. - -This method might be invoked when printing diagnostic messages, for example, -to obtain the name and line number of the all input files that are currently -being processed. - -=end __PRIVATE__ - -=cut - -sub input_streams { - return $_[0]->{_INPUT_STREAMS}; -} - -##--------------------------------------------------------------------------- - -=begin __PRIVATE__ - -=head1 B<top_stream()> - - $hashref = $parser->top_stream(); - -Returns a reference to the hash-table that represents the element -that is currently at the top (end) of the input stream stack -(see L<"input_streams()">). The return value will be the C<undef> -if the input stack is empty. - -This method might be used when printing diagnostic messages, for example, -to obtain the name and line number of the current input file. - -=end __PRIVATE__ - -=cut - -sub top_stream { - return $_[0]->{_TOP_STREAM} || undef; -} - -############################################################################# - -=head1 PRIVATE METHODS AND DATA - -B<Pod::Parser> makes use of several internal methods and data fields -which clients should not need to see or use. For the sake of avoiding -name collisions for client data and methods, these methods and fields -are briefly discussed here. Determined hackers may obtain further -information about them by reading the B<Pod::Parser> source code. - -Private data fields are stored in the hash-object whose reference is -returned by the B<new()> constructor for this class. The names of all -private methods and data-fields used by B<Pod::Parser> begin with a -prefix of "_" and match the regular expression C</^_\w+$/>. - -=cut - -##--------------------------------------------------------------------------- - -=begin _PRIVATE_ - -=head1 B<_push_input_stream()> - - $hashref = $parser->_push_input_stream($in_fh,$out_fh); - -This method will push the given input stream on the input stack and -perform any necessary beginning-of-document or beginning-of-file -processing. The argument C<$in_fh> is the input stream filehandle to -push, and C<$out_fh> is the corresponding output filehandle to use (if -it is not given or is undefined, then the current output stream is used, -which defaults to standard output if it doesnt exist yet). - -The value returned will be reference to the hash-table that represents -the new top of the input stream stack. I<Please Note> that it is -possible for this method to use default values for the input and output -file handles. If this happens, you will need to look at the C<INPUT> -and C<OUTPUT> instance data members to determine their new values. - -=end _PRIVATE_ - -=cut - -sub _push_input_stream { - my ($self, $in_fh, $out_fh) = @_; - local *myData = $self; - - ## Initialize stuff for the entire document if this is *not* - ## an included file. - ## - ## NOTE: we need to be *very* careful when "defaulting" the output - ## filehandle. We only want to use a default value if this is the - ## beginning of the entire document (but *not* if this is an included - ## file). - unless (defined $myData{_TOP_STREAM}) { - $out_fh = \*STDOUT unless (defined $out_fh); - $myData{_CUTTING} = 1; ## current "cutting" state - $myData{_INPUT_STREAMS} = []; ## stack of all input streams - } - - ## Initialize input indicators - $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE}); - $myData{_OUTPUT} = $out_fh if (defined $out_fh); - $in_fh = \*STDIN unless (defined $in_fh); - $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE}); - $myData{_INPUT} = $in_fh; - my $input_top = $myData{_TOP_STREAM} - = new Pod::InputSource( - -name => $myData{_INFILE}, - -handle => $in_fh, - -was_cutting => $myData{_CUTTING} - ); - local *input_stack = $myData{_INPUT_STREAMS}; - push(@input_stack, $input_top); - - ## Perform beginning-of-document and/or beginning-of-input processing - $self->begin_pod() if (@input_stack == 1); - $self->begin_input(); - - return $input_top; -} - -##--------------------------------------------------------------------------- - -=begin _PRIVATE_ - -=head1 B<_pop_input_stream()> - - $hashref = $parser->_pop_input_stream(); - -This takes no arguments. It will perform any necessary end-of-file or -end-of-document processing and then pop the current input stream from -the top of the input stack. - -The value returned will be reference to the hash-table that represents -the new top of the input stream stack. - -=end _PRIVATE_ - -=cut - -sub _pop_input_stream { - my ($self) = @_; - local *myData = $self; - local *input_stack = $myData{_INPUT_STREAMS}; - - ## Perform end-of-input and/or end-of-document processing - $self->end_input() if (@input_stack > 0); - $self->end_pod() if (@input_stack == 1); - - ## Restore cutting state to whatever it was before we started - ## parsing this file. - my $old_top = pop(@input_stack); - $myData{_CUTTING} = $old_top->was_cutting(); - - ## Dont forget to reset the input indicators - my $input_top = undef; - if (@input_stack > 0) { - $input_top = $myData{_TOP_STREAM} = $input_stack[-1]; - $myData{_INFILE} = $input_top->name(); - $myData{_INPUT} = $input_top->handle(); - } else { - delete $myData{_TOP_STREAM}; - delete $myData{_INPUT_STREAMS}; - } - - return $input_top; -} - -############################################################################# - -=head1 TREE-BASED PARSING - -If straightforward stream-based parsing wont meet your needs (as is -likely the case for tasks such as translating PODs into structured -markup languages like HTML and XML) then you may need to take the -tree-based approach. Rather than doing everything in one pass and -calling the B<interpolate()> method to expand sequences into text, it -may be desirable to instead create a parse-tree using the B<parse_text()> -method to return a tree-like structure which may contain an ordered list -list of children (each of which may be a text-string, or a similar -tree-like structure). - -Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and -to the objects described in L<Pod::InputObjects>. The former describes -the gory details and parameters for how to customize and extend the -parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides -several objects that may all be used interchangeably as parse-trees. The -most obvious one is the B<Pod::ParseTree> object. It defines the basic -interface and functionality that all things trying to be a POD parse-tree -should do. A B<Pod::ParseTree> is defined such that each "node" may be a -text-string, or a reference to another parse-tree. Each B<Pod::Paragraph> -object and each B<Pod::InteriorSequence> object also supports the basic -parse-tree interface. - -The B<parse_text()> method takes a given paragraph of text, and -returns a parse-tree that contains one or more children, each of which -may be a text-string, or an InteriorSequence object. There are also -callback-options that may be passed to B<parse_text()> to customize -the way it expands or transforms interior-sequences, as well as the -returned result. These callbacks can be used to create a parse-tree -with custom-made objects (which may or may not support the parse-tree -interface, depending on how you choose to do it). - -If you wish to turn an entire POD document into a parse-tree, that process -is fairly straightforward. The B<parse_text()> method is the key to doing -this successfully. Every paragraph-callback (i.e. the polymorphic methods -for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes -a B<Pod::Paragraph> object as an argument. Each paragraph object has a -B<parse_tree()> method that can be used to get or set a corresponding -parse-tree. So for each of those paragraph-callback methods, simply call -B<parse_text()> with the options you desire, and then use the returned -parse-tree to assign to the given paragraph object. - -That gives you a parse-tree for each paragraph - so now all you need is -an ordered list of paragraphs. You can maintain that yourself as a data -element in the object/hash. The most straightforward way would be simply -to use an array-ref, with the desired set of custom "options" for each -invocation of B<parse_text>. Let's assume the desired option-set is -given by the hash C<%options>. Then we might do something like the -following: - - package MyPodParserTree; - - @ISA = qw( Pod::Parser ); - - ... - - sub begin_pod { - my $self = shift; - $self->{'-paragraphs'} = []; ## initialize paragraph list - } - - sub command { - my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; - my $ptree = $parser->parse_text({%options}, $paragraph, ...); - $pod_para->parse_tree( $ptree ); - push @{ $self->{'-paragraphs'} }, $pod_para; - } - - sub verbatim { - my ($parser, $paragraph, $line_num, $pod_para) = @_; - push @{ $self->{'-paragraphs'} }, $pod_para; - } - - sub textblock { - my ($parser, $paragraph, $line_num, $pod_para) = @_; - my $ptree = $parser->parse_text({%options}, $paragraph, ...); - $pod_para->parse_tree( $ptree ); - push @{ $self->{'-paragraphs'} }, $pod_para; - } - - ... - - package main; - ... - my $parser = new MyPodParserTree(...); - $parser->parse_from_file(...); - my $paragraphs_ref = $parser->{'-paragraphs'}; - -Of course, in this module-author's humble opinion, I'd be more inclined to -use the existing B<Pod::ParseTree> object than a simple array. That way -everything in it, paragraphs and sequences, all respond to the same core -interface for all parse-tree nodes. The result would look something like: - - package MyPodParserTree2; - - ... - - sub begin_pod { - my $self = shift; - $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree - } - - sub parse_tree { - ## convenience method to get/set the parse-tree for the entire POD - (@_ > 1) and $_[0]->{'-ptree'} = $_[1]; - return $_[0]->{'-ptree'}; - } - - sub command { - my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; - my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); - $pod_para->parse_tree( $ptree ); - $parser->parse_tree()->append( $pod_para ); - } - - sub verbatim { - my ($parser, $paragraph, $line_num, $pod_para) = @_; - $parser->parse_tree()->append( $pod_para ); - } - - sub textblock { - my ($parser, $paragraph, $line_num, $pod_para) = @_; - my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); - $pod_para->parse_tree( $ptree ); - $parser->parse_tree()->append( $pod_para ); - } - - ... - - package main; - ... - my $parser = new MyPodParserTree2(...); - $parser->parse_from_file(...); - my $ptree = $parser->parse_tree; - ... - -Now you have the entire POD document as one great big parse-tree. You -can even use the B<-expand_seq> option to B<parse_text> to insert -whole different kinds of objects. Just don't expect B<Pod::Parser> -to know what to do with them after that. That will need to be in your -code. Or, alternatively, you can insert any object you like so long as -it conforms to the B<Pod::ParseTree> interface. - -One could use this to create subclasses of B<Pod::Paragraphs> and -B<Pod::InteriorSequences> for specific commands (or to create your own -custom node-types in the parse-tree) and add some kind of B<emit()> -method to each custom node/subclass object in the tree. Then all you'd -need to do is recursively walk the tree in the desired order, processing -the children (most likely from left to right) by formatting them if -they are text-strings, or by calling their B<emit()> method if they -are objects/references. - -=head1 SEE ALSO - -L<Pod::InputObjects>, L<Pod::Select> - -B<Pod::InputObjects> defines POD input objects corresponding to -command paragraphs, parse-trees, and interior-sequences. - -B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability -to selectively include and/or exclude sections of a POD document from being -translated based upon the current heading, subheading, subsubheading, etc. - -=for __PRIVATE__ -B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users -the ability the employ I<callback functions> instead of, or in addition -to, overriding methods of the base class. - -=for __PRIVATE__ -B<Pod::Select> and B<Pod::Callbacks> do not override any -methods nor do they define any new methods with the same name. Because -of this, they may I<both> be used (in combination) as a base class of -the same subclass in order to combine their functionality without -causing any namespace clashes due to multiple inheritance. - -=head1 AUTHOR - -Brad Appleton E<lt>bradapp@enteract.comE<gt> - -Based on code for B<Pod::Text> written by -Tom Christiansen E<lt>tchrist@mox.perl.comE<gt> - -=cut - -1; |