diff options
Diffstat (limited to 'contrib/perl5/pod/perlsub.pod')
-rw-r--r-- | contrib/perl5/pod/perlsub.pod | 1278 |
1 files changed, 0 insertions, 1278 deletions
diff --git a/contrib/perl5/pod/perlsub.pod b/contrib/perl5/pod/perlsub.pod deleted file mode 100644 index b440cd1..0000000 --- a/contrib/perl5/pod/perlsub.pod +++ /dev/null @@ -1,1278 +0,0 @@ -=head1 NAME - -perlsub - Perl subroutines - -=head1 SYNOPSIS - -To declare subroutines: - - sub NAME; # A "forward" declaration. - sub NAME(PROTO); # ditto, but with prototypes - sub NAME : ATTRS; # with attributes - sub NAME(PROTO) : ATTRS; # with attributes and prototypes - - sub NAME BLOCK # A declaration and a definition. - sub NAME(PROTO) BLOCK # ditto, but with prototypes - sub NAME : ATTRS BLOCK # with attributes - sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes - -To define an anonymous subroutine at runtime: - - $subref = sub BLOCK; # no proto - $subref = sub (PROTO) BLOCK; # with proto - $subref = sub : ATTRS BLOCK; # with attributes - $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes - -To import subroutines: - - use MODULE qw(NAME1 NAME2 NAME3); - -To call subroutines: - - NAME(LIST); # & is optional with parentheses. - NAME LIST; # Parentheses optional if predeclared/imported. - &NAME(LIST); # Circumvent prototypes. - &NAME; # Makes current @_ visible to called subroutine. - -=head1 DESCRIPTION - -Like many languages, Perl provides for user-defined subroutines. -These may be located anywhere in the main program, loaded in from -other files via the C<do>, C<require>, or C<use> keywords, or -generated on the fly using C<eval> or anonymous subroutines. -You can even call a function indirectly using a variable containing -its name or a CODE reference. - -The Perl model for function call and return values is simple: all -functions are passed as parameters one single flat list of scalars, and -all functions likewise return to their caller one single flat list of -scalars. Any arrays or hashes in these call and return lists will -collapse, losing their identities--but you may always use -pass-by-reference instead to avoid this. Both call and return lists may -contain as many or as few scalar elements as you'd like. (Often a -function without an explicit return statement is called a subroutine, but -there's really no difference from Perl's perspective.) - -Any arguments passed in show up in the array C<@_>. Therefore, if -you called a function with two arguments, those would be stored in -C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its -elements are aliases for the actual scalar parameters. In particular, -if an element C<$_[0]> is updated, the corresponding argument is -updated (or an error occurs if it is not updatable). If an argument -is an array or hash element which did not exist when the function -was called, that element is created only when (and if) it is modified -or a reference to it is taken. (Some earlier versions of Perl -created the element whether or not the element was assigned to.) -Assigning to the whole array C<@_> removes that aliasing, and does -not update any arguments. - -The return value of a subroutine is the value of the last expression -evaluated. More explicitly, a C<return> statement may be used to exit the -subroutine, optionally specifying the returned value, which will be -evaluated in the appropriate context (list, scalar, or void) depending -on the context of the subroutine call. If you specify no return value, -the subroutine returns an empty list in list context, the undefined -value in scalar context, or nothing in void context. If you return -one or more aggregates (arrays and hashes), these will be flattened -together into one large indistinguishable list. - -Perl does not have named formal parameters. In practice all you -do is assign to a C<my()> list of these. Variables that aren't -declared to be private are global variables. For gory details -on creating private variables, see L<"Private Variables via my()"> -and L<"Temporary Values via local()">. To create protected -environments for a set of functions in a separate package (and -probably a separate file), see L<perlmod/"Packages">. - -Example: - - sub max { - my $max = shift(@_); - foreach $foo (@_) { - $max = $foo if $max < $foo; - } - return $max; - } - $bestday = max($mon,$tue,$wed,$thu,$fri); - -Example: - - # get a line, combining continuation lines - # that start with whitespace - - sub get_line { - $thisline = $lookahead; # global variables! - LINE: while (defined($lookahead = <STDIN>)) { - if ($lookahead =~ /^[ \t]/) { - $thisline .= $lookahead; - } - else { - last LINE; - } - } - return $thisline; - } - - $lookahead = <STDIN>; # get first line - while (defined($line = get_line())) { - ... - } - -Assigning to a list of private variables to name your arguments: - - sub maybeset { - my($key, $value) = @_; - $Foo{$key} = $value unless $Foo{$key}; - } - -Because the assignment copies the values, this also has the effect -of turning call-by-reference into call-by-value. Otherwise a -function is free to do in-place modifications of C<@_> and change -its caller's values. - - upcase_in($v1, $v2); # this changes $v1 and $v2 - sub upcase_in { - for (@_) { tr/a-z/A-Z/ } - } - -You aren't allowed to modify constants in this way, of course. If an -argument were actually literal and you tried to change it, you'd take a -(presumably fatal) exception. For example, this won't work: - - upcase_in("frederick"); - -It would be much safer if the C<upcase_in()> function -were written to return a copy of its parameters instead -of changing them in place: - - ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2 - sub upcase { - return unless defined wantarray; # void context, do nothing - my @parms = @_; - for (@parms) { tr/a-z/A-Z/ } - return wantarray ? @parms : $parms[0]; - } - -Notice how this (unprototyped) function doesn't care whether it was -passed real scalars or arrays. Perl sees all arguments as one big, -long, flat parameter list in C<@_>. This is one area where -Perl's simple argument-passing style shines. The C<upcase()> -function would work perfectly well without changing the C<upcase()> -definition even if we fed it things like this: - - @newlist = upcase(@list1, @list2); - @newlist = upcase( split /:/, $var ); - -Do not, however, be tempted to do this: - - (@a, @b) = upcase(@list1, @list2); - -Like the flattened incoming parameter list, the return list is also -flattened on return. So all you have managed to do here is stored -everything in C<@a> and made C<@b> an empty list. See -L<Pass by Reference> for alternatives. - -A subroutine may be called using an explicit C<&> prefix. The -C<&> is optional in modern Perl, as are parentheses if the -subroutine has been predeclared. The C<&> is I<not> optional -when just naming the subroutine, such as when it's used as -an argument to defined() or undef(). Nor is it optional when you -want to do an indirect subroutine call with a subroutine name or -reference using the C<&$subref()> or C<&{$subref}()> constructs, -although the C<< $subref->() >> notation solves that problem. -See L<perlref> for more about all that. - -Subroutines may be called recursively. If a subroutine is called -using the C<&> form, the argument list is optional, and if omitted, -no C<@_> array is set up for the subroutine: the C<@_> array at the -time of the call is visible to subroutine instead. This is an -efficiency mechanism that new users may wish to avoid. - - &foo(1,2,3); # pass three arguments - foo(1,2,3); # the same - - foo(); # pass a null list - &foo(); # the same - - &foo; # foo() get current args, like foo(@_) !! - foo; # like foo() IFF sub foo predeclared, else "foo" - -Not only does the C<&> form make the argument list optional, it also -disables any prototype checking on arguments you do provide. This -is partly for historical reasons, and partly for having a convenient way -to cheat if you know what you're doing. See L<Prototypes> below. - -Functions whose names are in all upper case are reserved to the Perl -core, as are modules whose names are in all lower case. A -function in all capitals is a loosely-held convention meaning it -will be called indirectly by the run-time system itself, usually -due to a triggered event. Functions that do special, pre-defined -things include C<BEGIN>, C<CHECK>, C<INIT>, C<END>, C<AUTOLOAD>, and -C<DESTROY>--plus all functions mentioned in L<perltie>. - -=head2 Private Variables via my() - -Synopsis: - - my $foo; # declare $foo lexically local - my (@wid, %get); # declare list of variables local - my $foo = "flurp"; # declare $foo lexical, and init it - my @oof = @bar; # declare @oof lexical, and init it - my $x : Foo = $y; # similar, with an attribute applied - -B<WARNING>: The use of attribute lists on C<my> declarations is -experimental. This feature should not be relied upon. It may -change or disappear in future releases of Perl. See L<attributes>. - -The C<my> operator declares the listed variables to be lexically -confined to the enclosing block, conditional (C<if/unless/elsif/else>), -loop (C<for/foreach/while/until/continue>), subroutine, C<eval>, -or C<do/require/use>'d file. If more than one value is listed, the -list must be placed in parentheses. All listed elements must be -legal lvalues. Only alphanumeric identifiers may be lexically -scoped--magical built-ins like C<$/> must currently be C<local>ize -with C<local> instead. - -Unlike dynamic variables created by the C<local> operator, lexical -variables declared with C<my> are totally hidden from the outside -world, including any called subroutines. This is true if it's the -same subroutine called from itself or elsewhere--every call gets -its own copy. - -This doesn't mean that a C<my> variable declared in a statically -enclosing lexical scope would be invisible. Only dynamic scopes -are cut off. For example, the C<bumpx()> function below has access -to the lexical $x variable because both the C<my> and the C<sub> -occurred at the same scope, presumably file scope. - - my $x = 10; - sub bumpx { $x++ } - -An C<eval()>, however, can see lexical variables of the scope it is -being evaluated in, so long as the names aren't hidden by declarations within -the C<eval()> itself. See L<perlref>. - -The parameter list to my() may be assigned to if desired, which allows you -to initialize your variables. (If no initializer is given for a -particular variable, it is created with the undefined value.) Commonly -this is used to name input parameters to a subroutine. Examples: - - $arg = "fred"; # "global" variable - $n = cube_root(27); - print "$arg thinks the root is $n\n"; - fred thinks the root is 3 - - sub cube_root { - my $arg = shift; # name doesn't matter - $arg **= 1/3; - return $arg; - } - -The C<my> is simply a modifier on something you might assign to. So when -you do assign to variables in its argument list, C<my> doesn't -change whether those variables are viewed as a scalar or an array. So - - my ($foo) = <STDIN>; # WRONG? - my @FOO = <STDIN>; - -both supply a list context to the right-hand side, while - - my $foo = <STDIN>; - -supplies a scalar context. But the following declares only one variable: - - my $foo, $bar = 1; # WRONG - -That has the same effect as - - my $foo; - $bar = 1; - -The declared variable is not introduced (is not visible) until after -the current statement. Thus, - - my $x = $x; - -can be used to initialize a new $x with the value of the old $x, and -the expression - - my $x = 123 and $x == 123 - -is false unless the old $x happened to have the value C<123>. - -Lexical scopes of control structures are not bounded precisely by the -braces that delimit their controlled blocks; control expressions are -part of that scope, too. Thus in the loop - - while (my $line = <>) { - $line = lc $line; - } continue { - print $line; - } - -the scope of $line extends from its declaration throughout the rest of -the loop construct (including the C<continue> clause), but not beyond -it. Similarly, in the conditional - - if ((my $answer = <STDIN>) =~ /^yes$/i) { - user_agrees(); - } elsif ($answer =~ /^no$/i) { - user_disagrees(); - } else { - chomp $answer; - die "'$answer' is neither 'yes' nor 'no'"; - } - -the scope of $answer extends from its declaration through the rest -of that conditional, including any C<elsif> and C<else> clauses, -but not beyond it. - -None of the foregoing text applies to C<if/unless> or C<while/until> -modifiers appended to simple statements. Such modifiers are not -control structures and have no effect on scoping. - -The C<foreach> loop defaults to scoping its index variable dynamically -in the manner of C<local>. However, if the index variable is -prefixed with the keyword C<my>, or if there is already a lexical -by that name in scope, then a new lexical is created instead. Thus -in the loop - - for my $i (1, 2, 3) { - some_function(); - } - -the scope of $i extends to the end of the loop, but not beyond it, -rendering the value of $i inaccessible within C<some_function()>. - -Some users may wish to encourage the use of lexically scoped variables. -As an aid to catching implicit uses to package variables, -which are always global, if you say - - use strict 'vars'; - -then any variable mentioned from there to the end of the enclosing -block must either refer to a lexical variable, be predeclared via -C<our> or C<use vars>, or else must be fully qualified with the package name. -A compilation error results otherwise. An inner block may countermand -this with C<no strict 'vars'>. - -A C<my> has both a compile-time and a run-time effect. At compile -time, the compiler takes notice of it. The principal usefulness -of this is to quiet C<use strict 'vars'>, but it is also essential -for generation of closures as detailed in L<perlref>. Actual -initialization is delayed until run time, though, so it gets executed -at the appropriate time, such as each time through a loop, for -example. - -Variables declared with C<my> are not part of any package and are therefore -never fully qualified with the package name. In particular, you're not -allowed to try to make a package variable (or other global) lexical: - - my $pack::var; # ERROR! Illegal syntax - my $_; # also illegal (currently) - -In fact, a dynamic variable (also known as package or global variables) -are still accessible using the fully qualified C<::> notation even while a -lexical of the same name is also visible: - - package main; - local $x = 10; - my $x = 20; - print "$x and $::x\n"; - -That will print out C<20> and C<10>. - -You may declare C<my> variables at the outermost scope of a file -to hide any such identifiers from the world outside that file. This -is similar in spirit to C's static variables when they are used at -the file level. To do this with a subroutine requires the use of -a closure (an anonymous function that accesses enclosing lexicals). -If you want to create a private subroutine that cannot be called -from outside that block, it can declare a lexical variable containing -an anonymous sub reference: - - my $secret_version = '1.001-beta'; - my $secret_sub = sub { print $secret_version }; - &$secret_sub(); - -As long as the reference is never returned by any function within the -module, no outside module can see the subroutine, because its name is not in -any package's symbol table. Remember that it's not I<REALLY> called -C<$some_pack::secret_version> or anything; it's just $secret_version, -unqualified and unqualifiable. - -This does not work with object methods, however; all object methods -have to be in the symbol table of some package to be found. See -L<perlref/"Function Templates"> for something of a work-around to -this. - -=head2 Persistent Private Variables - -Just because a lexical variable is lexically (also called statically) -scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that -within a function it works like a C static. It normally works more -like a C auto, but with implicit garbage collection. - -Unlike local variables in C or C++, Perl's lexical variables don't -necessarily get recycled just because their scope has exited. -If something more permanent is still aware of the lexical, it will -stick around. So long as something else references a lexical, that -lexical won't be freed--which is as it should be. You wouldn't want -memory being free until you were done using it, or kept around once you -were done. Automatic garbage collection takes care of this for you. - -This means that you can pass back or save away references to lexical -variables, whereas to return a pointer to a C auto is a grave error. -It also gives us a way to simulate C's function statics. Here's a -mechanism for giving a function private variables with both lexical -scoping and a static lifetime. If you do want to create something like -C's static variables, just enclose the whole function in an extra block, -and put the static variable outside the function but in the block. - - { - my $secret_val = 0; - sub gimme_another { - return ++$secret_val; - } - } - # $secret_val now becomes unreachable by the outside - # world, but retains its value between calls to gimme_another - -If this function is being sourced in from a separate file -via C<require> or C<use>, then this is probably just fine. If it's -all in the main program, you'll need to arrange for the C<my> -to be executed early, either by putting the whole block above -your main program, or more likely, placing merely a C<BEGIN> -sub around it to make sure it gets executed before your program -starts to run: - - sub BEGIN { - my $secret_val = 0; - sub gimme_another { - return ++$secret_val; - } - } - -See L<perlmod/"Package Constructors and Destructors"> about the -special triggered functions, C<BEGIN>, C<CHECK>, C<INIT> and C<END>. - -If declared at the outermost scope (the file scope), then lexicals -work somewhat like C's file statics. They are available to all -functions in that same file declared below them, but are inaccessible -from outside that file. This strategy is sometimes used in modules -to create private variables that the whole module can see. - -=head2 Temporary Values via local() - -B<WARNING>: In general, you should be using C<my> instead of C<local>, because -it's faster and safer. Exceptions to this include the global punctuation -variables, filehandles and formats, and direct manipulation of the Perl -symbol table itself. Format variables often use C<local> though, as do -other variables whose current value must be visible to called -subroutines. - -Synopsis: - - local $foo; # declare $foo dynamically local - local (@wid, %get); # declare list of variables local - local $foo = "flurp"; # declare $foo dynamic, and init it - local @oof = @bar; # declare @oof dynamic, and init it - - local *FH; # localize $FH, @FH, %FH, &FH ... - local *merlyn = *randal; # now $merlyn is really $randal, plus - # @merlyn is really @randal, etc - local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal - local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc - -A C<local> modifies its listed variables to be "local" to the -enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine -called from within that block>. A C<local> just gives temporary -values to global (meaning package) variables. It does I<not> create -a local variable. This is known as dynamic scoping. Lexical scoping -is done with C<my>, which works more like C's auto declarations. - -If more than one variable is given to C<local>, they must be placed in -parentheses. All listed elements must be legal lvalues. This operator works -by saving the current values of those variables in its argument list on a -hidden stack and restoring them upon exiting the block, subroutine, or -eval. This means that called subroutines can also reference the local -variable, but not the global one. The argument list may be assigned to if -desired, which allows you to initialize your local variables. (If no -initializer is given for a particular variable, it is created with an -undefined value.) Commonly this is used to name the parameters to a -subroutine. Examples: - - for $i ( 0 .. 9 ) { - $digits{$i} = $i; - } - # assume this function uses global %digits hash - parse_num(); - - # now temporarily add to %digits hash - if ($base12) { - # (NOTE: not claiming this is efficient!) - local %digits = (%digits, 't' => 10, 'e' => 11); - parse_num(); # parse_num gets this new %digits! - } - # old %digits restored here - -Because C<local> is a run-time operator, it gets executed each time -through a loop. In releases of Perl previous to 5.0, this used more stack -storage each time until the loop was exited. Perl now reclaims the space -each time through, but it's still more efficient to declare your variables -outside the loop. - -A C<local> is simply a modifier on an lvalue expression. When you assign to -a C<local>ized variable, the C<local> doesn't change whether its list is viewed -as a scalar or an array. So - - local($foo) = <STDIN>; - local @FOO = <STDIN>; - -both supply a list context to the right-hand side, while - - local $foo = <STDIN>; - -supplies a scalar context. - -A note about C<local()> and composite types is in order. Something -like C<local(%foo)> works by temporarily placing a brand new hash in -the symbol table. The old hash is left alone, but is hidden "behind" -the new one. - -This means the old variable is completely invisible via the symbol -table (i.e. the hash entry in the C<*foo> typeglob) for the duration -of the dynamic scope within which the C<local()> was seen. This -has the effect of allowing one to temporarily occlude any magic on -composite types. For instance, this will briefly alter a tied -hash to some other implementation: - - tie %ahash, 'APackage'; - [...] - { - local %ahash; - tie %ahash, 'BPackage'; - [..called code will see %ahash tied to 'BPackage'..] - { - local %ahash; - [..%ahash is a normal (untied) hash here..] - } - } - [..%ahash back to its initial tied self again..] - -As another example, a custom implementation of C<%ENV> might look -like this: - - { - local %ENV; - tie %ENV, 'MyOwnEnv'; - [..do your own fancy %ENV manipulation here..] - } - [..normal %ENV behavior here..] - -It's also worth taking a moment to explain what happens when you -C<local>ize a member of a composite type (i.e. an array or hash element). -In this case, the element is C<local>ized I<by name>. This means that -when the scope of the C<local()> ends, the saved value will be -restored to the hash element whose key was named in the C<local()>, or -the array element whose index was named in the C<local()>. If that -element was deleted while the C<local()> was in effect (e.g. by a -C<delete()> from a hash or a C<shift()> of an array), it will spring -back into existence, possibly extending an array and filling in the -skipped elements with C<undef>. For instance, if you say - - %hash = ( 'This' => 'is', 'a' => 'test' ); - @ary = ( 0..5 ); - { - local($ary[5]) = 6; - local($hash{'a'}) = 'drill'; - while (my $e = pop(@ary)) { - print "$e . . .\n"; - last unless $e > 3; - } - if (@ary) { - $hash{'only a'} = 'test'; - delete $hash{'a'}; - } - } - print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n"; - print "The array has ",scalar(@ary)," elements: ", - join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n"; - -Perl will print - - 6 . . . - 4 . . . - 3 . . . - This is a test only a test. - The array has 6 elements: 0, 1, 2, undef, undef, 5 - -The behavior of local() on non-existent members of composite -types is subject to change in future. - -=head2 Lvalue subroutines - -B<WARNING>: Lvalue subroutines are still experimental and the implementation -may change in future versions of Perl. - -It is possible to return a modifiable value from a subroutine. -To do this, you have to declare the subroutine to return an lvalue. - - my $val; - sub canmod : lvalue { - $val; - } - sub nomod { - $val; - } - - canmod() = 5; # assigns to $val - nomod() = 5; # ERROR - -The scalar/list context for the subroutine and for the right-hand -side of assignment is determined as if the subroutine call is replaced -by a scalar. For example, consider: - - data(2,3) = get_data(3,4); - -Both subroutines here are called in a scalar context, while in: - - (data(2,3)) = get_data(3,4); - -and in: - - (data(2),data(3)) = get_data(3,4); - -all the subroutines are called in a list context. - -=head2 Passing Symbol Table Entries (typeglobs) - -B<WARNING>: The mechanism described in this section was originally -the only way to simulate pass-by-reference in older versions of -Perl. While it still works fine in modern versions, the new reference -mechanism is generally easier to work with. See below. - -Sometimes you don't want to pass the value of an array to a subroutine -but rather the name of it, so that the subroutine can modify the global -copy of it rather than working with a local copy. In perl you can -refer to all objects of a particular name by prefixing the name -with a star: C<*foo>. This is often known as a "typeglob", because the -star on the front can be thought of as a wildcard match for all the -funny prefix characters on variables and subroutines and such. - -When evaluated, the typeglob produces a scalar value that represents -all the objects of that name, including any filehandle, format, or -subroutine. When assigned to, it causes the name mentioned to refer to -whatever C<*> value was assigned to it. Example: - - sub doubleary { - local(*someary) = @_; - foreach $elem (@someary) { - $elem *= 2; - } - } - doubleary(*foo); - doubleary(*bar); - -Scalars are already passed by reference, so you can modify -scalar arguments without using this mechanism by referring explicitly -to C<$_[0]> etc. You can modify all the elements of an array by passing -all the elements as scalars, but you have to use the C<*> mechanism (or -the equivalent reference mechanism) to C<push>, C<pop>, or change the size of -an array. It will certainly be faster to pass the typeglob (or reference). - -Even if you don't want to modify an array, this mechanism is useful for -passing multiple arrays in a single LIST, because normally the LIST -mechanism will merge all the array values so that you can't extract out -the individual arrays. For more on typeglobs, see -L<perldata/"Typeglobs and Filehandles">. - -=head2 When to Still Use local() - -Despite the existence of C<my>, there are still three places where the -C<local> operator still shines. In fact, in these three places, you -I<must> use C<local> instead of C<my>. - -=over 4 - -=item 1. - -You need to give a global variable a temporary value, especially $_. - -The global variables, like C<@ARGV> or the punctuation variables, must be -C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits -it up into chunks separated by lines of equal signs, which are placed -in C<@Fields>. - - { - local @ARGV = ("/etc/motd"); - local $/ = undef; - local $_ = <>; - @Fields = split /^\s*=+\s*$/; - } - -It particular, it's important to C<local>ize $_ in any routine that assigns -to it. Look out for implicit assignments in C<while> conditionals. - -=item 2. - -You need to create a local file or directory handle or a local function. - -A function that needs a filehandle of its own must use -C<local()> on a complete typeglob. This can be used to create new symbol -table entries: - - sub ioqueue { - local (*READER, *WRITER); # not my! - pipe (READER, WRITER); or die "pipe: $!"; - return (*READER, *WRITER); - } - ($head, $tail) = ioqueue(); - -See the Symbol module for a way to create anonymous symbol table -entries. - -Because assignment of a reference to a typeglob creates an alias, this -can be used to create what is effectively a local function, or at least, -a local alias. - - { - local *grow = \&shrink; # only until this block exists - grow(); # really calls shrink() - move(); # if move() grow()s, it shrink()s too - } - grow(); # get the real grow() again - -See L<perlref/"Function Templates"> for more about manipulating -functions by name in this way. - -=item 3. - -You want to temporarily change just one element of an array or hash. - -You can C<local>ize just one element of an aggregate. Usually this -is done on dynamics: - - { - local $SIG{INT} = 'IGNORE'; - funct(); # uninterruptible - } - # interruptibility automatically restored here - -But it also works on lexically declared aggregates. Prior to 5.005, -this operation could on occasion misbehave. - -=back - -=head2 Pass by Reference - -If you want to pass more than one array or hash into a function--or -return them from it--and have them maintain their integrity, then -you're going to have to use an explicit pass-by-reference. Before you -do that, you need to understand references as detailed in L<perlref>. -This section may not make much sense to you otherwise. - -Here are a few simple examples. First, let's pass in several arrays -to a function and have it C<pop> all of then, returning a new list -of all their former last elements: - - @tailings = popmany ( \@a, \@b, \@c, \@d ); - - sub popmany { - my $aref; - my @retlist = (); - foreach $aref ( @_ ) { - push @retlist, pop @$aref; - } - return @retlist; - } - -Here's how you might write a function that returns a -list of keys occurring in all the hashes passed to it: - - @common = inter( \%foo, \%bar, \%joe ); - sub inter { - my ($k, $href, %seen); # locals - foreach $href (@_) { - while ( $k = each %$href ) { - $seen{$k}++; - } - } - return grep { $seen{$_} == @_ } keys %seen; - } - -So far, we're using just the normal list return mechanism. -What happens if you want to pass or return a hash? Well, -if you're using only one of them, or you don't mind them -concatenating, then the normal calling convention is ok, although -a little expensive. - -Where people get into trouble is here: - - (@a, @b) = func(@c, @d); -or - (%a, %b) = func(%c, %d); - -That syntax simply won't work. It sets just C<@a> or C<%a> and -clears the C<@b> or C<%b>. Plus the function didn't get passed -into two separate arrays or hashes: it got one long list in C<@_>, -as always. - -If you can arrange for everyone to deal with this through references, it's -cleaner code, although not so nice to look at. Here's a function that -takes two array references as arguments, returning the two array elements -in order of how many elements they have in them: - - ($aref, $bref) = func(\@c, \@d); - print "@$aref has more than @$bref\n"; - sub func { - my ($cref, $dref) = @_; - if (@$cref > @$dref) { - return ($cref, $dref); - } else { - return ($dref, $cref); - } - } - -It turns out that you can actually do this also: - - (*a, *b) = func(\@c, \@d); - print "@a has more than @b\n"; - sub func { - local (*c, *d) = @_; - if (@c > @d) { - return (\@c, \@d); - } else { - return (\@d, \@c); - } - } - -Here we're using the typeglobs to do symbol table aliasing. It's -a tad subtle, though, and also won't work if you're using C<my> -variables, because only globals (even in disguise as C<local>s) -are in the symbol table. - -If you're passing around filehandles, you could usually just use the bare -typeglob, like C<*STDOUT>, but typeglobs references work, too. -For example: - - splutter(\*STDOUT); - sub splutter { - my $fh = shift; - print $fh "her um well a hmmm\n"; - } - - $rec = get_rec(\*STDIN); - sub get_rec { - my $fh = shift; - return scalar <$fh>; - } - -If you're planning on generating new filehandles, you could do this. -Notice to pass back just the bare *FH, not its reference. - - sub openit { - my $path = shift; - local *FH; - return open (FH, $path) ? *FH : undef; - } - -=head2 Prototypes - -Perl supports a very limited kind of compile-time argument checking -using function prototyping. If you declare - - sub mypush (\@@) - -then C<mypush()> takes arguments exactly like C<push()> does. The -function declaration must be visible at compile time. The prototype -affects only interpretation of new-style calls to the function, -where new-style is defined as not using the C<&> character. In -other words, if you call it like a built-in function, then it behaves -like a built-in function. If you call it like an old-fashioned -subroutine, then it behaves like an old-fashioned subroutine. It -naturally falls out from this rule that prototypes have no influence -on subroutine references like C<\&foo> or on indirect subroutine -calls like C<&{$subref}> or C<< $subref->() >>. - -Method calls are not influenced by prototypes either, because the -function to be called is indeterminate at compile time, since -the exact code called depends on inheritance. - -Because the intent of this feature is primarily to let you define -subroutines that work like built-in functions, here are prototypes -for some other functions that parse almost exactly like the -corresponding built-in. - - Declared as Called as - - sub mylink ($$) mylink $old, $new - sub myvec ($$$) myvec $var, $offset, 1 - sub myindex ($$;$) myindex &getstring, "substr" - sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off - sub myreverse (@) myreverse $a, $b, $c - sub myjoin ($@) myjoin ":", $a, $b, $c - sub mypop (\@) mypop @array - sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme - sub mykeys (\%) mykeys %{$hashref} - sub myopen (*;$) myopen HANDLE, $name - sub mypipe (**) mypipe READHANDLE, WRITEHANDLE - sub mygrep (&@) mygrep { /foo/ } $a, $b, $c - sub myrand ($) myrand 42 - sub mytime () mytime - -Any backslashed prototype character represents an actual argument -that absolutely must start with that character. The value passed -as part of C<@_> will be a reference to the actual argument given -in the subroutine call, obtained by applying C<\> to that argument. - -Unbackslashed prototype characters have special meanings. Any -unbackslashed C<@> or C<%> eats all remaining arguments, and forces -list context. An argument represented by C<$> forces scalar context. An -C<&> requires an anonymous subroutine, which, if passed as the first -argument, does not require the C<sub> keyword or a subsequent comma. - -A C<*> allows the subroutine to accept a bareword, constant, scalar expression, -typeglob, or a reference to a typeglob in that slot. The value will be -available to the subroutine either as a simple scalar, or (in the latter -two cases) as a reference to the typeglob. If you wish to always convert -such arguments to a typeglob reference, use Symbol::qualify_to_ref() as -follows: - - use Symbol 'qualify_to_ref'; - - sub foo (*) { - my $fh = qualify_to_ref(shift, caller); - ... - } - -A semicolon separates mandatory arguments from optional arguments. -It is redundant before C<@> or C<%>, which gobble up everything else. - -Note how the last three examples in the table above are treated -specially by the parser. C<mygrep()> is parsed as a true list -operator, C<myrand()> is parsed as a true unary operator with unary -precedence the same as C<rand()>, and C<mytime()> is truly without -arguments, just like C<time()>. That is, if you say - - mytime +2; - -you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed -without a prototype. - -The interesting thing about C<&> is that you can generate new syntax with it, -provided it's in the initial position: - - sub try (&@) { - my($try,$catch) = @_; - eval { &$try }; - if ($@) { - local $_ = $@; - &$catch; - } - } - sub catch (&) { $_[0] } - - try { - die "phooey"; - } catch { - /phooey/ and print "unphooey\n"; - }; - -That prints C<"unphooey">. (Yes, there are still unresolved -issues having to do with visibility of C<@_>. I'm ignoring that -question for the moment. (But note that if we make C<@_> lexically -scoped, those anonymous subroutines can act like closures... (Gee, -is this sounding a little Lispish? (Never mind.)))) - -And here's a reimplementation of the Perl C<grep> operator: - - sub mygrep (&@) { - my $code = shift; - my @result; - foreach $_ (@_) { - push(@result, $_) if &$code; - } - @result; - } - -Some folks would prefer full alphanumeric prototypes. Alphanumerics have -been intentionally left out of prototypes for the express purpose of -someday in the future adding named, formal parameters. The current -mechanism's main goal is to let module writers provide better diagnostics -for module users. Larry feels the notation quite understandable to Perl -programmers, and that it will not intrude greatly upon the meat of the -module, nor make it harder to read. The line noise is visually -encapsulated into a small pill that's easy to swallow. - -It's probably best to prototype new functions, not retrofit prototyping -into older ones. That's because you must be especially careful about -silent impositions of differing list versus scalar contexts. For example, -if you decide that a function should take just one parameter, like this: - - sub func ($) { - my $n = shift; - print "you gave me $n\n"; - } - -and someone has been calling it with an array or expression -returning a list: - - func(@foo); - func( split /:/ ); - -Then you've just supplied an automatic C<scalar> in front of their -argument, which can be more than a bit surprising. The old C<@foo> -which used to hold one thing doesn't get passed in. Instead, -C<func()> now gets passed in a C<1>; that is, the number of elements -in C<@foo>. And the C<split> gets called in scalar context so it -starts scribbling on your C<@_> parameter list. Ouch! - -This is all very powerful, of course, and should be used only in moderation -to make the world a better place. - -=head2 Constant Functions - -Functions with a prototype of C<()> are potential candidates for -inlining. If the result after optimization and constant folding -is either a constant or a lexically-scoped scalar which has no other -references, then it will be used in place of function calls made -without C<&>. Calls made using C<&> are never inlined. (See -F<constant.pm> for an easy way to declare most constants.) - -The following functions would all be inlined: - - sub pi () { 3.14159 } # Not exact, but close. - sub PI () { 4 * atan2 1, 1 } # As good as it gets, - # and it's inlined, too! - sub ST_DEV () { 0 } - sub ST_INO () { 1 } - - sub FLAG_FOO () { 1 << 8 } - sub FLAG_BAR () { 1 << 9 } - sub FLAG_MASK () { FLAG_FOO | FLAG_BAR } - - sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) } - sub BAZ_VAL () { - if (OPT_BAZ) { - return 23; - } - else { - return 42; - } - } - - sub N () { int(BAZ_VAL) / 3 } - BEGIN { - my $prod = 1; - for (1..N) { $prod *= $_ } - sub N_FACTORIAL () { $prod } - } - -If you redefine a subroutine that was eligible for inlining, you'll get -a mandatory warning. (You can use this warning to tell whether or not a -particular subroutine is considered constant.) The warning is -considered severe enough not to be optional because previously compiled -invocations of the function will still be using the old value of the -function. If you need to be able to redefine the subroutine, you need to -ensure that it isn't inlined, either by dropping the C<()> prototype -(which changes calling semantics, so beware) or by thwarting the -inlining mechanism in some other way, such as - - sub not_inlined () { - 23 if $]; - } - -=head2 Overriding Built-in Functions - -Many built-in functions may be overridden, though this should be tried -only occasionally and for good reason. Typically this might be -done by a package attempting to emulate missing built-in functionality -on a non-Unix system. - -Overriding may be done only by importing the name from a -module--ordinary predeclaration isn't good enough. However, the -C<use subs> pragma lets you, in effect, predeclare subs -via the import syntax, and these names may then override built-in ones: - - use subs 'chdir', 'chroot', 'chmod', 'chown'; - chdir $somewhere; - sub chdir { ... } - -To unambiguously refer to the built-in form, precede the -built-in name with the special package qualifier C<CORE::>. For example, -saying C<CORE::open()> always refers to the built-in C<open()>, even -if the current package has imported some other subroutine called -C<&open()> from elsewhere. Even though it looks like a regular -function call, it isn't: you can't take a reference to it, such as -the incorrect C<\&CORE::open> might appear to produce. - -Library modules should not in general export built-in names like C<open> -or C<chdir> as part of their default C<@EXPORT> list, because these may -sneak into someone else's namespace and change the semantics unexpectedly. -Instead, if the module adds that name to C<@EXPORT_OK>, then it's -possible for a user to import the name explicitly, but not implicitly. -That is, they could say - - use Module 'open'; - -and it would import the C<open> override. But if they said - - use Module; - -they would get the default imports without overrides. - -The foregoing mechanism for overriding built-in is restricted, quite -deliberately, to the package that requests the import. There is a second -method that is sometimes applicable when you wish to override a built-in -everywhere, without regard to namespace boundaries. This is achieved by -importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an -example that quite brazenly replaces the C<glob> operator with something -that understands regular expressions. - - package REGlob; - require Exporter; - @ISA = 'Exporter'; - @EXPORT_OK = 'glob'; - - sub import { - my $pkg = shift; - return unless @_; - my $sym = shift; - my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0)); - $pkg->export($where, $sym, @_); - } - - sub glob { - my $pat = shift; - my @got; - local *D; - if (opendir D, '.') { - @got = grep /$pat/, readdir D; - closedir D; - } - return @got; - } - 1; - -And here's how it could be (ab)used: - - #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces - package Foo; - use REGlob 'glob'; # override glob() in Foo:: only - print for <^[a-z_]+\.pm\$>; # show all pragmatic modules - -The initial comment shows a contrived, even dangerous example. -By overriding C<glob> globally, you would be forcing the new (and -subversive) behavior for the C<glob> operator for I<every> namespace, -without the complete cognizance or cooperation of the modules that own -those namespaces. Naturally, this should be done with extreme caution--if -it must be done at all. - -The C<REGlob> example above does not implement all the support needed to -cleanly override perl's C<glob> operator. The built-in C<glob> has -different behaviors depending on whether it appears in a scalar or list -context, but our C<REGlob> doesn't. Indeed, many perl built-in have such -context sensitive behaviors, and these must be adequately supported by -a properly written override. For a fully functional example of overriding -C<glob>, study the implementation of C<File::DosGlob> in the standard -library. - -=head2 Autoloading - -If you call a subroutine that is undefined, you would ordinarily -get an immediate, fatal error complaining that the subroutine doesn't -exist. (Likewise for subroutines being used as methods, when the -method doesn't exist in any base class of the class's package.) -However, if an C<AUTOLOAD> subroutine is defined in the package or -packages used to locate the original subroutine, then that -C<AUTOLOAD> subroutine is called with the arguments that would have -been passed to the original subroutine. The fully qualified name -of the original subroutine magically appears in the global $AUTOLOAD -variable of the same package as the C<AUTOLOAD> routine. The name -is not passed as an ordinary argument because, er, well, just -because, that's why... - -Many C<AUTOLOAD> routines load in a definition for the requested -subroutine using eval(), then execute that subroutine using a special -form of goto() that erases the stack frame of the C<AUTOLOAD> routine -without a trace. (See the source to the standard module documented -in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can -also just emulate the routine and never define it. For example, -let's pretend that a function that wasn't defined should just invoke -C<system> with those arguments. All you'd do is: - - sub AUTOLOAD { - my $program = $AUTOLOAD; - $program =~ s/.*:://; - system($program, @_); - } - date(); - who('am', 'i'); - ls('-l'); - -In fact, if you predeclare functions you want to call that way, you don't -even need parentheses: - - use subs qw(date who ls); - date; - who "am", "i"; - ls -l; - -A more complete example of this is the standard Shell module, which -can treat undefined subroutine calls as calls to external programs. - -Mechanisms are available to help modules writers split their modules -into autoloadable files. See the standard AutoLoader module -described in L<AutoLoader> and in L<AutoSplit>, the standard -SelfLoader modules in L<SelfLoader>, and the document on adding C -functions to Perl code in L<perlxs>. - -=head2 Subroutine Attributes - -A subroutine declaration or definition may have a list of attributes -associated with it. If such an attribute list is present, it is -broken up at space or colon boundaries and treated as though a -C<use attributes> had been seen. See L<attributes> for details -about what attributes are currently supported. -Unlike the limitation with the obsolescent C<use attrs>, the -C<sub : ATTRLIST> syntax works to associate the attributes with -a pre-declaration, and not just with a subroutine definition. - -The attributes must be valid as simple identifier names (without any -punctuation other than the '_' character). They may have a parameter -list appended, which is only checked for whether its parentheses ('(',')') -nest properly. - -Examples of valid syntax (even though the attributes are unknown): - - sub fnord (&\%) : switch(10,foo(7,3)) : expensive ; - sub plugh () : Ugly('\(") :Bad ; - sub xyzzy : _5x5 { ... } - -Examples of invalid syntax: - - sub fnord : switch(10,foo() ; # ()-string not balanced - sub snoid : Ugly('(') ; # ()-string not balanced - sub xyzzy : 5x5 ; # "5x5" not a valid identifier - sub plugh : Y2::north ; # "Y2::north" not a simple identifier - sub snurt : foo + bar ; # "+" not a colon or space - -The attribute list is passed as a list of constant strings to the code -which associates them with the subroutine. In particular, the second example -of valid syntax above currently looks like this in terms of how it's -parsed and invoked: - - use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad'; - -For further details on attribute lists and their manipulation, -see L<attributes>. - -=head1 SEE ALSO - -See L<perlref/"Function Templates"> for more about references and closures. -See L<perlxs> if you'd like to learn about calling C subroutines from Perl. -See L<perlembed> if you'd like to learn about calling Perl subroutines from C. -See L<perlmod> to learn about bundling up your functions in separate files. -See L<perlmodlib> to learn what library modules come standard on your system. -See L<perltoot> to learn how to make object method calls. |