summaryrefslogtreecommitdiffstats
path: root/contrib/perl5/pod/perlref.pod
diff options
context:
space:
mode:
authormarkm <markm@FreeBSD.org>1998-09-09 07:00:04 +0000
committermarkm <markm@FreeBSD.org>1998-09-09 07:00:04 +0000
commit4fcbc3669aa997848e15198cc9fb856287a6788c (patch)
tree58b20e81687d6d5931f120b50802ed21225bf440 /contrib/perl5/pod/perlref.pod
downloadFreeBSD-src-4fcbc3669aa997848e15198cc9fb856287a6788c.zip
FreeBSD-src-4fcbc3669aa997848e15198cc9fb856287a6788c.tar.gz
Initial import of Perl5. The king is dead; long live the king!
Diffstat (limited to 'contrib/perl5/pod/perlref.pod')
-rw-r--r--contrib/perl5/pod/perlref.pod646
1 files changed, 646 insertions, 0 deletions
diff --git a/contrib/perl5/pod/perlref.pod b/contrib/perl5/pod/perlref.pod
new file mode 100644
index 0000000..66b1a7d
--- /dev/null
+++ b/contrib/perl5/pod/perlref.pod
@@ -0,0 +1,646 @@
+=head1 NAME
+
+perlref - Perl references and nested data structures
+
+=head1 DESCRIPTION
+
+Before release 5 of Perl it was difficult to represent complex data
+structures, because all references had to be symbolic--and even then
+it was difficult to refer to a variable instead of a symbol table entry.
+Perl now not only makes it easier to use symbolic references to variables,
+but also lets you have "hard" references to any piece of data or code.
+Any scalar may hold a hard reference. Because arrays and hashes contain
+scalars, you can now easily build arrays of arrays, arrays of hashes,
+hashes of arrays, arrays of hashes of functions, and so on.
+
+Hard references are smart--they keep track of reference counts for you,
+automatically freeing the thing referred to when its reference count goes
+to zero. (Note: the reference counts for values in self-referential or
+cyclic data structures may not go to zero without a little help; see
+L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.)
+If that thing happens to be an object, the object is destructed. See
+L<perlobj> for more about objects. (In a sense, everything in Perl is an
+object, but we usually reserve the word for references to objects that
+have been officially "blessed" into a class package.)
+
+Symbolic references are names of variables or other objects, just as a
+symbolic link in a Unix filesystem contains merely the name of a file.
+The C<*glob> notation is a kind of symbolic reference. (Symbolic
+references are sometimes called "soft references", but please don't call
+them that; references are confusing enough without useless synonyms.)
+
+In contrast, hard references are more like hard links in a Unix file
+system: They are used to access an underlying object without concern for
+what its (other) name is. When the word "reference" is used without an
+adjective, as in the following paragraph, it is usually talking about a
+hard reference.
+
+References are easy to use in Perl. There is just one overriding
+principle: Perl does no implicit referencing or dereferencing. When a
+scalar is holding a reference, it always behaves as a simple scalar. It
+doesn't magically start being an array or hash or subroutine; you have to
+tell it explicitly to do so, by dereferencing it.
+
+=head2 Making References
+
+References can be created in several ways.
+
+=over 4
+
+=item 1.
+
+By using the backslash operator on a variable, subroutine, or value.
+(This works much like the & (address-of) operator in C.) Note
+that this typically creates I<ANOTHER> reference to a variable, because
+there's already a reference to the variable in the symbol table. But
+the symbol table reference might go away, and you'll still have the
+reference that the backslash returned. Here are some examples:
+
+ $scalarref = \$foo;
+ $arrayref = \@ARGV;
+ $hashref = \%ENV;
+ $coderef = \&handler;
+ $globref = \*foo;
+
+It isn't possible to create a true reference to an IO handle (filehandle
+or dirhandle) using the backslash operator. The most you can get is a
+reference to a typeglob, which is actually a complete symbol table entry.
+But see the explanation of the C<*foo{THING}> syntax below. However,
+you can still use type globs and globrefs as though they were IO handles.
+
+=item 2.
+
+A reference to an anonymous array can be created using square
+brackets:
+
+ $arrayref = [1, 2, ['a', 'b', 'c']];
+
+Here we've created a reference to an anonymous array of three elements
+whose final element is itself a reference to another anonymous array of three
+elements. (The multidimensional syntax described later can be used to
+access this. For example, after the above, C<$arrayref-E<gt>[2][1]> would have
+the value "b".)
+
+Note that taking a reference to an enumerated list is not the same
+as using square brackets--instead it's the same as creating
+a list of references!
+
+ @list = (\$a, \@b, \%c);
+ @list = \($a, @b, %c); # same thing!
+
+As a special case, C<\(@foo)> returns a list of references to the contents
+of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>.
+
+=item 3.
+
+A reference to an anonymous hash can be created using curly
+brackets:
+
+ $hashref = {
+ 'Adam' => 'Eve',
+ 'Clyde' => 'Bonnie',
+ };
+
+Anonymous hash and array composers like these can be intermixed freely to
+produce as complicated a structure as you want. The multidimensional
+syntax described below works for these too. The values above are
+literals, but variables and expressions would work just as well, because
+assignment operators in Perl (even within local() or my()) are executable
+statements, not compile-time declarations.
+
+Because curly brackets (braces) are used for several other things
+including BLOCKs, you may occasionally have to disambiguate braces at the
+beginning of a statement by putting a C<+> or a C<return> in front so
+that Perl realizes the opening brace isn't starting a BLOCK. The economy and
+mnemonic value of using curlies is deemed worth this occasional extra
+hassle.
+
+For example, if you wanted a function to make a new hash and return a
+reference to it, you have these options:
+
+ sub hashem { { @_ } } # silently wrong
+ sub hashem { +{ @_ } } # ok
+ sub hashem { return { @_ } } # ok
+
+On the other hand, if you want the other meaning, you can do this:
+
+ sub showem { { @_ } } # ambiguous (currently ok, but may change)
+ sub showem { {; @_ } } # ok
+ sub showem { { return @_ } } # ok
+
+Note how the leading C<+{> and C<{;> always serve to disambiguate
+the expression to mean either the HASH reference, or the BLOCK.
+
+=item 4.
+
+A reference to an anonymous subroutine can be created by using
+C<sub> without a subname:
+
+ $coderef = sub { print "Boink!\n" };
+
+Note the presence of the semicolon. Except for the fact that the code
+inside isn't executed immediately, a C<sub {}> is not so much a
+declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no
+matter how many times you execute that particular line (unless you're in an
+C<eval("...")>), C<$coderef> will still have a reference to the I<SAME>
+anonymous subroutine.)
+
+Anonymous subroutines act as closures with respect to my() variables,
+that is, variables visible lexically within the current scope. Closure
+is a notion out of the Lisp world that says if you define an anonymous
+function in a particular lexical context, it pretends to run in that
+context even when it's called outside of the context.
+
+In human terms, it's a funny way of passing arguments to a subroutine when
+you define it as well as when you call it. It's useful for setting up
+little bits of code to run later, such as callbacks. You can even
+do object-oriented stuff with it, though Perl already provides a different
+mechanism to do that--see L<perlobj>.
+
+You can also think of closure as a way to write a subroutine template without
+using eval. (In fact, in version 5.000, eval was the I<only> way to get
+closures. You may wish to use "require 5.001" if you use closures.)
+
+Here's a small example of how closures works:
+
+ sub newprint {
+ my $x = shift;
+ return sub { my $y = shift; print "$x, $y!\n"; };
+ }
+ $h = newprint("Howdy");
+ $g = newprint("Greetings");
+
+ # Time passes...
+
+ &$h("world");
+ &$g("earthlings");
+
+This prints
+
+ Howdy, world!
+ Greetings, earthlings!
+
+Note particularly that $x continues to refer to the value passed into
+newprint() I<despite> the fact that the "my $x" has seemingly gone out of
+scope by the time the anonymous subroutine runs. That's what closure
+is all about.
+
+This applies only to lexical variables, by the way. Dynamic variables
+continue to work as they have always worked. Closure is not something
+that most Perl programmers need trouble themselves about to begin with.
+
+=item 5.
+
+References are often returned by special subroutines called constructors.
+Perl objects are just references to a special kind of object that happens to know
+which package it's associated with. Constructors are just special
+subroutines that know how to create that association. They do so by
+starting with an ordinary reference, and it remains an ordinary reference
+even while it's also being an object. Constructors are often
+named new() and called indirectly:
+
+ $objref = new Doggie (Tail => 'short', Ears => 'long');
+
+But don't have to be:
+
+ $objref = Doggie->new(Tail => 'short', Ears => 'long');
+
+ use Term::Cap;
+ $terminal = Term::Cap->Tgetent( { OSPEED => 9600 });
+
+ use Tk;
+ $main = MainWindow->new();
+ $menubar = $main->Frame(-relief => "raised",
+ -borderwidth => 2)
+
+=item 6.
+
+References of the appropriate type can spring into existence if you
+dereference them in a context that assumes they exist. Because we haven't
+talked about dereferencing yet, we can't show you any examples yet.
+
+=item 7.
+
+A reference can be created by using a special syntax, lovingly known as
+the *foo{THING} syntax. *foo{THING} returns a reference to the THING
+slot in *foo (which is the symbol table entry which holds everything
+known as foo).
+
+ $scalarref = *foo{SCALAR};
+ $arrayref = *ARGV{ARRAY};
+ $hashref = *ENV{HASH};
+ $coderef = *handler{CODE};
+ $ioref = *STDIN{IO};
+ $globref = *foo{GLOB};
+
+All of these are self-explanatory except for *foo{IO}. It returns the
+IO handle, used for file handles (L<perlfunc/open>), sockets
+(L<perlfunc/socket> and L<perlfunc/socketpair>), and directory handles
+(L<perlfunc/opendir>). For compatibility with previous versions of
+Perl, *foo{FILEHANDLE} is a synonym for *foo{IO}.
+
+*foo{THING} returns undef if that particular THING hasn't been used yet,
+except in the case of scalars. *foo{SCALAR} returns a reference to an
+anonymous scalar if $foo hasn't been used yet. This might change in a
+future release.
+
+*foo{IO} is an alternative to the \*HANDLE mechanism given in
+L<perldata/"Typeglobs and Filehandles"> for passing filehandles
+into or out of subroutines, or storing into larger data structures.
+Its disadvantage is that it won't create a new filehandle for you.
+Its advantage is that you have no risk of clobbering more than you want
+to with a typeglob assignment, although if you assign to a scalar instead
+of a typeglob, you're ok.
+
+ splutter(*STDOUT);
+ splutter(*STDOUT{IO});
+
+ sub splutter {
+ my $fh = shift;
+ print $fh "her um well a hmmm\n";
+ }
+
+ $rec = get_rec(*STDIN);
+ $rec = get_rec(*STDIN{IO});
+
+ sub get_rec {
+ my $fh = shift;
+ return scalar <$fh>;
+ }
+
+=back
+
+=head2 Using References
+
+That's it for creating references. By now you're probably dying to
+know how to use references to get back to your long-lost data. There
+are several basic methods.
+
+=over 4
+
+=item 1.
+
+Anywhere you'd put an identifier (or chain of identifiers) as part
+of a variable or subroutine name, you can replace the identifier with
+a simple scalar variable containing a reference of the correct type:
+
+ $bar = $$scalarref;
+ push(@$arrayref, $filename);
+ $$arrayref[0] = "January";
+ $$hashref{"KEY"} = "VALUE";
+ &$coderef(1,2,3);
+ print $globref "output\n";
+
+It's important to understand that we are specifically I<NOT> dereferencing
+C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the
+scalar variable happens I<BEFORE> it does any key lookups. Anything more
+complicated than a simple scalar variable must use methods 2 or 3 below.
+However, a "simple scalar" includes an identifier that itself uses method
+1 recursively. Therefore, the following prints "howdy".
+
+ $refrefref = \\\"howdy";
+ print $$$$refrefref;
+
+=item 2.
+
+Anywhere you'd put an identifier (or chain of identifiers) as part of a
+variable or subroutine name, you can replace the identifier with a
+BLOCK returning a reference of the correct type. In other words, the
+previous examples could be written like this:
+
+ $bar = ${$scalarref};
+ push(@{$arrayref}, $filename);
+ ${$arrayref}[0] = "January";
+ ${$hashref}{"KEY"} = "VALUE";
+ &{$coderef}(1,2,3);
+ $globref->print("output\n"); # iff IO::Handle is loaded
+
+Admittedly, it's a little silly to use the curlies in this case, but
+the BLOCK can contain any arbitrary expression, in particular,
+subscripted expressions:
+
+ &{ $dispatch{$index} }(1,2,3); # call correct routine
+
+Because of being able to omit the curlies for the simple case of C<$$x>,
+people often make the mistake of viewing the dereferencing symbols as
+proper operators, and wonder about their precedence. If they were,
+though, you could use parentheses instead of braces. That's not the case.
+Consider the difference below; case 0 is a short-hand version of case 1,
+I<NOT> case 2:
+
+ $$hashref{"KEY"} = "VALUE"; # CASE 0
+ ${$hashref}{"KEY"} = "VALUE"; # CASE 1
+ ${$hashref{"KEY"}} = "VALUE"; # CASE 2
+ ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
+
+Case 2 is also deceptive in that you're accessing a variable
+called %hashref, not dereferencing through $hashref to the hash
+it's presumably referencing. That would be case 3.
+
+=item 3.
+
+Subroutine calls and lookups of individual array elements arise often
+enough that it gets cumbersome to use method 2. As a form of
+syntactic sugar, the examples for method 2 may be written:
+
+ $arrayref->[0] = "January"; # Array element
+ $hashref->{"KEY"} = "VALUE"; # Hash element
+ $coderef->(1,2,3); # Subroutine call
+
+The left side of the arrow can be any expression returning a reference,
+including a previous dereference. Note that C<$array[$x]> is I<NOT> the
+same thing as C<$array-E<gt>[$x]> here:
+
+ $array[$x]->{"foo"}->[0] = "January";
+
+This is one of the cases we mentioned earlier in which references could
+spring into existence when in an lvalue context. Before this
+statement, C<$array[$x]> may have been undefined. If so, it's
+automatically defined with a hash reference so that we can look up
+C<{"foo"}> in it. Likewise C<$array[$x]-E<gt>{"foo"}> will automatically get
+defined with an array reference so that we can look up C<[0]> in it.
+This process is called I<autovivification>.
+
+One more thing here. The arrow is optional I<BETWEEN> brackets
+subscripts, so you can shrink the above down to
+
+ $array[$x]{"foo"}[0] = "January";
+
+Which, in the degenerate case of using only ordinary arrays, gives you
+multidimensional arrays just like C's:
+
+ $score[$x][$y][$z] += 42;
+
+Well, okay, not entirely like C's arrays, actually. C doesn't know how
+to grow its arrays on demand. Perl does.
+
+=item 4.
+
+If a reference happens to be a reference to an object, then there are
+probably methods to access the things referred to, and you should probably
+stick to those methods unless you're in the class package that defines the
+object's methods. In other words, be nice, and don't violate the object's
+encapsulation without a very good reason. Perl does not enforce
+encapsulation. We are not totalitarians here. We do expect some basic
+civility though.
+
+=back
+
+The ref() operator may be used to determine what type of thing the
+reference is pointing to. See L<perlfunc>.
+
+The bless() operator may be used to associate the object a reference
+points to with a package functioning as an object class. See L<perlobj>.
+
+A typeglob may be dereferenced the same way a reference can, because
+the dereference syntax always indicates the kind of reference desired.
+So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.
+
+Here's a trick for interpolating a subroutine call into a string:
+
+ print "My sub returned @{[mysub(1,2,3)]} that time.\n";
+
+The way it works is that when the C<@{...}> is seen in the double-quoted
+string, it's evaluated as a block. The block creates a reference to an
+anonymous array containing the results of the call to C<mysub(1,2,3)>. So
+the whole block returns a reference to an array, which is then
+dereferenced by C<@{...}> and stuck into the double-quoted string. This
+chicanery is also useful for arbitrary expressions:
+
+ print "That yields @{[$n + 5]} widgets\n";
+
+=head2 Symbolic references
+
+We said that references spring into existence as necessary if they are
+undefined, but we didn't say what happens if a value used as a
+reference is already defined, but I<ISN'T> a hard reference. If you
+use it as a reference in this case, it'll be treated as a symbolic
+reference. That is, the value of the scalar is taken to be the I<NAME>
+of a variable, rather than a direct link to a (possibly) anonymous
+value.
+
+People frequently expect it to work like this. So it does.
+
+ $name = "foo";
+ $$name = 1; # Sets $foo
+ ${$name} = 2; # Sets $foo
+ ${$name x 2} = 3; # Sets $foofoo
+ $name->[0] = 4; # Sets $foo[0]
+ @$name = (); # Clears @foo
+ &$name(); # Calls &foo() (as in Perl 4)
+ $pack = "THAT";
+ ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
+
+This is very powerful, and slightly dangerous, in that it's possible
+to intend (with the utmost sincerity) to use a hard reference, and
+accidentally use a symbolic reference instead. To protect against
+that, you can say
+
+ use strict 'refs';
+
+and then only hard references will be allowed for the rest of the enclosing
+block. An inner block may countermand that with
+
+ no strict 'refs';
+
+Only package variables (globals, even if localized) are visible to
+symbolic references. Lexical variables (declared with my()) aren't in
+a symbol table, and thus are invisible to this mechanism. For example:
+
+ local $value = 10;
+ $ref = \$value;
+ {
+ my $value = 20;
+ print $$ref;
+ }
+
+This will still print 10, not 20. Remember that local() affects package
+variables, which are all "global" to the package.
+
+=head2 Not-so-symbolic references
+
+A new feature contributing to readability in perl version 5.001 is that the
+brackets around a symbolic reference behave more like quotes, just as they
+always have within a string. That is,
+
+ $push = "pop on ";
+ print "${push}over";
+
+has always meant to print "pop on over", despite the fact that push is
+a reserved word. This has been generalized to work the same outside
+of quotes, so that
+
+ print ${push} . "over";
+
+and even
+
+ print ${ push } . "over";
+
+will have the same effect. (This would have been a syntax error in
+Perl 5.000, though Perl 4 allowed it in the spaceless form.) Note that this
+construct is I<not> considered to be a symbolic reference when you're
+using strict refs:
+
+ use strict 'refs';
+ ${ bareword }; # Okay, means $bareword.
+ ${ "bareword" }; # Error, symbolic reference.
+
+Similarly, because of all the subscripting that is done using single
+words, we've applied the same rule to any bareword that is used for
+subscripting a hash. So now, instead of writing
+
+ $array{ "aaa" }{ "bbb" }{ "ccc" }
+
+you can write just
+
+ $array{ aaa }{ bbb }{ ccc }
+
+and not worry about whether the subscripts are reserved words. In the
+rare event that you do wish to do something like
+
+ $array{ shift }
+
+you can force interpretation as a reserved word by adding anything that
+makes it more than a bareword:
+
+ $array{ shift() }
+ $array{ +shift }
+ $array{ shift @_ }
+
+The B<-w> switch will warn you if it interprets a reserved word as a string.
+But it will no longer warn you about using lowercase words, because the
+string is effectively quoted.
+
+=head2 Pseudo-hashes: Using an array as a hash
+
+WARNING: This section describes an experimental feature. Details may
+change without notice in future versions.
+
+Beginning with release 5.005 of Perl you can use an array reference
+in some contexts that would normally require a hash reference. This
+allows you to access array elements using symbolic names, as if they
+were fields in a structure.
+
+For this to work, the array must contain extra information. The first
+element of the array has to be a hash reference that maps field names
+to array indices. Here is an example:
+
+ $struct = [{foo => 1, bar => 2}, "FOO", "BAR"];
+
+ $struct->{foo}; # same as $struct->[1], i.e. "FOO"
+ $struct->{bar}; # same as $struct->[2], i.e. "BAR"
+
+ keys %$struct; # will return ("foo", "bar") in some order
+ values %$struct; # will return ("FOO", "BAR") in same some order
+
+ while (my($k,$v) = each %$struct) {
+ print "$k => $v\n";
+ }
+
+Perl will raise an exception if you try to delete keys from a pseudo-hash
+or try to access nonexistent fields. For better performance, Perl can also
+do the translation from field names to array indices at compile time for
+typed object references. See L<fields>.
+
+
+=head2 Function Templates
+
+As explained above, a closure is an anonymous function with access to the
+lexical variables visible when that function was compiled. It retains
+access to those variables even though it doesn't get run until later,
+such as in a signal handler or a Tk callback.
+
+Using a closure as a function template allows us to generate many functions
+that act similarly. Suppopose you wanted functions named after the colors
+that generated HTML font changes for the various colors:
+
+ print "Be ", red("careful"), "with that ", green("light");
+
+The red() and green() functions would be very similar. To create these,
+we'll assign a closure to a typeglob of the name of the function we're
+trying to build.
+
+ @colors = qw(red blue green yellow orange purple violet);
+ for my $name (@colors) {
+ no strict 'refs'; # allow symbol table manipulation
+ *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };
+ }
+
+Now all those different functions appear to exist independently. You can
+call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on
+both compile time and memory use, and is less error-prone as well, since
+syntax checks happen at compile time. It's critical that any variables in
+the anonymous subroutine be lexicals in order to create a proper closure.
+That's the reasons for the C<my> on the loop iteration variable.
+
+This is one of the only places where giving a prototype to a closure makes
+much sense. If you wanted to impose scalar context on the arguments of
+these functions (probably not a wise idea for this particular example),
+you could have written it this way instead:
+
+ *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" };
+
+However, since prototype checking happens at compile time, the assignment
+above happens too late to be of much use. You could address this by
+putting the whole loop of assignments within a BEGIN block, forcing it
+to occur during compilation.
+
+Access to lexicals that change over type--like those in the C<for> loop
+above--only works with closures, not general subroutines. In the general
+case, then, named subroutines do not nest properly, although anonymous
+ones do. If you are accustomed to using nested subroutines in other
+programming languages with their own private variables, you'll have to
+work at it a bit in Perl. The intuitive coding of this kind of thing
+incurs mysterious warnings about ``will not stay shared''. For example,
+this won't work:
+
+ sub outer {
+ my $x = $_[0] + 35;
+ sub inner { return $x * 19 } # WRONG
+ return $x + inner();
+ }
+
+A work-around is the following:
+
+ sub outer {
+ my $x = $_[0] + 35;
+ local *inner = sub { return $x * 19 };
+ return $x + inner();
+ }
+
+Now inner() can only be called from within outer(), because of the
+temporary assignments of the closure (anonymous subroutine). But when
+it does, it has normal access to the lexical variable $x from the scope
+of outer().
+
+This has the interesting effect of creating a function local to another
+function, something not normally supported in Perl.
+
+=head1 WARNING
+
+You may not (usefully) use a reference as the key to a hash. It will be
+converted into a string:
+
+ $x{ \$a } = $a;
+
+If you try to dereference the key, it won't do a hard dereference, and
+you won't accomplish what you're attempting. You might want to do something
+more like
+
+ $r = \@a;
+ $x{ $r } = $r;
+
+And then at least you can use the values(), which will be
+real refs, instead of the keys(), which won't.
+
+The standard Tie::RefHash module provides a convenient workaround to this.
+
+=head1 SEE ALSO
+
+Besides the obvious documents, source code can be instructive.
+Some rather pathological examples of the use of references can be found
+in the F<t/op/ref.t> regression test in the Perl source directory.
+
+See also L<perldsc> and L<perllol> for how to use references to create
+complex data structures, and L<perltoot>, L<perlobj>, and L<perlbot>
+for how to use them to create objects.
OpenPOWER on IntegriCloud