diff options
author | markm <markm@FreeBSD.org> | 1998-09-09 07:00:04 +0000 |
---|---|---|
committer | markm <markm@FreeBSD.org> | 1998-09-09 07:00:04 +0000 |
commit | 4fcbc3669aa997848e15198cc9fb856287a6788c (patch) | |
tree | 58b20e81687d6d5931f120b50802ed21225bf440 /contrib/perl5/pod/perlobj.pod | |
download | FreeBSD-src-4fcbc3669aa997848e15198cc9fb856287a6788c.zip FreeBSD-src-4fcbc3669aa997848e15198cc9fb856287a6788c.tar.gz |
Initial import of Perl5. The king is dead; long live the king!
Diffstat (limited to 'contrib/perl5/pod/perlobj.pod')
-rw-r--r-- | contrib/perl5/pod/perlobj.pod | 541 |
1 files changed, 541 insertions, 0 deletions
diff --git a/contrib/perl5/pod/perlobj.pod b/contrib/perl5/pod/perlobj.pod new file mode 100644 index 0000000..f10fbdf --- /dev/null +++ b/contrib/perl5/pod/perlobj.pod @@ -0,0 +1,541 @@ +=head1 NAME + +perlobj - Perl objects + +=head1 DESCRIPTION + +First of all, you need to understand what references are in Perl. +See L<perlref> for that. Second, if you still find the following +reference work too complicated, a tutorial on object-oriented programming +in Perl can be found in L<perltoot>. + +If you're still with us, then +here are three very simple definitions that you should find reassuring. + +=over 4 + +=item 1. + +An object is simply a reference that happens to know which class it +belongs to. + +=item 2. + +A class is simply a package that happens to provide methods to deal +with object references. + +=item 3. + +A method is simply a subroutine that expects an object reference (or +a package name, for class methods) as the first argument. + +=back + +We'll cover these points now in more depth. + +=head2 An Object is Simply a Reference + +Unlike say C++, Perl doesn't provide any special syntax for +constructors. A constructor is merely a subroutine that returns a +reference to something "blessed" into a class, generally the +class that the subroutine is defined in. Here is a typical +constructor: + + package Critter; + sub new { bless {} } + +That word C<new> isn't special. You could have written +a construct this way, too: + + package Critter; + sub spawn { bless {} } + +In fact, this might even be preferable, because the C++ programmers won't +be tricked into thinking that C<new> works in Perl as it does in C++. +It doesn't. We recommend that you name your constructors whatever +makes sense in the context of the problem you're solving. For example, +constructors in the Tk extension to Perl are named after the widgets +they create. + +One thing that's different about Perl constructors compared with those in +C++ is that in Perl, they have to allocate their own memory. (The other +things is that they don't automatically call overridden base-class +constructors.) The C<{}> allocates an anonymous hash containing no +key/value pairs, and returns it The bless() takes that reference and +tells the object it references that it's now a Critter, and returns +the reference. This is for convenience, because the referenced object +itself knows that it has been blessed, and the reference to it could +have been returned directly, like this: + + sub new { + my $self = {}; + bless $self; + return $self; + } + +In fact, you often see such a thing in more complicated constructors +that wish to call methods in the class as part of the construction: + + sub new { + my $self = {}; + bless $self; + $self->initialize(); + return $self; + } + +If you care about inheritance (and you should; see +L<perlmod/"Modules: Creation, Use, and Abuse">), +then you want to use the two-arg form of bless +so that your constructors may be inherited: + + sub new { + my $class = shift; + my $self = {}; + bless $self, $class; + $self->initialize(); + return $self; + } + +Or if you expect people to call not just C<CLASS-E<gt>new()> but also +C<$obj-E<gt>new()>, then use something like this. The initialize() +method used will be of whatever $class we blessed the +object into: + + sub new { + my $this = shift; + my $class = ref($this) || $this; + my $self = {}; + bless $self, $class; + $self->initialize(); + return $self; + } + +Within the class package, the methods will typically deal with the +reference as an ordinary reference. Outside the class package, +the reference is generally treated as an opaque value that may +be accessed only through the class's methods. + +A constructor may re-bless a referenced object currently belonging to +another class, but then the new class is responsible for all cleanup +later. The previous blessing is forgotten, as an object may belong +to only one class at a time. (Although of course it's free to +inherit methods from many classes.) If you find yourself having to +do this, the parent class is probably misbehaving, though. + +A clarification: Perl objects are blessed. References are not. Objects +know which package they belong to. References do not. The bless() +function uses the reference to find the object. Consider +the following example: + + $a = {}; + $b = $a; + bless $a, BLAH; + print "\$b is a ", ref($b), "\n"; + +This reports $b as being a BLAH, so obviously bless() +operated on the object and not on the reference. + +=head2 A Class is Simply a Package + +Unlike say C++, Perl doesn't provide any special syntax for class +definitions. You use a package as a class by putting method +definitions into the class. + +There is a special array within each package called @ISA, which says +where else to look for a method if you can't find it in the current +package. This is how Perl implements inheritance. Each element of the +@ISA array is just the name of another package that happens to be a +class package. The classes are searched (depth first) for missing +methods in the order that they occur in @ISA. The classes accessible +through @ISA are known as base classes of the current class. + +All classes implicitly inherit from class C<UNIVERSAL> as their +last base class. Several commonly used methods are automatically +supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for +more details. + +If a missing method is found in one of the base classes, it is cached +in the current class for efficiency. Changing @ISA or defining new +subroutines invalidates the cache and causes Perl to do the lookup again. + +If neither the current class, its named base classes, nor the UNIVERSAL +class contains the requested method, these three places are searched +all over again, this time looking for a method named AUTOLOAD(). If an +AUTOLOAD is found, this method is called on behalf of the missing method, +setting the package global $AUTOLOAD to be the fully qualified name of +the method that was intended to be called. + +If none of that works, Perl finally gives up and complains. + +Perl classes do method inheritance only. Data inheritance is left up +to the class itself. By and large, this is not a problem in Perl, +because most classes model the attributes of their object using an +anonymous hash, which serves as its own little namespace to be carved up +by the various classes that might want to do something with the object. +The only problem with this is that you can't sure that you aren't using +a piece of the hash that isn't already used. A reasonable workaround +is to prepend your fieldname in the hash with the package name. + + sub bump { + my $self = shift; + $self->{ __PACKAGE__ . ".count"}++; + } + +=head2 A Method is Simply a Subroutine + +Unlike say C++, Perl doesn't provide any special syntax for method +definition. (It does provide a little syntax for method invocation +though. More on that later.) A method expects its first argument +to be the object (reference) or package (string) it is being invoked on. There are just two +types of methods, which we'll call class and instance. +(Sometimes you'll hear these called static and virtual, in honor of +the two C++ method types they most closely resemble.) + +A class method expects a class name as the first argument. It +provides functionality for the class as a whole, not for any individual +object belonging to the class. Constructors are typically class +methods. Many class methods simply ignore their first argument, because +they already know what package they're in, and don't care what package +they were invoked via. (These aren't necessarily the same, because +class methods follow the inheritance tree just like ordinary instance +methods.) Another typical use for class methods is to look up an +object by name: + + sub find { + my ($class, $name) = @_; + $objtable{$name}; + } + +An instance method expects an object reference as its first argument. +Typically it shifts the first argument into a "self" or "this" variable, +and then uses that as an ordinary reference. + + sub display { + my $self = shift; + my @keys = @_ ? @_ : sort keys %$self; + foreach $key (@keys) { + print "\t$key => $self->{$key}\n"; + } + } + +=head2 Method Invocation + +There are two ways to invoke a method, one of which you're already +familiar with, and the other of which will look familiar. Perl 4 +already had an "indirect object" syntax that you use when you say + + print STDERR "help!!!\n"; + +This same syntax can be used to call either class or instance methods. +We'll use the two methods defined above, the class method to lookup +an object reference and the instance method to print out its attributes. + + $fred = find Critter "Fred"; + display $fred 'Height', 'Weight'; + +These could be combined into one statement by using a BLOCK in the +indirect object slot: + + display {find Critter "Fred"} 'Height', 'Weight'; + +For C++ fans, there's also a syntax using -E<gt> notation that does exactly +the same thing. The parentheses are required if there are any arguments. + + $fred = Critter->find("Fred"); + $fred->display('Height', 'Weight'); + +or in one statement, + + Critter->find("Fred")->display('Height', 'Weight'); + +There are times when one syntax is more readable, and times when the +other syntax is more readable. The indirect object syntax is less +cluttered, but it has the same ambiguity as ordinary list operators. +Indirect object method calls are parsed using the same rule as list +operators: "If it looks like a function, it is a function". (Presuming +for the moment that you think two words in a row can look like a +function name. C++ programmers seem to think so with some regularity, +especially when the first word is "new".) Thus, the parentheses of + + new Critter ('Barney', 1.5, 70) + +are assumed to surround ALL the arguments of the method call, regardless +of what comes after. Saying + + new Critter ('Bam' x 2), 1.4, 45 + +would be equivalent to + + Critter->new('Bam' x 2), 1.4, 45 + +which is unlikely to do what you want. + +There are times when you wish to specify which class's method to use. +In this case, you can call your method as an ordinary subroutine +call, being sure to pass the requisite first argument explicitly: + + $fred = MyCritter::find("Critter", "Fred"); + MyCritter::display($fred, 'Height', 'Weight'); + +Note however, that this does not do any inheritance. If you wish +merely to specify that Perl should I<START> looking for a method in a +particular package, use an ordinary method call, but qualify the method +name with the package like this: + + $fred = Critter->MyCritter::find("Fred"); + $fred->MyCritter::display('Height', 'Weight'); + +If you're trying to control where the method search begins I<and> you're +executing in the class itself, then you may use the SUPER pseudo class, +which says to start looking in your base class's @ISA list without having +to name it explicitly: + + $self->SUPER::display('Height', 'Weight'); + +Please note that the C<SUPER::> construct is meaningful I<only> within the +class. + +Sometimes you want to call a method when you don't know the method name +ahead of time. You can use the arrow form, replacing the method name +with a simple scalar variable containing the method name: + + $method = $fast ? "findfirst" : "findbest"; + $fred->$method(@args); + +=head2 Default UNIVERSAL methods + +The C<UNIVERSAL> package automatically contains the following methods that +are inherited by all other classes: + +=over 4 + +=item isa(CLASS) + +C<isa> returns I<true> if its object is blessed into a subclass of C<CLASS> + +C<isa> is also exportable and can be called as a sub with two arguments. This +allows the ability to check what a reference points to. Example + + use UNIVERSAL qw(isa); + + if(isa($ref, 'ARRAY')) { + #... + } + +=item can(METHOD) + +C<can> checks to see if its object has a method called C<METHOD>, +if it does then a reference to the sub is returned, if it does not then +I<undef> is returned. + +=item VERSION( [NEED] ) + +C<VERSION> returns the version number of the class (package). If the +NEED argument is given then it will check that the current version (as +defined by the $VERSION variable in the given package) not less than +NEED; it will die if this is not the case. This method is normally +called as a class method. This method is called automatically by the +C<VERSION> form of C<use>. + + use A 1.2 qw(some imported subs); + # implies: + A->VERSION(1.2); + +=back + +B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and +C<isa> uses a very similar method and cache-ing strategy. This may cause +strange effects if the Perl code dynamically changes @ISA in any package. + +You may add other methods to the UNIVERSAL class via Perl or XS code. +You do not need to C<use UNIVERSAL> in order to make these methods +available to your program. This is necessary only if you wish to +have C<isa> available as a plain subroutine in the current package. + +=head2 Destructors + +When the last reference to an object goes away, the object is +automatically destroyed. (This may even be after you exit, if you've +stored references in global variables.) If you want to capture control +just before the object is freed, you may define a DESTROY method in +your class. It will automatically be called at the appropriate moment, +and you can do any extra cleanup you need to do. Perl passes a reference +to the object under destruction as the first (and only) argument. Beware +that the reference is a read-only value, and cannot be modified by +manipulating C<$_[0]> within the destructor. The object itself (i.e. +the thingy the reference points to, namely C<${$_[0]}>, C<@{$_[0]}>, +C<%{$_[0]}> etc.) is not similarly constrained. + +If you arrange to re-bless the reference before the destructor returns, +perl will again call the DESTROY method for the re-blessed object after +the current one returns. This can be used for clean delegation of +object destruction, or for ensuring that destructors in the base classes +of your choosing get called. Explicitly calling DESTROY is also possible, +but is usually never needed. + +Do not confuse the foregoing with how objects I<CONTAINED> in the current +one are destroyed. Such objects will be freed and destroyed automatically +when the current object is freed, provided no other references to them exist +elsewhere. + +=head2 WARNING + +While indirect object syntax may well be appealing to English speakers and +to C++ programmers, be not seduced! It suffers from two grave problems. + +The first problem is that an indirect object is limited to a name, +a scalar variable, or a block, because it would have to do too much +lookahead otherwise, just like any other postfix dereference in the +language. (These are the same quirky rules as are used for the filehandle +slot in functions like C<print> and C<printf>.) This can lead to horribly +confusing precedence problems, as in these next two lines: + + move $obj->{FIELD}; # probably wrong! + move $ary[$i]; # probably wrong! + +Those actually parse as the very surprising: + + $obj->move->{FIELD}; # Well, lookee here + $ary->move->[$i]; # Didn't expect this one, eh? + +Rather than what you might have expected: + + $obj->{FIELD}->move(); # You should be so lucky. + $ary[$i]->move; # Yeah, sure. + +The left side of ``-E<gt>'' is not so limited, because it's an infix operator, +not a postfix operator. + +As if that weren't bad enough, think about this: Perl must guess I<at +compile time> whether C<name> and C<move> above are functions or methods. +Usually Perl gets it right, but when it doesn't it, you get a function +call compiled as a method, or vice versa. This can introduce subtle +bugs that are hard to unravel. For example, calling a method C<new> +in indirect notation--as C++ programmers are so wont to do--can +be miscompiled into a subroutine call if there's already a C<new> +function in scope. You'd end up calling the current package's C<new> +as a subroutine, rather than the desired class's method. The compiler +tries to cheat by remembering bareword C<require>s, but the grief if it +messes up just isn't worth the years of debugging it would likely take +you to to track such subtle bugs down. + +The infix arrow notation using ``C<-E<gt>>'' doesn't suffer from either +of these disturbing ambiguities, so we recommend you use it exclusively. + +=head2 Summary + +That's about all there is to it. Now you need just to go off and buy a +book about object-oriented design methodology, and bang your forehead +with it for the next six months or so. + +=head2 Two-Phased Garbage Collection + +For most purposes, Perl uses a fast and simple reference-based +garbage collection system. For this reason, there's an extra +dereference going on at some level, so if you haven't built +your Perl executable using your C compiler's C<-O> flag, performance +will suffer. If you I<have> built Perl with C<cc -O>, then this +probably won't matter. + +A more serious concern is that unreachable memory with a non-zero +reference count will not normally get freed. Therefore, this is a bad +idea: + + { + my $a; + $a = \$a; + } + +Even thought $a I<should> go away, it can't. When building recursive data +structures, you'll have to break the self-reference yourself explicitly +if you don't care to leak. For example, here's a self-referential +node such as one might use in a sophisticated tree structure: + + sub new_node { + my $self = shift; + my $class = ref($self) || $self; + my $node = {}; + $node->{LEFT} = $node->{RIGHT} = $node; + $node->{DATA} = [ @_ ]; + return bless $node => $class; + } + +If you create nodes like that, they (currently) won't go away unless you +break their self reference yourself. (In other words, this is not to be +construed as a feature, and you shouldn't depend on it.) + +Almost. + +When an interpreter thread finally shuts down (usually when your program +exits), then a rather costly but complete mark-and-sweep style of garbage +collection is performed, and everything allocated by that thread gets +destroyed. This is essential to support Perl as an embedded or a +multithreadable language. For example, this program demonstrates Perl's +two-phased garbage collection: + + #!/usr/bin/perl + package Subtle; + + sub new { + my $test; + $test = \$test; + warn "CREATING " . \$test; + return bless \$test; + } + + sub DESTROY { + my $self = shift; + warn "DESTROYING $self"; + } + + package main; + + warn "starting program"; + { + my $a = Subtle->new; + my $b = Subtle->new; + $$a = 0; # break selfref + warn "leaving block"; + } + + warn "just exited block"; + warn "time to die..."; + exit; + +When run as F</tmp/test>, the following output is produced: + + starting program at /tmp/test line 18. + CREATING SCALAR(0x8e5b8) at /tmp/test line 7. + CREATING SCALAR(0x8e57c) at /tmp/test line 7. + leaving block at /tmp/test line 23. + DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13. + just exited block at /tmp/test line 26. + time to die... at /tmp/test line 27. + DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. + +Notice that "global destruction" bit there? That's the thread +garbage collector reaching the unreachable. + +Objects are always destructed, even when regular refs aren't and in fact +are destructed in a separate pass before ordinary refs just to try to +prevent object destructors from using refs that have been themselves +destructed. Plain refs are only garbage-collected if the destruct level +is greater than 0. You can test the higher levels of global destruction +by setting the PERL_DESTRUCT_LEVEL environment variable, presuming +C<-DDEBUGGING> was enabled during perl build time. + +A more complete garbage collection strategy will be implemented +at a future date. + +In the meantime, the best solution is to create a non-recursive container +class that holds a pointer to the self-referential data structure. +Define a DESTROY method for the containing object's class that manually +breaks the circularities in the self-referential structure. + +=head1 SEE ALSO + +A kinder, gentler tutorial on object-oriented programming in Perl can +be found in L<perltoot>. +You should also check out L<perlbot> for other object tricks, traps, and tips, +as well as L<perlmodlib> for some style guides on constructing both modules +and classes. |