diff options
Diffstat (limited to 'contrib/perl5/pod/perlfaq9.pod')
-rw-r--r-- | contrib/perl5/pod/perlfaq9.pod | 602 |
1 files changed, 0 insertions, 602 deletions
diff --git a/contrib/perl5/pod/perlfaq9.pod b/contrib/perl5/pod/perlfaq9.pod deleted file mode 100644 index 9676380..0000000 --- a/contrib/perl5/pod/perlfaq9.pod +++ /dev/null @@ -1,602 +0,0 @@ -=head1 NAME - -perlfaq9 - Networking ($Revision: 1.26 $, $Date: 1999/05/23 16:08:30 $) - -=head1 DESCRIPTION - -This section deals with questions related to networking, the internet, -and a few on the web. - -=head2 My CGI script runs from the command line but not the browser. (500 Server Error) - -If you can demonstrate that you've read the following FAQs and that -your problem isn't something simple that can be easily answered, you'll -probably receive a courteous and useful reply to your question if you -post it on comp.infosystems.www.authoring.cgi (if it's something to do -with HTTP, HTML, or the CGI protocols). Questions that appear to be Perl -questions but are really CGI ones that are posted to comp.lang.perl.misc -may not be so well received. - -The useful FAQs and related documents are: - - CGI FAQ - http://www.webthing.com/tutorials/cgifaq.html - - Web FAQ - http://www.boutell.com/faq/ - - WWW Security FAQ - http://www.w3.org/Security/Faq/ - - HTTP Spec - http://www.w3.org/pub/WWW/Protocols/HTTP/ - - HTML Spec - http://www.w3.org/TR/REC-html40/ - http://www.w3.org/pub/WWW/MarkUp/ - - CGI Spec - http://www.w3.org/CGI/ - - CGI Security FAQ - http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt - -=head2 How can I get better error messages from a CGI program? - -Use the CGI::Carp module. It replaces C<warn> and C<die>, plus the -normal Carp modules C<carp>, C<croak>, and C<confess> functions with -more verbose and safer versions. It still sends them to the normal -server error log. - - use CGI::Carp; - warn "This is a complaint"; - die "But this one is serious"; - -The following use of CGI::Carp also redirects errors to a file of your choice, -placed in a BEGIN block to catch compile-time warnings as well: - - BEGIN { - use CGI::Carp qw(carpout); - open(LOG, ">>/var/local/cgi-logs/mycgi-log") - or die "Unable to append to mycgi-log: $!\n"; - carpout(*LOG); - } - -You can even arrange for fatal errors to go back to the client browser, -which is nice for your own debugging, but might confuse the end user. - - use CGI::Carp qw(fatalsToBrowser); - die "Bad error here"; - -Even if the error happens before you get the HTTP header out, the module -will try to take care of this to avoid the dreaded server 500 errors. -Normal warnings still go out to the server error log (or wherever -you've sent them with C<carpout>) with the application name and date -stamp prepended. - -=head2 How do I remove HTML from a string? - -The most correct way (albeit not the fastest) is to use HTML::Parser -from CPAN. Another mostly correct -way is to use HTML::FormatText which not only removes HTML but also -attempts to do a little simple formatting of the resulting plain text. - -Many folks attempt a simple-minded regular expression approach, like -C<< s/<.*?>//g >>, but that fails in many cases because the tags -may continue over line breaks, they may contain quoted angle-brackets, -or HTML comment may be present. Plus, folks forget to convert -entities--like C<<> for example. - -Here's one "simple-minded" approach, that works for most files: - - #!/usr/bin/perl -p0777 - s/<(?:[^>'"]*|(['"]).*?\1)*>//gs - -If you want a more complete solution, see the 3-stage striphtml -program in -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz -. - -Here are some tricky cases that you should think about when picking -a solution: - - <IMG SRC = "foo.gif" ALT = "A > B"> - - <IMG SRC = "foo.gif" - ALT = "A > B"> - - <!-- <A comment> --> - - <script>if (a<b && a>c)</script> - - <# Just data #> - - <![INCLUDE CDATA [ >>>>>>>>>>>> ]]> - -If HTML comments include other tags, those solutions would also break -on text like this: - - <!-- This section commented out. - <B>You can't see me!</B> - --> - -=head2 How do I extract URLs? - -A quick but imperfect approach is - - #!/usr/bin/perl -n00 - # qxurl - tchrist@perl.com - print "$2\n" while m{ - < \s* - A \s+ HREF \s* = \s* (["']) (.*?) \1 - \s* > - }gsix; - -This version does not adjust relative URLs, understand alternate -bases, deal with HTML comments, deal with HREF and NAME attributes -in the same tag, understand extra qualifiers like TARGET, or accept -URLs themselves as arguments. It also runs about 100x faster than a -more "complete" solution using the LWP suite of modules, such as the -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz program. - -=head2 How do I download a file from the user's machine? How do I open a file on another machine? - -In the context of an HTML form, you can use what's known as -B<multipart/form-data> encoding. The CGI.pm module (available from -CPAN) supports this in the start_multipart_form() method, which isn't -the same as the startform() method. - -=head2 How do I make a pop-up menu in HTML? - -Use the B<< <SELECT> >> and B<< <OPTION> >> tags. The CGI.pm -module (available from CPAN) supports this widget, as well as many -others, including some that it cleverly synthesizes on its own. - -=head2 How do I fetch an HTML file? - -One approach, if you have the lynx text-based HTML browser installed -on your system, is this: - - $html_code = `lynx -source $url`; - $text_data = `lynx -dump $url`; - -The libwww-perl (LWP) modules from CPAN provide a more powerful way -to do this. They don't require lynx, but like lynx, can still work -through proxies: - - # simplest version - use LWP::Simple; - $content = get($URL); - - # or print HTML from a URL - use LWP::Simple; - getprint "http://www.linpro.no/lwp/"; - - # or print ASCII from HTML from a URL - # also need HTML-Tree package from CPAN - use LWP::Simple; - use HTML::Parser; - use HTML::FormatText; - my ($html, $ascii); - $html = get("http://www.perl.com/"); - defined $html - or die "Can't fetch HTML from http://www.perl.com/"; - $ascii = HTML::FormatText->new->format(parse_html($html)); - print $ascii; - -=head2 How do I automate an HTML form submission? - -If you're submitting values using the GET method, create a URL and encode -the form using the C<query_form> method: - - use LWP::Simple; - use URI::URL; - - my $url = url('http://www.perl.com/cgi-bin/cpan_mod'); - $url->query_form(module => 'DB_File', readme => 1); - $content = get($url); - -If you're using the POST method, create your own user agent and encode -the content appropriately. - - use HTTP::Request::Common qw(POST); - use LWP::UserAgent; - - $ua = LWP::UserAgent->new(); - my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod', - [ module => 'DB_File', readme => 1 ]; - $content = $ua->request($req)->as_string; - -=head2 How do I decode or create those %-encodings on the web? - - -If you are writing a CGI script, you should be using the CGI.pm module -that comes with perl, or some other equivalent module. The CGI module -automatically decodes queries for you, and provides an escape() -function to handle encoding. - - -The best source of detailed information on URI encoding is RFC 2396. -Basically, the following substitutions do it: - - s/([^\w()'*~!.-])/sprintf '%%%02x', $1/eg; # encode - - s/%([A-Fa-f\d]{2})/chr hex $1/eg; # decode - -However, you should only apply them to individual URI components, not -the entire URI, otherwise you'll lose information and generally mess -things up. If that didn't explain it, don't worry. Just go read -section 2 of the RFC, it's probably the best explanation there is. - -RFC 2396 also contains a lot of other useful information, including a -regexp for breaking any arbitrary URI into components (Appendix B). - -=head2 How do I redirect to another page? - -According to RFC 2616, "Hypertext Transfer Protocol -- HTTP/1.1", the -preferred method is to send a C<Location:> header instead of a -C<Content-Type:> header: - - Location: http://www.domain.com/newpage - -Note that relative URLs in these headers can cause strange effects -because of "optimizations" that servers do. - - $url = "http://www.perl.com/CPAN/"; - print "Location: $url\n\n"; - exit; - -To target a particular frame in a frameset, include the "Window-target:" -in the header. - - print <<EOF; - Location: http://www.domain.com/newpage - Window-target: <FrameName> - - EOF - -To be correct to the spec, each of those virtual newlines should -really be physical C<"\015\012"> sequences by the time your message is -received by the client browser. Except for NPH scripts, though, that -local newline should get translated by your server into standard form, -so you shouldn't have a problem here, even if you are stuck on MacOS. -Everybody else probably won't even notice. - -=head2 How do I put a password on my web pages? - -That depends. You'll need to read the documentation for your web -server, or perhaps check some of the other FAQs referenced above. - -=head2 How do I edit my .htpasswd and .htgroup files with Perl? - -The HTTPD::UserAdmin and HTTPD::GroupAdmin modules provide a -consistent OO interface to these files, regardless of how they're -stored. Databases may be text, dbm, Berkley DB or any database with a -DBI compatible driver. HTTPD::UserAdmin supports files used by the -`Basic' and `Digest' authentication schemes. Here's an example: - - use HTTPD::UserAdmin (); - HTTPD::UserAdmin - ->new(DB => "/foo/.htpasswd") - ->add($username => $password); - -=head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things? - -Read the CGI security FAQ, at -http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html , and the -Perl/CGI FAQ at -http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html . - -In brief: use tainting (see L<perlsec>), which makes sure that data -from outside your script (eg, CGI parameters) are never used in -C<eval> or C<system> calls. In addition to tainting, never use the -single-argument form of system() or exec(). Instead, supply the -command and arguments as a list, which prevents shell globbing. - -=head2 How do I parse a mail header? - -For a quick-and-dirty solution, try this solution derived -from L<perlfunc/split>: - - $/ = ''; - $header = <MSG>; - $header =~ s/\n\s+/ /g; # merge continuation lines - %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header ); - -That solution doesn't do well if, for example, you're trying to -maintain all the Received lines. A more complete approach is to use -the Mail::Header module from CPAN (part of the MailTools package). - -=head2 How do I decode a CGI form? - -You use a standard module, probably CGI.pm. Under no circumstances -should you attempt to do so by hand! - -You'll see a lot of CGI programs that blindly read from STDIN the number -of bytes equal to CONTENT_LENGTH for POSTs, or grab QUERY_STRING for -decoding GETs. These programs are very poorly written. They only work -sometimes. They typically forget to check the return value of the read() -system call, which is a cardinal sin. They don't handle HEAD requests. -They don't handle multipart forms used for file uploads. They don't deal -with GET/POST combinations where query fields are in more than one place. -They don't deal with keywords in the query string. - -In short, they're bad hacks. Resist them at all costs. Please do not be -tempted to reinvent the wheel. Instead, use the CGI.pm or CGI_Lite.pm -(available from CPAN), or if you're trapped in the module-free land -of perl1 .. perl4, you might look into cgi-lib.pl (available from -http://cgi-lib.stanford.edu/cgi-lib/ ). - -Make sure you know whether to use a GET or a POST in your form. -GETs should only be used for something that doesn't update the server. -Otherwise you can get mangled databases and repeated feedback mail -messages. The fancy word for this is ``idempotency''. This simply -means that there should be no difference between making a GET request -for a particular URL once or multiple times. This is because the -HTTP protocol definition says that a GET request may be cached by the -browser, or server, or an intervening proxy. POST requests cannot be -cached, because each request is independent and matters. Typically, -POST requests change or depend on state on the server (query or update -a database, send mail, or purchase a computer). - -=head2 How do I check a valid mail address? - -You can't, at least, not in real time. Bummer, eh? - -Without sending mail to the address and seeing whether there's a human -on the other hand to answer you, you cannot determine whether a mail -address is valid. Even if you apply the mail header standard, you -can have problems, because there are deliverable addresses that aren't -RFC-822 (the mail header standard) compliant, and addresses that aren't -deliverable which are compliant. - -Many are tempted to try to eliminate many frequently-invalid -mail addresses with a simple regex, such as -C</^[\w.-]+\@(?:[\w-]+\.)+\w+$/>. It's a very bad idea. However, -this also throws out many valid ones, and says nothing about -potential deliverability, so it is not suggested. Instead, see -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz, -which actually checks against the full RFC spec (except for nested -comments), looks for addresses you may not wish to accept mail to -(say, Bill Clinton or your postmaster), and then makes sure that the -hostname given can be looked up in the DNS MX records. It's not fast, -but it works for what it tries to do. - -Our best advice for verifying a person's mail address is to have them -enter their address twice, just as you normally do to change a password. -This usually weeds out typos. If both versions match, send -mail to that address with a personal message that looks somewhat like: - - Dear someuser@host.com, - - Please confirm the mail address you gave us Wed May 6 09:38:41 - MDT 1998 by replying to this message. Include the string - "Rumpelstiltskin" in that reply, but spelled in reverse; that is, - start with "Nik...". Once this is done, your confirmed address will - be entered into our records. - -If you get the message back and they've followed your directions, -you can be reasonably assured that it's real. - -A related strategy that's less open to forgery is to give them a PIN -(personal ID number). Record the address and PIN (best that it be a -random one) for later processing. In the mail you send, ask them to -include the PIN in their reply. But if it bounces, or the message is -included via a ``vacation'' script, it'll be there anyway. So it's -best to ask them to mail back a slight alteration of the PIN, such as -with the characters reversed, one added or subtracted to each digit, etc. - -=head2 How do I decode a MIME/BASE64 string? - -The MIME-Base64 package (available from CPAN) handles this as well as -the MIME/QP encoding. Decoding BASE64 becomes as simple as: - - use MIME::Base64; - $decoded = decode_base64($encoded); - -The MIME-Tools package (available from CPAN) supports extraction with -decoding of BASE64 encoded attachments and content directly from email -messages. - -If the string to decode is short (less than 84 bytes long) -a more direct approach is to use the unpack() function's "u" -format after minor transliterations: - - tr#A-Za-z0-9+/##cd; # remove non-base64 chars - tr#A-Za-z0-9+/# -_#; # convert to uuencoded format - $len = pack("c", 32 + 0.75*length); # compute length byte - print unpack("u", $len . $_); # uudecode and print - -=head2 How do I return the user's mail address? - -On systems that support getpwuid, the $< variable, and the -Sys::Hostname module (which is part of the standard perl distribution), -you can probably try using something like this: - - use Sys::Hostname; - $address = sprintf('%s@%s', scalar getpwuid($<), hostname); - -Company policies on mail address can mean that this generates addresses -that the company's mail system will not accept, so you should ask for -users' mail addresses when this matters. Furthermore, not all systems -on which Perl runs are so forthcoming with this information as is Unix. - -The Mail::Util module from CPAN (part of the MailTools package) provides a -mailaddress() function that tries to guess the mail address of the user. -It makes a more intelligent guess than the code above, using information -given when the module was installed, but it could still be incorrect. -Again, the best way is often just to ask the user. - -=head2 How do I send mail? - -Use the C<sendmail> program directly: - - open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq") - or die "Can't fork for sendmail: $!\n"; - print SENDMAIL <<"EOF"; - From: User Originating Mail <me\@host> - To: Final Destination <you\@otherhost> - Subject: A relevant subject line - - Body of the message goes here after the blank line - in as many lines as you like. - EOF - close(SENDMAIL) or warn "sendmail didn't close nicely"; - -The B<-oi> option prevents sendmail from interpreting a line consisting -of a single dot as "end of message". The B<-t> option says to use the -headers to decide who to send the message to, and B<-odq> says to put -the message into the queue. This last option means your message won't -be immediately delivered, so leave it out if you want immediate -delivery. - -Alternate, less convenient approaches include calling mail (sometimes -called mailx) directly or simply opening up port 25 have having an -intimate conversation between just you and the remote SMTP daemon, -probably sendmail. - -Or you might be able use the CPAN module Mail::Mailer: - - use Mail::Mailer; - - $mailer = Mail::Mailer->new(); - $mailer->open({ From => $from_address, - To => $to_address, - Subject => $subject, - }) - or die "Can't open: $!\n"; - print $mailer $body; - $mailer->close(); - -The Mail::Internet module uses Net::SMTP which is less Unix-centric than -Mail::Mailer, but less reliable. Avoid raw SMTP commands. There -are many reasons to use a mail transport agent like sendmail. These -include queueing, MX records, and security. - -=head2 How do I use MIME to make an attachment to a mail message? - -This answer is extracted directly from the MIME::Lite documentation. -Create a multipart message (i.e., one with attachments). - - use MIME::Lite; - - ### Create a new multipart message: - $msg = MIME::Lite->new( - From =>'me@myhost.com', - To =>'you@yourhost.com', - Cc =>'some@other.com, some@more.com', - Subject =>'A message with 2 parts...', - Type =>'multipart/mixed' - ); - - ### Add parts (each "attach" has same arguments as "new"): - $msg->attach(Type =>'TEXT', - Data =>"Here's the GIF file you wanted" - ); - $msg->attach(Type =>'image/gif', - Path =>'aaa000123.gif', - Filename =>'logo.gif' - ); - - $text = $msg->as_string; - -MIME::Lite also includes a method for sending these things. - - $msg->send; - -This defaults to using L<sendmail(1)> but can be customized to use -SMTP via L<Net::SMTP>. - -=head2 How do I read mail? - -While you could use the Mail::Folder module from CPAN (part of the -MailFolder package) or the Mail::Internet module from CPAN (also part -of the MailTools package), often a module is overkill. Here's a -mail sorter. - - #!/usr/bin/perl - # bysub1 - simple sort by subject - my(@msgs, @sub); - my $msgno = -1; - $/ = ''; # paragraph reads - while (<>) { - if (/^From/m) { - /^Subject:\s*(?:Re:\s*)*(.*)/mi; - $sub[++$msgno] = lc($1) || ''; - } - $msgs[$msgno] .= $_; - } - for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) { - print $msgs[$i]; - } - -Or more succinctly, - - #!/usr/bin/perl -n00 - # bysub2 - awkish sort-by-subject - BEGIN { $msgno = -1 } - $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m; - $msg[$msgno] .= $_; - END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] } - -=head2 How do I find out my hostname/domainname/IP address? - -The normal way to find your own hostname is to call the C<`hostname`> -program. While sometimes expedient, this has some problems, such as -not knowing whether you've got the canonical name or not. It's one of -those tradeoffs of convenience versus portability. - -The Sys::Hostname module (part of the standard perl distribution) will -give you the hostname after which you can find out the IP address -(assuming you have working DNS) with a gethostbyname() call. - - use Socket; - use Sys::Hostname; - my $host = hostname(); - my $addr = inet_ntoa(scalar gethostbyname($host || 'localhost')); - -Probably the simplest way to learn your DNS domain name is to grok -it out of /etc/resolv.conf, at least under Unix. Of course, this -assumes several things about your resolv.conf configuration, including -that it exists. - -(We still need a good DNS domain name-learning method for non-Unix -systems.) - -=head2 How do I fetch a news article or the active newsgroups? - -Use the Net::NNTP or News::NNTPClient modules, both available from CPAN. -This can make tasks like fetching the newsgroup list as simple as - - perl -MNews::NNTPClient - -e 'print News::NNTPClient->new->list("newsgroups")' - -=head2 How do I fetch/put an FTP file? - -LWP::Simple (available from CPAN) can fetch but not put. Net::FTP (also -available from CPAN) is more complex but can put as well as fetch. - -=head2 How can I do RPC in Perl? - -A DCE::RPC module is being developed (but is not yet available) and -will be released as part of the DCE-Perl package (available from -CPAN). The rpcgen suite, available from CPAN/authors/id/JAKE/, is -an RPC stub generator and includes an RPC::ONC module. - -=head1 AUTHOR AND COPYRIGHT - -Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. -All rights reserved. - -When included as part of the Standard Version of Perl, or as part of -its complete documentation whether printed or otherwise, this work -may be distributed only under the terms of Perl's Artistic License. -Any distribution of this file or derivatives thereof I<outside> -of that package require that special arrangements be made with -copyright holder. - -Irrespective of its distribution, all code examples in this file -are hereby placed into the public domain. You are permitted and -encouraged to use this code in your own programs for fun -or for profit as you see fit. A simple comment in the code giving -credit would be courteous but is not required. |