summaryrefslogtreecommitdiffstats
path: root/share/doc/psd/04.uprog/p4
diff options
context:
space:
mode:
authorgrog <grog@FreeBSD.org>2002-05-19 06:11:50 +0000
committergrog <grog@FreeBSD.org>2002-05-19 06:11:50 +0000
commit2d717c6f9fef0e6fda4e55557f91a024e131d99e (patch)
tree80e9a5cb42879768b1004d8e85813a58aca2fb9d /share/doc/psd/04.uprog/p4
parent559398948817b63678ed666fe48329ec9dd78db7 (diff)
downloadFreeBSD-src-2d717c6f9fef0e6fda4e55557f91a024e131d99e.zip
FreeBSD-src-2d717c6f9fef0e6fda4e55557f91a024e131d99e.tar.gz
Initial checkin: 4.4BSD version. These files need to be updated with
current license information and adapted to the FreeBSD build environment before they will build. Approved by: David Taylor <davidt@caldera.com>
Diffstat (limited to 'share/doc/psd/04.uprog/p4')
-rw-r--r--share/doc/psd/04.uprog/p4567
1 files changed, 567 insertions, 0 deletions
diff --git a/share/doc/psd/04.uprog/p4 b/share/doc/psd/04.uprog/p4
new file mode 100644
index 0000000..baddb52
--- /dev/null
+++ b/share/doc/psd/04.uprog/p4
@@ -0,0 +1,567 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)p4 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.NH
+LOW-LEVEL I/O
+.PP
+This section describes the
+bottom level of I/O on the
+.UC UNIX
+system.
+The lowest level of I/O in
+.UC UNIX
+provides no buffering or any other services;
+it is in fact a direct entry into the operating system.
+You are entirely on your own,
+but on the other hand,
+you have the most control over what happens.
+And since the calls and usage are quite simple,
+this isn't as bad as it sounds.
+.NH 2
+File Descriptors
+.PP
+In the
+.UC UNIX
+operating system,
+all input and output is done
+by reading or writing files,
+because all peripheral devices, even the user's terminal,
+are files in the file system.
+This means that a single, homogeneous interface
+handles all communication between a program and peripheral devices.
+.PP
+In the most general case,
+before reading or writing a file,
+it is necessary to inform the system
+of your intent to do so,
+a process called
+``opening'' the file.
+If you are going to write on a file,
+it may also be necessary to create it.
+The system checks your right to do so
+(Does the file exist?
+Do you have permission to access it?),
+and if all is well,
+returns a small positive integer
+called a
+.ul
+file descriptor.
+Whenever I/O is to be done on the file,
+the file descriptor is used instead of the name to identify the file.
+(This is roughly analogous to the use of
+.UC READ(5,...)
+and
+.UC WRITE(6,...)
+in Fortran.)
+All
+information about an open file is maintained by the system;
+the user program refers to the file
+only
+by the file descriptor.
+.PP
+The file pointers discussed in section 3
+are similar in spirit to file descriptors,
+but file descriptors are more fundamental.
+A file pointer is a pointer to a structure that contains,
+among other things, the file descriptor for the file in question.
+.PP
+Since input and output involving the user's terminal
+are so common,
+special arrangements exist to make this convenient.
+When the command interpreter (the
+``shell'')
+runs a program,
+it opens
+three files, with file descriptors 0, 1, and 2,
+called the standard input,
+the standard output, and the standard error output.
+All of these are normally connected to the terminal,
+so if a program reads file descriptor 0
+and writes file descriptors 1 and 2,
+it can do terminal I/O
+without worrying about opening the files.
+.PP
+If I/O is redirected
+to and from files with
+.UL <
+and
+.UL > ,
+as in
+.P1
+prog <infile >outfile
+.P2
+the shell changes the default assignments for file descriptors
+0 and 1
+from the terminal to the named files.
+Similar observations hold if the input or output is associated with a pipe.
+Normally file descriptor 2 remains attached to the terminal,
+so error messages can go there.
+In all cases,
+the file assignments are changed by the shell,
+not by the program.
+The program does not need to know where its input
+comes from nor where its output goes,
+so long as it uses file 0 for input and 1 and 2 for output.
+.NH 2
+Read and Write
+.PP
+All input and output is done by
+two functions called
+.UL read
+and
+.UL write .
+For both, the first argument is a file descriptor.
+The second argument is a buffer in your program where the data is to
+come from or go to.
+The third argument is the number of bytes to be transferred.
+The calls are
+.P1
+n_read = read(fd, buf, n);
+
+n_written = write(fd, buf, n);
+.P2
+Each call returns a byte count
+which is the number of bytes actually transferred.
+On reading,
+the number of bytes returned may be less than
+the number asked for,
+because fewer than
+.UL n
+bytes remained to be read.
+(When the file is a terminal,
+.UL read
+normally reads only up to the next newline,
+which is generally less than what was requested.)
+A return value of zero bytes implies end of file,
+and
+.UL -1
+indicates an error of some sort.
+For writing, the returned value is the number of bytes
+actually written;
+it is generally an error if this isn't equal
+to the number supposed to be written.
+.PP
+The number of bytes to be read or written is quite arbitrary.
+The two most common values are
+1,
+which means one character at a time
+(``unbuffered''),
+and
+512,
+which corresponds to a physical blocksize on many peripheral devices.
+This latter size will be most efficient,
+but even character at a time I/O
+is not inordinately expensive.
+.PP
+Putting these facts together,
+we can write a simple program to copy
+its input to its output.
+This program will copy anything to anything,
+since the input and output can be redirected to any file or device.
+.P1
+#define BUFSIZE 512 /* best size for PDP-11 UNIX */
+
+main() /* copy input to output */
+{
+ char buf[BUFSIZE];
+ int n;
+
+ while ((n = read(0, buf, BUFSIZE)) > 0)
+ write(1, buf, n);
+ exit(0);
+}
+.P2
+If the file size is not a multiple of
+.UL BUFSIZE ,
+some
+.UL read
+will return a smaller number of bytes
+to be written by
+.UL write ;
+the next call to
+.UL read
+after that
+will return zero.
+.PP
+It is instructive to see how
+.UL read
+and
+.UL write
+can be used to construct
+higher level routines like
+.UL getchar ,
+.UL putchar ,
+etc.
+For example,
+here is a version of
+.UL getchar
+which does unbuffered input.
+.P1
+#define CMASK 0377 /* for making char's > 0 */
+
+getchar() /* unbuffered single character input */
+{
+ char c;
+
+ return((read(0, &c, 1) > 0) ? c & CMASK : EOF);
+}
+.P2
+.UL c
+.ul
+must
+be declared
+.UL char ,
+because
+.UL read
+accepts a character pointer.
+The character being returned must be masked with
+.UL 0377
+to ensure that it is positive;
+otherwise sign extension may make it negative.
+(The constant
+.UL 0377
+is appropriate for the
+.UC PDP -11
+but not necessarily for other machines.)
+.PP
+The second version of
+.UL getchar
+does input in big chunks,
+and hands out the characters one at a time.
+.P1
+#define CMASK 0377 /* for making char's > 0 */
+#define BUFSIZE 512
+
+getchar() /* buffered version */
+{
+ static char buf[BUFSIZE];
+ static char *bufp = buf;
+ static int n = 0;
+
+ if (n == 0) { /* buffer is empty */
+ n = read(0, buf, BUFSIZE);
+ bufp = buf;
+ }
+ return((--n >= 0) ? *bufp++ & CMASK : EOF);
+}
+.P2
+.NH 2
+Open, Creat, Close, Unlink
+.PP
+Other than the default
+standard input, output and error files,
+you must explicitly open files in order to
+read or write them.
+There are two system entry points for this,
+.UL open
+and
+.UL creat
+[sic].
+.PP
+.UL open
+is rather like the
+.UL fopen
+discussed in the previous section,
+except that instead of returning a file pointer,
+it returns a file descriptor,
+which is just an
+.UL int .
+.P1
+int fd;
+
+fd = open(name, rwmode);
+.P2
+As with
+.UL fopen ,
+the
+.UL name
+argument
+is a character string corresponding to the external file name.
+The access mode argument
+is different, however:
+.UL rwmode
+is 0 for read, 1 for write, and 2 for read and write access.
+.UL open
+returns
+.UL -1
+if any error occurs;
+otherwise it returns a valid file descriptor.
+.PP
+It is an error to
+try to
+.UL open
+a file that does not exist.
+The entry point
+.UL creat
+is provided to create new files,
+or to re-write old ones.
+.P1
+fd = creat(name, pmode);
+.P2
+returns a file descriptor
+if it was able to create the file
+called
+.UL name ,
+and
+.UL -1
+if not.
+If the file
+already exists,
+.UL creat
+will truncate it to zero length;
+it is not an error to
+.UL creat
+a file that already exists.
+.PP
+If the file is brand new,
+.UL creat
+creates it with the
+.ul
+protection mode
+specified by
+the
+.UL pmode
+argument.
+In the
+.UC UNIX
+file system,
+there are nine bits of protection information
+associated with a file,
+controlling read, write and execute permission for
+the owner of the file,
+for the owner's group,
+and for all others.
+Thus a three-digit octal number
+is most convenient for specifying the permissions.
+For example,
+0755
+specifies read, write and execute permission for the owner,
+and read and execute permission for the group and everyone else.
+.PP
+To illustrate,
+here is a simplified version of
+the
+.UC UNIX
+utility
+.IT cp ,
+a program which copies one file to another.
+(The main simplification is that our version
+copies only one file,
+and does not permit the second argument
+to be a directory.)
+.P1
+#define NULL 0
+#define BUFSIZE 512
+#define PMODE 0644 /* RW for owner, R for group, others */
+
+main(argc, argv) /* cp: copy f1 to f2 */
+int argc;
+char *argv[];
+{
+ int f1, f2, n;
+ char buf[BUFSIZE];
+
+ if (argc != 3)
+ error("Usage: cp from to", NULL);
+ if ((f1 = open(argv[1], 0)) == -1)
+ error("cp: can't open %s", argv[1]);
+ if ((f2 = creat(argv[2], PMODE)) == -1)
+ error("cp: can't create %s", argv[2]);
+
+ while ((n = read(f1, buf, BUFSIZE)) > 0)
+ if (write(f2, buf, n) != n)
+ error("cp: write error", NULL);
+ exit(0);
+}
+.P2
+.P1
+error(s1, s2) /* print error message and die */
+char *s1, *s2;
+{
+ printf(s1, s2);
+ printf("\en");
+ exit(1);
+}
+.P2
+.PP
+As we said earlier,
+there is a limit (typically 15-25)
+on the number of files which a program
+may have open simultaneously.
+Accordingly, any program which intends to process
+many files must be prepared to re-use
+file descriptors.
+The routine
+.UL close
+breaks the connection between a file descriptor
+and an open file,
+and frees the
+file descriptor for use with some other file.
+Termination of a program
+via
+.UL exit
+or return from the main program closes all open files.
+.PP
+The function
+.UL unlink(filename)
+removes the file
+.UL filename
+from the file system.
+.NH 2
+Random Access \(em Seek and Lseek
+.PP
+File I/O is normally sequential:
+each
+.UL read
+or
+.UL write
+takes place at a position in the file
+right after the previous one.
+When necessary, however,
+a file can be read or written in any arbitrary order.
+The
+system call
+.UL lseek
+provides a way to move around in
+a file without actually reading
+or writing:
+.P1
+lseek(fd, offset, origin);
+.P2
+forces the current position in the file
+whose descriptor is
+.UL fd
+to move to position
+.UL offset ,
+which is taken relative to the location
+specified by
+.UL origin .
+Subsequent reading or writing will begin at that position.
+.UL offset
+is
+a
+.UL long ;
+.UL fd
+and
+.UL origin
+are
+.UL int 's.
+.UL origin
+can be 0, 1, or 2 to specify that
+.UL offset
+is to be
+measured from
+the beginning, from the current position, or from the
+end of the file respectively.
+For example,
+to append to a file,
+seek to the end before writing:
+.P1
+lseek(fd, 0L, 2);
+.P2
+To get back to the beginning (``rewind''),
+.P1
+lseek(fd, 0L, 0);
+.P2
+Notice the
+.UL 0L
+argument;
+it could also be written as
+.UL (long)\ 0 .
+.PP
+With
+.UL lseek ,
+it is possible to treat files more or less like large arrays,
+at the price of slower access.
+For example, the following simple function reads any number of bytes
+from any arbitrary place in a file.
+.P1
+get(fd, pos, buf, n) /* read n bytes from position pos */
+int fd, n;
+long pos;
+char *buf;
+{
+ lseek(fd, pos, 0); /* get to pos */
+ return(read(fd, buf, n));
+}
+.P2
+.PP
+In pre-version 7
+.UC UNIX ,
+the basic entry point to the I/O system
+is called
+.UL seek .
+.UL seek
+is identical to
+.UL lseek ,
+except that its
+.UL offset
+argument is an
+.UL int
+rather than a
+.UL long .
+Accordingly,
+since
+.UC PDP -11
+integers have only 16 bits,
+the
+.UL offset
+specified
+for
+.UL seek
+is limited to 65,535;
+for this reason,
+.UL origin
+values of 3, 4, 5 cause
+.UL seek
+to multiply the given offset by 512
+(the number of bytes in one physical block)
+and then interpret
+.UL origin
+as if it were 0, 1, or 2 respectively.
+Thus to get to an arbitrary place in a large file
+requires two seeks, first one which selects
+the block, then one which
+has
+.UL origin
+equal to 1 and moves to the desired byte within the block.
+.NH 2
+Error Processing
+.PP
+The routines discussed in this section,
+and in fact all the routines which are direct entries into the system
+can incur errors.
+Usually they indicate an error by returning a value of \-1.
+Sometimes it is nice to know what sort of error occurred;
+for this purpose all these routines, when appropriate,
+leave an error number in the external cell
+.UL errno .
+The meanings of the various error numbers are
+listed
+in the introduction to Section II
+of the
+.I
+.UC UNIX
+Programmer's Manual,
+.R
+so your program can, for example, determine if
+an attempt to open a file failed because it did not exist
+or because the user lacked permission to read it.
+Perhaps more commonly,
+you may want to print out the
+reason for failure.
+The routine
+.UL perror
+will print a message associated with the value
+of
+.UL errno ;
+more generally,
+.UL sys\_errno
+is an array of character strings which can be indexed
+by
+.UL errno
+and printed by your program.
OpenPOWER on IntegriCloud