diff options
author | dg <dg@FreeBSD.org> | 1995-01-25 08:46:06 +0000 |
---|---|---|
committer | dg <dg@FreeBSD.org> | 1995-01-25 08:46:06 +0000 |
commit | b665918ea6fca14c00c5d1f51ccc41b444bb6f5e (patch) | |
tree | 2d7adc05778d6bb2a1f24e86b4cc48a9c010929a /share | |
parent | 3d46fa5ebf1b6cea582cd88264abe1242c94b559 (diff) | |
download | FreeBSD-src-b665918ea6fca14c00c5d1f51ccc41b444bb6f5e.zip FreeBSD-src-b665918ea6fca14c00c5d1f51ccc41b444bb6f5e.tar.gz |
Added bpf(4) manual page from 1.1.5.
Diffstat (limited to 'share')
-rw-r--r-- | share/man/man4/bpf.4 | 685 |
1 files changed, 685 insertions, 0 deletions
diff --git a/share/man/man4/bpf.4 b/share/man/man4/bpf.4 new file mode 100644 index 0000000..ec94bde --- /dev/null +++ b/share/man/man4/bpf.4 @@ -0,0 +1,685 @@ +.\" Copyright (c) 1990 The Regents of the University of California. +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that: (1) source code distributions +.\" retain the above copyright notice and this paragraph in its entirety, (2) +.\" distributions including binary code include the above copyright notice and +.\" this paragraph in its entirety in the documentation or other materials +.\" provided with the distribution, and (3) all advertising materials mentioning +.\" features or use of this software display the following acknowledgement: +.\" ``This product includes software developed by the University of California, +.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of +.\" the University nor the names of its contributors may be used to endorse +.\" or promote products derived from this software without specific prior +.\" written permission. +.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED +.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. +.\" +.\" This document is derived in part from the enet man page (enet.4) +.\" distributed with 4.3BSD Unix. +.\" +.TH BPF 4 "23 May 1991" +.SH NAME +bpf \- Berkeley Packet Filter +.SH SYNOPSIS +.B "pseudo-device bpfilter 16" +.SH DESCRIPTION +The Berkeley Packet Filter +provides a raw interface to data link layers in a protocol +independent fashion. +All packets on the network, even those destined for other hosts, +are accessible through this mechanism. +.PP +The packet filter appears as a character special device, +.I /dev/bpf0, /dev/bpf1, +etc. +After opening the device, the file descriptor must be bound to a +specific network interface with the BIOSETIF ioctl. +A given interface can be shared be multiple listeners, and the filter +underlying each descriptor will see an identical packet stream. +The total number of open +files is limited to the value given in the kernel configuration; the +example given in the SYNOPSIS above sets the limit to 16. +.PP +A separate device file is required for each minor device. +If a file is in use, the open will fail and +.I errno +will be set to EBUSY. +.PP +Associated with each open instance of a +.I bpf +file is a user-settable packet filter. +Whenever a packet is received by an interface, +all file descriptors listening on that interface apply their filter. +Each descriptor that accepts the packet receives its own copy. +.PP +Reads from these files return the next group of packets +that have matched the filter. +To improve performance, the buffer passed to read must be +the same size as the buffers used internally by +.I bpf. +This size is returned by the BIOCGBLEN ioctl (see below), and under +BSD, can be set with BIOCSBLEN. +Note that an individual packet larger than this size is necessarily +truncated. +.PP +The packet filter will support any link level protocol that has fixed length +headers. Currently, only Ethernet, SLIP and PPP drivers have been +modified to interact with +.I bpf. +.PP +Since packet data is in network byte order, applications should use the +.I byteorder(3n) +macros to extract multi-byte values. +.PP +A packet can be sent out on the network by writing to a +.I bpf +file descriptor. The writes are unbuffered, meaning only one +packet can be processed per write. +Currently, only writes to Ethernets and SLIP links are supported. +.SH IOCTLS +The +.I ioctl +command codes below are defined in <net/bpf.h>. All commands require +these includes: +.ft B +.nf + + #include <sys/types.h> + #include <sys/time.h> + #include <sys/ioctl.h> + #include <net/bpf.h> + +.fi +.ft R +Additionally, BIOCGETIF and BIOCSETIF require \fB<net/if.h>\fR. + +In addition to FIONREAD and SIOCGIFADDR, the following commands +may be applied to any open +.I bpf +file. +The (third) argument to the +.I ioctl +should be a pointer to the type indicated. +.TP 10 +.B BIOCGBLEN (u_int) +Returns the required buffer length for reads on +.I bpf +files. +.TP 10 +.B BIOCSBLEN (u_int) +Sets the buffer length for reads on +.I bpf +files. The buffer must be set before the file is attached to an interface +with BIOCSETIF. +If the requested buffer size cannot be accomodated, the closest +allowable size will be set and returned in the argument. +A read call will result in EIO if it is passed a buffer that is not this size. +.TP 10 +.B BIOCGDLT (u_int) +Returns the type of the data link layer underyling the attached interface. +EINVAL is returned if no interface has been specified. +The device types, prefixed with ``DLT_'', are defined in <net/bpf.h>. +.TP 10 +.B BIOCPROMISC +Forces the interface into promiscuous mode. +All packets, not just those destined for the local host, are processed. +Since more than one file can be listening on a given interface, +a listener that opened its interface non-promiscuously may receive +packets promiscuously. This problem can be remedied with an +appropriate filter. +.IP +The interface remains in promiscuous mode until all files listening +promiscuously are closed. +.TP 10 +.B BIOCFLUSH +Flushes the buffer of incoming packets, +and resets the statistics that are returned by BIOCGSTATS. +.TP 10 +.B BIOCGETIF (struct ifreq) +Returns the name of the hardware interface that the file is listening on. +The name is returned in the if_name field of +.I ifr. +All other fields are undefined. +.TP 10 +.B BIOCSETIF (struct ifreq) +Sets the hardware interface associate with the file. This +command must be performed before any packets can be read. +The device is indicated by name using the +.I if_name +field of the +.I ifreq. +Additionally, performs the actions of BIOCFLUSH. +.TP 10 +.B BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval) +Set or get the read timeout parameter. +The +.I timeval +specifies the length of time to wait before timing +out on a read request. +This parameter is initialized to zero by +.IR open(2), +indicating no timeout. +.TP 10 +.B BIOCGSTATS (struct bpf_stat) +Returns the following structure of packet statistics: +.ft B +.nf + +struct bpf_stat { + u_int bs_recv; + u_int bs_drop; +}; +.fi +.ft R +.IP +The fields are: +.RS +.TP 15 +.I bs_recv +the number of packets received by the descriptor since opened or reset +(including any buffered since the last read call); +and +.TP +.I bs_drop +the number of packets which were accepted by the filter but dropped by the +kernel because of buffer overflows +(i.e., the application's reads aren't keeping up with the packet traffic). +.RE +.TP 10 +.B BIOCIMMEDIATE (u_int) +Enable or disable ``immediate mode'', based on the truth value of the argument. +When immediate mode is enabled, reads return immediately upon packet +reception. Otherwise, a read will block until either the kernel buffer +becomes full or a timeout occurs. +This is useful for programs like +.I rarpd(8c), +which must respond to messages in real time. +The default for a new file is off. +.TP 10 +.B BIOCSETF (struct bpf_program) +Sets the filter program used by the kernel to discard uninteresting +packets. An array of instructions and its length is passed in using +the following structure: +.ft B +.nf + +struct bpf_program { + int bf_len; + struct bpf_insn *bf_insns; +}; +.fi +.ft R +.IP +The filter program is pointed to by the +.I bf_insns +field while its length in units of `struct bpf_insn' is given by the +.I bf_len +field. +Also, the actions of BIOCFLUSH are performed. +.IP +See section \fBFILTER MACHINE\fP for an explanation of the filter language. +.TP 10 +.B BIOCVERSION (struct bpf_version) +Returns the major and minor version numbers of the filter languange currently +recognized by the kernel. Before installing a filter, applications must check +that the current version is compatible with the running kernel. Version +numbers are compatible if the major numbers match and the application minor +is less than or equal to the kernel minor. The kernel version number is +returned in the following structure: +.ft B +.nf + +struct bpf_version { + u_short bv_major; + u_short bv_minor; +}; +.fi +.ft R +.IP +The current version numbers are given by +.B BPF_MAJOR_VERSION +and +.B BPF_MINOR_VERSION +from <net/bpf.h>. +An incompatible filter +may result in undefined behavior (most likely, an error returned by +.I ioctl() +or haphazard packet matching). +.SH BPF HEADER +The following structure is prepended to each packet returned by +.I read(2): +.in 15 +.ft B +.nf + +struct bpf_hdr { + struct timeval bh_tstamp; + u_long bh_caplen; + u_long bh_datalen; + u_short bh_hdrlen; +}; +.fi +.ft R +.in -15 +.PP +The fields, whose values are stored in host order, and are: +.TP 15 +.I bh_tstamp +The time at which the packet was processed by the packet filter. +.TP +.I bh_caplen +The length of the captured portion of the packet. This is the minimum of +the truncation amount specified by the filter and the length of the packet. +.TP +.I bh_datalen +The length of the packet off the wire. +This value is independent of the truncation amount specified by the filter. +.TP +.I bh_hdrlen +The length of the BPF header, which may not be equal to +.I sizeof(struct bpf_hdr). +.RE +.PP +The +.I bh_hdrlen +field exists to account for +padding between the header and the link level protocol. +The purpose here is to guarantee proper alignment of the packet +data structures, which is required on alignment sensitive +architectures and and improves performance on many other architectures. +The packet filter insures that the +.I bpf_hdr +and the \fInetwork layer\fR header will be word aligned. Suitable precautions +must be taken when accessing the link layer protocol fields on alignment +restricted machines. (This isn't a problem on an Ethernet, since +the type field is a short falling on an even offset, +and the addresses are probably accessed in a bytewise fashion). +.PP +Additionally, individual packets are padded so that each starts +on a word boundary. This requires that an application +has some knowledge of how to get from packet to packet. +The macro BPF_WORDALIGN is defined in <net/bpf.h> to facilitate +this process. It rounds up its argument +to the nearest word aligned value (where a word is BPF_ALIGNMENT bytes wide). +.PP +For example, if `p' points to the start of a packet, this expression +will advance it to the next packet: +.sp +.RS +.ce 1 +.nf +p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen) +.fi +.RE +.PP +For the alignment mechanisms to work properly, the +buffer passed to +.I read(2) +must itself be word aligned. +.I malloc(3) +will always return an aligned buffer. +.ft R +.SH FILTER MACHINE +A filter program is an array of instructions, with all branches forwardly +directed, terminated by a \fBreturn\fP instruction. +Each instruction performs some action on the pseudo-machine state, +which consists of an accumulator, index register, scratch memory store, +and implicit program counter. + +The following structure defines the instruction format: +.RS +.ft B +.nf + +struct bpf_insn { + u_short code; + u_char jt; + u_char jf; + long k; +}; +.fi +.ft R +.RE + +The \fIk\fP field is used in differnet ways by different insutructions, +and the \fIjt\fP and \fIjf\fP fields are used as offsets +by the branch intructions. +The opcodes are encoded in a semi-hierarchical fashion. +There are eight classes of intructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, +BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. Various other mode and +operator bits are or'd into the class to give the actual instructions. +The classes and modes are defined in <net/bpf.h>. + +Below are the semantics for each defined BPF instruction. +We use the convention that A is the accumulator, X is the index register, +P[] packet data, and M[] scratch memory store. +P[i:n] gives the data at byte offset ``i'' in the packet, +interpreted as a word (n=4), +unsigned halfword (n=2), or unsigned byte (n=1). +M[i] gives the i'th word in the scratch memory store, which is only +addressed in word units. The memory store is indexed from 0 to BPF_MEMWORDS-1. +\fIk\fP, \fIjt\fP, and \fIjf\fP are the corresponding fields in the +instruction definition. ``len'' refers to the length of the packet. + +.TP 10 +.B BPF_LD +These instructions copy a value into the accumulator. The type of the +source operand is specified by an ``addressing mode'' and can be +a constant (\fBBPF_IMM\fP), packet data at a fixed offset (\fBBPF_ABS\fP), +packet data at a variable offset (\fBBPF_IND\fP), the packet length +(\fBBPF_LEN\fP), +or a word in the scratch memory store (\fBBPF_MEM\fP). +For \fBBPF_IND\fP and \fBBPF_ABS\fP, the data size must be specified as a word +(\fBBPF_W\fP), halfword (\fBBPF_H\fP), or byte (\fBBPF_B\fP). +The semantics of all the recognized BPF_LD instructions follow. + +.RS +.TP 30 +.B BPF_LD+BPF_W+BPF_ABS +A <- P[k:4] +.TP +.B BPF_LD+BPF_H+BPF_ABS +A <- P[k:2] +.TP +.B BPF_LD+BPF_B+BPF_ABS +A <- P[k:1] +.TP +.B BPF_LD+BPF_W+BPF_IND +A <- P[X+k:4] +.TP +.B BPF_LD+BPF_H+BPF_IND +A <- P[X+k:2] +.TP +.B BPF_LD+BPF_B+BPF_IND +A <- P[X+k:1] +.TP +.B BPF_LD+BPF_W+BPF_LEN +A <- len +.TP +.B BPF_LD+BPF_IMM +A <- k +.TP +.B BPF_LD+BPF_MEM +A <- M[k] +.RE + +.TP 10 +.B BPF_LDX +These instructions load a value into the index register. Note that +the addressing modes are more retricted than those of the accumulator loads, +but they include +.B BPF_MSH, +a hack for efficiently loading the IP header length. +.RS +.TP 30 +.B BPF_LDX+BPF_W+BPF_IMM +X <- k +.TP +.B BPF_LDX+BPF_W+BPF_MEM +X <- M[k] +.TP +.B BPF_LDX+BPF_W+BPF_LEN +X <- len +.TP +.B BPF_LDX+BPF_B+BPF_MSH +X <- 4*(P[k:1]&0xf) +.RE + +.TP 10 +.B BPF_ST +This instruction stores the accumulator into the scratch memory. +We do not need an addressing mode since there is only one possibility +for the destination. +.RS +.TP 30 +.B BPF_ST +M[k] <- A +.RE + +.TP 10 +.B BPF_STX +This instruction stores the index register in the scratch memory store. +.RS +.TP 30 +.B BPF_STX +M[k] <- X +.RE + +.TP 10 +.B BPF_ALU +The alu instructions perform operations between the accumulator and +index register or constant, and store the result back in the accumulator. +For binary operations, a source mode is required (\fBBPF_K\fP or +\fBBPF_X\fP). +.RS +.TP 30 +.B BPF_ALU+BPF_ADD+BPF_K +A <- A + k +.TP +.B BPF_ALU+BPF_SUB+BPF_K +A <- A - k +.TP +.B BPF_ALU+BPF_MUL+BPF_K +A <- A * k +.TP +.B BPF_ALU+BPF_DIV+BPF_K +A <- A / k +.TP +.B BPF_ALU+BPF_AND+BPF_K +A <- A & k +.TP +.B BPF_ALU+BPF_OR+BPF_K +A <- A | k +.TP +.B BPF_ALU+BPF_LSH+BPF_K +A <- A << k +.TP +.B BPF_ALU+BPF_RSH+BPF_K +A <- A >> k +.TP +.B BPF_ALU+BPF_ADD+BPF_X +A <- A + X +.TP +.B BPF_ALU+BPF_SUB+BPF_X +A <- A - X +.TP +.B BPF_ALU+BPF_MUL+BPF_X +A <- A * X +.TP +.B BPF_ALU+BPF_DIV+BPF_X +A <- A / X +.TP +.B BPF_ALU+BPF_AND+BPF_X +A <- A & X +.TP +.B BPF_ALU+BPF_OR+BPF_X +A <- A | X +.TP +.B BPF_ALU+BPF_LSH+BPF_X +A <- A << X +.TP +.B BPF_ALU+BPF_RSH+BPF_X +A <- A >> X +.TP +.B BPF_ALU+BPF_NEG +A <- -A +.RE + +.TP 10 +.B BPF_JMP +The jump instructions alter flow of control. Conditional jumps +compare the accumulator against a constant (\fBBPF_K\fP) or +the index register (\fBBPF_X\fP). If the result is true (or non-zero), +the true branch is taken, otherwise the false branch is taken. +Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. +However, the jump always (\fBBPF_JA\fP) opcode uses the 32 bit \fIk\fP +field as the offset, allowing arbitrarily distant destinations. +All conditionals use unsigned comparison conventions. +.RS +.TP 30 +.B BPF_JMP+BPF_JA +pc += k +.TP +.B BPF_JMP+BPF_JGT+BPF_K +pc += (A > k) ? jt : jf +.TP +.B BPF_JMP+BPF_JGE+BPF_K +pc += (A >= k) ? jt : jf +.TP +.B BPF_JMP+BPF_JEQ+BPF_K +pc += (A == k) ? jt : jf +.TP +.B BPF_JMP+BPF_JSET+BPF_K +pc += (A & k) ? jt : jf +.TP +.B BPF_JMP+BPF_JGT+BPF_X +pc += (A > X) ? jt : jf +.TP +.B BPF_JMP+BPF_JGE+BPF_X +pc += (A >= X) ? jt : jf +.TP +.B BPF_JMP+BPF_JEQ+BPF_X +pc += (A == X) ? jt : jf +.TP +.B BPF_JMP+BPF_JSET+BPF_X +pc += (A & X) ? jt : jf +.RE +.TP 10 +.B BPF_RET +The return instructions terminate the filter program and specify the amount +of packet to accept (i.e., they return the truncation amount). A return +value of zero indicates that the packet should be ignored. +The return value is either a constant (\fBBPF_K\fP) or the accumulator +(\fBBPF_A\fP). +.RS +.TP 30 +.B BPF_RET+BPF_A +accept A bytes +.TP +.B BPF_RET+BPF_K +accept k bytes +.RE +.TP 10 +.B BPF_MISC +The miscellaneous category was created for anything that doesn't +fit into the above classes, and for any new instructions that might need to +be added. Currently, these are the register transfer intructions +that copy the index register to the accumulator or vice versa. +.RS +.TP 30 +.B BPF_MISC+BPF_TAX +X <- A +.TP +.B BPF_MISC+BPF_TXA +A <- X +.RE +.PP +The BPF interface provides the following macros to facilitate +array initializers: +.RS +\fBBPF_STMT\fI(opcode, operand)\fR +.br +and +.br +\fBBPF_JUMP\fI(opcode, operand, true_offset, false_offset)\fR +.RE +.PP +.SH EXAMPLES +The following filter is taken from the Reverse ARP Daemon. It accepts +only Reverse ARP requests. +.RS +.nf + +struct bpf_insn insns[] = { + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), + BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + + sizeof(struct ether_header)), + BPF_STMT(BPF_RET+BPF_K, 0), +}; +.fi +.RE +.PP +This filter accepts only IP packets between host 128.3.112.15 and +128.3.112.35. +.RS +.nf + +struct bpf_insn insns[] = { + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 26), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 30), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), + BPF_STMT(BPF_RET+BPF_K, (u_int)-1), + BPF_STMT(BPF_RET+BPF_K, 0), +}; +.fi +.RE +.PP +Finally, this filter returns only TCP finger packets. We must parse +the IP header to reach the TCP header. The \fBBPF_JSET\fP instruction +checks that the IP fragment offset is 0 so we are sure +that we have a TCP header. +.RS +.nf + +struct bpf_insn insns[] = { + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), + BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), + BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), + BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), + BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), + BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), + BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), + BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), + BPF_STMT(BPF_RET+BPF_K, (u_int)-1), + BPF_STMT(BPF_RET+BPF_K, 0), +}; +.fi +.RE +.SH SEE ALSO +tcpdump(8) +.LP +McCanne, S., Jacobson V., +.RI ` "An efficient, extensible, and portable network monitor" ' +.SH FILES +/dev/bpf0, /dev/bpf1, ... +.SH BUGS +The read buffer must be of a fixed size (returned by the BIOCGBLEN ioctl). +.PP +A file that does not request promiscuous mode may receive promiscuously +received packets as a side effect of another file requesting this +mode on the same hardware interface. This could be fixed in the kernel +with additional processing overhead. However, we favor the model where +all files must assume that the interface is promiscuous, and if +so desired, must utilize a filter to reject foreign packets. +.PP +Data link protocols with variable length headers are not currently supported. +.PP +Under SunOS, if a BPF application reads more than 2^31 bytes of +data, read will fail in EINVAL. You can either fix the bug in SunOS, +or lseek to 0 when read fails for this reason. +.SH HISTORY +.PP +The Enet packet filter was created in 1980 by Mike Accetta and +Rick Rashid at Carnegie-Mellon University. Jeffrey Mogul, at +Stanford, ported the code to BSD and continued its development from +1983 on. Since then, it has evolved into the Ultrix Packet Filter +at DEC, a STREAMS NIT module under SunOS 4.1, and BPF. +.SH AUTHORS +.PP +Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in +Summer 1990. Much of the design is due to Van Jacobson. |