summaryrefslogtreecommitdiffstats
path: root/share/doc/iso/wisc/trans_design.nr
diff options
context:
space:
mode:
Diffstat (limited to 'share/doc/iso/wisc/trans_design.nr')
-rw-r--r--share/doc/iso/wisc/trans_design.nr1466
1 files changed, 1466 insertions, 0 deletions
diff --git a/share/doc/iso/wisc/trans_design.nr b/share/doc/iso/wisc/trans_design.nr
new file mode 100644
index 0000000..6aeb54a
--- /dev/null
+++ b/share/doc/iso/wisc/trans_design.nr
@@ -0,0 +1,1466 @@
+.NC "The Design of the ARGO Transport Entity"
+.sh 1 "Protocol Hooks"
+.pp
+The design of the AOS kernel IPC support to some
+extent mandates the
+design of protocols.
+Each protocol must provide the following
+protocol hooks, which are procedures called through a
+protocol switch table
+(an array of type \fIprotosw\fR as described in
+Chapter Five.
+.ip "pr_input()" 5
+Called when data are to be passed up from a lower layer.
+.ip "pr_output()" 5
+Called when data are to be passed down from a higher layer.
+.ip "pr_init()" 5
+Called when the system is brought up.
+.ip "pr_fasttimo()" 5
+Called every 200 milliseconds by the clock functional unit.
+.ip "pr_slowtimo()" 5
+Called every 500 milliseconds by the clock functional unit.
+.ip "pr_drain()" 5
+This is meant to be called when buffer space is low.
+Each protocol is expected to provide this routine to free
+non-critical buffer space.
+This is not yet called anywhere.
+.ip "pr_ctlinput()" 5
+Used for exchanging information between
+protocols, such as notifying a transport protocol of changes
+in routing or configuration information.
+.ip "pr_ctloutput()" 5
+Supports the protocol-dependent
+\fIgetsockopt()\fR
+and
+\fIsetsockopt()\fR
+options.
+.ip "pr_usrreq()" 5
+Called by the socket code to pass along a \*(lquser request\*(rq -
+in other words a service primitive.
+This call is also used for other protocol functions.
+The functions served by the \fIpr_usrreq()\fR routine are:
+.ip " PRU_ATTACH" 10
+Creates a protocol control block and attaches it to a given socket.
+Called as a result of a \fIsocket()\fR system call.
+.ip " PRU_DISCONNECT" 10
+Called as a result of a
+\fIclose()\fR system call.
+Initiates disconnection.
+.ip " PRU_DETACH" 10
+Disassociates a protocol control block from a socket and recycles
+the buffer space used for the protocol control block.
+Called after PRU_DISCONNECT.
+.ip " PRU_SHUTDOWN" 10
+Called as a result of a
+\fIshutdown()\fR system call.
+If the protocol supports the notion of half-open connections,
+this closes the connection in one direction or both directions,
+depending on the arguments passed to
+\fIshutdown\fR.
+.ip " PRU_BIND" 10
+Gives an address to a socket.
+Called as a result of a
+\fIbind()\fR system call, also
+when
+socket without a bound address is used.
+In the latter case, an unused transport suffix is located and
+bound to the socket.
+.ip " PRU_LISTEN" 10
+Called as a result of a
+\fIlisten()\fR system call.
+Marks the socket as willing to queue incoming connection
+requests.
+.ip " PRU_CONNECT" 10
+Called as a result of a
+\fIconnect()\fR system call.
+Initiates a connection request.
+.ip " PRU_ACCEPT" 10
+Called as a result of an
+\fIaccept()\fR system call.
+Dequeues a pending connection request, or blocks waiting for
+a connection request to arrive.
+In the latter case, it marks the socket as willing to accept
+connections.
+.ip " PRU_RCVD" 10
+The protocol module is expected to have put incoming data
+into the socket's receive buffer, \fIso_rcv\fR.
+When a receive primitive is used
+(\fIrecv(), recvmsg(), recvfrom(),
+read(), readv(), \fRand
+\fIrecvv()\fR system calls)
+the socket code module copies data from the
+\fIso_rcv\fR to the user's
+address space.
+The protocol module may arrange to be informed each time the socket code
+does this, in which case the socket code calls \fIpr_usrreq\fR(PRU_RCVD)
+after the data were copied to the user.
+.ip " PRU_SEND" 10
+This performs the protocol-dependent part of a send primitive
+(\fIsend(), sendmsg(), sendto(), write(), writev(),
+\fRand \fIsendv()\fR system calls).
+The socket code
+(procedures \fIsendit() and \fIsosend()\fR)
+moves outgoing data from the user's
+address space into a chain of \fImbufs\fR.
+The socket code takes as much data from the user as it
+determines will fit into the outgoing socket buffer, so_snd.
+It passes this much data in the form of an mbuf chain to the protocol
+via \fIpr_usrreq\fR(PRU_SEND).
+If there are more data than
+the so_snd can accommodate,
+the socket code, which is running on behalf of a user process,
+puts the user process to sleep.
+The protocol module is expected to wake up the user process when
+more room appears in so_snd.
+.ip " PRU_ABORT" 10
+Called when a socket is closed and that socket
+is accepting connections and has
+queued pending
+connection requests or
+partially open connections.
+.ip " PRU_CONTROL" 10
+Called as a result of an
+\fIioctl()\fR system call.
+.ip " PRU_SENSE" 10
+Called as a result of an
+\fIfstat()\fR system call.
+.ip " PRU_RCVOOB" 10
+Performs the work of receiving \*(lqout-of-band\*(rq data.
+The socket module has already allocated an mbuf into which
+the protocol module is expected to put the incoming
+\*(lqout-of-band\*(rq data.
+The socket code will then move the data from this mbuf
+to the user's address space.
+.ip " PRU_SENDOOB" 10
+Performs the work of sending \*(lqout-of-band\*(rq data.
+The socket module has already moved the data
+from the user's address space into a chain of mbufs,
+which it now passes to the protocol module.
+.ip " PRU_SOCKADDR" 10
+Supports the system call
+\fIgetsockname()\fR.
+Puts the socket's bound address into an mbuf.
+.ip " PRU_PEERADDR" 10
+Supports the system call
+\fIgetpeername\fR().
+Puts the peer's address into an mbuf.
+.ip " PRU_CONNECT2" 10
+This is used in the Unix domain to support pipes.
+It is not generally supported by transport protocols.
+.ip " PRU_FASTTIMO, PRU_SLOWTIMO" 10
+These are superfluous.
+None of the transport protocols uses them.
+.ip " PRU_PROTORCV, PRU_PROTOSEND" 10
+None of the transport protocols uses these.
+.ip " PRU_SENDEOT" 10
+This was added to support TP.
+This indicates that the end of the data sent in this
+send primitive should
+be marked by the protocol as the end of the TSDU.
+.sh 1 "The Interface Between the Transport Entity and Lower Layers"
+.pp
+The transport layer may run over a network layer such as IP
+or the ISO connectionless network layer,
+or it may run over a multi-purpose layer such as the service
+provided by X.25.
+X.25 is viewed as a network layer when
+TP runs over X.25, and as a
+subnetwork layer
+when IP is running over X.25.
+The software interface between data link and network layers differs
+considerably from the software interface between transport and network
+layers in AOS.
+For this reason some modification of the transport-to-lower-layer
+interface is necessary to support the suite of protocols included in
+ARGO.
+.pp
+In AOS it is assumed that the transport layer will run over one
+and only one network layer, and therefore it may call the
+network layer output procedure directly.
+In order to allow TP to run over a set of lower layers,
+all domain-specific functions have been put into a set of routines
+that are called indirectly through a domain-specific switch table.
+The primary reason for this is that the transport and network
+layers share information, mostly information pertaining to addresses.
+The protocol control blocks for different network layers
+differ, so the transport layer cannot just directly
+access the network layer's pcb.
+Similarly, a network layer may not directly access the transport
+pcb because a multitude of transport protocols can run over each
+of the network protocols.
+.pp
+To permit different network-layer protocol control blocks to coexist
+under one transport layer, all transport-dependent control
+information was put into a transport-specific protocol control block.
+A new field, \fIso_tpcb\fR,
+was added to the \fIsocket\fR structure to hold a pointer to
+the transport-layer protocol control block.
+The existing
+field \fCso_pcb\fR is used for the network layer pcb.
+.pp
+The following structure was added to allow domain-specific
+functions to be called indirectly.
+All these functions operate on a network-layer pcb.
+.pp
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct nl_protosw {
+.T&
+l l l l.
++int+nlp_afamily;+/* address family */
++int+(*nlp_putnetaddr)();+/* puts addrs in pcb */
++int+(*nlp_getnetaddr)();+/* gets addrs from pcb */
++int+(*nlp_putsufx)();+/* transp suffix -> pcb */
++int+(*nlp_getsufx)();+/* gets t-suffix */
++int+(*nlp_recycle_suffix)();+/* zeroes suffix */
++int+(*nlp_mtu)();+/* get maximum
++++transmission unit size */
++int+(*nlp_pcbbind)();+/* bind to pcb */
++int+(*nlp_pcbconn)();+/* connect */
++int+(*nlp_pcbdisc)();+/* disconnect */
++int+(*nlp_pcbdetach)();+/* detach pcb */
++int+(*nlp_pcballoc)();+/* allocate a pcb */
++int+(*nlp_output)();+/* emit packet */
++int+(*nlp_dgoutput)();+/* emit datagram */
++caddr_t+nlp_pcblist;+/* list of pcbs
++++for management
++++of connections */
+};
+.TE
+\fR
+.)b
+.lp
+The switch is based on the address family chosen when the
+\fIsocket()\fR system call is made prior to connection establishment.
+This unfortunately ties the address family to the domain,
+but the only alternative is to add an argument to the \fIsocket()\fR
+system call to let the user specify the desired network layer.
+In the case of a connection oriented environment with no multi-homing,
+it would be possible to determine which network layer is to be
+used
+from routing
+information, but to do this requires unrealistic assumptions
+about the environment.
+For these reasons, linking the address family to the network
+layer protocol is seen as the least of the evils.
+The transport suffixes are kept in the network layer's pcb
+as well as in the transport layer because
+full transport address pairs are used to identify a connection
+in the Internet domain.
+.sh 1 "The Architecture of the Transport Protocol Entity"
+.pp
+A set of protocol hooks is required
+by the AOS IPC architecture.
+These hooks are used by the protocol-independent parts of the kernel
+to gain entry to protocol-specific code.
+The protocol code can be entered in one of the following ways:
+.ip "1) " 5
+at boot time, when autoconfiguration
+initializes each protocol through
+the
+\fIpr_init()\fR
+hook,
+.ip "2) " 5
+from above, either
+a user program making a system call, through
+the \fIpr_usrreq()\fR or \fIpr_ctloutput()\fR hooks, or
+from a higher layer protocol using the
+\fIpr_output()\fR hook,
+.ip "3) " 5
+from below, a device interrupt servicing an incoming packet
+through the \fIpr_input()\fR and \fIpr_ctlinput()\fR hooks, and
+.ip "4) " 5
+from a clock interrupt through the \fIpr_slowtimo()\fR
+or the
+\fIpr_fasttimo()\fR hook.
+.\" FIGURE
+.so figs/trans_flow.nr
+.\".so figs/trans_flow.grn
+.pp
+The protocol code can be divided into
+the following modules, which are described in more detail below.
+.CF
+shows the flow of data and control
+among these modules.
+.in +5
+.ip "Timers and References:" 5
+The code executed on behalf of \fIpr_slowtimo()\fR.
+The fast timeout is not used by TP.
+.ip "Driver:" 5
+This is the finite state machine for TP.
+.ip "Input: " 5
+This is the module that decodes incoming packets,
+identifies or creates the pcb for which
+the packet is destined, and creates an "event" to
+pass to the driver.
+.ip "Output:" 5
+This is the module that creates a packet header of a given type
+with fields containing
+values that are appropriate to the connection
+on which the packet is being sent, appends data if necessary,
+and hands a packet
+to the lower layer, according to the transport-to-lower-layer
+interface.
+.ip "Send: " 5
+This module packetizes data from the outbound
+socket buffer, \fIso_snd\fR,
+handles retransmissions of packetized data, and
+drops packetized data from the retransmission queue.
+.ip "Receive:" 5
+This module reorders packets if necessary,
+depacketizes data, passes it to the socket code module,
+and determines when acknowledgments should be sent.
+.in -5
+.sh 1 "Timers and References"
+.pp
+TP identifies sockets by \fIreference numbers\fR, or
+\fIreferences\fR,
+which are \*(lqfrozen\*(rq (may not be reassigned)
+until some locally defined time after
+a connection is broken and its protocol control block
+is discarded.
+An array of \fIreference blocks\fR is maintained by TP.
+The reference number of a reference block is its
+offset in the array.
+When a reference block is in use it contains
+a pointer to the pcb for the socket to which the
+reference applies.
+.pp
+The system clock calls the \fIpr_slowtimo()\fR and
+\fIpr_fasttimo()\fR hooks for each protocol in the protocol switch table
+every 500 and 200 microseconds, respectively.
+Each protocol handles its own timers its own way.
+The timers in TP take two forms
+- those that typically are cancelled and
+those that usually expire.
+The latter form may have more than one instantiation at any given
+time.
+The former may not.
+The two are implemented slightly
+differently for the sake of performance.
+.pp
+The timers that normally expire
+are kept in a queue, their values all relative
+to the value of preceding timer.
+Thus all timer values are decremented by a single
+operation on the value of the first timer.
+The timer is represented by the Ecallout structure:
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct Ecallout {
+.T&
+l l l l.
++int+c_time;+/* incremental time */
++int+c_func;+/* function to call */
++u_int+c_arg1;+/* argument to routine */
++u_int+c_arg2;+/* argument to routine */
++int+c_arg3;+/* argument to routine */
++struct Ecallout+*c_next;
+};
+.TE
+\fR
+.)b
+.lp
+When an Ecallout structure migrates to the head
+of the E timer list, and its \fIc_time\fR
+field is decremented to zero,
+the function stored in \fIc_func\fR is
+called, with \fIc_arg1, c_arg2\fR, and \fIc_arg3\fR
+as arguments.
+Setting and cancelling these timers
+are accomplished by a linear search and one
+insertion or deletion from the timer queue.
+This queue is linked to the
+reference block associated with a communication endpoint.
+This form used for the reference timer
+and for the retransmission timers for data TPDUs.
+.pp
+The second form of timer, the type that
+typically is cancelled, is used for several
+timers - the inactivity timer, the sendack timer,
+and the retransmission
+timer for all types of TPDUs except data TPDUs.
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct Ccallout {
+.T&
+l l l l.
++int+c_time;+/* incremental time */
++int+c_active;+/* this timer is active? */
+};
+.TE
+\fR
+.)b
+.lp
+All of these timers are stored
+directly
+in the reference block.
+These timers are decremented in one linear scan of
+the reference blocks.
+Cancelling, setting, and both
+cancelling and resetting one of these timers is accomplished by a
+single assignment to an array element.
+.sh 1 "Driver"
+.pp
+This is the finite state machine for TP.
+A connection is managed by the finite state machine (fsm).
+All events that pertain to a connection cause the
+finite state machine driver to be called.
+The driver takes two arguments - the pcb for the connection
+and an event structure.
+The event structure contains a field that discriminates
+the different types of events, and a union of
+structures that are specific to the event types.
+The driver evaluates a set of predicates based on the current
+state of the finite state machine (which is kept in the pcb) and the event type.
+The result of the predicate evaluation determines
+a set of actions to take and a state transition.
+The driver takes the actions and if they complete
+without errors, the driver makes the state transition.
+.pp
+The states, event types, predicates, actions, and state transitions are all
+specified as a \fIxebec transition file\fR.
+\fIXebec\fR is a utility that takes a human-readable description
+of a finite state machine
+and produces a set of tables and C source code for the driver.
+The driver procedure is called \fItp_driver()\fR.
+It is located in a file generated by xebec,
+\fCtp_driver.c\fR.
+For more details about xebec, see the manual page \fIxebec(1)\fR.
+.pp
+The transition file for TP is \fCtp.trans\fR,
+and it is a good place to begin a perusal of the TP
+source code.
+.sh 1 "Input"
+.pp
+This is the module that decodes an incoming packet,
+locates or creates the pcb for which
+the packet is destined, and creates an event to
+pass to the driver.
+The network layer passes a packet up to the appropriate
+transport layer by indirectly calling a transport input
+routine through the protocol switch table for the network
+domain.
+There is one protocol switch entry for TP for each domain in which
+TP will run (Internet, ISO).
+In the Internet domain, the protocol switch field \fIpr_input()\fR
+takes the value \fItpip_input()\fR.
+This procedure accepts a packet from IP, with the IP header
+still intact.
+It extracts the network addresses from the IP header,
+strips the IP header, and calls the domain-independent
+input procedure for TP,
+\fItp_input()\fR.
+\fITp_input()\fR
+decodes a TPDU.
+The multitude of options, the variable-length
+nature of the options, the semantics of the
+options, and the possible combinations of concatenated
+TPDUs make this a
+complex procedure.
+It is sensitive to changes, and from
+the point of view of a software maintenance, it is a
+potential hazard.
+Because it is in the
+critical path of TP however, some compromise
+was made between maintainability and efficiency.
+Multiple copies of sections of code were avoided as much as
+possible,
+not for the sake of saving space, but rather for the sake
+of maintainability.
+Ironically,
+this detracts somewhat from the readability of the code.
+.pp
+Once a TPDU has been decoded and a pcb has been
+identified for the TPDU,
+the appropriate fields of the TPDU
+are extracted and their values are placed in
+an event structure.
+Finally, \fItp_driver()\fR is called with
+the event structure and the pcb as parameters.
+.sh 1 "Output"
+.pp
+This module creates a TPDU header of a given type
+with field values that are appropriate to the connection
+on which the TPDU is being sent, appends data if necessary,
+and hands a TPDU
+to the lower layer according to the transport-to-lower-layer
+interface.
+Whenever a TPDU is to be sent to the peer or prospective peer,
+the function \fItp_emit()\fR
+is called, passing as arguments the pcb a TPDU type and several miscellaneous
+other type-specific arguments, possibly including some data.
+The data are in the form of an mbuf chain.
+\fITp_emit()\fR prepends to the data an mbuf containing a TP header,
+fills in the fields of the header according to the parameters
+given, performs the checksum if appropriate, and
+calls a domain-specific output routine.
+For the Internet domain, this output routine is
+\fItpip_output()\fR, which takes
+as arguments the mbuf chain representing the TPDU,
+and a network level pcb.
+Some protocol errors cannot be associated with
+a connection
+but require that TP issue
+an ER TPDU or a DR TPDU.
+When these errors occur the routine
+\fItp_error_emit()\fR is called.
+This procedure creates the appropriate type of TPDU
+and passes it to a domain-dependent routine for transmitting datagrams.
+In the Internet domain,
+\fItpip_output_dg()\fR is called.
+This takes as arguments an mbuf chain representing the TPDU,
+a source network address, and a destination network address.
+.sh 1 "Send"
+.\" FIGURE
+.so figs/mbufsnd.nr
+.\".so figs/mbufsnd.grn
+.pp
+This module packetizes data from the outbound
+socket buffer, \fIso_snd\fR,
+handles retransmissions of packetized data, and
+drops packetized data from the retransmission queue.
+The major routine in this module is \fItp_send()\fR, which
+takes a range of sequence numbers as arguments.
+For each sequence number in the range,
+it packetizes the an appropriate amount
+of outbound data, and places the resulting TPDU on
+a retransmission control queue subject to the
+constraints imposed by the rules of expedited data,
+maximum packet sizes, and end-of-TSDU markers.
+.pp
+The most complicating factor is that of managing
+expedited data.
+A normal datum may not be sent (for its first time) before the
+acknowledgment of any expedited datum
+that was received from the user after the
+normal datum was received.
+In order to enforce this rule,
+each TPDU must be marked in some way
+so that it will be known which expedited datum
+must be delivered and acknowledged by the peer before this TPDU may be transmitted
+for the first time.
+Markers are placed in \fIso_snd\fR
+when an
+outgoing expedited datum arrives from the user.
+A marker is an mbuf structure with an \fIm_len\fR
+of zero, but with the data area nevertheless containing
+the sequence number of an expedited data TPDU.
+The \fIm_type\fR of a marker is a new type, MT_XPD.
+.pp
+\fITp_send()\fR stops packetizing data when it encounters a marker
+for an unacknowledged expedited datum.
+If it encounters a marker for an expedited TPDU that has already
+been acknowledged, the marker is jettisoned.
+.CF
+illustrates the structure of the sending socket buffer used
+for normal data.
+.pp
+When \fItp_send()\fR moves data from mbufs on \fIso_snd\fR to the retransmission
+control queue, it needs to know
+how many octets of data can be placed in each TPDU.
+The appropriate amount depends on, among other things,
+the maximum transmission unit of the network layer
+on the route the packet will take.
+To determine the maximum transmission unit,
+TP queries the network layer through
+the domain-dependent switch table's field, \fInl_mtu\fR.
+In the Internet domain, this resolves to \fItp_inmtu()\fR.
+The header sizes for the network and transport layers
+also affect the amount of data that can go into a packet,
+and these sizes depend on the connection's characteristics.
+.pp
+Once the maximum amount of data per TPDU is determined,
+\fItp_send()\fR can pull this amount off the \fIso_snd\fR queue to form
+a TPDU,
+assign a TPDU sequence number,
+and place the new TPDU on the
+retransmission control queue.
+The retransmission control queue is a list of mbuf chains.
+Each mbuf chain represents one TPDU, preceded by an
+\fIrtc structure\fR:
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct tp_rtc {
+.T&
+l l l l.
++struct tp_rtc+*tprt_next;+/* next rtc struct in list */
++SeqNum+tprt_seq;+/* seq # of this TPDU */
++int+tprt_eot;+/* end of TSDU? */
++int+tprt_octets;+/* # octets in this TPDU */
++struct mbuf+*tprt_data;+/* ptr to the octets of data */
+.\"/* Performance measurment info: */
+.\"int tprt_window; /* in which call to tp_send() was
+.\" * this TPDU formed?
+.\" */
+.\"struct timeval tprt_sess_time; /* time session received the
+.\" * majority of the data for this packet on send;
+.\" * on recv, this is the time it's given to session
+.\" */
+.\"struct timeval tprt_net_time; /* time first copy was given to net layer
+.\" * on send; on receive it's the time received from
+.\" * the network
+.\" */
+};
+.TE
+\fR
+.)b
+.lp
+Once TPDUs are on the retransmission control queue,
+they are retransmitted or dropped by the actions
+of timers.
+The procedure \fItp_sbdrop()\fR
+removes the TPDUs from the retransmission queue.
+It takes a sequence number as an argument and drops
+all TPDUs up to and including the TPDU with that sequence number.
+.pp
+When an AK TPDU arrives, the values from
+its credit and sequence number fields
+are passed to \fItp_goodack()\fR, which
+determines whether or not the AK brought any news with it,
+and therefore whether TP can send more data
+or expedited data.
+If this AK acknowledges something heretofore unacknowledged,
+\fItp_goodack()\fR drops the appropriate TPDU(s) from the retransmission
+control list, computes the smoothed average round trip time
+and standard deviation of the round trip time,
+and updates
+the retransmission timer based on these statistics.
+It sets a flag in the pcb if the TP entity is obliged to
+send the flow control confirmation parameter on its next
+AK TPDU.
+\fITp_goodack()\fR returns true if the AK brought some news with it,
+either with respect to a change in credit or with respect to
+new acknowledgments.
+.pp
+The function \fItp_goodXack()\fR is called when an XAK TPDU
+arrives.
+It takes the XAK sequence number as an argument and
+determines if the XAK acknowledges the last XPD TPDU sent.
+If so, it drops the expedited data from the outgoing
+expedited data buffer.
+By its definition in the TP specification,
+the expedited data stream has a window
+of size 1,
+that is,
+only one expedited datum (packet) can be buffered
+at a time.
+\fITp_goodXack()\fR returns true if the XAK acknowledged
+the last XPD TPDU sent and the data were dropped,
+and it returns false if the acknowledgment caused no action to be taken.
+.\" NEXT FIGURE
+.so figs/mbufrcv.nr
+.\".so figs/mbufrcv.grn
+.sh 1 "Receive"
+.pp
+This module reorders incoming TPDUs if necessary,
+depacketizes data, passes it to the socket code module,
+and determines when acknowledgments should be sent.
+The function
+\fItp_stash()\fR
+takes an DT TPDU as an argument, and if the TPDU is not in
+sequence, it saves the TPDU in a \fItp_rtc\fR structure in
+a list, with the TPDUs
+kept in order.
+When the next expected TPDU arrives, the
+list of out-of-order TPDUs is scanned for
+more TPDUs in sequence, updating
+a field in the pcb, \fItp_rcvnxt\fR which
+always contains the sequence
+number of
+the next expected TPDU.
+If an acknowledgment is to be generated
+at any time, the value of tp_rcvnxt goes into the
+\fIYR-TU-NR\fR\** field of the acknowledgment TPDU.
+.(f
+\**
+This is the name used in ISO 8073 for the field
+which indicates the sequence number of the next expected DT TPDU.
+.)f
+.pp
+\fITp_stash()\fR returns true if an acknowledgment needs to be generated
+immediately, false not.
+The acknowledgment strategy is therefore implemented in this routine.
+Acknowledgments may be generated for one or more of several reasons,
+listed below.
+\fITp_stash()\fR increments a counter for each of these reasons
+for which an acknowledgment is generated, and a counter for TPDUs
+that are not acknowledged immediately.
+.ip "ACK_STRAT_EACH" 5
+The acknowledgment strategy in use calls for acknowledging each
+data packet with an AK TPDU.
+.ip "ACK_STRAT_FULLWIN" 5
+The acknowledgment strategy in use calls for acknowledging
+upon receiving the DT TPDU that represents the upper window
+edge of the last advertised window.
+.ip "ACK_DUP" 5
+A duplicate data TPDU was received.
+.ip "ACK_REORDER" 5
+A DT TPDU arrived in the window but out of order.
+.ip "ACK_EOT" 5
+A DT TPDU arrived, and it had the end-of-TSDU flag set.
+.pp
+Upon receipt of a DT TPDU that is in order, and upon reordering
+DT TPDUs,
+\fItp_stash()\fR
+places the TSDUs into the socket's receive
+socket buffer, \fIso->so_rcv\fR in mbuf chains, with
+TSDUs delimited by mbufs of the \fIm_type\fR MT_EOT,
+which is a new type with the ARGO kernel.
+.CF
+illustrates the structure of the receiving socket buffer used
+for normal data.
+.pp
+A separate socket buffer, \fItpcb->tp_Xrcv\fR,
+is used for
+buffering expedited data.
+Only one expedited data packet may reside in this buffer at a time
+because the TP standard limits the size of the window on expedited flow
+to be 1.
+This means the data structures are straightforward;
+there is no need to distinguish between separate TSDUs in this socket buffer.
+.pp
+Credit is determined
+by dividing the total amount of available
+space in the receive buffer
+by the negotiated maximum TPDU size.
+TP can often offer a larger credit than this if it uses
+an average of the measured actual TPDU sizes.
+This strategy was once an option in the ARGO kernel,
+but it was removed because unless the actual TPDU size
+is constant, it leads to reneging of credit,
+retransmissions, and decreased performance.
+It does not work well when there is any fluctuation in the sizes
+of TPDUs and it carries the penalty of lengthening the critical path
+of the TP entity.
+.sh 1 "Major Data Structures and Types"
+.pp
+In addition to the types commonly used in the kernel,
+such as
+.(b
+\fC
+.TS
+tab(+);
+l l l l.
+ +typedef+unsigned char+u_char;
+ +typedef+unsigned int+u_int;
+ +typedef+unsigned short+u_short;
+.TE
+\fR
+.)b
+TP uses the following types:
+.(b
+\fC
+.TS
+tab(+);
+l l l l.
+ +typedef+unsigned int+SeqNum
+ +typedef+unsigned short+RefNum;
+ +typedef+int+ProtoHook;
+.TE
+\fR
+.)b
+.pp
+Sequence numbers can be either 7 or 31 bits.
+An unsigned integer is used in all cases, and the proper type
+of arithmetic is performed with bit masks.
+Reference numbers are 16 bits.
+ProtoHook is the type of the procedures that are in switch
+tables, which,
+although they are not functions,
+are declared \fIint\fR rather than \fIvoid\fR
+to be consistent with the rest of the kernel.
+.pp
+The following structures are fundamental
+types used throughout TP,
+in addition to those already described in the
+section,
+"The Design of the Transport Entity".
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct tp_ref {
+.T&
+l l l l.
++u_char+tpr_state;+/* REF_FROZEN...*/
++struct Ccallout+tpr_callout[N_CTIMERS];+/* C timers */
++struct Ecallout+tpr_calltodo;+/* E timers list */
++struct tp_pcb+*tpr_pcb;+/* --> PCB */
+};
+.TE
+\fR
+.)b
+.lp
+The reference structure is logically a part of the protocol
+control block and it is linked to a pcb, but it may outlive
+a pcb.
+When a connection is dissolved, the pcb may be recycled
+but the reference structure must remain until the reference
+timer goes off.
+The field \fItpr_state\fR takes the values
+REF_FROZEN (a reference timer is ticking),
+REF_OPEN (in use, has timers and an associated pcb),
+REF_OPENING (has a pcb but no timers), and
+REF_FREE (free to reallocate).
+.pp
+The TP protocol control block is too large to fit into
+one mbuf structure so it comprises two structures
+linked together, the
+\fItp_pcb\fR structure and the.
+\fItp_pcb_aux\fR structure.
+The \fItp_pcb_aux\fR structure contains
+items that are used less frequently than those in
+the former structure, since each access to these
+items requires a second pointer dereference.
+.(b
+\fC
+.TS
+tab(+);
+l s s s.
+struct tp_pcb_aux {
+.T&
+l l l s.
+ +struct sockbuf+tpa_Xsnd;+/* for expedited data */
++struct sockbuf+tpa_Xrcv;+/* for expedited data */
++u_char +tpa_vers;+/* protocol version */
++u_char +tpa_peer_acktime;+/* to compute DT TPDU
++++retrans timer value */
++SeqNum+tpa_Xsndnxt;+/* seq # of
++++next XPD to send */
++SeqNum+tpa_Xuna;+/* seq # of
++++unacked XPD */
++SeqNum+tpa_Xrcvnxt;+/* next XPD seq #
++++expect to recv */
++/* addressing */
++u_short+tpa_domain;+/* domain AF_ISO,...*/
++u_short+tpa_fsuffixlen;+/* foreign suffix */
++u_char+tpa_fsuffix[MAX_TSAP_SEL_LEN];+
++u_short+tpa_lsuffixlen;+/* local suffix */
++u_char+tpa_lsuffix[MAX_TSAP_SEL_LEN];+
+.T&
+l s s s.
+ +/* AK subsequencing */
+.T&
+l l l s.
+ +u_short+tpa_s_subseq;+/* next subseq to send */
++u_short+tpa_r_subseq;+/* highest recv subseq */
+};
+.TE
+\fR
+.)b
+.pp
+The major portion of the protocol control block is in the
+\fItp_pcb\fR structure:
+.(b
+\fC
+.TS
+tab(%);
+l s s s.
+struct tp_pcb {
+.\" ***************************************
+.T&
+l l l l.
+.\" The next line sets the spacing for the table: 1+3 17+3 17+3 13+3
+ % % %
+.\"456789 123456789- 123456789 123456-789 123456789 1234567890
+.\"
+ %struct tp_ref%*tp_refp;%
+.T&
+l l l s.
+%%/* reference structure */%
+.\" ***************************************
+.T&
+l l l l.
+ %struct tp_pcb_aux%*tp_aux;%
+.T&
+l l l s.
+ %%/*rest of tpcb (auxiliary struct)*/%
+.\" ***************************************
+.T&
+l l l l.
+ %caddr_t%tp_npcb;%/* to ll pcb */
+%struct nl_protosw%*tp_nlproto;%
+.T&
+l l l s.
+ % %/* domain-dependent routines */%
+.\" ***************************************
+.T&
+l l l l.
+ %struct socket%*tp_sock;%/* back ptr */
+.\" ***************************************
+.T&
+l s s s.
+
+/* local and foreign reference numbers: */
+.T&
+l l l l.
+ %RefNum%tp_lref;%
+%RefNum%tp_fref;%
+.\" ***************************************
+.T&
+l s s s.
+.\"456789 123456789 123456789 123456789 123456789 1234567890
+
+/* Stuff for sequence space arithmetic:
+ * Maintaining 2 sequence spaces is a pain so we set these
+ * values once at connection establishment time. Sequence
+ * number arithmetic is a set of macros which uses these.
+ * Sequence numbers are stored as 32 bits.
+ * tp_seqmask tells which of the 32 bits is used.
+ * tp_seqibt is the lsb that is not used. When set,
+ * it indicates wraparound has occurred.
+ * tp_seqhalf is the value that is half the sequence space.
+ * (or half plus one).
+ */
+.T&
+l l l l.
+%u_int%tp_seqmask;%/* mask */
+%u_int%tp_seqbit;%/* wraparound */
+%u_int%tp_seqhalf;%/* half space */
+.\" ***************************************
+.T&
+l s s s.
+
+/* flags: values are defined in tp_user.h.
+ * Here we keep such info as which options
+ * are in use: checksum, extended format,
+ * flow control in class 2, etc.
+ * See tp(4p) man page.
+ */
+.\" ***************************************
+.T&
+l l l l.
+ %u_short%tp_state;%/* fsm */
+%short%tp_retrans;%
+.T&
+l l l s.
+ % % /* # times to retransmit */%
+.\" ***************************************
+.T&
+l s s s.
+
+/* credit & sequencing info for SENDING: */
+.T&
+l l l s.
+ %u_short%tp_fcredit;%
+ % %/* remote real window */%
+ %u_short%tp_cong_win;%
+ % %/* remote congestion window */%
+.\" ***************************************
+%SeqNum%tp_snduna;%
+.T&
+l l l s.
+ % %/* seq # of lowest unacked DT */%
+.\" ***************************************
+.T&
+l l l l.
+ %struct tp_rtc %*tp_snduna_rtc;%
+.T&
+l l l s.
+ % %/* ptr to mbufs containing lowest%
+%% * unacked TPDUs sent so far%
+%% */%
+.\" ***************************************
+.T&
+l l l l.
+ %SeqNum%tp_sndhiwat;%
+.T&
+l l l s.
+ % %/* highest DT sent yet */%
+.\" ***************************************
+.T&
+l l l l.
+ %struct tp_rtc%*tp_sndhiwat_rtc;%
+.T&
+l l l s.
+ % %/* ptr to mbufs containing the last%
+%% * DT sent - this is the last item %
+%% * on the list that starts%
+%% * at tp_snduna_rtc%
+%% */%
+.\" ***************************************
+.T&
+l l l l.
+ %int %tp_Nwindow;%/* for perf. measmt */
+.\" ***************************************
+.T&
+l s s s.
+
+/* credit & sequencing info for RECEIVING: */
+.\" ***************************************
+.T&
+l l l s.
+ %SeqNum%tp_sent_lcdt;%
+ %%/* cdt according to last AK sent */%
+ %SeqNum%tp_sent_uwe;%
+ % %/* upper window edge, according to%
+%% * the last AK sent %
+%% */*
+ %SeqNum%tp_sent_rcvnxt;%
+ % %/* rcvnxt, according to%
+%% * the last AK sent%
+%% */*
+.\" ***************************************
+.T&
+l l l l.
+ %short%tp_lcredit;%/* local */
+.\" ***************************************
+.T&
+l l l l.
+ %SeqNum%tp_rcvnxt;%
+.T&
+l l l s.
+ % %/* next DT seq# we expect to recv */%
+.\" ***************************************
+.T&
+l l l l.
+ %struct tp_rtc%*tp_rcvnxt_rtc;%
+.T&
+l l l s.
+ % %/* ptr to mbufs containing unacked %
+%% * DTs received out of order, and %
+%% * which we haven't acknowledged%
+%% */%
+.\" ***************************************
+.TE
+.TS
+tab(%);
+l s s s.
+/* Items kept in the aux structure: */
+
+.\" ***************************************
+.T&
+l s s l.
+#define tp_vers%tp_aux->tpa_vers
+#define tp_peer_acktime%tp_aux->tpa_peer_acktime
+#define tp_Xsnd%tp_aux->tpa_Xsnd
+#define tp_Xrcv%tp_aux->tpa_Xrcv
+#define tp_Xrcvnxt%tp_aux->tpa_Xrcvnxt
+#define tp_Xsndnxt%tp_aux->tpa_Xsndnxt
+#define tp_Xuna%tp_aux->tpa_Xuna
+#define tp_domain%tp_aux->tpa_domain
+#define tp_fsuffixlen%tp_aux->tpa_fsuffixlen
+#define tp_fsuffix%tp_aux->tpa_fsuffix
+#define tp_lsuffixlen%tp_aux->tpa_lsuffixlen
+#define tp_lsuffix%tp_aux->tpa_lsuffix
+#define tp_s_subseq%tp_aux->tpa_s_subseq
+#define tp_r_subseq%tp_aux->tpa_r_subseq
+.\" ***************************************
+.T&
+l s s s.
+ % % %
+/* parameters per-connection controllable by user: */
+.\" ***************************************
+.T&
+l l l l.
+ %struct%tp_conn_param%_tp_param;
+ % % %
+.\" ***************************************
+.T&
+l s s l.
+#define tp_Nretrans%_tp_param.p_Nretrans
+#define tp_dr_ticks%_tp_param.p_dr_ticks
+#define tp_cc_ticks%_tp_param.p_cc_ticks
+#define tp_dt_ticks%_tp_param.p_dt_ticks
+#define tp_xpd_ticks%_tp_param.p_x_ticks
+#define tp_cr_ticks%_tp_param.p_cr_ticks
+#define tp_keepalive_ticks%_tp_param.p_keepalive_ticks
+#define tp_sendack_ticks%_tp_param.p_sendack_ticks
+#define tp_refer_ticks%_tp_param.p_ref_ticks
+#define tp_inact_ticks%_tp_param.p_inact_ticks
+#define tp_xtd_format%_tp_param.p_xtd_format
+#define tp_xpd_service%_tp_param.p_xpd_service
+#define tp_ack_strat%_tp_param.p_ack_strat
+#define tp_rx_strat%_tp_param.p_rx_strat
+#define tp_use_checksum%_tp_param.p_use_checksum
+#define tp_tpdusize%_tp_param.p_tpdusize
+#define tp_class%_tp_param.p_class
+#define tp_winsize%_tp_param.p_winsize
+#define tp_netservice%_tp_param.p_netservice
+#define tp_no_disc_indications%_tp_param.p_no_disc_indications
+#define tp_dont_change_params%_tp_param.p_dont_change_params
+.\" ***************************************
+.TE
+.\" ***************************************
+.\" ***************************************
+.\" ***************************************
+.TS
+tab(%);
+l l l l.
+.\" The next line sets the spacing for the table: 1+3 17+3 17+3 13+3
+.\"456789 123456789- 123456789 123456-789 123456789 1234567890
+.\"
+.T&
+l l l s.
+ %%/* log2(the negotiated max size) */%
+.T&
+l l l l.
+ %int%tp_l_tpdusize;%/* # bytes */
+.\" ***************************************
+ %struct timeval%tp_rtt;%
+.T&
+l l l s.
+ % %/* smoothed avg round-trip time */%
+ %struct timeval%tp_rtv;%
+ % %/* std deviation of round-trip time */%
+%struct timeval%tp_rttemit[ TP_RTT_NUM + 1 ];%
+%%/* times that the last TP_RTT_NUM %
+%% * DT_TPDUs were transmitted %
+%% */%
+.\" ***************************************
+ %unsigned % %
+% tp_sendfcc:1,%/* shall next ack %
+% %include flow control conf. param? */%
+.\" ***************************************
+.T&
+l l l s.
+ % tp_trace:1,%/* is this pcb being traced?%
+%% * (not used yet) %
+%% */%
+.\" ***************************************
+% tp_perf_on:1,%/* statistics being kept? */%
+.\" ***************************************
+% tp_reneged:1,%/* have we reneged on credit%
+%% * since the last AK TPDU was sent? %
+%% */%
+% tp_decbit:4,%/* congestion experienced? */%
+% tp_flags:8,%/* see #defines below */%
+.\" ***************************************
+% tp_unused:16;%%
+.T&
+l s s l.
+#define TPF_XPD_PRESENT%TPFLAG_XPD_PRESENT
+#define TPF_NLQOS_PDN%TPFLAG_NLQOS_PDN
+#define TPF_PEER_ON_SAMENET%TPFLAG_PEER_ON_SAMENET
+%%%
+.\" ***************************************
+.T&
+l l l l.
+ %struct tp_pmeas%*tp_p_meas;%
+.T&
+l l l s.
+ % %/* ptr to mbuf to hold the perf.%
+%% * statistics structure %
+%% */%
+.\" ***************************************
+};
+.TE
+\fR
+.\"
+.\" end of tpcb structure (thank you)
+.\"
+.)b
+.fi
+.sh 1 "Sequence Number Arithmetic"
+.pp
+Sequence numbers in TP can be either 7 bits
+(\*(lqnormal format\*(rq)
+or 31 bits
+(\*(lqextended format\*(rq).
+Sequence numbers are unsigned integers,
+regardless of their format.
+Three fields are kept in the pcb to manage the sequence
+number arithmetic:
+.(b
+\fC
+.TS
+tab(+);
+l l l l.
+ +u_int+tp_seqmask;+/* mask for seq space */
+ +u_int+tp_seqbit;+/* bit for seq # wraparound */
+ +u_int+tp_seqhalf;+/* half the seq space */
+.TE
+\fR
+.)b
+.lp
+\fITp_seqmask\fR
+is a bit mask indicating which bits are legitimate
+for a sequence number of either format.
+It takes the value 0x7f if 7-bit sequence numbers are in use,
+and 0x7fffffff if 31-bit sequence numbers are in use.
+\fITp_seqbit\fR
+is the bit that becomes set when a sequence number wraps around
+while being incremented.
+Its value is 0x80 for normal format, 0x80000000 for extended format.
+\fITp_seqhalf\fR
+takes the value which is in the middle of the sequence space,
+0x40 for normal format,
+and
+0x40000000 for extended format.
+.(b
+.nf
+The macro
+.fi
+\fC
+.TS
+tab(+);
+l l l l.
+ SEQ(tpcb, x)
+.TE
+\fR
+.)b
+.lp
+extracts a sequence number from the location
+in which it is stored.
+.pp
+The macros
+.(b
+\fC
+.TS
+tab(+);
+l l s s l.
+ +SEQ_GT(tpcb, seq, t)+is seq > t?
+ +SEQ_GEQ(tpcb, seq, t)+is seq >= t?
+ +SEQ_LT(tpcb, seq, t)+is seq < t?
+ +SEQ_LEQ(tpcb, seq, t)+is seq <= t?
+ +SEQ_INC(tpcb, seq)+seq\+\+
+ +SEQ_DEC(tpcb, seq)+seq--
+ +SEQ_SUB(tpcb, seq, amt)+seq -= amt
+ +SEQ_ADD(tpcb, seq, amt)+seq \+= amt
+.TE
+\fR
+.)b
+.lp
+perform the indicated comparisons and arithmetic
+on their arguments.
+.pp
+An example of how these macros
+are used is as follows.
+To determine if a sequence
+number \fIseq\fR is in a receive window
+bounded by
+\fIlwe\fR and \fIuwe\fR,
+we define the
+macro
+.(b
+\fC
+.TS
+tab(+);
+l l.
+#define+IN_RWINDOW(tpcb, seq, lwe, uwe)\\
++( SEQ_GEQ(tpcb, seq, lwe) && SEQ_LT(tpcb, seq, uwe) )
+.TE
+\fR
+.)b
+.sh 1 "TP Implementation Options"
+.pp
+The transport protocol specification leaves several
+things to the discretion of the implementor,
+some of which may affect the performance
+of individual connections and
+aggregate performance.
+Wherever different strategies are likely to favor
+the performance of
+individual connections to the detriment of aggregate performance
+or vice versa, the
+various strategies are under the control of options via the
+\fIgetsockopt()\fR and
+\fIsetsockopt()\fR system calls (see the manual pages
+\fIgetsockopt(2)\fR,
+\fIsetsockopt(2)\fR
+and
+\fItp(4p)\fR
+for details).
+In some cases the preferred strategies differ for the different
+subnetworks, so the strategies chosen will be determined
+by the subnetwork in use.
+.sh 2 "TPDU size"
+.pp
+The limitation of the maximum TPDU size to a power of two is
+unfortunate in the LAN environment.
+For example, if the maximum NSDU size is around 1500, as in the case of an
+Ethernet,
+using a maximum TPDU size of 1024 reduces
+the possible throughput by approximately 30%.
+TP negotiates a maximum TPDU size of 2048 and
+generates TPDUs of size around 1500.
+Obviously this works well only when the peer is known to be
+using the same scheme (so that the peer
+doesn't send TPDUs of size 2048 and cause its
+network layer to fragment the TPDUs).
+This is likely to be the case in a LAN where
+all protocol entities are under the same administrative
+control.
+The maximum TPDU size negotiated is under the control of the user,
+so
+it is possible to prevent this scheme from being used
+by default
+when the peer is not on the same LAN, by
+setting the \fItp.tpdusize\fR parameter in the ARGO directory service
+file to
+something less than the network's maximum transmission
+unit.
+.\"***********************************************************
+.sh 2 "Congestion Window Strategy"
+.pp
+The congestion window strategy from the
+DoD Internet
+was adapted for use with TP.
+The strategy is intended to minimize the
+adverse effect
+of transport's retransmission on an
+already congested network.
+.pp
+A TP entity keeps two notions of the peer's window:
+the real window, which is that advertised by the peer
+in AK TPDUs, and the congestion window, which is a locally
+controlled window.
+TP uses the smaller of the two windows when transmitting.
+The congestion window starts small, which keeps a
+new connection from overloading the network with a sudden
+burst of packets
+immediately after connection establishement.
+This is called \fIslow start\fR.
+For each successful acknowledgment received, the congestion
+window grows by one, until eventually the real window
+is the one in use.
+If a retransmission timer expires, the congestion window
+is reset to size one.
+.pp
+The congestion window strategy is used for class 4 unless
+the transport user requests that it not be used.
+The slow start strategy is used for traffic over a PDN
+unless
+the transport user requests that it not be used.
+Slow start is not used for traffic over a LAN unless
+its use is requested by the transport user.
+.\"***********************************************************
+.sh 2 "Retransmission strategies"
+.pp
+A retransmission timer is invoked for each set of DT TPDUs
+sent in one send operation (call to \fItp_send()\fR).
+This set of packets is called the \fIsend window\fR for the purpose
+of this discusssion.
+.pp
+The number of TPDUs
+in a send window
+depends on the remote credit and the amount of data
+in the local send buffers.
+When a retransmission timer goes off, the lower
+window edge
+is reevaluated but the upper window edge is not reevaluated.
+.pp
+There are several retransmission strategies implemented in
+ARGO TP.
+The choice of strategies is the user's, and is made with the
+\fIsetsockopt()\fR system call.
+The strategies are summarized here:
+.ip "Retransmit LWE TPDU only:" 5
+Only the TPDU representing the new lower window edge
+is retransmitted.
+This is the default retransmission strategy.
+.ip "Retransmit whole send window:" 5
+Retransmission begins with the new lower window edge
+and continues up to the old upper window edge.
+.pp
+The value of the data retransmission timer
+adapts to the average round trip time and the standard deviation of
+the round trip time.
+A round trip time is the time that passes between
+the moment of a packet's first transmission and
+the moment it is first acknowledged.
+The average round trip time
+is kept by the sending side of TP, using
+a formula for
+smoothing the average:
+.(b
+\fC
+.TS
+tab(+);
+l l l l.
+#define+TP_RTT_ALPHA+3
+#define+TP_RTV_ALPHA+2
++++
+#define+SMOOTH(alpha, old, new) \\
++(((new-old) >> alpha ) \+ (old) )
+.TE
+\fR
+.)b
+.lp
+The times included in the average are chosen as follows.
+The time of
+each packet's initial transmission is kept (for the last
+\fIN\fR packets, where \fIN\fR is a defined constant).
+When an AK TPDU arrives, ARGO TP subtracts the initial transmission
+time for the lowest unacknowledged sequence number that was
+acknowledged by this AK TPDU from the current time,
+and apply the resulting time to the average.
+Hence, not all packets are included in this average,
+which is as it should be since
+the purpose of this measurement is
+to find a good value for the retransmission timer.
+.pp
+Each time part of a window is retransmitted,
+the retransmission timer for that window is increased.
+This does not affect the retransmission timers for other windows.
+.\"***********************************************************
+.sh 2 "Acknowledgment strategies"
+.pp
+The transport protocol specification
+requires acknowledgments to be sent immediately
+upon receipt
+of CC TPDUs (in class 4), XPD TPDUs, and DT TPDUs containing an
+EOT marker, and at other times as required for flow control,
+otherwise acknowledgments may be delayed.
+In addition to the times when an acknowledgment is required,
+ARGO TP transmits an AK TPDU whenever the user receives some data,
+thereby increasing the size of the window.
+For those times when
+immediate acknowledgment is optional,
+ARGO TP offers two acknowledgment strategies:
+.ip " Acknowledge each TPDU" 10
+Upon receipt of a DT TPDU and AK TPDU is sent.
+.ip " Acknowledge full window" 10
+Acknowledgment is issued
+upon receipt of enough data to
+consume the last advertised credit.
+.pp
+The latter strategy
+requires a timer to trigger an acknowledgment
+in case the peer doesn't send the entire window
+quickly.
+This timer is called the
+\fIsendack timer\fR.
+The upper bound on the value of this timer
+is called the \fIlocal acknowledgment time\fR.
+The local acknowledgment time may be "advertised" to the
+peer during connection establishment, and the
+peer may choose to use this value to
+adjust its retransmission timers.
+The ARGO TP entity advertises its local acknowledgment time
+on a CR TPDU, but it is not
+constrained by
+the remote acknowledge time, should the peer
+advertise it.
+Instead,
+ARGO TP adapts its sendack timer
+to the behavior of the connection.
+.pp
+Under the assumption that the round trip time is
+often
+symmetric,
+and lacking
+a method to measure
+the round trip time in the other direction,
+ARGO TP uses the measured average round trip time
+to adjust the the sendack timer.
+.pp
+The choice of strategies is made with the
+\fIsetsockopt()\fR system call.
+The default strategy is
+to
+delay acknowledgments until the most recently advertised window is filled.
OpenPOWER on IntegriCloud