summaryrefslogtreecommitdiffstats
path: root/sys/netinet/tcp_subr.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Switch to using a struct xucred instead of a struct xucred when notgreen2001-02-181-5/+16
| | | | | | | | | | | | | | | | | actually in the kernel. This structure is a different size than what is currently in -CURRENT, but should hopefully be the last time any application breakage is caused there. As soon as any major inconveniences are removed, the definition of the in-kernel struct ucred should be conditionalized upon defined(_KERNEL). This also changes struct export_args to remove dependency on the constantly-changing struct ucred, as well as limiting the bounds of the size fields to the correct size. This means: a) mountd and friends won't break all the time, b) mountd and friends won't crash the kernel all the time if they don't know what they're doing wrt actual struct export_args layout. Reviewed by: bde
* Remove unneeded loop increment in src/sys/netinet/in_pcb.c:in_pcbnotifyphk2001-02-181-12/+43
| | | | | | | | | | | | | | | | | | | | Add new PRC_UNREACH_ADMIN_PROHIB in sys/sys/protosw.h Remove condition on TCP in src/sys/netinet/ip_icmp.c:icmp_input In src/sys/netinet/ip_icmp.c:icmp_input set code = PRC_UNREACH_ADMIN_PROHIB or PRC_UNREACH_HOST for all unreachables except ICMP_UNREACH_NEEDFRAG Rename sysctl icmp_admin_prohib_like_rst to icmp_unreach_like_rst to reflect the fact that we also react on ICMP unreachables that are not administrative prohibited. Also update the comments to reflect this. In sys/netinet/tcp_subr.c:tcp_ctlinput add code to treat PRC_UNREACH_ADMIN_PROHIB and PRC_UNREACH_HOST different. PR: 23986 Submitted by: Jesper Skriver <jesper@skriver.dk>
* Mechanical change to use <sys/queue.h> macro API instead ofphk2001-02-041-4/+4
| | | | | | | fondling implementation details. Created with: sed(1) Reviewed by: md5(1)
* Update the "icmp_admin_prohib_like_rst" code to check the tcp-window andphk2000-12-241-9/+60
| | | | | | | to be configurable with respect to acting only in SYN or in all TCP states. PR: 23665 Submitted by: Jesper Skriver <jesper@skriver.dk>
* We currently does not react to ICMP administratively prohibitedphk2000-12-161-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | messages send by routers when they deny our traffic, this causes a timeout when trying to connect to TCP ports/services on a remote host, which is blocked by routers or firewalls. rfc1122 (Requirements for Internet Hosts) section 3.2.2.1 actually requi re that we treat such a message for a TCP session, that we treat it like if we had recieved a RST. quote begin. A Destination Unreachable message that is received MUST be reported to the transport layer. The transport layer SHOULD use the information appropriately; for example, see Sections 4.1.3.3, 4.2.3.9, and 4.2.4 below. A transport protocol that has its own mechanism for notifying the sender that a port is unreachable (e.g., TCP, which sends RST segments) MUST nevertheless accept an ICMP Port Unreachable for the same purpose. quote end. I've written a small extension that implement this, it also create a sysctl "net.inet.tcp.icmp_admin_prohib_like_rst" to control if this new behaviour is activated. When it's activated (set to 1) we'll treat a ICMP administratively prohibited message (icmp type 3 code 9, 10 and 13) for a TCP sessions, as if we recived a TCP RST, but only if the TCP session is in SYN_SENT state. The reason for only reacting when in SYN_SENT state, is that this will solve the problem, and at the same time minimize the risk of this being abused. I suggest that we enable this new behaviour by default, but it would be a change of current behaviour, so if people prefer to leave it disabled by default, at least for now, this would be ok for me, the attached diff actually have the sysctl set to 0 by default. PR: 23086 Submitted by: Jesper Skriver <jesper@skriver.dk>
* Revert the last commit to the callout interface, and add a flag tojlemon2000-11-251-5/+5
| | | | | | | callout_init() indicating whether the callout is safe or not. Update the callers of callout_init() to reflect the new interface. Okayed by: Jake
* Convert all users of fldoff() to offsetof(). fldoff() is badphk2000-10-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | because it only takes a struct tag which makes it impossible to use unions, typedefs etc. Define __offsetof() in <machine/ansi.h> Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h> Remove myriad of local offsetof() definitions. Remove includes of <stddef.h> in kernel code. NB: Kernelcode should *never* include from /usr/include ! Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API. Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001. Paritials reviews by: various. Significant brucifications by: bde
* be careful on mbuf overrun on ctlinput.itojun2000-10-231-1/+6
| | | | | short icmp6 packet may be able to panic the kernel. sync with kame.
* Use stronger random number generation for TCP_ISSINCR and tcp_iss.kris2000-09-291-1/+1
| | | | Reviewed by: peter, jlemon
* Finally make do_tcpdrain sysctl live under correct parent, _net_inet_tcp,bmilekic2000-09-251-2/+2
| | | | as opposed to _debug. Like before, default value remains 1.
* When a connection is being dropped due to a listen queue overflow,jayanth2000-07-211-0/+12
| | | | | | | | | delete the cloned route that is associated with the connection. This does not exhaust the routing table memory when the system is under a SYN flood attack. The route entry is not deleted if there is any prior information cached in it. Reviewed by: Peter Wemm,asmodai
* sync with kame tree as of july00. tons of bug fixes/improvements.itojun2000-07-041-8/+4
| | | | | | | API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-3/+3
| | | | Pointed out by: bde
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-3/+3
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* Let initialize th_sum before in6_cksum(), again.shin2000-04-191-0/+1
| | | | | | | | | Without this fix, all IPv6 TCP RST packet has wrong cksum value, so IPv6 connect() trial to 5.0 machine won't fail until tcp connect timeout, when they should fail soon. Thanks to haro@tk.kubota.co.jp (Munehiro Matsuda) for his much debugging help and detailed info.
* Add support for offloading IP/TCP/UDP checksums to NIC hardware whichjlemon2000-03-271-22/+19
| | | | supports them.
* Limit the maximum permissible TCP window size to 65535 octets ifps2000-02-281-1/+4
| | | | | | | | | window scaling is disabled. PR: kern/16914 Submitted by: Jayanth Vijayaraghavan <jayanth@yahoo-inc.com> Reviewed by: wollman Approved by: jkh
* Fix the bug that IPv4 ttl is not initialized when AF_INET6 socket is usedshin2000-01-251-8/+5
| | | | | | | | for IPv4 communication.(IPv4 mapped IPv6 addr.) Also removed IPv6 hoplimit initialization because it is alway done at tcp_output. Confirmed by: Bernd Walter <ticso@cicely5.cicely.de>
* Fixed the problem that IPsec connection hangs when bigger data is sent.shin2000-01-151-0/+1
| | | | | | | -opt_ipsec.h was missing on some tcp files (sorry for basic mistake) -made buildable as above fix -also added some missing IPv4 mapped IPv6 addr consideration into ipsec4_getpolicybysock
* Added missing 'else' for 'if (isipv6)' at IPv6 length setting in tcp_respond().shin2000-01-151-1/+1
| | | | | By this bug, IPv6 reset was not sent. (I checked around same kind of bug, but no other found.)
* Removed wrong(unnecessary) & operators for pointer, in ipsec_hdrsiz_tcp().shin2000-01-151-2/+2
| | | | | | | This must be one of the reason why connections over IPsec hangs for bigger packets.(which was reported on freebsd-current@freebsd.org) But there still seems to be another bug and the problem is not yet fixed.
* Clear rt after RTFREE. This might have sometime caused kernel panic at rtfree()shin2000-01-131-1/+4
| | | | on INET6 enabled environment.
* removed incorrect ip6 length setting for IPv6 tcp reset packet.shin2000-01-131-1/+0
|
* tcp updates to support IPv6.shin2000-01-091-68/+500
| | | | | | | also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
* Make tcp_drain() actually do something. When invoked (usually as amsmith1999-12-281-0/+29
| | | | | | | | | | | desperation measure in low-memory situations), walk the tcpbs and flush the reassembly queues. This behaviour is currently controlled by the debug.do_tcpdrain sysctl (defaults to on). Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: wollman
* udp IPv6 support, IPv6/IPv4 tunneling support in kernel,shin1999-12-071-2/+2
| | | | | | | | | | packet divert at kernel for IPv6/IPv4 translater daemon This includes queue related patch submitted by jburkhol@home.com. Submitted by: queue related patch from jburkhol@home.com Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
* KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCPshin1999-11-221-0/+1
| | | | | | | | | | for IPv6 yet) With this patch, you can assigne IPv6 addr automatically, and can reply to IPv6 ping. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
* KAME related header files additions and merges.shin1999-11-051-0/+6
| | | | | | | (only those which don't affect c source files so much) Reviewed by: cvs-committers Obtained from: KAME project
* Change so_cred's type to a ucred, not a pcred. THis makes more sense, actually.green1999-09-191-4/+2
| | | | | | Make a sonewconn3() which takes an extra argument (proc) so new sockets created with sonewconn() from a user's system call get the correct credentials, not just the parent's credentials.
* Restructure TCP timeout handling:jlemon1999-08-301-3/+35
| | | | | | | | | | - eliminate the fast/slow timeout lists for TCP and instead use a callout entry for each timer. - increase the TCP timer granularity to HZ - implement "bad retransmit" recovery, as presented in "On Estimating End-to-End Network Path Properties", by Allman and Paxson. Submitted by: jlemon, wollmann
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Add readonly OID ``net.inet.tcp.tcbhashsize'' so it is possible tojlemon1999-08-261-1/+6
| | | | discover the size of the TCB hashtable on a running system.
* Two new sysctls: net.inet.tcp.getcred and net.inet.udp.getcred. These takegreen1999-07-111-1/+33
| | | | | | | | | a sockaddr_in[2] (local, then remote) and return a struct ucred. Example code for these is at: http://www.FreeBSD.org/~green/inetd_ident.patch http://www.FreeBSD.org/~green/freebsd4.c (for pidentd) Reviewed by: bde
* Use the new tunable macros for the net.inet.tcp.tcbhashsize tunable.msmith1999-07-051-3/+2
|
* Close a race window where a tcp socket is closed while tcp_pcblist istegge1999-06-161-2/+7
| | | | copying out tcp socket info, causing a NULL pointer to be dereferenced.
* Add sysctl descriptions to many SYSCTL_XXXsbillf1999-05-031-11/+11
| | | | | | | PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)
* This Implements the mumbled about "Jail" feature.phk1999-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
* Nuke all the stupid ffs() stuff and use powerof2() instead.msmith1999-02-041-2/+2
| | | | Submitted by: Bruce Evans <bde@zeta.org.au>
* Fix power-of-2 check for the TCB hash size.msmith1999-02-041-2/+2
| | | | Submitted by: Brian Feldman <green@unixhelp.org>
* Make TCBHASHSIZE a boot-time tunable as well, taking its value from themsmith1999-02-031-4/+14
| | | | | | variable net.inet.tcp.tcbhashsize. Requested by: David Filo <filo@yahoo-inc.com>
* The "easy" fixes for compiling the kernel -Wunused: remove unreferenced staticarchie1998-12-071-2/+1
| | | | and local variables, goto labels, and functions declared but not defined.
* The below patch helps to reduce the leakage of internal socket informationguido1998-11-151-2/+3
| | | | | | | when a TCP "stealth" scan is directed at a *BSD box by ensuring the window is 0 for all RST packets generated through tcp_respond() Reviewed by: Don Lewis <Don.Lewis@tsc.tdk.com> Obtained from: Bugtraq (from: Darren Reed <avalon@COOMBS.ANU.EDU.AU>)
* RFC 1644 has the status "Experimental Protocol", which means:phk1998-09-061-2/+2
| | | | | | | | | | 4.1.4. Experimental Protocol A system should not implement an experimental protocol unless it is participating in the experiment and has coordinated its use of the protocol with the developer of the protocol. Pointed out by: Steinar Haug <sthaug@nethelp.no>
* Re-implement tcp and ip fragment reassembly to not store pointers in thedfr1998-08-241-15/+12
| | | | | | ip header which can't work on alpha since pointers are too big. Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
* Convert socket structures to be type-stable and add a version number.wollman1998-05-151-25/+92
| | | | | | | | | | | | | | | | | | | Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).
* Fixed style bugs (mostly) in previous commit.bde1998-03-281-5/+7
|
* Use the zone allocator to allocate inpcbs and tcpcbs. Each protocol createswollman1998-03-241-8/+46
| | | | | | | | its own zone; this is used particularly by TCP which allocates both inpcb and tcpcb in a single allocation. (Some hackery ensures that the tcpcb is reasonably aligned.) Also keep track of the number of pcbs of each type allocated, and keep a generation count (instance version number) for future use.
* Improved connection establishment performance by doing local port lookups viadg1998-01-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a hashed port list. In the new scheme, in_pcblookup() goes away and is replaced by a new routine, in_pcblookup_local() for doing the local port check. Note that this implementation is space inefficient in that the PCB struct is now too large to fit into 128 bytes. I might deal with this in the future by using the new zone allocator, but I wanted these changes to be extensively tested in their current form first. Also: 1) Fixed off-by-one errors in the port lookup loops in in_pcbbind(). 2) Got rid of some unneeded rehashing. Adding a new routine, in_pcbinshash() to do the initialial hash insertion. 3) Renamed in_pcblookuphash() to in_pcblookup_hash() for easier readability. 4) Added a new routine, in_pcbremlists() to remove the PCB from the various hash lists. 5) Added/deleted comments where appropriate. 6) Removed unnecessary splnet() locking. In general, the PCB functions should be called at splnet()...there are unfortunately a few exceptions, however. 7) Reorganized a few structs for better cache line behavior. 8) Killed my TCP_ACK_HACK kludge. It may come back in a different form in the future, however. These changes have been tested on wcarchive for more than a month. In tests done here, connection establishment overhead is reduced by more than 50 times, thus getting rid of one of the major networking scalability problems. Still to do: make tcp_fastimo/tcp_slowtimo scale well for systems with a large number of connections. tcp_fastimo is easy; tcp_slowtimo is difficult. WARNING: Anything that knows about inpcb and tcpcb structs will have to be recompiled; at the very least, this includes netstat(1).
* Make TCP_COMPAT_42 a new style option.eivind1998-01-251-1/+2
|
* Fix an incredibly horrible bug in the ipfw codejulian1997-12-191-1/+3
| | | | | | | | where if you are using the "reset tcp" firewall command, the kernel would write ethernet headers onto random kernel stack locations. Fought to the death by: terry, julian, archie. fix valid for 2.2 series as well.
OpenPOWER on IntegriCloud