summaryrefslogtreecommitdiffstats
path: root/sys/netinet
Commit message (Collapse)AuthorAgeFilesLines
* Delay the assignment of a path for DATA chunk until they hittuexen2010-09-158-191/+112
| | | | | | | the sent_queue. Honor a given path when the SCTP_ADDR_OVER flag is set. MFC after: 2 weeks.
* Use TAILQ_EMPTY() for testing if a tail queue is empty.tuexen2010-09-151-4/+5
| | | | Set whoFrom to NULL after freeing whoFrom.
* Remove unused variable/assignment.tuexen2010-09-151-2/+1
| | | | MFC after: 2 weeks.
* Remove assignment without effect.tuexen2010-09-151-2/+0
| | | | MFC after: 2 weeks.
* * Use !TAILQ_EMPTY() for checking if a tail queue is not empty.tuexen2010-09-151-4/+3
| | | | | | * Remove assignment without any effect. MFC after: 2 weeks.
* Change the default MSS for IPv4 and IPv6 TCP connections from anandre2010-09-151-19/+27
| | | | | | | | | | | | | | | | | | artificial power-of-2 rounded number to their real values specified in RFC879 and RFC2460. From the history and existing comments it appears that the rounded numbers were intended to be advantageous for the kernel and mbuf system. However this hasn't been the case at for at least a long time. The mbuf clusters used in tcp_output() have enough space to hold the larger real value for the default MSS for both IPv4 and IPv6. Note that the default MSS is only used when path MTU discovery is disabled. Update and expand related comments. Reviewed by: lsteward (including some word-smithing) MFC after: 2 weeks
* Adding an address on an interface also requires the loopback route toqingli2010-09-121-0/+2
| | | | | | | | that address be installed. PR: kern/150481 Submitted by: Ingo Flaschberger <if at xip.at> MFC after: 5 days
* * Remove code which has no effect.tuexen2010-09-091-108/+61
| | | | | | * Clean up the handling in sctp_lower_sosend(). MFC after: 3 weeks.
* Fix CARP in backup mode by properly registering its hooks for INET and INET6will2010-09-061-0/+15
| | | | | | | | | using ipproto_{un,}register() and the newly created ip6proto_{un,}register() so that it can again receive IPPROTO_CARP packets allowing its state machine to work. Reviewed by: bz Approved by: ken (mentor)
* Fix static kernel builds with carp(4) by changing its SYSINIT order so thatwill2010-09-061-1/+1
| | | | | | | | it is initialized after basic protocol initialization, which allows it to register via pf_proto_register(). Reviewed by: bz Approved by: ken (mentor)
* in_delayed_cksum() requires host byte order.glebius2010-09-061-6/+4
| | | | | Reported by: Alexander Levin <amindomao googlemail.com> MFC after: 1 week
* Implement correct handling of address parameter andtuexen2010-09-052-122/+78
| | | | | | sendinfo for SCTP send calls. MFC after: 4 weeks.
* Fix some CLANG warnings. One clang warning is leftrrs2010-09-055-17/+35
| | | | | | due to the fact that its bogus.. nam->sa_family will not change from AF_INET6 to AF_INET (but clang thinks it does ;-D)
* In case of RADIX_MPATH do not leak the IN_IFADDR read lock onbz2010-09-041-2/+3
| | | | | | early return. MFC after: 3 days
* MFp4 CH=183052 183053 183258:bz2010-09-022-12/+8
| | | | | | | | | | | | | | | | | | | | | In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number(*) and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling. (*) the functions have been without any in-tree consumer since the initial introducation, so this is considered save. Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically. Reviewed by: philip, will MFC after: 1 week
* Fix a bug which results in peer IPv4 addresses a.b.c.d with 224<=d<=239tuexen2010-09-011-1/+1
| | | | | | incorrectly being detected as multicast addresses on little endian systems. MFC after: 2 weeks
* o Some programs could send broadcast/multicast traffic to ipfwmaxim2010-08-301-2/+21
| | | | | | | | | | | | | pseudo-interface. This leads to a panic due to uninitialized if_broadcastaddr address. Initialize it and implement ip_output() method to prevent mbuf leak later. ipfw pseudo-interface should never send anything therefore call panic(9) in if_start() method. PR: kern/149807 Submitted by: Dmitrij Tejblum MFC after: 2 weeks
* Fix the the SCTP_WITH_NO_CSUM option when used in combination withtuexen2010-08-297-101/+123
| | | | | | | interface supporting CRC offload. While at it, make use of the feature that the loopback interface provides CRC offloading. MFC after: 4 weeks
* Bugfix: Do not send a packet drop report in response to a receivedtuexen2010-08-281-3/+6
| | | | INIT-ACK with incorrect CRC.
* Fix the switching on/off of CMT using sysctl and socket option.tuexen2010-08-2811-174/+159
| | | | | | | | Fix the switching on/off of PF and NR-SACKs using sysctl. Add minor improvement in handling malloc failures. Improve the address checks when sending. MFC after: 4 weeks
* Simplify the tcp pcblist estimate logic slightly.jhb2010-08-271-5/+3
| | | | MFC after: 3 days
* Use timestamp modulo comparison macro for automatic receive bufferandre2010-08-271-1/+1
| | | | | | scaling to correctly handle wrapping of ticks value. MFC after: 1 week
* MFp4: anchie_soc2009 branch:anchie2010-08-191-0/+1
| | | | | | | | | | | | | | | | | | | | Add kernel side support for Secure Neighbor Discovery (SeND), RFC 3971. The implementation consists of a kernel module that gets packets from the nd6 code, sends them to user space on a dedicated socket and reinjects them back for further processing. Hooks are used from nd6 code paths to divert relevant packets to the send implementation for processing in user space. The hooks are only triggered if the send module is loaded. In case no user space application is connected to the send socket, processing continues normaly as if the module would not be loaded. Unloading the module is not possible at this time due to missing nd6 locking. The native SeND socket is similar to a raw IPv6 socket but with its own, internal pseudo-protocol. Approved by: bz (mentor)
* If a TCP connection has been idle for one retransmit timeout or moreandre2010-08-182-15/+28
| | | | | | | | | | | | | | | | | | | it must reset its congestion window back to the initial window. RFC3390 has increased the initial window from 1 segment to up to 4 segments. The initial window increase of RFC3390 wasn't reflected into the restart window which remained at its original defaults of 4 segments for local and 1 segment for all other connections. Both values are controllable through sysctl net.inet.tcp.local_slowstart_flightsize and net.inet.tcp.slowstart_flightsize. The increase helps TCP's slow start algorithm to open up the congestion window much faster. Reviewed by: lstewart MFC after: 1 week
* Untangle the net.inet.tcp.log_in_vain and net.inet.tcp.log_debugandre2010-08-183-5/+29
| | | | | | | | | | | | | | | | sysctl's and remove any side effects. Both sysctl's share the same backend infrastructure and due to the way it was implemented enabling net.inet.tcp.log_in_vain would also cause log_debug output to be generated. This was surprising and eventually annoying to the user. The log output backend is kept the same but a little shim is inserted to properly separate log_in_vain and log_debug and to remove any side effects. PR: kern/137317 MFC after: 1 week
* When calculating the expected memory size for userspace, also take thebz2010-08-181-1/+1
| | | | | | | | number of syncache entries into account for the surplus we add to account for a possible increase of records in the re-entry window. Discussed with: jhb, silby MFC after: 1 week
* Ensure a minimum "slop" of 10 extra pcb structures when providing ajhb2010-08-174-8/+9
| | | | | | | | | memory size estimate to userland for pcb list sysctls. The previous behavior of a "slop" of n/8 does not work well for small values of n (e.g. no slop at all if you have less than 8 open UDP connections). Reviewed by: bz MFC after: 1 week
* Fix the interaction between 'ICMP fragmentation needed' MTU updates,andre2010-08-152-6/+6
| | | | | | | | | | | | | | | | | | | | | | | path MTU discovery and the tcp_minmss limiter for very small MTU's. When the MTU suggested by the gateway via ICMP, or if there isn't any the next smaller step from ip_next_mtu(), is lower than the floor enforced by net.inet.tcp.minmss (default 216) the value is ignored and the default MSS (512) is used instead. However the DF flag in the IP header is still set in tcp_output() preventing fragmentation by the gateway. Fix this by using tcp_minmss as the MSS and clear the DF flag if the suggested MTU is too low. This turns off path MTU dissovery for the remainder of the session and allows fragmentation to be done by the gateway. Only MTU's smaller than 256 are affected. The smallest official MTU specified is for AX.25 packet radio at 256 octets. PR: kern/146628 Tested by: Matthew Luckie <mjl-at-luckie org nz> MFC after: 1 week
* Initializing the new error variable to zero in syncache_socket()andre2010-08-151-1/+1
| | | | | | is not necessary. Noticed by: bz
* Add more logging points for failures in syncache_socket() toandre2010-08-151-5/+24
| | | | | | | | | report when a new socket couldn't be created because one of in_pcbinshash(), in6_pcbconnect() or in_pcbconnect() failed. Logging is conditional on net.inet.tcp.log_debug being enabled. MFC after: 1 week
* When using TSO and sending more than TCP_MAXWIN sendalot is setandre2010-08-141-2/+5
| | | | | | | | | | | | | | | and we loop back to 'again'. If the remainder is less or equal to one full segment, the TSO flag was not cleared even though it isn't necessary anymore. Enabling the TSO flag on a segment that doesn't require any offloaded segmentation by the NIC may cause confusion in the driver or hardware. Reset the internal tso flag in tcp_output() on every iteration of sendalot. PR: kern/132832 Submitted by: Renaud Lienhart <renaud-at-vmware com> MFC after: 1 week
* Change the messages of the ICMP bad port bandwidth limiter fromandre2010-08-141-1/+2
| | | | | | | | | | | | | a kernel printf to a log output with the priority of LOG_NOTICE. This way the messages still show up in /var/log/messages but no longer spam the console every other second on busy servers that are port scanned: "Limiting open port RST response from 114 to 100 packets/sec" PR: kern/147352 Submitted by: Eugene Grosbein <eugen-at-eg sd rdtc ru> MFC after: 1 week
* Disable TCP inflight limiter by default.andre2010-08-141-1/+1
| | | | | | | | | | | | | | | | | It was experimental and interferes with the normal congestion control algorithms by instating a separate, possibly lower, ceiling for the amount of data that is in flight to the remote host. With high speed internet connections the inflight limit frequently has been estimated too low due to the noisy nature of the RTT measurements. This code gives way for the upcoming pluggable congestion control framework. It is the task of the congestion control algorithm to set the congestion window and amount of inflight data without external interference. Reviewed by: lstewart MFC after: 1 week Removal after: 1 month
* Unbreak LINT by moving all carp hooks to net/if.c / netinet/ip_carp.h, withwill2010-08-113-24/+23
| | | | | | | the appropriate ifdefs. Reviewed by: bz Approved by: ken (mentor)
* Allow carp(4) to be loaded as a kernel module. Follow precedent set bywill2010-08-116-77/+167
| | | | | | | | | | | | | | | bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks
* Address an edge condition that we found at work, where the carp(4)delphij2010-08-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interface goes to issue LINK_UP, then LINK_DOWN, then LINK_UP at cold boot. This behavior is not observed when carp(4) interface is created slightly later, when the underlying interface is fully up. Before this change what happen at boot is roughly: - ifconfig creates em0 interface; - ifconfig clones a carp device using em0; (em0's link state is DOWN at this point) - carp state: INIT -> BACKUP [*] - carp state: BACKUP -> MASTER - [Some negotiate between em0 and switch] - em0 kicks up link state change event (em0's link state is now up DOWN at this point) - do_link_state_change() -> carp_carpdev_state() - carp state: MASTER -> INIT (via carp_set_state(sc, INIT)) [+] - carp state: INIT -> BACKUP - carp state: BACKUP -> MASTER At the [*] stage, em0 did not received any broadcast message from other node, and assume our node is the master, thus carp(4) sets the link state to "UP" after becoming a master. At [+], the master status is forcely set to "INIT", then an election is casted, after which our node would actually become a master. We believe that at the [*] stage, the master status should remain as "INIT" since the underlying parent interface's link state is not up. Obtained from: iXsystems, Inc. Reported by: jpaetzel MFC after: 2 months
* Don't use struct timezone.ed2010-08-081-4/+2
| | | | | The timezone structure acquired by gettimeofday() is not used at all. Just remove it.
* Fix a bug where endpoints bound to wildcard addresses wheretuexen2010-08-051-0/+36
| | | | | | | using addresses not announced to the peer due to address scoping. MFC after: 3 weeks
* Cleanup code.tuexen2010-08-011-2/+1
| | | | MFC after: 2 weeks
* Document the mandatory argument to the arptimer() andbz2010-07-311-5/+3
| | | | | | | | | | | | | | | | nd6_llinfo_timer() functions with a KASSERT(). Note: there is no need to return after panic. In the legacy IP case, only assign the arg after the check, in the IPv6 case, remove the extra checks for the table and interface as they have to be there unless we freed and forgot to cancel the timer. It doesn't matter anyway as we would panic on the NULL pointer deref immediately and the bug is elsewhere. This unifies the code of both address families to some extend. Reviewed by: rwatson MFC after: 6 days
* MFp4 @181628:bz2010-07-311-6/+20
| | | | | | | | | | Free the rtentry after we diconnected it from the FIB and are counting it as rttrash. There might still be a chance we leak it from a different code path but there is nothing we can do about this here. Sponsored by: ISPsystem (in February) Reviewed by: julian (in February) MFC after: 2 weeks
* Fix a bug in syncache where the initial CWND for new incoming connectionsandre2010-07-301-1/+2
| | | | | | | | | | | | | | | | | | was limited to one segment under the faulty assumption of a retransmit. Due to this the opportunity to initialize the increased congestion window according to RFC3390 was missed. Support for RFC3465 introduced in r187289 uncovered the bug as the ACK to SYN/ACK no longer caused snd_cwnd increase by MSS (actually, this increase shouldn't happen as it's explicitly forbidden by RFC3390, but it's another issue). Snd_cwnd remains really small (1*MSS + 1) and this causes really bad interaction with delayed acks on other side. The variable name sc_rxmits is a bit misleading as it counts all transmits, not just retransmits. Submitted by: Maxim Dounin <mdounin-at-mdounin-dot-ru> MFC after: 10 days
* Fix the comment block that has the nicerrs2010-07-291-12/+22
| | | | | | table to really have the nice table :-) MFC after: 1 month
* PR SCTP Bugs. Basically a full sized frame ofrrs2010-07-294-31/+48
| | | | | | | | PR SCTP FWD-TSN's would not be sent and thus cause a stalled connection. Also the rwnd Calculation was also off on the receiver side for PR-SCTP. MFC after: 1 month
* Fix operation of "netgraph" action in conjunction with theglebius2010-07-271-0/+2
| | | | | | | | | net.inet.ip.fw.one_pass sysctl. The "ngtee" action is still broken. PR: kern/148885 Submitted by: Nickolay Dudorov <nnd mail.nsk.ru>
* Fix a bug where the length of a FORWARD-TSN chunk was set incorrectly intuexen2010-07-261-4/+2
| | | | | | | the chunk. This resulted in malformed frames. Remove a duplicate assignment. MFC after: 2 weeks
* Make sure that we report chunks if a socketrrs2010-07-261-7/+29
| | | | | | | | still exists that were not sent. In either case carefully remove the data if it does not get taken by the reporting routines. MFC after: 2 weeks
* When counting the number of chunks in therrs2010-07-261-13/+20
| | | | | | | | | retransmission queue to validate the retran count, we need to include the chunks in the control send queue too. Otherwise the count will not match and you will get the invarient warning if invarients are on. MFC after: 2 weeks
* - Move common code from the hook functions that fills in a packet node struct tolstewart2010-07-181-115/+87
| | | | | | | | | | | | | a separate inline function. This further reduces duplicate code that didn't have a good reason to stay as it was. - Reorder the malloc of a pkt_node struct in the hook functions such that it only occurs if we managed to find a usable tcpcb associated with the packet. - Make the inp_locally_locked variable's type consistent with the prototype of siftr_siftdata(). Sponsored by: FreeBSD Foundation
* machine/cpu.h isn't appropriate for this file,so remove itimp2010-07-161-1/+0
|
OpenPOWER on IntegriCloud