summaryrefslogtreecommitdiffstats
path: root/sys/netinet
Commit message (Collapse)AuthorAgeFilesLines
* Add ipfw support for setting/matching DiffServ codepoints (DSCP).melifaro2013-03-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | Setting DSCP support is done via O_SETDSCP which works for both IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done for IPv4. Dscp can be specified by name (AFXY, CSX, BE, EF), by value (0..63) or via tablearg. Matching DSCP is done via another opcode (O_DSCP) which accepts several classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words). Many people made their variants of this patch, the ones I'm aware of are (in alphabetic order): Dmitrii Tejblum Marcelo Araujo Roman Bogorodskiy (novel) Sergey Matveichuk (sem) Sergey Ryabin PR: kern/102471, kern/121122 MFC after: 2 weeks
* In m_megapullup() instead of reserving some space at the end of packet,glebius2013-03-171-10/+6
| | | | | | m_align() it, reserving space to prepend data. Reviewed by: mav
* - Replace compat macros with function calls.glebius2013-03-163-3/+3
|
* We can, and should use M_WAITOK here.glebius2013-03-151-1/+1
| | | | Sponsored by: Nginx, Inc.
* Use m_get/m_gethdr instead of compat macros.glebius2013-03-156-11/+10
| | | | Sponsored by: Nginx, Inc.
* - Use m_getcl() instead of hand allocating.glebius2013-03-151-12/+8
| | | | Sponsored by: Nginx, Inc.
* Functions m_getm2() and m_get2() have different order of arguments,glebius2013-03-121-1/+1
| | | | | | | and that can drive someone crazy. While m_get2() is young and not documented yet, change its order of arguments to match m_getm2(). Sorry for churn, but better now than later.
* Remove LIBALIAS_LOCK_ASSERT(), including a couple with an uninitialzedglebius2013-03-111-6/+1
| | | | | | | argument, in code that isn't compiled in kernel. PR: kern/176667 Sponsored by: Nginx, Inc.
* The hashmask returned by hashinit() is a valid index in the returned hash array.lstewart2013-03-071-1/+1
| | | | | | | | Fix a siftr(4) potential memory leak and INVARIANTS triggered kernel panic in hashdestroy() by ensuring the last array index in the flow counter hash table is flushed of entries. MFC after: 3 days
* - Make callout(9) tickless, relying on eventtimers(4) as backend fordavide2013-03-041-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | precise time event generation. This greatly improves granularity of callouts which are not anymore constrained to wait next tick to be scheduled. - Extend the callout KPI introducing a set of callout_reset_sbt* functions, which take a sbintime_t as timeout argument. The new KPI also offers a way for consumers to specify precision tolerance they allow, so that callout can coalesce events and reduce number of interrupts as well as potentially avoid scheduling a SWI thread. - Introduce support for dispatching callouts directly from hardware interrupt context, specifying an additional flag. This feature should be used carefully, as long as interrupt context has some limitations (e.g. no sleeping locks can be held). - Enhance mechanisms to gather informations about callwheel, introducing a new sysctl to obtain stats. This change breaks the KBI. struct callout fields has been changed, in particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t' (8 bytes) and another 'sbintime_t' field was added for precision. Together with: mav Reviewed by: attilio, bde, luigi, phk Sponsored by: Google Summer of Code 2012, iXsystems inc. Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm), markj (amd64), mav, Fabian Keil
* Fix a potential race in returning setting errno when antuexen2013-02-271-1/+2
| | | | | | | | association goes down. Reported by Mozilla in https://bugzilla.mozilla.org/show_bug.cgi?id=845513 MFC after: 3 days
* Fix tcp_lro_rx_ipv4() for drivers that do not set CSUM_IP_CHECKED.gallatin2013-02-211-1/+1
| | | | | | | | | | | Specifcially, in_cksum_hdr() returns 0 (not 0xffff) when the IPv4 checksum is correct. Without this fix, the tcp_lro code will reject good IPv4 traffic from drivers that do not implement IPv4 header harder csum offload. Sponsored by: Myricom Inc. MFC after: 7 days
* ip_savecontrol() style fixes. No functional changes.pluknet2013-02-201-17/+17
| | | | | | | | | - fix indentation - put the operator at the end of the line for long statements - remove spaces between the type and the variable in a cast - remove excessive parentheses Tested by: md5
* Send the adaptation layer indication only if set by the user.tuexen2013-02-114-19/+26
| | | | | MFC after: 3 days Discussed with: rrs
* Don't send kernel provided information in the User Initiatedtuexen2013-02-114-68/+29
| | | | | | | | | | | ABORT cause, since the user can also provide this kind of information. So the receiver doesn't know who provided the information. While there: Fix a bug where the stack would send a malformed ABORT chunk when using a send() call with SCTP_ABORT|SCT_SENDALL flags. MFC after: 3 days
* Resolve source address selection in presense of CARP. Add a coupleglebius2013-02-112-0/+12
| | | | | | | | | | | | | | | | | | of helper functions: - carp_master() - boolean function which is true if an address is in the MASTER state. - ifa_preferred() - boolean function that compares two addresses, and is aware of CARP. Utilize ifa_preferred() in ifa_ifwithnet(). The previous version of patch also changed source address selection logic in jails using carp_master(), but we failed to negotiate this part with Bjoern. May be we will approach this problem again later. Reported & tested by: Anton Yuzhaninov <citrin citrin.ru> Sponsored by: Nginx, Inc
* Make sure that received packets for removed addresses are handledtuexen2013-02-101-180/+192
| | | | | | consistently. While there, make variable names consistent. MFC after: 3 days
* Cleanup the handling of address scopes. Announce in the INIT/INIT-ACKtuexen2013-02-0910-320/+228
| | | | | | | only the supported address types. While there, do some whitespace cleanups. MFC after: 1 week
* Fix a bug where HEARTBEATs were still sent in SHUTDOWN_SENT ortuexen2013-02-096-54/+57
| | | | | | | SHUTDOWN_ACK_SENT state. While there, make the corresponding code consistent. MFC after: 1 week
* Add placeholder constants to reserve a portion of the socket optionjhb2013-02-012-0/+5
| | | | | | name space for use by downstream vendors to add custom options. MFC after: 2 weeks
* uma_zone_set_max() directly returns the rounded effective zoneandre2013-02-012-6/+6
| | | | | | | limit. Use the return value directly instead of doing a second uma_zone_set_max() step. MFC after: 1 week
* - Move AUTHORS and ACKNOWLEDGEMENTS to the end of the page.glebius2013-01-311-33/+35
| | | | - Add myself to list of authors.
* Retire struct sockaddr_inarp.glebius2013-01-312-7/+6
| | | | | | | | | | | | | | | Since ARP and routing are separated, "proxy only" entries don't have any meaning, thus we don't need additional field in sockaddr to pass SIN_PROXY flag. New kernel is binary compatible with old tools, since sizes of sockaddr_inarp and sockaddr_in match, and sa_family are filled with same value. The structure declaration is left for compatibility with third party software, but in tree code no longer use it. Reviewed by: ru, andre, net@
* Utilize m_get2() to get mbuf of appropriate size.glebius2013-01-301-15/+1
|
* Add checks for SO_NO_OFFLOAD in a couple of places that I missed earliernp2013-01-261-0/+2
| | | | in r245915.
* Teach toe_l2_resolve to resolve IPv6 destinations too.np2013-01-261-1/+66
| | | | Reviewed by: bz@
* Move lle_event to if_llatbl.hnp2013-01-251-11/+0
| | | | | | | | | | | | lle_event replaced arp_update_event after the ARP rewrite and ended up in if_ether.h simply because arp_update_event used to be there too. IPv6 neighbor discovery is going to grow lle_event support and this is a good time to move it to if_llatbl.h. The two in-tree consumers of this event - OFED and toecore - are not affected. Reviewed by: bz@
* There is no need to call into the TOE driver twice in pru_rcvd (tod_rcvdnp2013-01-251-0/+1
| | | | | | and then tod_output right after that). Reviewed by: bz@
* Add TCP_OFFLOAD hook in syncache_respond for IPv6 too, just like the onenp2013-01-251-0/+9
| | | | | | that exists for IPv4. Reviewed by: bz@
* Teach toe_4tuple_check() to deal with IPv6 4-tuples too.np2013-01-251-5/+9
| | | | Reviewed by: bz@
* Heed SO_NO_OFFLOAD.np2013-01-251-2/+5
| | | | MFC after: 1 week
* Remove redundant test, we know inp_lport is 0.np2013-01-251-2/+1
| | | | MFC after: 1 week
* Use decimal values for UDP and TCP socket options rather than hex to avoidjhb2013-01-222-13/+15
| | | | | | | implying that these constants should be treated as bit masks. Reviewed by: net MFC after: 1 week
* Simplify and fix a bug in cc_ack_received()'s "are we congestion window limited"lstewart2013-01-221-1/+1
| | | | | | | | | | | | logic (refer to [1] for associated discussion). snd_cwnd and snd_wnd are unsigned long and on 64 bit hosts, min() will truncate them to 32 bits and could therefore potentially corrupt the result (although under normal operation, neither variable should legitmately exceed 32 bits). [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html Submitted by: jhb MFC after: 1 week
* Don't drop options from the third retransmitted SYN by default. If thejhb2013-01-091-1/+7
| | | | | | | | | | | | | SYNs (or SYN/ACK replies) are dropped due to network congestion, then the remote end of the connection may act as if options such as window scaling are enabled but the local end will think they are not. This can result in very slow data transfers in the case of window scaling disagreements. The old behavior can be obtained by setting the net.inet.tcp.rexmit_drop_options sysctl to a non-zero value. Reviewed by: net@ MFC after: 2 weeks
* Temporarily revert rev 244678. This is causing loopback problems withpeter2013-01-031-8/+3
| | | | the lo (loopback) interfaces.
* Some cleanups.tuexen2012-12-271-15/+9
| | | | MFC after: 3 days
* Minor cleanups of debug messages.tuexen2012-12-271-2/+2
| | | | MFC after: 3 days
* Fix a copy and paste error.tuexen2012-12-271-1/+1
| | | | MFC after: 3 days
* Garbage collect carp_cksum().glebius2012-12-251-10/+4
|
* Change net.inet.carp.demotion sysctl to add the supplied valueglebius2012-12-251-3/+20
| | | | | | | to the current demotion factor instead of assigning it. This allows external scripts to control demotion factor together with kernel in a raceless manner.
* Fix sysctl_handle_int() usage. Either arg1 or arg2 should be supplied,glebius2012-12-252-2/+2
| | | | and arg2 doesn't pass size of arg1.
* The SIOCSIFFLAGS ioctl handler runs if_up()/if_down() that notifyglebius2012-12-251-3/+8
| | | | | | | | | | | | | | | | | | all interested parties in case if interface flag IFF_UP has changed. However, not only SIOCSIFFLAGS can raise the flag, but SIOCAIFADDR and SIOCAIFADDR_IN6 can, too. The actual |= is done not in the protocol code, but in code of interface drivers. To fix this historical layering violation, we will check whether ifp->if_ioctl(SIOCSIFADDR) raised the IFF_UP flag, and if it did, run the if_up() handler. This fixes configuring an address under CARP control on an interface that was initially !IFF_UP. P.S. I intentionally omitted handling the IFF_SMART flag. This flag was never ever used in any driver since it was introduced, and since it means another layering violation, it should be garbage collected instead of pretended to be supported.
* Minor style(9) changes:glebius2012-12-241-1/+3
| | | | | - Remove declaration in initializer. - Add empty line between logical blocks.
* Fix !INET6 build after r244365.glebius2012-12-181-2/+11
|
* Clear correct flag in INET6 case.glebius2012-12-181-1/+1
|
* Since we use different flags to detect tcp forwarding, and we share theae2012-12-171-1/+2
| | | | | | | same code for IPv4 and IPv6 in tcp_input, we should check both M_IP_NEXTHOP and M_IP6_NEXTHOP flags. MFC after: 3 days
* Fix problem in r238990. The LLE_LINKED flag should be tested prior toglebius2012-12-131-4/+12
| | | | | | | | | entering llentry_free(), and in case if we lose the race, we should simply perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread performing arptimer(), it will remove two references from the lle instead of one. Reported by: Ian FREISLICH <ianf clue.co.za>
* Fix a crash in tcp_input(), that happens when mbuf has a fwd_tag on it,glebius2012-12-121-0/+2
| | | | | | | | | but later after processing and freeing the tag, we need to jump back again to the findpcb label. Since the fwd_tag pointer wasn't NULL we tried to process and free the tag for second time. Reported & tested by: Pawel Tyll <ptyll nitronet.pl> MFC after: 3 days
* Get it compiling without INET and INET6 support (mainly userland stack).tuexen2012-12-081-0/+4
| | | | MFC after: 2 weeks
OpenPOWER on IntegriCloud