summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fq_codel: Fair Queue Codel AQMEric Dumazet2012-05-124-0/+690
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fair Queue Codel packet scheduler Principles : - Packets are classified (internal classifier or external) on flows. - This is a Stochastic model (as we use a hash, several flows might be hashed on same slot) - Each flow has a CoDel managed queue. - Flows are linked onto two (Round Robin) lists, so that new flows have priority on old ones. - For a given flow, packets are not reordered (CoDel uses a FIFO) - head drops only. - ECN capability is on by default. - Very low memory footprint (64 bytes per flow) tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ] [ target TIME ] [ interval TIME ] [ noecn ] [ quantum BYTES ] defaults : 1024 flows, 10240 packets limit, quantum : device MTU target : 5ms (CoDel default) interval : 100ms (CoDel default) Impressive results on load : class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0 Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0) rate 201691Kbit 28595pps backlog 0b 312p requeues 0 lended: 33063109 borrowed: 0 giants: 0 tokens: -912 ctokens: -912 class fq_codel 10:1735 parent 10: (dropped 1292, overlimits 0 requeues 0) backlog 15140b 10p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.1ms class fq_codel 10:4524 parent 10: (dropped 1291, overlimits 0 requeues 0) backlog 16654b 11p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.1ms class fq_codel 10:4e74 parent 10: (dropped 1290, overlimits 0 requeues 0) backlog 6056b 4p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms class fq_codel 10:628a parent 10: (dropped 1289, overlimits 0 requeues 0) backlog 7570b 5p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms class fq_codel 10:a4b3 parent 10: (dropped 302, overlimits 0 requeues 0) backlog 16654b 11p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.1ms class fq_codel 10:c3c2 parent 10: (dropped 1284, overlimits 0 requeues 0) backlog 13626b 9p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 5.9ms class fq_codel 10:d331 parent 10: (dropped 299, overlimits 0 requeues 0) backlog 15140b 10p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.0ms class fq_codel 10:d526 parent 10: (dropped 12160, overlimits 0 requeues 0) backlog 35870b 211p requeues 0 deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us class fq_codel 10:e2c6 parent 10: (dropped 1288, overlimits 0 requeues 0) backlog 15140b 10p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.1ms class fq_codel 10:eab5 parent 10: (dropped 1285, overlimits 0 requeues 0) backlog 16654b 11p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 5.9ms class fq_codel 10:f220 parent 10: (dropped 1289, overlimits 0 requeues 0) backlog 15140b 10p requeues 0 deficit 1514 count 1 lastcount 1 ldelay 7.1ms qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17 Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71) rate 201697Kbit 28602pps backlog 0b 260p requeues 71 qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0) rate 201697Kbit 28602pps backlog 189352b 260p requeues 0 maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593 new_flows_len 0 old_flows_len 11 PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data. 64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms 64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms 64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms 64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms 64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms 64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms 64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms 64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms 64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms 64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms 10 packets transmitted, 10 received, 0% packet loss, time 8999ms rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms Much better than SFQ because of priority given to new flows, and fast path dirtying less cache lines. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* codel: use Newton method instead of sqrt() and dividesEric Dumazet2012-05-121-31/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Van pointed out, interval/sqrt(count) can be implemented using multiplies only. http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Iterative_methods_for_reciprocal_square_roots This patch implements the Newton method and reciprocal divide. Total cost is 15 cycles instead of 120 on my Corei5 machine (64bit kernel). There is a small 'error' for count values < 5, but we don't really care. I reuse a hole in struct codel_vars : - pack the dropping boolean into one bit - use 31bit to store the reciprocal value of sqrt(count). Suggested-by: Van Jacobson <van@pollere.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rndis_wlan: cleanup: change oid from __le32 to u32 in various placesJussi Kivilinna2012-05-121-49/+49
| | | | | | Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* rndis_host: cleanup: change oid from __le32 to u32 in rndis_query()Jussi Kivilinna2012-05-121-6/+6
| | | | | | Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* rndis_wlan: cleanup: byteswap data from device instead of RNDIS_* definesJussi Kivilinna2012-05-121-13/+13
| | | | | | | | | All other values from device provided buffer are byteswapped, so it seems more logical to do same for these. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* rndis_host: cleanup: byteswap data from device instead of RNDIS_* definesJussi Kivilinna2012-05-121-25/+28
| | | | | | | | | All other values from device provided buffer are byteswapped, so it seems more logical to do same for these. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: move bus message definitionLinus Walleij2012-05-121-6/+5
| | | | | | | | | This moves the bus message definition to land together with the other message types. This message is not used in the kernel but I'm keeping it anyway. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: fixup a few name prefixesLinus Walleij2012-05-122-57/+54
| | | | | | | | This switches a horde of NDIS_*-prefixed variables to the RNDIS_* prefix. Most of them aren't used much and causes no changes. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: merge command codesLinus Walleij2012-05-123-57/+39
| | | | | | | | | Switch the hyperv filter and rndis gadget driver to use the same command enumerators as the other drivers and delete the surplus command codes. Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: move and namespace PnP definesLinus Walleij2012-05-121-15/+12
| | | | | | | | | | | | This moves the PnP OID definitions to the RNDIS_* namespace and puts them in the next falling slot in the list. Oh, the comment above the PnP defines was referring to some obsolete or out-of-tree driver so removed it, and removed my own comments telling where each header segment came from as well, we have moved everything around by this point anyway. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: delete duplicate packet typesLinus Walleij2012-05-121-13/+0
| | | | | | | | The NDIS_*-prefixed packet types have equivalent RNDIS_*- prefixed types, besides nothing in the kernel use these defines. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: merge media type definitionsLinus Walleij2012-05-123-45/+29
| | | | | | | | | | | | | | | | | | | | | | | | | Let's have a unified table of RNDIS media. We used to have a similar table with NDIS_* prefix from the gadget driver, but since we're only using RNDIS in the kernel (IIRC NDIS, non-remote, is for the windows- internal network drivers so what do we care) let's prefix everything with RNDIS. Some of the definitions were conflicting, in one of the defines 0x0B is bearer "CO WAN" and in two others "BPC". Well I took the majority vote. Two definition of medium 0x09 calls it "wireless WAN" but one vote for "wireless LAN" but in this case I am sticking with the minority, "Wide Area Network" does not make much sense in this case as far as I can tell. NOTE: latin singular and plural is so screwed up in these defines that it makes my eyes bleed. But I will not attempt to submit a patch converting all use of _MEDIA_ to _MEDIUM_ while I can probably tell from the semantics of the code that RNDIS_MEDIA_STATE_CONNECTED is most probably (erroneously) referring to a singular, unless it can return an array of connected media. I suspect these erroneous plurals are used in documentation and such so I don't want to mess around with things for no functional change. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: group all status codes togetherLinus Walleij2012-05-121-81/+76
| | | | | | | | Move all RNDIS status codes so they appear in rising order and in one place of the header file. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: delete surplus definesLinus Walleij2012-05-121-4/+0
| | | | | | | | These defines are not used in the kernel, and they have duplicate definitions under the RNDIS_* prefix. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: merge duplicate 802_* OIDsLinus Walleij2012-05-123-174/+188
| | | | | | | | The 802_* network OIDs were duplicated, so let's merge them and use the RNDIS_* prefixed definitions from the hyperV driver. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: eliminate first set of duplicate OIDsLinus Walleij2012-05-124-185/+123
| | | | | | | | | | | The RNDIS protocol contains a vast number of Object ID:s (OIDs). The current definitions had multiple definitions of these ID:s, let's use the nicely RNDIS_*-prefixed defines from the HyperV implementation, rename everywhere they're used, and copy+rename the few that were missing from this list of objects. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: remove ambigous status codesLinus Walleij2012-05-122-30/+5
| | | | | | | | | | | | | The RNDIS status codes are redefined with much stranged ifdeffery and only one of these codes was used in the hyperv driver, and there it is very clearly referring to the RNDIS variant, not some other status. So clarify this by explictly using the RNDIS_* prefixed status code in the hyperv drivera and delete the duplicate defines. Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: break out <linux/rndis.h> definesLinus Walleij2012-05-126-613/+565
| | | | | | | | | | | | As a first step to consolidate the RNDIS implementations, break out a common file with all the #defines and move it to <linux/rndis.h>. This also deletes the immediate duplicated defines in the <linux/rndis.h> file that yields a lot of compilation warnings. Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* usb/net: rndis: inline the cpu_to_le32() macroLinus Walleij2012-05-123-173/+177
| | | | | | | | | | | | | | The header file <linux/usb/rndis_host.h> used a number of #defines that included the cpu_to_le32() macro to assure the result will be in LE endianness. Inlining this into the code instead of using it in the code definitions yields consolidation opportunities later on as you will see in the following patches. The individual drivers also used local defines - all are switched over to the pattern of doing the conversion at the call sites instead. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6/af_inet6.c: checkpatch cleanupEldad Zack2012-05-111-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | af_inet6.c:80: ERROR: do not initialise statics to 0 or NULL af_inet6.c:259: ERROR: spaces required around that '=' (ctx:VxV) af_inet6.c:394: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:412: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:422: ERROR: do not use assignment in if condition af_inet6.c:425: ERROR: do not use assignment in if condition af_inet6.c:433: ERROR: do not use assignment in if condition af_inet6.c:437: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:446: ERROR: spaces required around that '=' (ctx:VxV) af_inet6.c:478: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:485: ERROR: that open brace { should be on the previous line af_inet6.c:485: ERROR: space required before the open parenthesis '(' af_inet6.c:513: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:629: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:647: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:687: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:709: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable af_inet6.c:1073: ERROR: space required before the open parenthesis '(' Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: of/phy: fix build error when phylib is built as a moduleBjørn Mork2012-05-111-1/+1
| | | | | | | | | | | | | CONFIG_OF_MDIO is tristate and will be m if PHYLIB is m. Use IS_ENABLED macro to prevent build error: ERROR: "of_mdio_find_bus" [drivers/net/phy/mdio-mux.ko] undefined! Reported-by: Randy Dunlap <rdunlap@xenotime.net> Cc: David Daney <david.daney@cavium.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller2012-05-1118-265/+380
|\ | | | | | | | | | | | | | | | | | | | | Included changes: * fix a little bug in the DHCP packet snooping introduced so far * minor fixes and cleanups * minor routing protocol API cleanups * add a new contributor name to translation-table.{c,h} * update copyright years in file headers * minor improvement for the routing algorithm
| * batman-adv: add contributor nameAntonio Quartulli2012-05-112-2/+2
| | | | | | | | | | | | | | translation_table.{c,h} have been heavily modified by another contributor and for legal purposes it is better to include his name into the contributor list Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: update copyright yearsAntonio Quartulli2012-05-112-2/+2
| | | | | | | | | | | | update copyright years in order to include 2012 Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: fix checkpatch string complaintMarek Lindner2012-05-111-2/+2
| | | | | | | | | | | | | | Regression introduced by: f76d019194e0a88c57371df169ecc979690a04c2 Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: avoid temporary routing loops by being strict on forwarded OGMsMarek Lindner2012-05-112-29/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | batman-adv would forward OGMs from non-besthops while replacing the the TQ and TTL values with the values from the best hop. In certain corner cases this leads to a temporary routing loop. This patch changes this behavior: Only packets from best next hops are forwarded - TQ and TTL values won't be replaced anymore. However, the protocol needs to rebroadcast OGMs from single hop neighbors regardless of whether or not they are the best hop. To handle this case a new flag is introduced to alert neighboring nodes about the forwarded OGM that is not from my best next hop. It is to be discarded by all nodes except for the one originating the OGM. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Daniele Furlan <daniele.furlan@gmail.com> Tested-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
| * batman-adv: Adding hard_iface specific sysfs wrapper macros for UINTLinus Luessing2012-05-111-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to easily add a sysfs parameter for an unsigned int later, which is not for a batman mesh interface (e.g. bat0), but for a common interface instead. It allows reading and writing an atomic_t in hard_iface (instead of bat_priv compared to the mesh variant). Developed by Linus during a 6 months trainee study period in Ascom (Switzerland) AG. Signed-off-by: Linus Luessing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
| * batman-adv: rename sysfs macros to reflect the soft-interface dependencyMarek Lindner2012-05-111-28/+29
| | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: refactoring API: find generalized name for bat_ogm_update_mac ↵Marek Lindner2012-05-114-14/+16
| | | | | | | | | | | | | | callback Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: ignore protocol packets if the interface did not enable this ↵Marek Lindner2012-05-111-0/+7
| | | | | | | | | | | | | | | | protocol Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: split neigh_new function into generic and batman iv specific partsMarek Lindner2012-05-113-27/+48
| | | | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: replace HZ calculations with jiffies_to_msecs()Marek Lindner2012-05-113-8/+13
| | | | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: rename last_valid to last_seenMarek Lindner2012-05-113-16/+16
| | | | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: register batman ogm receive function during protocol initMarek Lindner2012-05-115-24/+41
| | | | | | | | | | | | | | | | | | | | | | The B.A.T.M.A.N. IV OGM receive function still was hard-coded although it is a routing protocol specific function. This patch takes advantage of the dynamic packet handler registration to remove the hard-coded function calls. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: introduce packet type handler array for incoming packetsMarek Lindner2012-05-113-113/+127
| | | | | | | | | | | | | | | | | | | | The packet handler array replaces the growing switch statement, thus dealing with incoming packets in a more efficient way. It also adds to possibility to register packet handlers on the fly. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: introduce is_single_hop_neigh variable to increase readabilityMarek Lindner2012-05-111-7/+9
| | | | | | | | | | | | Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * batman-adv: fix wrong dhcp option list browsingAntonio Quartulli2012-05-111-3/+3
|/ | | | | | | | | | In is_type_dhcprequest(), while parsing a DHCP message, if the entry we found in the option list is neither a padding nor the dhcp-type, we have to ignore it and jump as many bytes as its length + 1. The "+ 1" byte is given by the subtype field itself that has to be jumped too. Reported-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* 6lowpan: IPv6 link local addressalex.bluesman.smirnov@gmail.com2012-05-101-1/+13
| | | | | | | | | | | | | | | | | | | | According to the RFC4944 (Transmission of IPv6 Packets over IEEE 802.15.4 Networks), chapter 7: The IPv6 link-local address [RFC4291] for an IEEE 802.15.4 interface is formed by appending the Interface Identifier, as defined above, to the prefix FE80::/64. 10 bits 54 bits 64 bits +----------+-----------------------+----------------------------+ |1111111010| (zeros) | Interface Identifier | +----------+-----------------------+----------------------------+ This patch adds IPv6 address generation support for the 6lowpan interfaces. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* codel: Controlled Delay AQMEric Dumazet2012-05-105-0/+645
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson. http://queue.acm.org/detail.cfm?id=2209336 This AQM main input is no longer queue size in bytes or packets, but the delay packets stay in (FIFO) queue. As we don't have infinite memory, we still can drop packets in enqueue() in case of massive load, but mean of CoDel is to drop packets in dequeue(), using a control law based on two simple parameters : target : target sojourn time (default 5ms) interval : width of moving time window (default 100ms) Based on initial work from Dave Taht. Refactored to help future codel inclusion as a plugin for other linux qdisc (FQ_CODEL, ...), like RED. include/net/codel.h contains codel algorithm as close as possible than Kathleen reference. net/sched/sch_codel.c contains the linux qdisc specific glue. Separate structures permit a memory efficient implementation of fq_codel (to be sent as a separate work) : Each flow has its own struct codel_vars. timestamps are taken at enqueue() time with 1024 ns precision, allowing a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses usec as base unit. Selected packets are dropped, unless ECN is enabled and packets can get ECN mark instead. Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and tg3 drivers (BQL enabled). Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ] [ interval TIME ] [ ecn ] qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0) rate 202365Kbit 16708pps backlog 113550b 75p requeues 0 count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us maxpacket 1514 ecn_mark 84399 drop_overlimit 0 CoDel must be seen as a base module, and should be used keeping in mind there is still a FIFO queue. So a typical setup will probably need a hierarchy of several qdiscs and packet classifiers to be able to meet whatever constraints a user might have. One possible example would be to use fq_codel, which combines Fair Queueing and CoDel, in replacement of sfq / sfq_red. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Van Jacobson <van@pollere.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: update bstats in dequeue()Eric Dumazet2012-05-104-6/+6
| | | | | | | | | | | | | | Class bytes/packets stats can be misleading because they are updated in enqueue() while packet might be dropped later. We already fixed all qdiscs but sch_atm. This patch makes the final cleanup. class rate estimators can now match qdisc ones. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, drivers/net: Convert compare_ether_addr_64bits to ether_addr_equal_64bitsJoe Perches2012-05-103-35/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new bool function ether_addr_equal_64bits to add some clarity and reduce the likelihood for misuse of compare_ether_addr_64bits for sorting. Done via cocci script: $ cat compare_ether_addr_64bits.cocci @@ expression a,b; @@ - !compare_ether_addr_64bits(a, b) + ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - compare_ether_addr_64bits(a, b) + !ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - !ether_addr_equal_64bits(a, b) == 0 + ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - !ether_addr_equal_64bits(a, b) != 0 + !ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - ether_addr_equal_64bits(a, b) == 0 + !ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - ether_addr_equal_64bits(a, b) != 0 + ether_addr_equal_64bits(a, b) @@ expression a,b; @@ - !!ether_addr_equal_64bits(a, b) + ether_addr_equal_64bits(a, b) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* etherdevice.h: Add ether_addr_equal_64bitsJoe Perches2012-05-101-0/+20
| | | | | | | | | | | | Add an optimized boolean function to check if 2 ethernet addresses are the same. This is to avoid any confusion about compare_ether_addr_64bits returning an unsigned, and not being able to use the compare_ether_addr_64bits function for sorting ala memcmp. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net: Convert compare_ether_addr to ether_addr_equalJoe Perches2012-05-1031-86/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new bool function ether_addr_equal to add some clarity and reduce the likelihood for misuse of compare_ether_addr for sorting. Done via cocci script: $ cat compare_ether_addr.cocci @@ expression a,b; @@ - !compare_ether_addr(a, b) + ether_addr_equal(a, b) @@ expression a,b; @@ - compare_ether_addr(a, b) + !ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) == 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !ether_addr_equal(a, b) != 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) == 0 + !ether_addr_equal(a, b) @@ expression a,b; @@ - ether_addr_equal(a, b) != 0 + ether_addr_equal(a, b) @@ expression a,b; @@ - !!ether_addr_equal(a, b) + ether_addr_equal(a, b) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* be2net: avoid disabling sriov while VFs are assignedSathya Perla2012-05-103-89/+134
| | | | | | | | | | | | Calling pci_disable_sriov() while VFs are assigned to VMs causes kernel panic. This patch uses PCI_DEV_FLAGS_ASSIGNED bit state of the VF's pci_dev to avoid this. Also, the unconditional function reset cmd issued on a PF probe can delete the VF configuration for the previously enabled VFs. A scratchpad register is now used to issue a function reset only when needed (i.e., in a crash dump scenario.) Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: fix data packet sequence number handlingJames Chapman2012-05-101-1/+1
| | | | | | | | | | | | | If enabled, L2TP data packets have sequence numbers which a receiver can use to drop out of sequence frames or try to reorder them. The first frame has sequence number 0, but the L2TP code currently expects it to be 1. This results in the first data frame being handled as out of sequence. This one-line patch fixes the problem. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: fix reorder timeout recoveryJames Chapman2012-05-102-0/+10
| | | | | | | | | | | | | | | | | | | When L2TP data packet reordering is enabled, packets are held in a queue while waiting for out-of-sequence packets. If a packet gets lost, packets will be held until the reorder timeout expires, when we are supposed to then advance to the sequence number of the next packet but we don't currently do so. As a result, the data channel is stuck because we are waiting for a packet that will never arrive - all packets age out and none are passed. The fix is to add a flag to the session context, which is set when the reorder timeout expires and tells the receive code to reset the next expected sequence number to that of the next packet in the queue. Tested in a production L2TP network with Starent and Nortel L2TP gear. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Out-line tcp_try_rmem_schedulePavel Emelyanov2012-05-101-1/+1
| | | | | | | | | | | | | | | | | | As proposed by Eric, make the tcp_input.o thinner. add/remove: 1/1 grow/shrink: 1/4 up/down: 868/-1329 (-461) function old new delta tcp_try_rmem_schedule - 864 +864 tcp_ack 4811 4815 +4 tcp_validate_incoming 817 815 -2 tcp_collapse 860 858 -2 tcp_send_rcvq 555 353 -202 tcp_data_queue 3435 3033 -402 tcp_prune_queue 721 - -721 Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Schedule rmem for rcvq repair sendPavel Emelyanov2012-05-101-0/+3
| | | | | | | | | | As noted by Eric, no checks are performed on the data size we're putting in the read queue during repair. Thus, validate the given data size with the common rmem management routine. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Move rcvq sending to tcp_input.cPavel Emelyanov2012-05-103-36/+35
| | | | | | | | | It actually works on the input queue and will use its read mem routines, thus it's better to have in in the tcp_input.c file. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2012-05-1014-245/+1370
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
OpenPOWER on IntegriCloud