summaryrefslogtreecommitdiffstats
path: root/Documentation
Commit message (Collapse)AuthorAgeFilesLines
* net: packet: document available fanout policiesDaniel Borkmann2013-08-291-0/+8
| | | | | | | Update documentation to add fanout policies that are available. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: TSO packets automatic sizingEric Dumazet2013-08-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After hearing many people over past years complaining against TSO being bursty or even buggy, we are proud to present automatic sizing of TSO packets. One part of the problem is that tcp_tso_should_defer() uses an heuristic relying on upcoming ACKS instead of a timer, but more generally, having big TSO packets makes little sense for low rates, as it tends to create micro bursts on the network, and general consensus is to reduce the buffering amount. This patch introduces a per socket sk_pacing_rate, that approximates the current sending rate, and allows us to size the TSO packets so that we try to send one packet every ms. This field could be set by other transports. Patch has no impact for high speed flows, where having large TSO packets makes sense to reach line rate. For other flows, this helps better packet scheduling and ACK clocking. This patch increases performance of TCP flows in lossy environments. A new sysctl (tcp_min_tso_segs) is added, to specify the minimal size of a TSO packet (default being 2). A follow-up patch will provide a new packet scheduler (FQ), using sk_pacing_rate as an input to perform optional per flow pacing. This explains why we chose to set sk_pacing_rate to twice the current rate, allowing 'slow start' ramp up. sk_pacing_rate = 2 * cwnd * mss / srtt v2: Neal Cardwell reported a suspect deferring of last two segments on initial write of 10 MSS, I had to change tcp_tso_should_defer() to take into account tp->xmit_size_goal_segs Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Cc: Tom Herbert <therbert@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: drop fragmented ndisc packets by default (RFC 6980)Hannes Frederic Sowa2013-08-291-0/+6
| | | | | | | | | | | This patch implements RFC6980: Drop fragmented ndisc packets by default. If a fragmented ndisc packet is received the user is informed that it is possible to disable the check. Cc: Fernando Gont <fernando@gont.com.ar> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2013-08-271-0/+40
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A number of significant new features and optimizations for net-next/3.12. Highlights are: * "Megaflows", an optimization that allows userspace to specify which flow fields were used to compute the results of the flow lookup. This allows for a major reduction in flow setups (the major performance bottleneck in Open vSwitch) without reducing flexibility. * Converting netlink dump operations to use RCU, allowing for additional parallelism in userspace. * Matching and modifying SCTP protocol fields. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * openvswitch: Mega flow implementationAndy Zhou2013-08-231-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add wildcarded flow support in kernel datapath. Wildcarded flow can improve OVS flow set up performance by avoid sending matching new flows to the user space program. The exact performance boost will largely dependent on wildcarded flow hit rate. In case all new flows hits wildcard flows, the flow set up rate is within 5% of that of linux bridge module. Pravin has made significant contributions to this patch. Including API clean ups and bug fixes. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* | Documentation/networking/: Update Intel wired LAN driver documentationJeff Kirsher2013-08-278-43/+193
| | | | | | | | | | | | | | | | | | Updates the documentation to the Intel wired LAN drivers. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-08-261-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/wireless/iwlwifi/pcie/trans.c include/linux/inetdevice.h The inetdevice.h conflict involves moving the IPV4_DEVCONF values into a UAPI header, overlapping additions of some new entries. The iwlwifi conflict is a context overlap. Signed-off-by: David S. Miller <davem@davemloft.net>
| * memcg: get rid of swapaccount leftoversMichal Hocko2013-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The swapaccount kernel parameter without any values has been removed by commit a2c8990aed5a ("memsw: remove noswapaccount kernel parameter") but it seems that we didn't get rid of all the left overs. Make sure that menuconfig help text and kernel-parameters.txt are clear about value for the paramter and remove the stalled comment which is not very much useful on its own. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Gergely Risko <gergely@risko.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | net/phy: micrel: Add OF configuration support for ksz9021Sean Cross2013-08-211-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | Some boards require custom PHY configuration, for example due to trace length differences. Add the ability to configure these registers in order to get the PHY to function on boards that need it. Because PHYs are auto-detected based on MDIO device IDs, allow PHY configuration to be specified in the parent Ethernet device node if no PHY device node is present. Signed-off-by: Sean Cross <xobs@kosagi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2013-08-201-3/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Conflicts: net/netfilter/nf_conntrack_proto_tcp.c The conflict had to do with overlapping changes dealing with fixing the use of an "s32" to hold the value returned by NAT_OFFSET(). Pablo Neira Ayuso says: ==================== The following batch contains Netfilter/IPVS updates for your net-next tree. More specifically, they are: * Trivial typo fix in xt_addrtype, from Phil Oester. * Remove net_ratelimit in the conntrack logging for consistency with other logging subsystem, from Patrick McHardy. * Remove unneeded includes from the recently added xt_connlabel support, from Florian Westphal. * Allow to update conntracks via nfqueue, don't need NFQA_CFG_F_CONNTRACK for this, from Florian Westphal. * Remove tproxy core, now that we have socket early demux, from Florian Westphal. * A couple of patches to refactor conntrack event reporting to save a good bunch of lines, from Florian Westphal. * Fix missing locking in NAT sequence adjustment, it did not manifested in any known bug so far, from Patrick McHardy. * Change sequence number adjustment variable to 32 bits, to delay the possible early overflow in long standing connections, also from Patrick. * Comestic cleanups for IPVS, from Dragos Foianu. * Fix possible null dereference in IPVS in the SH scheduler, from Daniel Borkmann. * Allow to attach conntrack expectations via nfqueue. Before this patch, you had to use ctnetlink instead, thus, we save the conntrack lookup. * Export xt_rpfilter and xt_HMARK header files, from Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skbFlorian Westphal2013-07-311-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The module was "permanent", due to the special tproxy skb->destructor. Nowadays we have tcp early demux and its sock_edemux destructor in networking core which can be used instead. Thanks to early demux changes the input path now also handles "skb->sk is tw socket" correctly, so this no longer needs the special handling introduced with commit d503b30bd648b3cb4e5f50b65d27e389960cc6d9 (netfilter: tproxy: do not assign timewait sockets to skb->sk). Thus: - move assign_sock function to where its needed - don't prevent timewait sockets from being assigned to the skb - remove nf_tproxy_core. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-08-163-6/+4
|\ \ \ | | |/ | |/|
| * | Merge branch 'i2c/for-current' of ↵Linus Torvalds2013-08-111-1/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some driver bugfixes for the I2C subsystem" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mv64xxx: Document the newly introduced allwinner compatible i2c: Fix Kontron PLD prescaler calculation i2c: i2c-mxs: Use DMA mode even for small transfers
| | * | i2c: mv64xxx: Document the newly introduced allwinner compatibleMaxime Ripard2013-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | Merge branch 'v4l_for_linus' of ↵Linus Torvalds2013-08-091-2/+2
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some driver fixes (em28xx, coda, usbtv, s5p, hdpvr and ml86v7667) and a fix for media DocBook" * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] em28xx: fix assignment of the eeprom data [media] hdpvr: fix iteration over uninitialized lists in hdpvr_probe() [media] usbtv: fix dependency [media] usbtv: Throw corrupted frames away [media] usbtv: Fix deinterlacing [media] v4l2: added missing mutex.h include to v4l2-ctrls.h [media] DocBook: upgrade media_api DocBook version to 4.2 [media] ml86v7667: fix compile warning: 'ret' set but not used [media] s5p-g2d: Fix registration failure [media] media: coda: Fix DT driver data pointer for i.MX27 [media] s5p-mfc: Fix input/output format reporting
| | * | | [media] DocBook: upgrade media_api DocBook version to 4.2Andrzej Hajda2013-07-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the last three errors of media_api DocBook validatation: (...) media_api.xml:414: element imagedata: validity error : Value "SVG" for attribute format of imagedata is not among the enumerated set media_api.xml:432: element imagedata: validity error : Value "SVG" for attribute format of imagedata is not among the enumerated set media_api.xml:452: element imagedata: validity error : Value "SVG" for attribute format of imagedata is not among the enumerated set (...) Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
| * | | | Merge tag 'regulator-v3.11-rc4' of ↵Linus Torvalds2013-08-081-3/+1
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator DT binding fixes from Mark Brown: "A couple of fixes to bring the DT binding documentation for Palmas into sync with the code" * tag 'regulator-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: palmas-pmic: doc: remove ti,tstep regulator: palmas-pmic: doc: fix typo for sleep-mode
| | * | | regulator: palmas-pmic: doc: remove ti,tstepNishanth Menon2013-07-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 28d1e8cd671a53d6b4f967abbbc2a55f7bd333f6 (regulator: palma: add ramp delay support through regulator constraints) Removed the regulator's ti,step option from driver without updating the documentation. So, remove from documentation and example as well. Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Mark Brown <broonie@linaro.org>
| | * | | regulator: palmas-pmic: doc: fix typo for sleep-modeNishanth Menon2013-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3c870e3f9d9d98f1ab98614b3b1fd5c79287d361 (regulator: palmas: Change the DT node property names to follow the convention) Missed updating mode-sleep from sleep-mode. Fix the same. Documentation example seems proper for this property. Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Mark Brown <broonie@linaro.org>
* | | | | ipv6: make unsolicited report intervals configurable for mldHannes Frederic Sowa2013-08-131-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit cab70040dfd95ee32144f02fade64f0cb94f31a0 ("net: igmp: Reduce Unsolicited report interval to 1s when using IGMPv3") and 2690048c01f32bf45d1c1e1ab3079bc10ad2aea7 ("net: igmp: Allow user-space configuration of igmp unsolicited report interval") by William Manley made igmp unsolicited report intervals configurable per interface and corrected the interval of unsolicited igmpv3 report messages resendings to 1s. Same needs to be done for IPv6: MLDv1 (RFC2710 7.10.): 10 seconds MLDv2 (RFC3810 9.11.): 1 second Both intervals are configurable via new procfs knobs mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval. (also added .force_mld_version to ipv6_devconf_dflt to bring structs in line without semantic changes) v2: a) Joined documentation update for IPv4 and IPv6 MLD/IGMP unsolicited_report_interval procfs knobs. b) incorporate stylistic feedback from William Manley v3: a) add new DEVCONF_* values to the end of the enum (thanks to David Miller) Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: William Manley <william.manley@youview.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: Add MOXA ART SoCs ethernet driverJonas Jensen2013-08-111-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MOXA UC-711X hardware(s) has an ethernet controller that seem to be developed internally. The IC used is "RTL8201CP". Since there is no public documentation, this driver is mostly the one published by MOXA that has been heavily cleaned up / ported from linux 2.6.9. Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Revert "net: sctp: convert sctp_checksum_disable module param into sctp sysctl"David S. Miller2013-08-091-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit cda5f98e36576596b9230483ec52bff3cc97eb21. As per Vlad's request. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: sctp: convert sctp_checksum_disable module param into sctp sysctlDaniel Borkmann2013-08-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the last module parameter for SCTP and make this configurable via sysctl for SCTP like all the rest of SCTP's configuration knobs. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-08-038-38/+79
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | Merge net into net-next to setup some infrastructure Eric Dumazet needs for usbnet changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-08-031-2/+2
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Don't ignore user initiated wireless regulatory settings on cards with custom regulatory domains, from Arik Nemtsov. 2) Fix length check of bluetooth information responses, from Jaganath Kanakkassery. 3) Fix misuse of PTR_ERR in btusb, from Adam Lee. 4) Handle rfkill properly while iwlwifi devices are offline, from Emmanuel Grumbach. 5) Fix r815x devices DMA'ing to stack buffers, from Hayes Wang. 6) Kernel info leak in ATM packet scheduler, from Dan Carpenter. 7) 8139cp doesn't check for DMA mapping errors, from Neil Horman. 8) Fix bridge multicast code to not snoop when no querier exists, otherwise mutlicast traffic is lost. From Linus Lüssing. 9) Avoid soft lockups in fib6_run_gc(), from Michal Kubecek. 10) Fix races in automatic address asignment on ipv6, which can result in incorrect lifetime assignments. From Jiri Benc. 11) Cure build bustage when CONFIG_NET_LL_RX_POLL is not set and rename it CONFIG_NET_RX_BUSY_POLL to eliminate the last reference to the original naming of this feature. From Cong Wang. 12) Fix crash in TIPC when server socket creation fails, from Ying Xue. 13) macvlan_changelink() silently succeeds when it shouldn't, from Michael S Tsirkin. 14) HTB packet scheduler can crash due to sign extension, fix from Stephen Hemminger. 15) With the cable unplugged, r8169 prints out a message every 10 seconds, make it netif_dbg() instead of netif_warn(). From Peter Wu. 16) Fix memory leak in rtm_to_ifaddr(), from Daniel Borkmann. 17) sis900 gets spurious TX queue timeouts due to mismanagement of link carrier state, from Denis Kirjanov. 18) Validate somaxconn sysctl to make sure it fits inside of a u16. From Roman Gushchin. 19) Fix MAC address filtering on qlcnic, from Shahed Shaikh. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits) qlcnic: Fix for flash update failure on 83xx adapter qlcnic: Fix link speed and duplex display for 83xx adapter qlcnic: Fix link speed display for 82xx adapter qlcnic: Fix external loopback test. qlcnic: Removed adapter series name from warning messages. qlcnic: Free up memory in error path. qlcnic: Fix ingress MAC learning qlcnic: Fix MAC address filter issue on 82xx adapter net: ethernet: davinci_emac: drop IRQF_DISABLED netlabel: use domain based selectors when address based selectors are not available net: check net.core.somaxconn sysctl values sis900: Fix the tx queue timeout issue net: rtm_to_ifaddr: free ifa if ifa_cacheinfo processing fails r8169: remove "PHY reset until link up" log spam net: ethernet: cpsw: drop IRQF_DISABLED htb: fix sign extension bug macvlan: handle set_promiscuity failures macvlan: better mode validation tipc: fix oops when creating server socket fails net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL ...
| | * | | | net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLLCong Wang2013-08-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eliezer renames several *ll_poll to *busy_poll, but forgets CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too. Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | printk: move to separate directory for easier modificationJoe Perches2013-07-311-1/+1
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it easier to break up printk into bite-sized chunks. Remove printk path/filename from comment. Signed-off-by: Joe Perches <joe@perches.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ming Lei <ming.lei@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | Merge tag 'fixes-for-linus' of ↵Linus Torvalds2013-07-261-0/+1
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "This is a largeish batch of fixes, mostly because I missed -rc2 due to travel/vacation. So in number these are a bit more than ideal unless you amortize them over two -rcs. Quick breakdown: - Defconfig updates - Making multi_v7_defconfig useful on more hardware to encourage single-image usage - Davinci and nomadik updates due to new code merged this merge window - Fixes for UART on Samsung platforms, both PM and clock-related - A handful of warning fixes from defconfig builds, including for max8925 backlight and pxamci (both with appropriate acks) - Exynos5440 fixes for LPAE configuration, PM - ...plus a bunch of other smaller changes all over the place I expect to switch to regressions-or-severe-bugs-only fixes from here on out" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (37 commits) mfd: max8925: fix dt code for backlight ARM: omap5: Only select errata 798181 if SMP ARM: EXYNOS: Update CONFIG_ARCH_NR_GPIO for Exynos ARM: EXYNOS: Fix low level debug support ARM: SAMSUNG: Save/restore only selected uart's registers ARM: SAMSUNG: Add SAMSUNG_PM config option to select pm ARM: S3C24XX: Add missing clkdev entries for s3c2440 UART ARM: multi_v7_defconfig: Select USB chipidea driver ARM: pxa: propagate errors from regulator_enable() to pxamci ARM: zynq: fix compilation warning ARM: keystone: fix compilation warning ARM: highbank: Only touch common coherency control register fields ARM: footbridge: fix overlapping PCI mappings dmaengine: shdma: fix a build failure on platforms with no DMA support ARM: STi: Set correct ARM ERRATAs. ARM: dts: STi: Fix pinconf setup for STiH416 serial2 ARM: nomadik: configure for NO_HZ and HRTIMERS ARM: nomadik: update defconfig base ARM: nomadik: Update MMC defconfigs ARM: davinci: defconfig: enable EDMA driver ...
| | * \ \ \ Merge tag 'imx-fixes-3.11' of git://git.linaro.org/people/shawnguo/linux-2.6 ↵Olof Johansson2013-07-221-0/+1
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into fixes From Shawn Guo, imx fixes for 3.11: - A few device tree source fixes regarding pinctrl, clock, and pwm backlight. - Fixes imx28 and imx51 audio driver failure caused by sgtl5000 codec driver change by supplying the correct clock for codec. - imx6q emi_sel clock muxing and imx6q-iomuxc-gpr macro fixes * tag 'imx-fixes-3.11' of git://git.linaro.org/people/shawnguo/linux-2.6: ARM: dts: imx51-babbage: Pass a real clock to the codec ARM i.MX53: mba53: Fix PWM backlight DT node ARM: imx: fix vf610 enet module clock selection ARM: mxs: saif0 is the clock provider to sgtl5000 ARM: i.MX6Q: correct emi_sel clock muxing ARM i.MX6Q: Fix IOMUXC GPR1 defines for ENET_CLK_SEL and IPU1/2_MUX ARM: i.MX27: Typo fix ARM: imx27: Fix documentation for SPLL clock ARM i.MX53: Fix UART pad configuration
| | | * | | | ARM: imx27: Fix documentation for SPLL clockMarkus Pargmann2013-07-151-0/+1
| | | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spll_gate was added with commit b7eed2076183994dbda2c19bc7fba99b65a135e3 "ARM: imx27: add a clock gate to activate SPLL clock". spll_gate is missing in the devicetree clock documentation for imx27. This patch adds it to the list of clocks in the documentation. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
| * | | | | Merge tag 'char-misc-3.11-rc3' of ↵Linus Torvalds2013-07-261-22/+22
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc patches from Greg KH: "Here are some char/misc patches for 3.11-rc3. It's pretty much just: - mei fixes - hyperv fixes - new ja_JP translation update all tiny stuff, but fixes for issues people have reported." * tag 'char-misc-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: HOWTO ja_JP sync mei: me: fix waiting for hw ready mei: don't have to clean the state on power up mei: me: fix reset state machine mei: hbm: fix typo in error message Tools: hv: KVP: Fix a bug in IPV6 subnet enumeration Drivers: hv: balloon: Do not post pressure status if interrupted Drivers: hv: balloon: Fix a bug in the hot-add code Drivers: hv: vmbus: incorrect device name is printed when child device is unregistered
| | * | | | | HOWTO ja_JP syncTsugikazu Shibata2013-07-241-22/+22
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attached is Documentation/ja_JP/HOWTO sync patch for 3.10. This patch was reviewed by Japanese translation community called JF. Signed-off-by: Tsugikazu Shibata <tshibata@ab.jp.nec.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | | Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds2013-07-231-0/+2
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull device tree bug fixes and maintainership updates from Grant Likely: "This branch contains a couple of minor bug fixes and documentation additions, but the bulk of it are several changes to the MAINTAINERS file regarding the subsystems I've been involved with" * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: of/irq: init struct resource to 0 in of_irq_to_resource() of/irq: Avoid calling list_first_entry() for empty list of: add vendor prefixes for hisilicon of: add vendor prefix for Qualcomm Atheros, Inc. MAINTAINERS: Fix incorrect status tag MAINTAINERS: Refactor device tree maintainership MAINTAINERS: Change device tree mailing list MAINTAINERS: Remove Grant Likely
| | * | | | | of: add vendor prefixes for hisiliconZhangfei Gao2013-07-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Grant Likely <grant.likely@linaro.org>
| | * | | | | of: add vendor prefix for Qualcomm Atheros, Inc.Gabor Juhos2013-07-221-0/+1
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prefix will be used in various compatible properties for the devices from Qualcomm Atheros, Inc. Cc: Luis R. Rodriguez <rodrigue@qca.qualcomm.com> Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Grant Likely <grant.likely@linaro.org>
| * | | | | Merge branch 'for-3.11/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2013-07-223-13/+51
| |\ \ \ \ \ | | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block IO driver bits from Jens Axboe: "As I mentioned in the core block pull request, due to real life circumstances the driver pull request would be late. Now it looks like -rc2 late... On the plus side, apart form the rsxx update, these are all things that I could argue could go in later in the cycle as they are fixes and not features. So even though things are late, it's not ALL bad. The pull request contains: - Updates to bcache, all bug fixes, from Kent. - A pile of drbd bug fixes (no big features this time!). - xen blk front/back fixes. - rsxx driver updates, some of them deferred form 3.10. So should be well cooked by now" * 'for-3.11/drivers' of git://git.kernel.dk/linux-block: (63 commits) bcache: Allocation kthread fixes bcache: Fix GC_SECTORS_USED() calculation bcache: Journal replay fix bcache: Shutdown fix bcache: Fix a sysfs splat on shutdown bcache: Advertise that flushes are supported bcache: check for allocation failures bcache: Fix a dumb race bcache: Use standard utility code bcache: Update email address bcache: Delete fuzz tester bcache: Document shrinker reserve better bcache: FUA fixes drbd: Allow online change of al-stripes and al-stripe-size drbd: Constants should be UPPERCASE drbd: Ignore the exit code of a fence-peer handler if it returns too late drbd: Fix rcu_read_lock balance on error path drbd: fix error return code in drbd_init() drbd: Do not sleep inside rcu bcache: Refresh usage docs ...
| | * | | | Merge branch 'bcache-for-3.11' of git://evilpiepirate.org/~kent/linux-bcache ↵Jens Axboe2013-07-021-18/+29
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | into for-3.11/drivers
| | | * | | | bcache: Refresh usage docsGabriel de Perthuis2013-06-261-13/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mention udev autoregistration, symlinks. Write down some sysfs paths. Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
| | | * | | | doc: Fix typo in documentation/bcache.txtMasanari Iida2013-06-261-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct spelling typo in documentation/bcache.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
| | * | | | | Merge tag 'v3.10-rc7' into for-3.11/driversJens Axboe2013-07-0226-74/+489
| | |\ \ \ \ \ | | | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.10-rc7 Pull this in early to avoid doing it with the bcache merge, since there are a number of changes to bcache between my old base (3.10-rc1) and the new pull request.
| | * | | | | Merge branch 'stable/for-jens-3.10' of ↵Jens Axboe2013-06-282-0/+27
| | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.11/drivers Konrad writes: It has the 'feature-max-indirect-segments' implemented in both backend and frontend. The current problem with the backend and frontend is that the segment size is limited to 11 pages. It means we can at most squeeze in 44kB per request. The ring can hold 32 (next power of two below 36) requests, meaning we can do 1.4M of outstanding requests. Nowadays that is not enough. The problem in the past was addressed in two ways - but neither one went upstream. The first solution to this proposed by Justin from Spectralogic was to negotiate the segment size. This means that the ‘struct blkif_sring_entry’ is now a variable size. It can expand from 112 bytes (cover 11 pages of data - 44kB) to 1580 bytes (256 pages of data - so 1MB). It is a simple extension by just making the array in the request expand from 11 to a variable size negotiated. But it had limits: this extension still limits the number of segments per request to 255 (as the total number must be specified in the request, which only has an 8-bit field for that purpose). The other solution (from Intel - Ronghui) was to create one extra ring that only has the ‘struct blkif_request_segment’ in them. The ‘struct blkif_request’ would be changed to have an index in said ‘segment ring’. There is only one segment ring. This means that the size of the initial ring is still the same. The requests would point to the segment and enumerate out how many of the indexes it wants to use. The limit is of course the size of the segment. If one assumes a one-page segment this means we can in one request cover ~4MB. Those patches were posted as RFC and the author never followed up on the ideas on changing it to be a bit more flexible. There is yet another mechanism that could be employed  (which these patches implement) - and it borrows from VirtIO protocol. And that is the ‘indirect descriptors’. This very similar to what Intel suggests, but with a twist. The twist is to negotiate how many of these 'segment' pages (aka indirect descriptor pages) we want to support (in reality we negotiate how many entries in the segment we want to cover, and we module the number if it is bigger than the segment size). This means that with the existing 36 slots in the ring (single page) we can cover: 32 slots * each blkif_request_indirect covers: 512 * 4096 ~= 64M. Since we ample space in the blkif_request_indirect to span more than one indirect page, that number (64M) can be also multiplied by eight = 512MB. Roger Pau Monne took the idea and implemented them in these patches. They work great and the corner cases (migration between backends with and without this extension) work nicely. The backend has a limit right now off how many indirect entries it can handle: one indirect page, and at maximum 256 entries (out of 512 - so 50% of the page is used). That comes out to 32 slots * 256 entries in a indirect page * 1 indirect page per request * 4096 = 32MB. This is a conservative number that can change in the future. Right now it strikes a good balance between giving excellent performance, memory usage in the backend, and balancing the needs of many guests. In the patchset there is also the split of the blkback structure to be per-VBD. This means that the spinlock contention we had with many guests trying to do I/O and all the blkback threads hitting the same lock has been eliminated. Also there are bug-fixes to deal with oddly sized sectors, insane amounts on th ring, and also a security fix (posted earlier).
| | | * | | | | xen-blkback/sysfs: Move the parameters for the persistent grant featuresKonrad Rzeszutek Wilk2013-06-042-18/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to a testing subdirectory (as this value should not be baked for the life-time) and also in an appropiate file. Also modified the introduction Linux version from 3.10 to 3.11. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | | * | | | | xen-blkfront: Introduce a 'max' module parameter to alter the amount of ↵Konrad Rzeszutek Wilk2013-06-041-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | indirect segments. The max module parameter (by default 32) is the maximum number of segments that the frontend will negotiate with the backend for indirect descriptors. Higher value means more potential throughput but more memory usage. The backend picks the minimum of the frontend and its default backend value. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | | * | | | | xen-blkback: implement LRU mechanism for persistent grantsRoger Pau Monne2013-04-181-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This mechanism allows blkback to change the number of grants persistently mapped at run time. The algorithm uses a simple LRU mechanism that removes (if needed) the persistent grants that have not been used since the last LRU run, or if all grants have been used it removes the first grants in the list (that are not in use). The algorithm allows the user to change the maximum number of persistent grants, by changing max_persistent_grants in sysfs. Since we are storing the persistent grants used inside the request struct (to be able to mark them as "unused" when unmapping), we no longer need the bitmap (unmap_seg). Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | | * | | | | xen-blkback: use balloon pages for all mappingsRoger Pau Monne2013-04-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using balloon pages for all granted pages allows us to simplify the logic in blkback, especially in the xen_blkbk_map function, since now we can decide if we want to map a grant persistently or not after we have actually mapped it. This could not be done before because persistent grants used ballooned pages, whereas non-persistent grants used pages from the kernel. This patch also introduces several changes, the first one is that the list of free pages is no longer global, now each blkback instance has it's own list of free pages that can be used to map grants. Also, a run time parameter (max_buffer_pages) has been added in order to tune the maximum number of free pages each blkback instance will keep in it's buffer. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: xen-devel@lists.xen.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | | | | | | Documentation: add networking/netdev-FAQ.txtPaul Gortmaker2013-07-312-0/+226
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A collection of expectations and operational details about how networking development takes place in the context of the netdev mailing list. The content is meant to capture specific items that are unique to netdev workflow, and not re-document generic linux expectations that are already captured elsewhere. This was originally proposed[1] as a regular posting mailing list FAQ, but it probably is more universally accessible here in tree. [1] https://lwn.net/Articles/559211/ Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | tcp: add tcp_syncookies mode to allow unconditionally generation of syncookiesHannes Frederic Sowa2013-07-301-0/+4
| |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you want to test which effects syncookies have to your | network connections you can set this knob to 2 to enable | unconditionally generation of syncookies. Original idea and first implementation by Eric Dumazet. Cc: Florian Westphal <fw@strlen.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | tcp: TCP_NOTSENT_LOWAT socket optionEric Dumazet2013-07-241-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Idea of this patch is to add optional limitation of number of unsent bytes in TCP sockets, to reduce usage of kernel memory. TCP receiver might announce a big window, and TCP sender autotuning might allow a large amount of bytes in write queue, but this has little performance impact if a large part of this buffering is wasted : Write queue needs to be large only to deal with large BDP, not necessarily to cope with scheduling delays (incoming ACKS make room for the application to queue more bytes) For most workloads, using a value of 128 KB or less is OK to give applications enough time to react to POLLOUT events in time (or being awaken in a blocking sendmsg()) This patch adds two ways to set the limit : 1) Per socket option TCP_NOTSENT_LOWAT 2) A sysctl (/proc/sys/net/ipv4/tcp_notsent_lowat) for sockets not using TCP_NOTSENT_LOWAT socket option (or setting a zero value) Default value being UINT_MAX (0xFFFFFFFF), meaning this has no effect. This changes poll()/select()/epoll() to report POLLOUT only if number of unsent bytes is below tp->nosent_lowat Note this might increase number of sendmsg()/sendfile() calls when using non blocking sockets, and increase number of context switches for blocking sockets. Note this is not related to SO_SNDLOWAT (as SO_SNDLOWAT is defined as : Specify the minimum number of bytes in the buffer until the socket layer will pass the data to the protocol) Tested: netperf sessions, and watching /proc/net/protocols "memory" column for TCP With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458) lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 45458 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 45458 no 208 yes kernel y y y y y y y y y y y y y n y y y y y lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 20567 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 20567 no 208 yes kernel y y y y y y y y y y y y y n y y y y y Using 128KB has no bad effect on the throughput or cpu usage of a single flow, although there is an increase of context switches. A bonus is that we hold socket lock for a shorter amount of time and should improve latencies of ACK processing. lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1651584 6291456 16384 20.00 17447.90 10^6bits/s 3.13 S -1.00 U 0.353 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 412,514 context-switches 200.034645535 seconds time elapsed lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1593240 6291456 16384 20.00 17321.16 10^6bits/s 3.35 S -1.00 U 0.381 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 2,675,818 context-switches 200.029651391 seconds time elapsed Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-By: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | net: sctp: trivial: update mailing list addressDaniel Borkmann2013-07-241-4/+1
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SCTP mailing list address to send patches or questions to is linux-sctp@vger.kernel.org and not lksctp-developers@lists.sourceforge.net anymore. Therefore, update all occurences. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | kernel: delete __cpuinit usage from all core kernel filesPaul Gortmaker2013-07-141-3/+3
| |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the uses of the __cpuinit macros from C files in the core kernel directories (kernel, init, lib, mm, and include) that don't really have a specific maintainer. [1] https://lkml.org/lkml/2013/5/20/589 Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
OpenPOWER on IntegriCloud