summaryrefslogtreecommitdiffstats
path: root/include/net
Commit message (Collapse)AuthorAgeFilesLines
* vfs: do bulk POLL* -> EPOLL* replacementLinus Torvalds2018-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2018-02-091-0/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2018-02-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Two fixes for BPF sockmap in order to break up circular map references from programs attached to sockmap, and detaching related sockets in case of socket close() event. For the latter we get rid of the smap_state_change() and plug into ULP infrastructure, which will later also be used for additional features anyway such as TX hooks. For the second issue, dependency chain is broken up via map release callback to free parse/verdict programs, all from John. 2) Fix a libbpf relocation issue that was found while implementing XDP support for Suricata project. Issue was that when clang was invoked with default target instead of bpf target, then various other e.g. debugging relevant sections are added to the ELF file that contained relocation entries pointing to non-BPF related sections which libbpf trips over instead of skipping them. Test cases for libbpf are added as well, from Jesper. 3) Various misc fixes for bpftool and one for libbpf: a small addition to libbpf to make sure it recognizes all standard section prefixes. Then, the Makefile in bpftool/Documentation is improved to explicitly check for rst2man being installed on the system as we otherwise risk installing empty man pages; the man page for bpftool-map is corrected and a set of missing bash completions added in order to avoid shipping bpftool where the completions are only partially working, from Quentin. 4) Fix applying the relocation to immediate load instructions in the nfp JIT which were missing a shift, from Jakub. 5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL in this case; and explicitly delete the veth devices in the two tests test_xdp_{meta,redirect}.sh before dismantling the netnses as when selftests are run in batch mode, then workqueue to handle destruction might not have finished yet and thus veth creation in next test under same dev name would fail, from Yonghong. 6) Fix test_kmod.sh to check the test_bpf.ko module path before performing an insmod, and fallback to modprobe. Especially the latter is useful when having a device under test that has the modules installed instead, from Naresh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf: sockmap, add sock close() hook to remove socksJohn Fastabend2018-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The selftests test_maps program was leaving dangling BPF sockmap programs around because not all psock elements were removed from the map. The elements in turn hold a reference on the BPF program they are attached to causing BPF programs to stay open even after test_maps has completed. The original intent was that sk_state_change() would be called when TCP socks went through TCP_CLOSE state. However, because socks may be in SOCK_DEAD state or the sock may be a listening socket the event is not always triggered. To resolve this use the ULP infrastructure and register our own proto close() handler. This fixes the above case. Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Reported-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * net: add a UID to use for ULP socket assignmentJohn Fastabend2018-02-061-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a UID field and enum that can be used to assign ULPs to sockets. This saves a set of string comparisons if the ULP id is known. For sockmap, which is added in the next patches, a ULP is used to hook into TCP sockets close state. In this case the ULP being added is done at map insert time and the ULP is known and done on the kernel side. In this case the named lookup is not needed. Because we don't want to expose psock internals to user space socket options a user visible flag is also added. For TLS this is set for BPF it will be cleared. Alos remove pr_notice, user gets an error code back and should check that rather than rely on logs. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2018-02-072-6/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for you net tree, they are: 1) Restore __GFP_NORETRY in xt_table allocations to mitigate effects of large memory allocation requests, from Michal Hocko. 2) Release IPv6 fragment queue in case of error in fragmentation header, this is a follow up to amend patch 83f1999caeb1, from Subash Abhinov Kasiviswanathan. 3) Flowtable infrastructure depends on NETFILTER_INGRESS as it registers a hook for each flowtable, reported by John Crispin. 4) Missing initialization of info->priv in xt_cgroup version 1, from Cong Wang. 5) Give a chance to garbage collector to run after scheduling flowtable cleanup. 6) Releasing flowtable content on nft_flow_offload module removal is not required at all, there is not dependencies between this module and flowtables, remove it. 7) Fix missing xt_rateest_mutex grabbing for hash insertions, also from Cong Wang. 8) Move nf_flow_table_cleanup() routine to flowtable core, this patch is a dependency for the next patch in this list. 9) Flowtable resources are not properly released on removal from the control plane. Fix this resource leak by scheduling removal of all entries and explicit call to the garbage collector. 10) nf_ct_nat_offset() declaration is dead code, this function prototype is not used anywhere, remove it. From Taehee Yoo. 11) Fix another flowtable resource leak on entry insertion failures, this patch also fixes a possible use-after-free. Patch from Felix Fietkau. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netfilter: nf_flow_offload: fix use-after-free and a resource leakFelix Fietkau2018-02-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flow_offload_del frees the flow, so all associated resource must be freed before. Since the ct entry in struct flow_offload_entry was allocated by flow_offload_alloc, it should be freed by flow_offload_free to take care of the error handling path when flow_offload_add fails. While at it, make flow_offload_del static, since it should never be called directly, only from the gc step Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: remove useless prototypeTaehee Yoo2018-02-071-5/+0
| | | | | | | | | | | | | | | | | | prototype nf_ct_nat_offset is not used anymore. Signed-off-by: Taehee Yoo <ap420073@gmail.com>
| * | netfilter: nf_tables: fix flowtable freePablo Neira Ayuso2018-02-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every flow_offload entry is added into the table twice. Because of this, rhashtable_free_and_destroy can't be used, since it would call kfree for each flow_offload object twice. This patch cleans up the flowtable via nf_flow_table_iterate() to schedule removal of entries by setting on the dying bit, then there is an explicitly invocation of the garbage collector to release resources. Based on patch from Felix Fietkau. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: nft_flow_offload: move flowtable cleanup routines to nf_flow_tablePablo Neira Ayuso2018-02-071-0/+3
| | | | | | | | | | | | | | | | | | | | | Move the flowtable cleanup routines to nf_flow_table and expose the nf_flow_table_cleanup() helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | | net: erspan: fix metadata extractionWilliam Tu2018-02-061-13/+13
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d350a823020e ("net: erspan: create erspan metadata uapi header") moves the erspan 'version' in front of the 'struct erspan_md2' for later extensibility reason. This breaks the existing erspan metadata extraction code because the erspan_md2 then has a 4-byte offset to between the erspan_metadata and erspan_base_hdr. This patch fixes it. Fixes: 1a66a836da63 ("gre: add collect_md mode to ERSPAN tunnel") Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode") Fixes: 1d7e2ed22f8d ("net: erspan: refactor existing erspan code") Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'usercopy-v4.16-rc1' of ↵Linus Torvalds2018-02-032-2/+9
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardened usercopy whitelisting from Kees Cook: "Currently, hardened usercopy performs dynamic bounds checking on slab cache objects. This is good, but still leaves a lot of kernel memory available to be copied to/from userspace in the face of bugs. To further restrict what memory is available for copying, this creates a way to whitelist specific areas of a given slab cache object for copying to/from userspace, allowing much finer granularity of access control. Slab caches that are never exposed to userspace can declare no whitelist for their objects, thereby keeping them unavailable to userspace via dynamic copy operations. (Note, an implicit form of whitelisting is the use of constant sizes in usercopy operations and get_user()/put_user(); these bypass all hardened usercopy checks since these sizes cannot change at runtime.) This new check is WARN-by-default, so any mistakes can be found over the next several releases without breaking anyone's system. The series has roughly the following sections: - remove %p and improve reporting with offset - prepare infrastructure and whitelist kmalloc - update VFS subsystem with whitelists - update SCSI subsystem with whitelists - update network subsystem with whitelists - update process memory with whitelists - update per-architecture thread_struct with whitelists - update KVM with whitelists and fix ioctl bug - mark all other allocations as not whitelisted - update lkdtm for more sensible test overage" * tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits) lkdtm: Update usercopy tests for whitelisting usercopy: Restrict non-usercopy caches to size 0 kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl kvm: whitelist struct kvm_vcpu_arch arm: Implement thread_struct whitelist for hardened usercopy arm64: Implement thread_struct whitelist for hardened usercopy x86: Implement thread_struct whitelist for hardened usercopy fork: Provide usercopy whitelisting for task_struct fork: Define usercopy region in thread_stack slab caches fork: Define usercopy region in mm_struct slab caches net: Restrict unwhitelisted proto caches to size 0 sctp: Copy struct sctp_sock.autoclose to userspace using put_user() sctp: Define usercopy region in SCTP proto slab cache caif: Define usercopy region in caif proto slab cache ip: Define usercopy region in IP proto slab cache net: Define usercopy region in struct proto slab cache scsi: Define usercopy region in scsi_sense_cache slab cache cifs: Define usercopy region in cifs_request slab cache vxfs: Define usercopy region in vxfs_inode slab cache ufs: Define usercopy region in ufs_inode_cache slab cache ...
| * sctp: Define usercopy region in SCTP proto slab cacheDavid Windsor2018-01-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SCTP socket event notification subscription information need to be copied to/from userspace. In support of usercopy hardening, this patch defines a region in the struct proto slab cache in which userspace copy operations are allowed. Additionally moves the usercopy fields to be adjacent for the region to cover both. example usage trace: net/sctp/socket.c: sctp_getsockopt_events(...): ... copy_to_user(..., &sctp_sk(sk)->subscribe, len) sctp_setsockopt_events(...): ... copy_from_user(&sctp_sk(sk)->subscribe, ..., optlen) sctp_getsockopt_initmsg(...): ... copy_to_user(..., &sctp_sk(sk)->initmsg, len) This region is known as the slab cache's usercopy region. Slab caches can now check that each dynamically sized copy operation involving cache-managed memory falls entirely within the slab's usercopy region. This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Signed-off-by: David Windsor <dave@nullcore.net> [kees: split from network patch, move struct members adjacent] [kees: add SCTPv6 struct whitelist, provide usage trace] Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-sctp@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * net: Define usercopy region in struct proto slab cacheDavid Windsor2018-01-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In support of usercopy hardening, this patch defines a region in the struct proto slab cache in which userspace copy operations are allowed. Some protocols need to copy objects to/from userspace, and they can declare the region via their proto structure with the new usersize and useroffset fields. Initially, if no region is specified (usersize == 0), the entire field is marked as whitelisted. This allows protocols to be whitelisted in subsequent patches. Once all protocols have been annotated, the full-whitelist default can be removed. This region is known as the slab cache's usercopy region. Slab caches can now check that each dynamically sized copy operation involving cache-managed memory falls entirely within the slab's usercopy region. This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Signed-off-by: David Windsor <dave@nullcore.net> [kees: adjust commit log, split off per-proto patches] [kees: add logic for by-default full-whitelist] Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2018-01-3158-362/+1459
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Significantly shrink the core networking routing structures. Result of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf 2) Add netdevsim driver for testing various offloads, from Jakub Kicinski. 3) Support cross-chip FDB operations in DSA, from Vivien Didelot. 4) Add a 2nd listener hash table for TCP, similar to what was done for UDP. From Martin KaFai Lau. 5) Add eBPF based queue selection to tun, from Jason Wang. 6) Lockless qdisc support, from John Fastabend. 7) SCTP stream interleave support, from Xin Long. 8) Smoother TCP receive autotuning, from Eric Dumazet. 9) Lots of erspan tunneling enhancements, from William Tu. 10) Add true function call support to BPF, from Alexei Starovoitov. 11) Add explicit support for GRO HW offloading, from Michael Chan. 12) Support extack generation in more netlink subsystems. From Alexander Aring, Quentin Monnet, and Jakub Kicinski. 13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From Russell King. 14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso. 15) Many improvements and simplifications to the NFP driver bpf JIT, from Jakub Kicinski. 16) Support for ipv6 non-equal cost multipath routing, from Ido Schimmel. 17) Add resource abstration to devlink, from Arkadi Sharshevsky. 18) Packet scheduler classifier shared filter block support, from Jiri Pirko. 19) Avoid locking in act_csum, from Davide Caratti. 20) devinet_ioctl() simplifications from Al viro. 21) More TCP bpf improvements from Lawrence Brakmo. 22) Add support for onlink ipv6 route flag, similar to ipv4, from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits) tls: Add support for encryption using async offload accelerator ip6mr: fix stale iterator net/sched: kconfig: Remove blank help texts openvswitch: meter: Use 64-bit arithmetic instead of 32-bit tcp_nv: fix potential integer overflow in tcpnv_acked r8169: fix RTL8168EP take too long to complete driver initialization. qmi_wwan: Add support for Quectel EP06 rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK ipmr: Fix ptrdiff_t print formatting ibmvnic: Wait for device response when changing MAC qlcnic: fix deadlock bug tcp: release sk_frag.page in tcp_disconnect ipv4: Get the address of interface correctly. net_sched: gen_estimator: fix lockdep splat net: macb: Handle HRESP error net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring ipv6: addrconf: break critical section in addrconf_verify_rtnl() ipv6: change route cache aging logic i40e/i40evf: Update DESC_NEEDED value to reflect larger value bnxt_en: cleanup DIM work on device shutdown ...
| * | tls: Add support for encryption using async offload acceleratorVakul Garg2018-01-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Async crypto accelerators (e.g. drivers/crypto/caam) support offloading GCM operation. If they are enabled, crypto_aead_encrypt() return error code -EINPROGRESS. In this case tls_do_encryption() needs to wait on a completion till the time the response for crypto offload request is received. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net_sched: plug in qdisc ops change_tx_queue_lenCong Wang2018-01-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new qdisc ops ->change_tx_queue_len() so that each qdisc could decide how to implement this if it wants. Previously we simply read dev->tx_queue_len, after pfifo_fast switches to skb array, we need this API to resize the skb array when we change dev->tx_queue_len. To avoid handling race conditions with TX BH, we need to deactivate all TX queues before change the value and bring them back after we are done, this also makes implementation easier. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-292-0/+18
| |\ \ | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-01-281-6/+36
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-26 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) A number of extensions to tcp-bpf, from Lawrence. - direct R or R/W access to many tcp_sock fields via bpf_sock_ops - passing up to 3 arguments to bpf_sock_ops functions - tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks - optionally calling bpf_sock_ops program when RTO fires - optionally calling bpf_sock_ops program when packet is retransmitted - optionally calling bpf_sock_ops program when TCP state changes - access to tclass and sk_txhash - new selftest 2) div/mod exception handling, from Daniel. One of the ugly leftovers from the early eBPF days is that div/mod operations based on registers have a hard-coded src_reg == 0 test in the interpreter as well as in JIT code generators that would return from the BPF program with exit code 0. This was basically adopted from cBPF interpreter for historical reasons. There are multiple reasons why this is very suboptimal and prone to bugs. To name one: the return code mapping for such abnormal program exit of 0 does not always match with a suitable program type's exit code mapping. For example, '0' in tc means action 'ok' where the packet gets passed further up the stack, which is just undesirable for such cases (e.g. when implementing policy) and also does not match with other program types. After considering _four_ different ways to address the problem, we adapt the same behavior as on some major archs like ARMv8: X div 0 results in 0, and X mod 0 results in X. aarch64 and aarch32 ISA do not generate any traps or otherwise aborts of program execution for unsigned divides. Given the options, it seems the most suitable from all of them, also since major archs have similar schemes in place. Given this is all in the realm of undefined behavior, we still have the option to adapt if deemed necessary. 3) sockmap sample refactoring, from John. 4) lpm map get_next_key fixes, from Yonghong. 5) test cleanups, from Alexei and Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | bpf: Support passing args to sock_ops bpf functionLawrence Brakmo2018-01-251-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for passing up to 4 arguments to sock_ops bpf functions. It reusues the reply union, so the bpf_sock_ops structures are not increased in size. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | bpf: Add write access to tcp_sock and sock fieldsLawrence Brakmo2018-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a macro, SOCK_OPS_SET_FIELD, for writing to struct tcp_sock or struct sock fields. This required adding a new field "temp" to struct bpf_sock_ops_kern for temporary storage that is used by sock_ops_convert_ctx_access. It is used to store and recover the contents of a register, so the register can be used to store the address of the sk. Since we cannot overwrite the dst_reg because it contains the pointer to ctx, nor the src_reg since it contains the value we want to store, we need an extra register to contain the address of the sk. Also adds the macro SOCK_OPS_GET_OR_SET_FIELD that calls one of the GET or SET macros depending on the value of the TYPE field. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * | | | Merge branch 'master' of ↵David S. Miller2018-01-261-0/+12
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-01-26 One last patch for this development cycle: 1) Add ESN support for IPSec HW offload. From Yossef Efraim. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | xfrm: Add ESN support for IPSec HW offloadYossef Efraim2018-01-181-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ESN support to IPsec device offload. Adding new xfrm device operation to synchronize device ESN. Signed-off-by: Yossef Efraim <yossefe@mellanox.com> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | | | net: erspan: create erspan metadata uapi headerWilliam Tu2018-01-251-30/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds a new uapi header file, erspan.h, and moves the 'struct erspan_metadata' from internal erspan.h to it. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: erspan: use bitfield instead of mask and offsetWilliam Tu2018-01-251-33/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally the erspan fields are defined as a group into a __be16 field, and use mask and offset to access each field. This is more costly due to calling ntohs/htons. The patch changes it to use bitfields. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | pkt_cls: add new tc cls helper to check offload flag and chain indexJakub Kicinski2018-01-251-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Very few (mlxsw) upstream drivers seem to allow offload of chains other than 0. Save driver developers typing and add a helper for checking both if ethtool's TC offload flag is on and if chain is 0. This helper will set the extack appropriately in both error cases. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | Merge branch 'rebased-net-ioctl' of ↵David S. Miller2018-01-242-3/+3
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | lift handling of SIOCIW... out of dev_ioctl()Al Viro2018-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | | ip_rt_ioctl(): take copyin to callerAl Viro2018-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-242-1/+2
| |\ \ \ \ \ \ | | |/ / / / / | |/| | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: erspan: fix use-after-freeWilliam Tu2018-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building the erspan header for either v1 or v2, the eth_hdr() does not point to the right inner packet's eth_hdr, causing kasan report use-after-free and slab-out-of-bouds read. The patch fixes the following syzkaller issues: [1] BUG: KASAN: slab-out-of-bounds in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735 [2] BUG: KASAN: slab-out-of-bounds in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698 [3] BUG: KASAN: use-after-free in erspan_xmit+0x22d4/0x2430 net/ipv4/ip_gre.c:735 [4] BUG: KASAN: use-after-free in erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698 [2] CPU: 0 PID: 3654 Comm: syzkaller377964 Not tainted 4.15.0-rc9+ #185 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:440 erspan_build_header+0x3bf/0x3d0 net/ipv4/ip_gre.c:698 erspan_xmit+0x3b8/0x13b0 net/ipv4/ip_gre.c:740 __netdev_start_xmit include/linux/netdevice.h:4042 [inline] netdev_start_xmit include/linux/netdevice.h:4051 [inline] packet_direct_xmit+0x315/0x6b0 net/packet/af_packet.c:266 packet_snd net/packet/af_packet.c:2943 [inline] packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2968 sock_sendmsg_nosec net/socket.c:638 [inline] sock_sendmsg+0xca/0x110 net/socket.c:648 SYSC_sendto+0x361/0x5c0 net/socket.c:1729 SyS_sendto+0x40/0x50 net/socket.c:1697 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129 RIP: 0023:0xf7fcfc79 RSP: 002b:00000000ffc6976c EFLAGS: 00000286 ORIG_RAX: 0000000000000171 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000020011000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020008000 RBP: 000000000000001c R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Fixes: f551c91de262 ("net: erspan: introduce erspan v2 for ip_gre") Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN") Reported-by: syzbot+9723f2d288e49b492cf0@syzkaller.appspotmail.com Reported-by: syzbot+f0ddeb2b032a8e1d9098@syzkaller.appspotmail.com Reported-by: syzbot+f14b3703cd8d7670203f@syzkaller.appspotmail.com Reported-by: syzbot+eefa384efad8d7997f20@syzkaller.appspotmail.com Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: remove tc_cls_common_offload_init_deprecated()Jakub Kicinski2018-01-241-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All users are now converted to tc_cls_common_offload_init(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | cls_bpf: remove gen_flags from bpf_offloadJakub Kicinski2018-01-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cls_bpf now guarantees that only device-bound programs are allowed with skip_sw. The drivers no longer pay attention to flags on filter load, therefore the bpf_offload member can be removed. If flags are needed again they should probably be added to struct tc_cls_common_offload instead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: prepare for reimplementation of tc_cls_common_offload_init()Jakub Kicinski2018-01-241-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the tc_cls_common_offload_init() helper function to tc_cls_common_offload_init_deprecated() and add a new implementation which also takes flags argument. We will only set extack if flags indicate that offload is forced (skip_sw) otherwise driver errors should be ignored, as they don't influence the overall filter installation. Note that we need the tc_skip_hw() helper for new version, therefore it is added later in the file. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: propagate extack to cls->destroy callbacksJakub Kicinski2018-01-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Propagate extack to cls->destroy callbacks when called from non-error paths. On error paths pass NULL to avoid overwriting the failure message. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net/sched: act_csum: don't use spinlock in the fast pathDavide Caratti2018-01-231-2/+14
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on act_csum configuration, to reduce the effects of contention in the data path when multiple readers are present. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: sched: create tc_can_offload_extack() wrapperQuentin Monnet2018-01-221-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a wrapper around tc_can_offload() that takes an additional extack pointer argument in order to output an error message if TC offload is disabled on the device. In this way, the error message is handled by the core and can be the same for all drivers. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: sched: add extack support for offload via tc_cls_common_offloadQuentin Monnet2018-01-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add extack support for hardware offload of classifiers. In order to achieve this, a pointer to a struct netlink_ext_ack is added to the struct tc_cls_common_offload that is passed to the callback for setting up the classifier. Function tc_cls_common_offload_init() is updated to support initialization of this new attribute. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2018-01-212-41/+15
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for your net-next tree. Basically, a new extension for ip6tables, simplification work of nf_tables that saves us 500 LoC, allow raw table registration before defragmentation, conversion of the SNMP helper to use the ASN.1 code generator, unique 64-bit handle for all nf_tables objects and fixes to address fallout from previous nf-next batch. More specifically, they are: 1) Seven patches to remove family abstraction layer (struct nft_af_info) in nf_tables, this simplifies our codebase and it saves us 64 bytes per net namespace. 2) Add IPv6 segment routing header matching for ip6tables, from Ahmed Abdelsalam. 3) Allow to register iptable_raw table before defragmentation, some people do not want to waste cycles on defragmenting traffic that is going to be dropped, hence add a new module parameter to enable this behaviour in iptables and ip6tables. From Subash Abhinov Kasiviswanathan. This patch needed a couple of follow up patches to get things tidy from Arnd Bergmann. 4) SNMP helper uses the ASN.1 code generator, from Taehee Yoo. Several patches for this helper to prepare this change are also part of this patch series. 5) Add 64-bit handles to uniquely objects in nf_tables, from Harsha Sharma. 6) Remove log message that several netfilter subsystems print at boot/load time. 7) Restore x_tables module autoloading, that got broken in a previous patch to allow singleton NAT hook callback registration per hook spot, from Florian Westphal. Moreover, return EBUSY to report that the singleton NAT hook slot is already in instead. 8) Several fixes for the new nf_tables flowtable representation, including incorrect error check after nf_tables_flowtable_lookup(), missing Kconfig dependencies that lead to build breakage and missing initialization of priority and hooknum in flowtable object. 9) Missing NETFILTER_FAMILY_ARP dependency in Kconfig for the clusterip target. This is due to recent updates in the core to shrink the hook array size and compile it out if no specific family is enabled via .config file. Patch from Florian Westphal. 10) Remove duplicated include header files, from Wei Yongjun. 11) Sparse warning fix for the NFPROTO_INET handling from the core due to missing static function definition, also from Wei Yongjun. 12) Restore ICMPv6 Parameter Problem error reporting when defragmentation fails, from Subash Abhinov Kasiviswanathan. 13) Remove obsolete owner field initialization from struct file_operations, patch from Alexey Dobriyan. 14) Use boolean datatype where needed in the Netfilter codebase, from Gustavo A. R. Silva. 15) Remove double semicolon in dynset nf_tables expression, from Luis de Bethencourt. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | netfilter: nf_tables: allocate handle and delete objects via handleHarsha Sharma2018-01-191-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows deletion of objects via unique handle which can be listed via '-a' option. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | netfilter: nf_tables: get rid of struct nft_af_info abstractionPablo Neira Ayuso2018-01-101-21/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the infrastructure to register/unregister nft_af_info structure, this structure stores no useful information anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | netfilter: nf_tables: get rid of pernet familiesPablo Neira Ayuso2018-01-102-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a single table list for each netns, we can get rid of one pointer per family and the global afinfo list, thus, shrinking struct netns for nftables that now becomes 64 bytes smaller. And call __nft_release_afinfo() from __net_exit path accordingly to release netnamespace objects on removal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | netfilter: nf_tables: add single table list for all familiesPablo Neira Ayuso2018-01-102-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Place all existing user defined tables in struct net *, instead of having one list per family. This saves us from one level of indentation in netlink dump functions. Place pointer to struct nft_af_info in struct nft_table temporarily, as we still need this to put back reference module reference counter on table removal. This patch comes in preparation for the removal of struct nft_af_info. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | netfilter: nf_tables: remove flag field from struct nft_af_infoPablo Neira Ayuso2018-01-101-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace it by a direct check for the netdev protocol family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | netfilter: nf_tables: remove nhooks field from struct nft_af_infoPablo Neira Ayuso2018-01-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already validate the hook through bitmask, so this check is superfluous. When removing this, this patch is also fixing a bug in the new flowtable codebase, since ctx->afi points to the table family instead of the netdev family which is where the flowtable is really hooked in. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | net: sched: cls: add extack support for tcf_change_indevAlexander Aring2018-01-191-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds extack handling for the tcf_change_indev function which is common used by TC classifier implementations. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: cls: add extack support for delete callbackAlexander Aring2018-01-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds extack support for classifier delete callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: cls: add extack support for tcf_exts_validateAlexander Aring2018-01-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tcf_exts_validate function calls the act api change callback. For preparing extack support for act api, this patch adds the extack as parameter for this function which is common used in cls implementations. Furthermore the tcf_exts_validate will call action init callback which prepares the TC action subsystem for extack support. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: cls: add extack support for change callbackAlexander Aring2018-01-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds extack support for classifier change callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: sched: cls: fix code style issuesAlexander Aring2018-01-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes some code style issues pointed out by checkpatch inside the TC cls subsystem. Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | tcp: avoid min RTT bloat by skipping RTT from delayed-ACK in BBRYuchung Cheng2018-01-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A persistent connection may send tiny amount of data (e.g. health-check) for a long period of time. BBR's windowed min RTT filter may only see RTT samples from delayed ACKs causing BBR to grossly over-estimate the path delay depending how much the ACK was delayed at the receiver. This patch skips RTT samples that are likely coming from delayed ACKs. Note that it is possible the sender never obtains a valid measure to set the min RTT. In this case BBR will continue to set cwnd to initial window which seems fine because the connection is thin stream. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud