summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'sh_eth-sw-reset'David S. Miller2016-05-091-9/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sergei Shtylyov says: ==================== sh_eth: couple of software reset bit cleanups Here's a set of 2 patches against DaveM's 'net-next.git' repo. We can save on the repetitive chip reset code... [1/2] sh_eth: call sh_eth_tsu_write() from sh_eth_chip_reset_giga() [2/2] sh_eth: reuse sh_eth_chip_reset() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * sh_eth: reuse sh_eth_chip_reset()Sergei Shtylyov2016-05-091-9/+2
| | | | | | | | | | | | | | | | All the chip_reset() methods repeat the code writing to the ARSTR register and delaying for 1 ms, so that we can reuse sh_eth_chip_reset() twice. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sh_eth: call sh_eth_tsu_write() from sh_eth_chip_reset_giga()Sergei Shtylyov2016-05-091-2/+3
|/ | | | | | | | | sh_eth_chip_reset_giga() doesn't really need to use direct iowrite32() when writing to the ARSTR register, it can use sh_eth_tsu_write() as all other chip_reset() methods. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pxa168_eth: mdiobus_scan() doesn't return NULL anymoreSergei Shtylyov2016-05-091-2/+0
| | | | | | | | Now that mdiobus_scan() doesn't return NULL on failure anymore, this driver no longer needs to check for it... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-upstream' of ↵David S. Miller2016-05-098-53/+180
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-05-07 Here are a few more Bluetooth patches for the 4.7 kernel: - NULL pointer fix in hci_intel driver - New Intel Bluetooth controller id in btusb driver - Added device tree binding documentation for Marvel's bt-sd8xxx - Platform specific wakeup interrupt support for btmrvl driver Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]Tedd Ho-Jeong An2016-05-061-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for Intel Bluetooth device 8265 also known as Windstorm Peak (WsP). T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 6 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=8087 ProdID=0a2b Rev= 0.10 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * Bluetooth: hci_intel: Fix null gpio desc pointer dereferenceLoic Poulain2016-05-021-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | gpiod_get_optional can return either ERR_PTR or NULL pointer. NULL case is not tested and then dereferenced later in desc_to_gpio. Fix this by using non optional version which returns ERR_PTR in any error case (this is not an optional gpio). Use the same non optional version for the host-wake gpio. Fixes: 765ea3abd116 ("Bluetooth: hci_intel: Retrieve host-wake IRQ") Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * btmrvl: add platform specific wakeup interrupt supportXinming Hu2016-05-024-15/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | On some arm-based platforms, we need to configure platform specific parameters by device tree node and also define our node as a child node of parent SDIO host controller. This patch parses these parameters from device tree. It includes calibration data download to firmware, wakeup pin configured to firmware, and soc specific wake up gpio, which will be set as wakeup interrupt pin. Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * dt: bindings: add MARVELL's bt-sd8xxx wireless deviceXinming Hu2016-05-022-29/+56
| | | | | | | | | | | | | | | | | | | | Add device tree binding documentation for MARVELL's bluetooth sdio (sd8897 and sd8997) chip. Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* | tcp: refactor struct tcp_skb_cbLawrence Brakmo2016-05-091-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor tcp_skb_cb to create two overlaping areas to store state for incoming or outgoing skbs based on comments by Neal Cardwell to tcp_nv patch: AFAICT this patch would not require an increase in the size of sk_buff cb[] if it were to take advantage of the fact that the tcp_skb_cb header.h4 and header.h6 fields are only used in the packet reception code path, and this in_flight field is only used on the transmit side. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ifb: support more featuresEric Dumazet2016-05-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | When using ifb+netem on ingress on SIT/IPIP/GRE traffic, GRO packets are not properly processed. Segmentation should not be forced, since ifb is already adding quite a performance hit. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: make sch_handle_ingress() drop monitor readyEric Dumazet2016-05-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TC_ACT_STOLEN is used when ingress traffic is mirred/redirected to say ifb. Packet is not dropped, but consumed. Only TC_ACT_SHOT is a clear indication something went wrong. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ISDN: eicon: replace custom hex_asc_lo() / hex_pack_byte()Andy Shevchenko2016-05-081-14/+7
| | | | | | | | | | | | | | | | Instead of custom approach re-use generic helpers to convert byte to hex format. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | fq_codel: add memory limitation per queueEric Dumazet2016-05-082-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On small embedded routers, one wants to control maximal amount of memory used by fq_codel, instead of controlling number of packets or bytes, since GRO/TSO make these not practical. Assuming skb->truesize is accurate, we have to keep track of skb->truesize sum for skbs in queue. This patch adds a new TCA_FQ_CODEL_MEMORY_LIMIT attribute. I chose a default value of 32 MBytes, which looks reasonable even for heavy duty usages. (Prior fq_codel users should not be hurt when they upgrade their kernels) Two fields are added to tc_fq_codel_qd_stats to report : - Current memory usage - Number of drops caused by memory limits # tc qd replace dev eth1 root est 1sec 4sec fq_codel memory_limit 4M .. # tc -s -d qd sh dev eth1 qdisc fq_codel 8008: root refcnt 257 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn Sent 2083566791363 bytes 1376214889 pkt (dropped 4994406, overlimits 0 requeues 21705223) rate 9841Mbit 812549pps backlog 3906120b 376p requeues 21705223 maxpacket 68130 drop_overlimit 4994406 new_flow_count 28855414 ecn_mark 0 memory_used 4190048 drop_overmemory 4994406 new_flows_len 1 old_flows_len 177 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Dave Täht <dave.taht@gmail.com> Cc: Sebastian Möller <moeller0@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Add Qualcomm IPC routerCourtney Cavin2016-05-089-1/+1198
| | | | | | | | | | | | | | | | | | | | | | Add an implementation of Qualcomm's IPC router protocol, used to communicate with service providing remote processors. Signed-off-by: Courtney Cavin <courtney.cavin@sonymobile.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com> [bjorn: Cope with 0 being a valid node id and implement RTM_NEWADDR] Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | soc: qcom: smd: Introduce compile stubsBjorn Andersson2016-05-081-1/+27
| | | | | | | | | | | | | | | | | | Introduce compile stubs for the SMD API, allowing consumers to be compile tested. Acked-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4: Reset dcb state machine and tx queue prio only if dcb is enabledHariprasad Shenai2016-05-071-18/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | When cxgb4 is enabled with CONFIG_CHELSIO_T4_DCB set, VI enable command gets called with DCB enabled. But when we have a back to back setup with DCB enabled on one side and non-DCB on the Peer side. Firmware doesn't send any DCB_L2_CFG, and DCB priority is never set for Tx queue. But driver resets the queue priority and state machine whenever there is a link down, this patch fixes it by adding a check to reset only if cxgb4_dcb_enabled() returns true. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mlxsw: spectrum: Fix ordering in mlxsw_sp_finiJiri Pirko2016-05-061-1/+1
| | | | | | | | | | | | Fixes: 0f433fa0ec ("mlxsw: spectrum_buffers: Implement shared buffer configuration") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | macvtap: add namespace support to the sysfs device classMarc Angel2016-05-061-10/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating macvtaps that are expected to have the same ifindex in different network namespaces, only the first one will succeed. The others will fail with a sysfs_warn_dup warning due to them trying to create the following sysfs link (with 'NN' the ifindex of macvtapX): /sys/class/macvtap/tapNN -> /sys/devices/virtual/net/macvtapX/tapNN This is reproducible by running the following commands: ip netns add ns1 ip netns add ns2 ip link add veth0 type veth peer name veth1 ip link set veth0 netns ns1 ip link set veth1 netns ns2 ip netns exec ns1 ip l add link veth0 macvtap0 type macvtap ip netns exec ns2 ip l add link veth1 macvtap1 type macvtap The last command will fail with "RTNETLINK answers: File exists" (along with the kernel warning) but retrying it will work because the ifindex was incremented. The 'net' device class is isolated between network namespaces so each one has its own hierarchy of net devices. This isn't the case for the 'macvtap' device class. The problem occurs half-way through the netdev registration, when `macvtap_device_event` is called-back to create the 'tapNN' macvtap class device under the 'macvtapX' net class device. This patch adds namespace support to the 'macvtap' device class so that /sys/class/macvtap is no longer shared between net namespaces. However, making the macvtap sysfs class namespace-aware has the side effect of changing /sys/devices/virtual/net/macvtapX/tapNN into /sys/devices/virtual/net/macvtapX/macvtap/tapNN. This is due to Commit 24b1442 ("Driver-core: Always create class directories for classses that support namespaces") and the fact that class devices supporting namespaces are really not supposed to be placed directly under other class devices. To avoid breaking userland, a tapNN symlink pointing to macvtap/tapNN is created inside the macvtapX directory. Signed-off-by: Marc Angel <marc@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: tcp: ip_send_unicast_reply() is not BH safeEric Dumazet2016-05-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | I forgot that ip_send_unicast_reply() is not BH safe (yet). Disabling preemption before calling it was not a good move. Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andres Lagar-Cavilla <andreslc@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'bpf-direct-pkt-access'David S. Miller2016-05-0614-82/+1004
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== bpf: introduce direct packet access This set of patches introduce 'direct packet access' from cls_bpf and act_bpf programs (which are root only). Current bpf programs use LD_ABS, LD_INS instructions which have to do 'if (off < skb_headlen)' for every packet access. It's ok for socket filters, but too slow for XDP, since single LD_ABS insn consumes 3% of cpu. Therefore we have to amortize the cost of length check over multiple packet accesses via direct access to skb->data, data_end pointers. The existing packet parser typically look like: if (load_half(skb, offsetof(struct ethhdr, h_proto)) != ETH_P_IP) return 0; if (load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)) != IPPROTO_UDP || load_byte(skb, ETH_HLEN) != 0x45) return 0; ... with 'direct packet access' the bpf program becomes: void *data = (void *)(long)skb->data; void *data_end = (void *)(long)skb->data_end; struct eth_hdr *eth = data; struct iphdr *iph = data + sizeof(*eth); if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end) return 0; if (eth->h_proto != htons(ETH_P_IP)) return 0; if (iph->protocol != IPPROTO_UDP || iph->ihl != 5) return 0; ... which is more natural to write and significantly faster. See patch 6 for performance tests: 21Mpps(old) vs 24Mpps(new) with just 5 loads. For more complex parsers the performance gain is higher. The other approach implemented in [1] was adding two new instructions to interpreter and JITs and was too hard to use from llvm side. The approach presented here doesn't need any instruction changes, but the verifier has to work harder to check safety of the packet access. Patch 1 prepares the code and Patch 2 adds new checks for direct packet access and all of them are gated with 'env->allow_ptr_leaks' which is true for root only. Patch 3 improves search pruning for large programs. Patch 4 wires in verifier's changes with net/core/filter side. Patch 5 updates docs Patches 6 and 7 add tests. [1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | samples/bpf: add verifier testsAlexei Starovoitov2016-05-061-0/+80
| | | | | | | | | | | | | | | | | | | | | add few tests for "pointer to packet" logic of the verifier Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | samples/bpf: add 'pointer to packet' testsAlexei Starovoitov2016-05-065-0/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parse_simple.c - packet parser exapmle with single length check that filters out udp packets for port 9 parse_varlen.c - variable length parser that understand multiple vlan headers, ipip, ipip6 and ip options to filter out udp or tcp packets on port 9. The packet is parsed layer by layer with multitple length checks. parse_ldabs.c - classic style of packet parsing using LD_ABS instruction. Same functionality as parse_simple. simple = 24.1Mpps per core varlen = 22.7Mpps ldabs = 21.4Mpps Parser with LD_ABS instructions is slower than full direct access parser which does more packet accesses and checks. These examples demonstrate the choice bpf program authors can make between flexibility of the parser vs speed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: add documentation for 'direct packet access'Alexei Starovoitov2016-05-061-2/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | explain how verifier checks safety of packet access and update email addresses. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: wire in data and data_end for cls_act_bpfAlexei Starovoitov2016-05-064-6/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allow cls_bpf and act_bpf programs access skb->data and skb->data_end pointers. The bpf helpers that change skb->data need to update data_end pointer as well. The verifier checks that programs always reload data, data_end pointers after calls to such bpf helpers. We cannot add 'data_end' pointer to struct qdisc_skb_cb directly, since it's embedded as-is by infiniband ipoib, so wrapper struct is needed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: improve verifier state equivalenceAlexei Starovoitov2016-05-061-20/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | since UNKNOWN_VALUE type is weaker than CONST_IMM we can un-teach verifier its recognition of constants in conditional branches without affecting safety. Ex: if (reg == 123) { .. here verifier was marking reg->type as CONST_IMM instead keep reg as UNKNOWN_VALUE } Two verifier states with UNKNOWN_VALUE are equivalent, whereas CONST_IMM_X != CONST_IMM_Y, since CONST_IMM is used for stack range verification and other cases. So help search pruning by marking registers as UNKNOWN_VALUE where possible instead of CONST_IMM. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: direct packet accessAlexei Starovoitov2016-05-063-8/+440
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extended BPF carried over two instructions from classic to access packet data: LD_ABS and LD_IND. They're highly optimized in JITs, but due to their design they have to do length check for every access. When BPF is processing 20M packets per second single LD_ABS after JIT is consuming 3% cpu. Hence the need to optimize it further by amortizing the cost of 'off < skb_headlen' over multiple packet accesses. One option is to introduce two new eBPF instructions LD_ABS_DW and LD_IND_DW with similar usage as skb_header_pointer(). The kernel part for interpreter and x64 JIT was implemented in [1], but such new insns behave like old ld_abs and abort the program with 'return 0' if access is beyond linear data. Such hidden control flow is hard to workaround plus changing JITs and rolling out new llvm is incovenient. Therefore allow cls_bpf/act_bpf program access skb->data directly: int bpf_prog(struct __sk_buff *skb) { struct iphdr *ip; if (skb->data + sizeof(struct iphdr) + ETH_HLEN > skb->data_end) /* packet too small */ return 0; ip = skb->data + ETH_HLEN; /* access IP header fields with direct loads */ if (ip->version != 4 || ip->saddr == 0x7f000001) return 1; [...] } This solution avoids introduction of new instructions. llvm stays the same and all JITs stay the same, but verifier has to work extra hard to prove safety of the above program. For XDP the direct store instructions can be allowed as well. The skb->data is NET_IP_ALIGNED, so for common cases the verifier can check the alignment. The complex packet parsers where packet pointer is adjusted incrementally cannot be tracked for alignment, so allow byte access in such cases and misaligned access on architectures that define efficient_unaligned_access [1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: cleanup verifier codeAlexei Starovoitov2016-05-061-47/+53
|/ / | | | | | | | | | | | | | | cleanup verifier code and prepare it for addition of "pointer to packet" logic Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch '40GbE' of ↵David S. Miller2016-05-0615-1262/+1062
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2016-05-05 This series contains updates to i40e and i40evf. The theme behind this series is code reduction, yeah! Jesse provides most of the changes starting with a refactor of the interpretation of a tunnel which lets us start using the hardware's parsing. Removed the packet split receive routine and ancillary code in preparation for the Rx-refactor. The refactor of the receive routine, aligns the receive routine with the one in ixgbe which was highly optimized. The hardware supports a 16 byte descriptor for receive, but the driver was never using it in production. There was no performance benefit to the real driver of 16 byte descriptors, so drop a whole lot of complexity while getting rid of the code. Fixed a bug where while changing the number of descriptors using ethtool, the driver did not test the limits of the system memory before permanently assuming it would be able to get receive buffer memory. Mitch fixes a memory leak of one page each time the driver is opened by allocating the correct number of receive buffers and do not fiddle with next_to_use in the VF driver. Arnd Bergmann fixed a indentation issue by adding the appropriate curly braces in i40e_vc_config_promiscuous_mode_msg(). Julia Lawall fixed an issue found by Coccinelle, where i40e_client_ops structure can be const since it is never modified. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | i40e: constify i40e_client_ops structureJulia Lawall2016-05-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The i40e_client_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e: fix misleading indentationArnd Bergmann2016-05-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Newly added code in i40e_vc_config_promiscuous_mode_msg() is indented in a way that gcc rightly complains about: drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c: In function 'i40e_vc_config_promiscuous_mode_msg': drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1543:4: error: this 'if' clause does not guard... [-Werror=misleading-indentation] if (f->vlan >= 0 && f->vlan <= I40E_MAX_VLANID) ^~ drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1550:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if' aq_err = pf->hw.aq.asq_last_status; From the context, it looks like the aq_err assignment was meant to be inside of the conditional expression, so I'm adding the appropriate curly braces now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 5676a8b9cd9a ("i40e: Add VF promiscuous mode driver support") Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e: Test memory before ethtool alloc succeedsJesse Brandeburg2016-05-051-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When testing on systems with very limited amounts of RAM, a bug was found where, while changing the number of descriptors using ethtool, the driver didn't test the limits of system memory before permanently assuming it would be able to get receive buffer memory. Work around this issue by pre-allocation of the receive buffer memory, in the "ghost" ring, which is then used during reinit using the new ring length. Change-Id: I92d7a5fb59a6c884b2efdd1ec652845f101c3359 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40evf: Allocate Rx buffers properlyMitch Williams2016-05-051-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate the correct number of RX buffers, and don't fiddle with next_to_use. The common RX code handles all of this. This fixes a memory leak of one page each time the driver is opened. Change-Id: Id06eca353086e084921f047acad28c14745684ee Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e/i40evf: Remove unused hardware receive descriptor codeJesse Brandeburg2016-05-055-62/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware supports a 16 byte descriptor for receive, but the driver was never using it in production. There was no performance benefit to the real driver of 16 byte descriptors, so drop a whole lot of complexity while getting rid of the code. Also since the previous patch made us use no-split mode all the time, drop any support in the driver for any other value in dtype and assume it is always zero (aka no-split). Hooray for code removal! Change-ID: I2257e902e4dad84a07b94db6d2e6f4ce69b27bc0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40evf: refactor receive routineJesse Brandeburg2016-05-055-513/+481
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is part 2 of the Rx refactor series, just including changes to i40evf. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but *does* use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove VF ethtool control over packet split - remove priv_flags interface since it is unused Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40evf: Drop packet split receive routineJesse Brandeburg2016-05-057-75/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Some of the split related context set up code stays in i40e_virtchnl_pf.c in case an older VF driver tries to load and still wants to use packet split. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e: Refactor receive routineJesse Brandeburg2016-05-056-303/+531
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is part 1 of the Rx refactor series, just including changes to i40e. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but *does* use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove ethtool control over packet split Change-ID: I5ca100721de65992aa0114f8b4bac844b84758e0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e/i40evf: Remove reference to ring->dtypeJesse Brandeburg2016-05-053-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the rx-refactor, the dtype variable in the i40e_ring struct is no longer used, so remove it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e: Drop packet split receive routineJesse Brandeburg2016-05-056-317/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | i40e/i40evf: Refactor tunnel interpretationJesse Brandeburg2016-05-052-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the interpretation of a tunnel. This removes some code and lets us start using the hardware's parsing. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | | net: vrf: Create FIB tables on link createDavid Ahern2016-05-063-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | Tables have to exist for VRFs to function. Ensure they exist when VRF device is created. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | cnic: call cp->stop_hw() in cnic_start_hw() on allocation failureJon Maxwell2016-05-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We recently had a system crash in the cnic module. Vmcore analysis confirmed that "ip link up" was executed which failed due to an allocation failure because of memory fragmentation. Futher analysis revealed that the cnic irq vector was still allocated after the "ip link up" that failed. When "ip link down" was executed it called free_msi_irqs() which crashed the system because the cnic irq was still inuse. PANIC: "kernel BUG at drivers/pci/msi.c:411!" The code execution was: cnic_netdev_event() if (event == NETDEV_UP) { . . ▹ if (!cnic_start_hw(dev)) cnic_start_hw() calls cnic_cm_open() which failed with -ENOMEM cnic_start_hw() then took the err1 path: err1:↩ cp->free_resc(dev);↩ <---- frees resources but not irq vector pci_dev_put(dev->pcidev);↩ return err;↩ }↩ This returns control back to cnic_netdev_event() but now the cnic irq vector is still allocated even although cnic_cm_open() failed. The next "ip link down" while trigger the crash. The cnic_start_hw() routine is not handling the allocation failure correctly. Fix this by checking whether CNIC_DRV_STATE_HANDLES_IRQ flag is set indicating that the hardware has been started in cnic_start_hw(). If it has then call cp->stop_hw() which frees the cnic irq vector and cnic resources. Otherwise just maintain the previous behaviour and free cnic resources. I reproduced this by injecting an ENOMEM error into cnic_cm_alloc_mem()s return code. # ip link set dev enpX down # ip link set dev enpX up <--- hit's allocation failure # ip link set dev enpX down <--- crashes here With this patch I confirmed there was no crash in the reproducer. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net/mlx4: Avoid wrong virtual mappingsHaggai Abramovsky2016-05-059-125/+68
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dma_alloc_coherent() function returns a virtual address which can be used for coherent access to the underlying memory. On some architectures, like arm64, undefined behavior results if this memory is also accessed via virtual mappings that are not coherent. Because of their undefined nature, operations like virt_to_page() return garbage when passed virtual addresses obtained from dma_alloc_coherent(). Any subsequent mappings via vmap() of the garbage page values are unusable and result in bad things like bus errors (synchronous aborts in ARM64 speak). The mlx4 driver contains code that does the equivalent of: vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the device is opened. Prevent Ethernet driver to run this problematic code by forcing it to allocate contiguous memory. As for the Infiniband driver, at first we are trying to allocate contiguous memory, but in case of failure roll back to work with fragmented memory. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reported-by: David Daney <david.daney@cavium.com> Tested-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | MAINTAINERS: Cleanup Intel Wired LAN maintainers listJeff Kirsher2016-05-041-7/+0
| | | | | | | | | | | | | | | | | | | | With the recent "retirements" and other changes, make the maintainers list a lot less confusing and a bit more straight forward. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Shannon Nelson <sln@onemain.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp: two more missing bh disableEric Dumazet2016-05-042-0/+4
| | | | | | | | | | | | | | | | | | | | | | percpu_counter only have protection against preemption. TCP stack uses them possibly from BH, so we need BH protection in contexts that could be run in process context Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch '10GbE' of ↵David S. Miller2016-05-0412-197/+640
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-05-04 This series contains updates to ixgbe, ixgbevf and traffic class helpers. Sridhar adds helper functions to the tc_mirred header to access tcf_mirred information and then implements them for ixgbe to enable redirection to a SRIOV VF or an offloaded MACVLAN device queue via tc 'mirred' action. Amritha adds support to set filters with multiple header fields (L3,L4) to match on. KY Srinivasan from Microsoft add Hyper-V support into ixgbevf. Emil adds 82599 sub-device IDs that were missing from the list of parts that support WoL. Then simplified the logic we use to determine WoL support by reading the EEPROM bits for MACs X540 and newer. Preethi cleaned up duplicate and unused device IDs. Fixed our ethtool stat reporting where we were ignoring higher 32 bits of stats registers, so fill out 64 bit stat values into two 32 bit words. Babu Moger from Oracle improves VF performance issues on SPARC. Alex Duyck cleans up some of the Hyper-V implementation from KY so that we can just use function pointers instead of having to identify if a given VF is running on a Linux or Windows PF. Usha makes sure that DCB and FCoE is disabled for X550EM_x/a MACs and cleans up the DCB initialization in the process. Tony cleans up the API for ixgbevf_update_xcast_mode() so we do not have to pass in the netdev parameter, since it was never used in the function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ixgbevf: Remove unused parameterTony Nguyen2016-05-043-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | ixgbevf_update_xcast_mode() is not using the netdev parameter; removing it since it's unnecessary. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: Disable DCB and FCoE for X550EM_x and x550em_aUsha Ketineni2016-05-043-42/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds IXGBE_FLAG_DCB_CAPABLE flag that is set for all MACs other than X550EM_x and x550em_a. DCB and FCoE is disabled for these MACS. DCB initialization code is moved to a separate function. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Ronald Bynoe <ronald.j.bynoe@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbevf: Use mac_ops instead of trying to identify NIC typeAlexander Duyck2016-05-044-27/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change makes it so that we can just use function pointers instead of having to identify if a given VF is running on a Linux or Windows PF. By doing this we can avoid having to pull too much information out of the lower layers and can instead just make use of the mac_ops pointers since they should differ between the two types of VFs anyway. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbevf: Change the relaxed order settings in VF driver for sparcBabu Moger2016-05-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We noticed performance issues with VF interface on sparc compared to PF. Setting the RX to IXGBE_DCA_RXCTRL_DATA_WRO_EN brings it on far with PF. Also this matches to the default sparc setting in PF driver. Signed-off-by: Babu Moger <babu.moger@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
OpenPOWER on IntegriCloud