summaryrefslogtreecommitdiffstats
path: root/net/core/rtnetlink.c
Commit message (Collapse)AuthorAgeFilesLines
* rtnetlink: reject non-IFLA_VF_PORT attributes inside IFLA_VF_PORTSDaniel Borkmann2015-07-151-4/+7
| | | | | | | | | | | | | | | | | | | Similarly as in commit 4f7d2cdfdde7 ("rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver"), we have a double nesting of netlink attributes, i.e. IFLA_VF_PORTS only contains IFLA_VF_PORT that is nested itself. While IFLA_VF_PORTS is a verified attribute from ifla_policy[], we only check if the IFLA_VF_PORTS container has IFLA_VF_PORT attributes and then pass the attribute's content itself via nla_parse_nested(). It would be more correct to reject inner types other than IFLA_VF_PORT instead of continuing parsing and also similarly as in commit 4f7d2cdfdde7, to check for a minimum of NLA_HDRLEN. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Scott Feldman <sfeldma@gmail.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: verify IFLA_VF_INFO attributes before passing them to driverDaniel Borkmann2015-07-081-91/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes anymore with respect to their policy, that is, ifla_vfinfo_policy[]. Before, they were part of ifla_policy[], but they have been nested since placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO, which is another nested attribute for the actual VF attributes such as IFLA_VF_MAC, IFLA_VF_VLAN, etc. Despite the policy being split out from ifla_policy[] in this commit, it's never applied anywhere. nla_for_each_nested() only does basic nla_ok() testing for struct nlattr, but it doesn't know about the data context and their requirements. Fix, on top of Jason's initial work, does 1) parsing of the attributes with the right policy, and 2) using the resulting parsed attribute table from 1) instead of the nla_for_each_nested() loop (just like we used to do when still part of ifla_policy[]). Reference: http://thread.gmane.org/gmane.linux.network/368913 Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Rony Efraim <ronye@mellanox.com> Cc: Vlad Zolotarov <vladz@cloudius-systems.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* switchdev; add VLAN support for port's bridge_getlinkScott Feldman2015-06-231-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One more missing piece of the puzzle. Add vlan dump support to switchdev port's bridge_getlink. iproute2 "bridge vlan show" cmd already knows how to show the vlans installed on the bridge and the device , but (until now) no one implemented the port vlan part of the netlink PF_BRIDGE:RTM_GETLINK msg. Before this patch, "bridge vlan show": $ bridge -c vlan show port vlan ids sw1p1 30-34 << bridge side vlans 57 sw1p1 << device side vlans (missing) sw1p2 57 sw1p2 sw1p3 sw1p4 br0 None (When the port is bridged, the output repeats the vlan list for the vlans on the bridge side of the port and the vlans on the device side of the port. The listing above show no vlans for the device side even though they are installed). After this patch: $ bridge -c vlan show port vlan ids sw1p1 30-34 << bridge side vlan 57 sw1p1 30-34 << device side vlans 57 3840 PVID sw1p2 57 sw1p2 57 3840 PVID sw1p3 3842 PVID sw1p4 3843 PVID br0 None I re-used ndo_dflt_bridge_getlink to add vlan fill call-back func. switchdev support adds an obj dump for VLAN objects, using the same call-back scheme as FDB dump. Support included for both compressed and un-compressed vlan dumps. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/core: Add reading VF statistics through the PF netdeviceEran Ben Elisha2015-06-151-2/+49
| | | | | | | | | | | Add ndo_get_vf_stats where the PF retrieves and fills the VFs traffic statistics. We encode the VF stats in a nested manner to allow for future extensions. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-05-231-0/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/cadence/macb.c drivers/net/phy/phy.c include/linux/skbuff.h net/ipv4/tcp.c net/switchdev/switchdev.c Switchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD} renaming overlapping with net-next changes of various sorts. phy.c was a case of two changes, one adding a local variable to a function whilst the second was removing one. tcp.c overlapped a deadlock fix with the addition of new tcp_info statistic values. macb.c involved the addition of two zyncq device entries. skbuff.h involved adding back ipv4_daddr to nf_bridge_info whilst net-next changes put two other existing members of that struct into a union. Signed-off-by: David S. Miller <davem@davemloft.net>
| * rtnl/bond: don't send rtnl msg for unregistered ifaceNicolas Dichtel2015-05-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the patch, the command 'ip link add bond2 type bond mode 802.3ad' causes the kernel to send a rtnl message for the bond2 interface, with an ifindex 0. 'ip monitor' shows: 0: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 state DOWN group default link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff 9: bond2@NONE: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default link/ether ea:3e:1f:53:92:7b brd ff:ff:ff:ff:ff:ff [snip] The patch fixes the spotted bug by checking in bond driver if the interface is registered before calling the notifier chain. It also adds a check in rtmsg_ifinfo() to prevent this kind of bug in the future. Fixes: d4261e565000 ("bonding: create netlink event when bonding option is changed") CC: Jiri Pirko <jiri@resnulli.us> Reported-by: Julien Meunier <julien.meunier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | switchdev: don't use anonymous union on switchdev attr/obj structsScott Feldman2015-05-131-1/+2
| | | | | | | | | | | | | | | | | | | | Older gcc versions (e.g. gcc version 4.4.6) don't like anonymous unions which was causing build issues on the newly added switchdev attr/obj structs. Fix this by using named union on structs. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | switchdev: convert parent_id_get to switchdev attr getScott Feldman2015-05-121-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch ID is just a gettable port attribute. Convert switchdev op switchdev_parent_id_get to a switchdev attr. Note: for sysfs and netlink interfaces, SWITCHDEV_ATTR_PORT_PARENT_ID is called with SWITCHDEV_F_NO_RECUSE to limit switch ID user-visiblity to only port netdevs. So when a port is stacked under bond/bridge, the user can only query switch id via the switch ports, but not via the upper devices Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* | switchdev: s/netdev_switch_/switchdev_/ and s/NETDEV_SWITCH_/SWITCHDEV_/Jiri Pirko2015-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | Turned out that "switchdev" sticks. So just unify all related terms to use this prefix. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netns: rename peernet2id() to peernet2id_alloc()Nicolas Dichtel2015-05-091-1/+1
|/ | | | | | | | | | In a following commit, a new function will be introduced to only lookup for a nsid (no allocation if the nsid doesn't exist). To avoid confusion, the existing function is renamed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge/nl: remove wrong use of NLM_F_MULTINicolas Dichtel2015-04-291-5/+7
| | | | | | | | | | | | | | | | | | | | | | NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: e5a55a898720 ("net: create generic bridge ops") Fixes: 815cccbf10b2 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf") CC: John Fastabend <john.r.fastabend@intel.com> CC: Sathya Perla <sathya.perla@emulex.com> CC: Subbu Seetharaman <subbu.seetharaman@emulex.com> CC: Ajit Khaparde <ajit.khaparde@emulex.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: intel-wired-lan@lists.osuosl.org CC: Jiri Pirko <jiri@resnulli.us> CC: Scott Feldman <sfeldma@gmail.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* if_link: Add an additional parameter to ifla_vf_info for RSS queryingVlad Zolotarov2015-04-101-6/+26
| | | | | | | | | | | | | Add configuration setting for drivers to allow/block an RSS Redirection Table and a Hash Key querying for discrete VFs. On some devices VF share the mentioned above information with PF and querying it may adduce a theoretical security risk. We want to let a system administrator to decide if he/she wants to take this risk or not. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* rtnetlink: Mark name argument of rtnl_create_link() constThomas Graf2015-04-101-1/+1
| | | | | Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Pass VLAN ID to rtnl_fdb_notify.Hubert Sokolowski2015-04-091-10/+10
| | | | | | | | | | | | When an FDB entry is added or deleted the information about VLAN is not passed to listening applications like 'bridge monitor fdb'. With this patch VLAN ID is passed if it was set in the original netlink message. Also remove an unused bdev variable. Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-04-021-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/usb/asix_common.c drivers/net/usb/sr9800.c drivers/net/usb/usbnet.c include/linux/usb/usbnet.h net/ipv4/tcp_ipv4.c net/ipv6/tcp_ipv6.c The TCP conflicts were overlapping changes. In 'net' we added a READ_ONCE() to the socket cached RX route read, whilst in 'net-next' Eric Dumazet touched the surrounding code dealing with how mini sockets are handled. With USB, it's a case of the same bug fix first going into net-next and then I cherry picked it back into net. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: use for_each_netdev_safe() in rtnl_group_changelink()WANG Cong2015-03-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case we move the whole dev group to another netns, we should call for_each_netdev_safe(), otherwise we get a soft lockup: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ip:798] irq event stamp: 255424 hardirqs last enabled at (255423): [<ffffffff81a2aa95>] restore_args+0x0/0x30 hardirqs last disabled at (255424): [<ffffffff81a2ad5a>] apic_timer_interrupt+0x6a/0x80 softirqs last enabled at (255422): [<ffffffff81079ebc>] __do_softirq+0x2c1/0x3a9 softirqs last disabled at (255417): [<ffffffff8107a190>] irq_exit+0x41/0x95 CPU: 0 PID: 798 Comm: ip Not tainted 4.0.0-rc4+ #881 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800d1b88000 ti: ffff880119530000 task.ti: ffff880119530000 RIP: 0010:[<ffffffff810cad11>] [<ffffffff810cad11>] debug_lockdep_rcu_enabled+0x28/0x30 RSP: 0018:ffff880119533778 EFLAGS: 00000246 RAX: ffff8800d1b88000 RBX: 0000000000000002 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffff8800d1b888c8 RDI: ffff8800d1b888c8 RBP: ffff880119533778 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000b5c2 R12: 0000000000000246 R13: ffff880119533708 R14: 00000000001d5a40 R15: ffff88011a7d5a40 FS: 00007fc01315f740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f367a120988 CR3: 000000011849c000 CR4: 00000000000007f0 Stack: ffff880119533798 ffffffff811ac868 ffffffff811ac831 ffffffff811ac828 ffff8801195337c8 ffffffff811ac8c9 ffff8801195339b0 ffff8801197633e0 0000000000000000 ffff8801195339b0 ffff8801195337d8 ffffffff811ad2d7 Call Trace: [<ffffffff811ac868>] rcu_read_lock+0x37/0x6e [<ffffffff811ac831>] ? rcu_read_unlock+0x5f/0x5f [<ffffffff811ac828>] ? rcu_read_unlock+0x56/0x5f [<ffffffff811ac8c9>] __fget+0x2a/0x7a [<ffffffff811ad2d7>] fget+0x13/0x15 [<ffffffff811be732>] proc_ns_fget+0xe/0x38 [<ffffffff817c7714>] get_net_ns_by_fd+0x11/0x59 [<ffffffff817df359>] rtnl_link_get_net+0x33/0x3e [<ffffffff817df3d7>] do_setlink+0x73/0x87b [<ffffffff810b28ce>] ? trace_hardirqs_off+0xd/0xf [<ffffffff81a2aa95>] ? retint_restore_args+0xe/0xe [<ffffffff817e0301>] rtnl_newlink+0x40c/0x699 [<ffffffff817dffe0>] ? rtnl_newlink+0xeb/0x699 [<ffffffff81a29246>] ? _raw_spin_unlock+0x28/0x33 [<ffffffff8143ed1e>] ? security_capable+0x18/0x1a [<ffffffff8107da51>] ? ns_capable+0x4d/0x65 [<ffffffff817de5ce>] rtnetlink_rcv_msg+0x181/0x194 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de44d>] ? __rtnl_unlock+0x17/0x17 [<ffffffff818327c6>] netlink_rcv_skb+0x4d/0x93 [<ffffffff817de42f>] rtnetlink_rcv+0x26/0x2d [<ffffffff81830f18>] netlink_unicast+0xcb/0x150 [<ffffffff8183198e>] netlink_sendmsg+0x501/0x523 [<ffffffff8115cba9>] ? might_fault+0x59/0xa9 [<ffffffff817b5398>] ? copy_from_user+0x2a/0x2c [<ffffffff817b7b74>] sock_sendmsg+0x34/0x3c [<ffffffff817b7f6d>] ___sys_sendmsg+0x1b8/0x255 [<ffffffff8115c5eb>] ? handle_pte_fault+0xbd5/0xd4a [<ffffffff8100a2b0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109e94b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109eb9c>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cadbf>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811ac1d8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac946>] ? __fget_light+0x2d/0x52 [<ffffffff817b8adc>] __sys_sendmsg+0x42/0x60 [<ffffffff817b8b0c>] SyS_sendmsg+0x12/0x1c [<ffffffff81a29e32>] system_call_fastpath+0x12/0x17 Fixes: e7ed828f10bd8 ("netlink: support setting devgroup parameters") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dev: introduce dev_get_iflink()Nicolas Dichtel2015-04-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | The goal of this patch is to prepare the removal of the iflink field. It introduces a new ndo function, which will be implemented by virtual interfaces. There is no functional change into this patch. All readers of iflink field now call dev_get_iflink(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: allow to delete a whole device groupWANG Cong2015-03-241-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | With dev group, we can change a batch of net devices, so we should allow to delete them together too. Group 0 is not allowed to be deleted since it is the default group. Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-03-201-13/+13
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Handle unregister properly when netdev namespace change fails.David S. Miller2015-03-101-13/+13
| | | | | | | | | | | | | | | | | | If rtnl_newlink() fails on it's call to dev_change_net_namespace(), we have to make use of the ->dellink() method, if present, just like we do when rtnl_configure_link() fails. Fixes: 317f4810e45e ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add support for phys_port_nameDavid Ahern2015-03-181-0/+21
|/ | | | | | | | | | | | | | | | | | Similar to port id allow netdevices to specify port names and export the name via sysfs. Drivers can implement the netdevice operation to assist udev in having sane default names for the devices using the rule: $ cat /etc/udev/rules.d/80-net-setup-link.rules SUBSYSTEM=="net", ACTION=="add", ATTR{phys_port_name}!="", NAME="$attr{phys_port_name}" Use of phys_name versus phys_id was suggested-by Jiri Pirko. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: do not use rcu in rtnl_dump_ifinfo()Eric Dumazet2015-03-011-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We did a failed attempt in the past to only use rcu in rtnl dump operations (commit e67f88dd12f6 "net: dont hold rtnl mutex during netlink dump callbacks") Now that dumps are holding RTNL anyway, there is no need to also use rcu locking, as it forbids any scheduling ability, like GFP_KERNEL allocations that controlling path should use instead of GFP_ATOMIC whenever possible. This should fix following splat Cong Wang reported : [ INFO: suspicious RCU usage. ] 3.19.0+ #805 Tainted: G W include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/771: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c #1: (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e stack backtrace: CPU: 3 PID: 771 Comm: ip Tainted: G W 3.19.0+ #805 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6 ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd 00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758 Call Trace: [<ffffffff81a27457>] dump_stack+0x4c/0x65 [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47 [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb [<ffffffff8109e67d>] __might_sleep+0x78/0x80 [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101 [<ffffffff817c1383>] alloc_netid+0x61/0x69 [<ffffffff817c14c3>] __peernet2id+0x79/0x8d [<ffffffff817c1ab7>] peernet2id+0x13/0x1f [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20 [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52 [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213 [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5 [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59 [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51 [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9 [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac556>] ? __fget_light+0x2d/0x52 [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60 [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 0c7aecd4bde4b7302 ("netns: add rtnl cmd to add and get peer netns ids") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Verify permission to link_net in newlinkEric W. Biederman2015-02-281-0/+3
| | | | | | | | | | | | | When applicable verify that the caller has permisson to the underlying network namespace for a newly created network device. Similary checks exist for the network namespace a network device will be created in. Fixes: 317f4810e45e ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Verify permission to dest_net in newlinkEric W. Biederman2015-02-281-0/+4
| | | | | | | | | | | | | | | When applicable verify that the caller has permision to create a network device in another network namespace. This check is already present when moving a network device between network namespaces in setlink so all that is needed is to duplicate that check in newlink. This change almost backports cleanly, but there are context conflicts as the code that follows was added in v4.0-rc1 Fixes: b51642f6d77b net: Enable a userns root rtnl calls that are safe for unprivilged users Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: avoid 0 sized arraysSasha Levin2015-02-241-2/+2
| | | | | | | | | | Arrays (when not in a struct) "shall have a value greater than zero". GCC complains when it's not the case here. Fixes: ba7d49b1f0 ("rtnetlink: provide api for getting and setting slave info") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: call ->dellink on failure when ->newlink existsWANG Cong2015-02-151-1/+8
| | | | | | | | | | | | | | | | | | | | Ignacy reported that when eth0 is down and add a vlan device on top of it like: ip link add link eth0 name eth0.1 up type vlan id 1 We will get a refcount leak: unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2 The problem is when rtnl_configure_link() fails in rtnl_newlink(), we simply call unregister_device(), but for stacked device like vlan, we almost do nothing when we unregister the upper device, more work is done when we unregister the lower device, so call its ->dellink(). Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-02-091-12/+6
|\ | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARYDaniel Borkmann2015-02-071-12/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ifla_vf_policy[] is wrong in advertising its individual member types as NLA_BINARY since .type = NLA_BINARY in combination with .len declares the len member as *max* attribute length [0, len]. The issue is that when do_setvfinfo() is being called to set up a VF through ndo handler, we could set corrupted data if the attribute length is less than the size of the related structure itself. The intent is exactly the opposite, namely to make sure to pass at least data of minimum size of len. Fixes: ebc08a6f47ee ("rtnetlink: Add VF config code to rtnetlink") Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-02-051-1/+5
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: dont send notification when skb->len == 0 in rtnl_bridge_notifyRoopa Prabhu2015-01-281-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081 This patch avoids calling rtnl_notify if the device ndo_bridge_getlink handler does not return any bytes in the skb. Alternately, the skb->len check can be moved inside rtnl_notify. For the bridge vlan case described in 92081, there is also a fix needed in bridge driver to generate a proper notification. Will fix that in subsequent patch. v2: rebase patch on net tree Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/core: Add event for a change in slave stateMoni Shoua2015-02-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | Add event which provides an indication on a change in the state of a bonding slave. The event handler should cast the pointer to the appropriate type (struct netdev_bonding_info) in order to get the full info about the slave. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellinkRoopa Prabhu2015-02-011-4/+6
| | | | | | | | | | | | | | | | | | | | bridge flags are needed inside ndo_bridge_setlink/dellink handlers to avoid another call to parse IFLA_AF_SPEC inside these handlers This is used later in this series Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnetlink: pass link_net to the newlink handlerNicolas Dichtel2015-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When IFLA_LINK_NETNSID is used, the netdevice should be built in this link netns and moved at the end to another netns (pointed by the socket netns or IFLA_NET_NS_[PID|FD]). Existing user of the newlink handler will use the netns argument (src_net) to find a link netdevice or to check some other information into the link netns. For example, to find a netdevice, two information are required: an ifindex (usually from IFLA_LINK) and a netns (this link netns). Note: when using IFLA_LINK_NETNSID and IFLA_NET_NS_[PID|FD], a user may create a netdevice that stands in netnsX and with its link part in netnsY, by sending a rtnl message from netnsZ. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnl: fix error path when adding an iface with a link netNicolas Dichtel2015-01-231-1/+4
| | | | | | | | | | | | | | | | | | If an error occurs when the netdevice is moved to the link netns, a full cleanup must be done. Fixes: 317f4810e45e ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnl: allow to create device with IFLA_LINK_NETNSID setNicolas Dichtel2015-01-191-3/+22
| | | | | | | | | | | | | | | | | | This patch adds the ability to create a netdevice in a specified netns and then move it into the final netns. In fact, it allows to have a symetry between get and set rtnl messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnl: add link netns id to interface messagesNicolas Dichtel2015-01-191-0/+13
| | | | | | | | | | | | | | | | | | | | | | This patch adds a new attribute (IFLA_LINK_NETNSID) which contains the 'link' netns id when this netns is different from the netns where the interface stands (for example for x-net interfaces like ip tunnels). With this attribute, it's possible to interpret correctly all advertised information (like IFLA_LINK, etc.). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: remove oflags from setlink/dellink.Rosen, Rami2015-01-191-6/+2
| | | | | | | | | | | | | | | | | | Commit 02dba4388d16 ("bridge: fix setlink/dellink notifications") removed usage of oflags in both rtnl_bridge_setlink() and rtnl_bridge_dellink() methods. This patch removes this variable as it is no longer needed. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: Fix bugs in nlmsg_end() conversions.David S. Miller2015-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end() void") didn't catch all of the cases where callers were breaking out on the return value being equal to zero, which they no longer should when zero means success. Fix all such cases. Reported-by: Marcel Holtmann <marcel@holtmann.org> Reported-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: make nlmsg_end() and genlmsg_end() voidJohannes Berg2015-01-181-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Contrary to common expectations for an "int" return, these functions return only a positive value -- if used correctly they cannot even return 0 because the message header will necessarily be in the skb. This makes the very common pattern of if (genlmsg_end(...) < 0) { ... } be a whole bunch of dead code. Many places also simply do return nlmsg_end(...); and the caller is expected to deal with it. This also commonly (at least for me) causes errors, because it is very common to write if (my_function(...)) /* error condition */ and if my_function() does "return nlmsg_end()" this is of course wrong. Additionally, there's not a single place in the kernel that actually needs the message length returned, and if anyone needs it later then it'll be very easy to just use skb->len there. Remove this, and make the functions void. This removes a bunch of dead code as described above. The patch adds lines because I did - return nlmsg_end(...); + nlmsg_end(...); + return 0; I could have preserved all the function's return values by returning skb->len, but instead I've audited all the places calling the affected functions and found that none cared. A few places actually compared the return value with <= 0 in dump functionality, but that could just be changed to < 0 with no change in behaviour, so I opted for the more efficient version. One instance of the error I've made numerous times now is also present in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't check for <0 or <=0 and thus broke out of the loop every single time. I've preserved this since it will (I think) have caused the messages to userspace to be formatted differently with just a single message for every SKB returned to userspace. It's possible that this isn't needed for the tools that actually use this, but I don't even know what they are so couldn't test that changing this behaviour would be acceptable. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: fix setlink/dellink notificationsRoopa Prabhu2015-01-171-24/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problems with bridge getlink/setlink notifications today: - bridge setlink generates two notifications to userspace - one from the bridge driver - one from rtnetlink.c (rtnl_bridge_notify) - dellink generates one notification from rtnetlink.c. Which means bridge setlink and dellink notifications are not consistent - Looking at the code it appears, If both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF were set, the size calculation in rtnl_bridge_notify can be wrong. Example: if you set both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF in a setlink request to rocker dev, rtnl_bridge_notify will allocate skb for one set of bridge attributes, but, both the bridge driver and rocker dev will try to add attributes resulting in twice the number of attributes being added to the skb. (rocker dev calls ndo_dflt_bridge_getlink) There are multiple options: 1) Generate one notification including all attributes from master and self: But, I don't think it will work, because both master and self may use the same attributes/policy. Cannot pack the same set of attributes in a single notification from both master and slave (duplicate attributes). 2) Generate one notification from master and the other notification from self (This seems to be ideal): For master: the master driver will send notification (bridge in this example) For self: the self driver will send notification (rocker in the above example. It can use helpers from rtnetlink.c to do so. Like the ndo_dflt_bridge_getlink api). This patch implements 2) (leaving the 'rtnl_bridge_notify' around to be used with 'self'). v1->v2 : - rtnl_bridge_notify is now called only for self, so, remove 'BRIDGE_FLAGS_SELF' check and cleanup a few things - rtnl_bridge_dellink used to always send a RTM_NEWLINK msg earlier. So, I have changed the notification from br_dellink to go as RTM_NEWLINK Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: tcp: add RTAX_CC_ALGO fib handlingDaniel Borkmann2015-01-051-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the minimum necessary for the RTAX_CC_ALGO congestion control metric to be set up and dumped back to user space. While the internal representation of RTAX_CC_ALGO is handled as a u32 key, we avoided to expose this implementation detail to user space, thus instead, we chose the netlink attribute that is being exchanged between user space to be the actual congestion control algorithm name, similarly as in the setsockopt(2) API in order to allow for maximum flexibility, even for 3rd party modules. It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as it should have been stored in RTAX_FEATURES instead, we first thought about reusing it for the congestion control key, but it brings more complications and/or confusion than worth it. Joint work with Florian Westphal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is definedHubert Sokolowski2015-01-051-2/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add checking whether the call to ndo_dflt_fdb_dump is needed. It is not expected to call ndo_dflt_fdb_dump unconditionally by some drivers (i.e. qlcnic or macvlan) that defines own ndo_fdb_dump. Other drivers define own ndo_fdb_dump and don't want ndo_dflt_fdb_dump to be called at all. At the same time it is desirable to call the default dump function on a bridge device. Fix attributes that are passed to dev->netdev_ops->ndo_fdb_dump. Add extra checking in br_fdb_dump to avoid duplicate entries as now filter_dev can be NULL. Following tests for filtering have been performed before the change and after the patch was applied to make sure they are the same and it doesn't break the filtering algorithm. [root@localhost ~]# cd /root/iproute2-3.18.0/bridge [root@localhost bridge]# modprobe dummy [root@localhost bridge]# ./bridge fdb add f1:f2:f3:f4:f5:f6 dev dummy0 [root@localhost bridge]# brctl addbr br0 [root@localhost bridge]# brctl addif br0 dummy0 [root@localhost bridge]# ip link set dev br0 address 02:00:00:12:01:04 [root@localhost bridge]# # show all [root@localhost bridge]# ./bridge fdb show 33:33:00:00:00:01 dev p2p1 self permanent 01:00:5e:00:00:01 dev p2p1 self permanent 33:33:ff:ac:ce:32 dev p2p1 self permanent 33:33:00:00:02:02 dev p2p1 self permanent 01:00:5e:00:00:fb dev p2p1 self permanent 33:33:00:00:00:01 dev p7p1 self permanent 01:00:5e:00:00:01 dev p7p1 self permanent 33:33:ff:79:50:53 dev p7p1 self permanent 33:33:00:00:02:02 dev p7p1 self permanent 01:00:5e:00:00:fb dev p7p1 self permanent f2:46:50:85:6d:d9 dev dummy0 master br0 permanent f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent 33:33:00:00:00:01 dev dummy0 self permanent f1:f2:f3:f4:f5:f6 dev dummy0 self permanent 33:33:00:00:00:01 dev br0 self permanent 02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent 02:00:00:12:01:04 dev br0 master br0 permanent [root@localhost bridge]# # filter by bridge [root@localhost bridge]# ./bridge fdb show br br0 f2:46:50:85:6d:d9 dev dummy0 master br0 permanent f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent 33:33:00:00:00:01 dev dummy0 self permanent f1:f2:f3:f4:f5:f6 dev dummy0 self permanent 33:33:00:00:00:01 dev br0 self permanent 02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent 02:00:00:12:01:04 dev br0 master br0 permanent [root@localhost bridge]# # filter by port [root@localhost bridge]# ./bridge fdb show brport dummy0 f2:46:50:85:6d:d9 master br0 permanent f2:46:50:85:6d:d9 vlan 1 master br0 permanent 33:33:00:00:00:01 self permanent f1:f2:f3:f4:f5:f6 self permanent [root@localhost bridge]# # filter by port + bridge [root@localhost bridge]# ./bridge fdb show br br0 brport dummy0 f2:46:50:85:6d:d9 master br0 permanent f2:46:50:85:6d:d9 vlan 1 master br0 permanent 33:33:00:00:00:01 self permanent f1:f2:f3:f4:f5:f6 self permanent [root@localhost bridge]# Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Disallow providing non zero VLAN ID for NIC drivers FDB add flowOr Gerlitz2014-12-161-0/+5
| | | | | | | | | | | The current implementations all use dev_uc_add_excl() and such whose API doesn't support vlans, so we can't make it with NICs HW for now. Fixes: f6f6424ba773 ('net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-12-111-17/+137
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-12-101-0/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | rocker: remove swdev modeRoopa Prabhu2014-12-091-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove use of 'swdev' mode in rocker. rocker dev offloads can use the BRIDGE_FLAGS_SELF to indicate offload to hardware. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()Mahesh Bandewar2014-12-091-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 56bfa7ee7c ("unregister_netdevice : move RTM_DELLINK to until after ndo_uninit") tried to do this ealier but while doing so it created a problem. Unfortunately the delayed rtmsg_ifinfo() also delayed call to fill_info(). So this translated into asking driver to remove private state and then query it's private state. This could have catastropic consequences. This change breaks the rtmsg_ifinfo() into two parts - one takes the precise snapshot of the device by called fill_info() before calling the ndo_uninit() and the second part sends the notification using collected snapshot. It was brought to notice when last link is deleted from an ipvlan device when it has free-ed the port and the subsequent .fill_info() call is trying to get the info from the port. kernel: [ 255.139429] ------------[ cut here ]------------ kernel: [ 255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110() kernel: [ 255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c kernel: [ 255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167 kernel: [ 255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012 kernel: [ 255.139515] 0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0 kernel: [ 255.139516] 0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000 kernel: [ 255.139518] 00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011 kernel: [ 255.139520] Call Trace: kernel: [ 255.139527] [<ffffffff815d87f4>] dump_stack+0x46/0x58 kernel: [ 255.139531] [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0 kernel: [ 255.139540] [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20 kernel: [ 255.139544] [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110 kernel: [ 255.139547] [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0 kernel: [ 255.139549] [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0 kernel: [ 255.139551] [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110 kernel: [ 255.139553] [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240 kernel: [ 255.139557] [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80 kernel: [ 255.139558] [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20 kernel: [ 255.139562] [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0 kernel: [ 255.139563] [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40 kernel: [ 255.139565] [<ffffffff8152c398>] netlink_unicast+0x178/0x230 kernel: [ 255.139567] [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420 kernel: [ 255.139571] [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0 kernel: [ 255.139575] [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130 kernel: [ 255.139577] [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0 kernel: [ 255.139578] [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310 kernel: [ 255.139581] [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0 kernel: [ 255.139585] [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70 kernel: [ 255.139589] [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500 kernel: [ 255.139597] [<ffffffff811e8336>] ? dput+0xb6/0x190 kernel: [ 255.139606] [<ffffffff811f05f6>] ? mntput+0x26/0x40 kernel: [ 255.139611] [<ffffffff811d2b94>] ? __fput+0x174/0x1e0 kernel: [ 255.139613] [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90 kernel: [ 255.139615] [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20 kernel: [ 255.139617] [<ffffffff815df092>] system_call_fastpath+0x12/0x17 kernel: [ 255.139619] ---[ end trace 5e6703e87d984f6b ]--- Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Cc: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bridge: add brport flags to dflt bridge_getlinkScott Feldman2014-12-021-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To allow brport device to return current brport flags set on port. Add returned flags to nested IFLA_PROTINFO netlink msg built in dflt getlink. With this change, netlink msg returned for bridge_getlink contains the port's offloaded flag settings (the port's SELF settings). Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | rtnl: expose physical switch id for particular deviceJiri Pirko2014-12-021-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The netdevice represents a port in a switch, it will expose IFLA_PHYS_SWITCH_ID value via rtnl. Two netdevices with the same value belong to one physical switch. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: rename netdev_phys_port_id to more generic nameJiri Pirko2014-12-021-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So this can be reused for identification of other "items" as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud