summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* unix_diag: Dumping all sockets corePavel Emelyanov2011-12-161-1/+75
| | | | | | | | Walk the unix sockets table and fill the core response structure, which includes type, state and inode. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* unix_diag: Basic module skeletonPavel Emelyanov2011-12-162-0/+81
| | | | | | | | Includes basic module_init/_exit functionality, dump/get_exact stubs and declares the basic API structures for request and response. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* af_unix: Export stuff required for diag modulePavel Emelyanov2011-12-162-3/+9
| | | | | Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sock_diag: Generalize requests cookies managementsPavel Emelyanov2011-12-165-21/+27
| | | | | | | | | The sk address is used as a cookie between dump/get_exact calls. It will be required for unix socket sdumping, so move it from inet_diag to sock_diag. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sock_diag: Fix module netlink aliasesPavel Emelyanov2011-12-165-9/+10
| | | | | | | | | | | | | | I've made a mistake when fixing the sock_/inet_diag aliases :( 1. The sock_diag layer should request the family-based alias, not just the IPPROTO_IP one; 2. The inet_diag layer should request for AF_INET+protocol alias, not just the protocol one. Thus fix this. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sock_diag: Move the SOCK_DIAG_BY_FAMILY cmd declarationPavel Emelyanov2011-12-162-1/+3
| | | | | | | It should belong to sock_diag, not inet_diag. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-12-1616-75/+102
|\ | | | | | | | | | | | | Conflicts: drivers/net/ethernet/freescale/fsl_pq_mdio.c net/batman-adv/translation-table.c net/ipv6/route.c
| * ipv6: Check dest prefix length on original route not copied one in ↵David S. Miller2011-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rt6_alloc_cow(). After commit 8e2ec639173f325977818c45011ee176ef2b11f6 ("ipv6: don't use inetpeer to store metrics for routes.") the test in rt6_alloc_cow() for setting the ANYCAST flag is now wrong. 'rt' will always now have a plen of 128, because it is set explicitly to 128 by ip6_rt_copy. So to restore the semantics of the test, check the destination prefix length of 'ort'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * sch_gred: should not use GFP_KERNEL while holding a spinlockEric Dumazet2011-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | gred_change_vq() is called under sch_tree_lock(sch). This means a spinlock is held, and we are not allowed to sleep in this context. We might pre-allocate memory using GFP_KERNEL before taking spinlock, but this is not suitable for stable material. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipip, sit: copy parms.name after register_netdeviceTed Feng2011-12-122-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Same fix as 731abb9cb2 for ipip and sit tunnel. Commit 1c5cae815d removed an explicit call to dev_alloc_name in ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice will now create a valid name, however the tunnel keeps a copy of the name in the private parms structure. Fix this by copying the name back after register_netdevice has successfully returned. This shows up if you do a simple tunnel add, followed by a tunnel show: $ sudo ip tunnel add mode ipip remote 10.2.20.211 $ ip tunnel tunl0: ip/ip remote any local any ttl inherit nopmtudisc tunl%d: ip/ip remote 10.2.20.211 local any ttl inherit $ sudo ip tunnel add mode sit remote 10.2.20.212 $ ip tunnel sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16 sit%d: ioctl 89f8 failed: No such device sit%d: ipv6/ip remote 10.2.20.212 local any ttl inherit Cc: stable@vger.kernel.org Signed-off-by: Ted Feng <artisdom@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Fix for adding multicast route for loopback device automatically.Li Wei2011-12-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no obvious reason to add a default multicast route for loopback devices, otherwise there would be a route entry whose dst.error set to -ENETUNREACH that would blocking all multicast packets. ==================== [ more detailed explanation ] The problem is that the resulting routing table depends on the sequence of interface's initialization and in some situation, that would block all muticast packets. Suppose there are two interfaces on my computer (lo and eth0), if we initailize 'lo' before 'eth0', the resuting routing table(for multicast) would be # ip -6 route show | grep ff00:: unreachable ff00::/8 dev lo metric 256 error -101 ff00::/8 dev eth0 metric 256 When sending multicasting packets, routing subsystem will return the first route entry which with a error set to -101(ENETUNREACH). I know the kernel will set the default ipv6 address for 'lo' when it is up and won't set the default multicast route for it, but there is no reason to stop 'init' program from setting address for 'lo', and that is exactly what systemd did. I am sure there is something wrong with kernel or systemd, currently I preferred kernel caused this problem. ==================== Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵John W. Linville2011-12-0913-67/+76
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
| | * ssb: fix init regression with SoCsHauke Mehrtens2011-12-081-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a Data bus error on some SoCs. The first fix for this problem did not solve it on all devices. commit 6ae8ec27868bfdbb815287bee8146acbefaee867 Author: Rafał Miłecki <zajec5@gmail.com> Date: Tue Jul 5 17:25:32 2011 +0200 ssb: fix init regression of hostmode PCI core In ssb_pcicore_fix_sprom_core_index() the sprom on the PCI core is accessed, but the sprom only exists when the ssb bus is connected over a PCI bus to the rest of the system and not when the SSB Bus is the main system bus. SoCs sometimes have a PCI host controller and there this code will not be executed, but there are some old SoCs with an PCI controller in client mode around and ssb_pcicore_fix_sprom_core_index() should not be called on these devices too. The PCI controller on these devices are unused, but without this fix it results in an Data bus error when it gets initialized. Cc: Michael Buesch <m@bues.ch> Cc: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * rtl8192{ce,cu,de,se}: avoid problems because of possible ERFOFF -> ERFSLEEP ↵Philipp Dreimann2011-12-074-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | transition In drivers rtl8192ce, rtl8192cu, rtl8192se, and rtl8192de, break statements would allow ppsc->rfpwr_state to be changed to ERFSLEEP even though the device is actually in ERFOFF. Signed-off-by: Philipp Dreimann <philipp@dreimann.net> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> Cc: Chaoming Li <chaoming_li@realsil.com.cn> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: fix another race in aggregation startJohannes Berg2011-12-071-45/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emmanuel noticed that when mac80211 stops the queues for aggregation that can leave a packet pending. This packet will be given to the driver after the AMPDU callback, but as a non-aggregated packet which messes up the sequence number etc. I also noticed by looking at the code that if packets are being processed while we clear the WANT_START bit, they might see it cleared already and queue up on tid_tx->pending. If the driver then rejects the new aggregation session we leak the packet. Fix both of these issues by changing this code to not stop the queues at all. Instead, let packets queue up on the tid_tx->pending queue instead of letting them get to the driver, and add code to recover properly in case the driver rejects the session. (The patch looks large because it has to move two functions to before their new use.) Cc: stable@vger.kernel.org Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * ath9k: fix check for antenna diversity supportFelix Fietkau2011-12-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes a regression on single-stream chips introduced in commit 43c3528430bd29f5e52438cad7cf7c0c62bf4583 "ath9k: implement .get_antenna and .set_antenna" Signed-off-by: Felix Fietkau <nbd@openwrt.org> Reported-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * Merge branch 'master' of ↵John W. Linville2011-12-066-15/+24
| | |\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth
| * | \ Merge branch 'batman-adv/maint' of git://git.open-mesh.org/linux-mergeDavid S. Miller2011-12-071-5/+22
| |\ \ \
| | * | | batman-adv: delete global entry in case of roamingAntonio Quartulli2011-12-071-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When receiving a DEL change for a client due to a roaming event (change is marked with TT_CLIENT_ROAM), each node has to check if the client roamed to itself or somewhere else. In the latter case the global entry is kept to avoid having no route at all otherwise we can safely delete the global entry Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| | * | | batman-adv: in case of roaming mark the client with TT_CLIENT_ROAMAntonio Quartulli2011-12-071-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of a client roaming from node A to node B, the latter have to mark the corresponding global entry with TT_CLIENT_ROAM (instead of TT_CLIENT_PENDING). Marking a global entry with TT_CLIENT_PENDING will end up in keeping such entry forever (because this flag is only meant to be used with local entries and it is never checked on global ones). In the worst case (all the clients roaming to the same node A) the local and the global table will contain exactly the same clients. Batman-adv will continue to work, but the memory usage is duplicated. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
| * | | | fsl_pq_mdio: Clean up tbi address configurationAndy Fleming2011-12-071-45/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code for setting the address of the internal TBI PHY was convoluted enough without a maze of ifdefs. Clean it up a bit so we allow the logic to fail down to -ENODEV at the end of the if/else ladder, rather than using ifdefs to repeat the same failure code over and over. Also, remove the support for the auto-configuration. I'm not aware of anyone using it, and it ends up using the bus mutex before it's been initialized. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ppp: fix pptp double release_sock in pptp_bind()Djalal Harouni2011-12-071-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/fec: fix the use of pdev->idShawn Guo2011-12-071-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pdev->id is used in several places for different purpose. All these uses assume it's always the id of fec device which is >= 0. However this is only true for non-DT case. When DT plays, pdev->id is always -1, which will break these pdev->id users. Instead of fixing all these users one by one, this patch introduces a new member 'dev_id' to 'struct fec_enet_private' for holding the correct fec device id, and replaces all the existing uses of pdev->id with this dev_id. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tg3: Break out RSS indir table init and assignmentMatt Carlson2011-12-152-23/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch creates a new device member to hold the RSS indirection table and separates out the code that initializes the table from the code that programs the table into device registers. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tg3: Use mii_advertise_flowctrlMatt Carlson2011-12-151-19/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces tg3's internal tg3_advert_flowctrl_1000T function with mii_advertise_flowctrl provided by the kernel headers. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tg3: Add 57766 ASIC rev supportMatt Carlson2011-12-152-27/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the 57766 ASIC revision. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tg3: Make the TX BD DMA limit configurableMatt Carlson2011-12-152-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 57766 ASIC rev will impose a new TX BD DMA limit on the driver. This patch prepares for 57766 support by making the tx BD DMA limit tunable. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tg3: Enable EEE support for capable 10/100 devsMatt Carlson2011-12-151-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some devices in the 57765 ASIC rev that are EEE capable. Unfortunately the EEE setup code only gets executed if the device is gigabit capable. This patch fixes the problem. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp_memcontrol: fix reversed if conditionDan Carpenter2011-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should only dereference the pointer if it's valid, not the other way round. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Move limit definitions outside CONFIG_INETGlauber Costa2011-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They need to be available for other protocols as well, since they are used in sock.c openly Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | inet: remove rcu protection on tw_netEric Dumazet2011-12-141-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b099ce2602d806 (net: Batch inet_twsk_purge) added rcu protection on tw_net for no obvious reason. struct net are refcounted anyway since timewait sockets escape from rcu protected sections. tw_net stay valid for the whole timwait lifetime. This also removes a lot of sparse errors. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: ping: remove some sparse errorsEric Dumazet2011-12-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table' was not declared. Should it be static? net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2 (different signedness) net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *<noident> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | cls_flow: remove one dynamic arrayEric Dumazet2011-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its better to use a predefined size for this small automatic variable. Removes a sparse error as well : net/sched/cls_flow.c:288:13: error: bad constant expression Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | bnx2x: handle vpd data longer than 128 bytesBarak Witkowski2011-12-141-7/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Barak Witkowski <barak@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | vlan: static functionsEric Dumazet2011-12-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6d4cdf47d2 (vlan: add 802.1q netpoll support) forgot to declare as static some private functions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | vlan: add rtnl_dereference() annotationsDan Carpenter2011-12-141-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original code generates a Sparse warning: net/8021q/vlan_core.c:336:9: error: incompatible types in comparison expression (different address spaces) It's ok to dereference __rcu pointers here because we are holding the RTNL lock. I've added some calls to rtnl_dereference() to silence the warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | rtnetlink: rtnl_link_register() sanity testEric Dumazet2011-12-141-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before adding a struct rtnl_link_ops into link_ops list, check it doesnt clash with a prior one. Based on a previous patch from Alexander Smirnov Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv6: If neigh lookup fails during icmp6 dst allocation, propagate error.David S. Miller2011-12-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't just succeed with a route that has a NULL neighbour attached. This follows the behavior of addrconf_dst_alloc(). Allowing this kind of route to end up with a NULL neigh attached will result in packet drops on output until the route is somehow invalidated, since nothing will meanwhile try to lookup the neigh again. A statistic is bumped for the case where we see a neigh-less route on output, but the resulting packet drop is otherwise silent in nature, and frankly it's a hard error for this to happen and ipv6 should do what ipv4 does which is say something in the kernel logs. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: Remove unused neighbour layer ops.David S. Miller2011-12-133-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's simpler to just keep these things out until there is a real user of them, so we can see what the needs actually are, rather than keep these things around as useless overhead. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_en: updated driver version to 2.0Yevgeny Petrilin2011-12-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: updated driver version to 1.1Yevgeny Petrilin2011-12-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: Modify driver initialization flow to accommodate SRIOV for EthernetJack Morgenstein2011-12-137-157/+934
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Added module parameters sr_iov and probe_vf for controlling enablement of SRIOV mode. 2. Increased default max num-qps, num-mpts and log_num_macs to accomodate SRIOV mode 3. Added port_type_array as a module parameter to allow driver startup with ports configured as desired. In SRIOV mode, only ETH is supported, and this array is ignored; otherwise, for the case where the FW supports both port types (ETH and IB), the port_type_array parameter is used. By default, the port_type_array is set to configure both ports as IB. 4. When running in sriov mode, the master needs to initialize the ICM eq table to hold the eq's for itself and also for all the slaves. 5. mlx4_set_port_mask() now invoked from mlx4_init_hca, instead of in mlx4_dev_cap. 6. Introduced sriov VF (slave) device startup/teardown logic (mainly procedures mlx4_init_slave, mlx4_slave_exit, mlx4_slave_cap, mlx4_slave_exit and flow modifications in __mlx4_init_one, mlx4_init_hca, and mlx4_setup_hca). VFs obtain their startup information from the PF (master) device via the comm channel. 7. In SRIOV mode (both PF and VF), MSI_X must be enabled, or the driver aborts loading the device. 8. Do not allow setting port type via sysfs when running in SRIOV mode. 9. mlx4_get_ownership: Currently, only one PF is supported by the driver. If the HCA is burned with FW which enables more than one PF, only one of the PFs is allowed to run. The first one up grabs a FW ownership semaphone -- all other PFs will find that semaphore taken, and the driver will not allow them to run. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Liran Liss <liranl@mellanox.co.il> Signed-off-by: Marcel Apfelbaum <marcela@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: adjust catas operation for SRIOV modeJack Morgenstein2011-12-132-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running in SRIOV mode, driver should not automatically start/stop the mlx4_core upon sensing an HCA internal error -- doing this disables/enables sriov, which will cause the hypervisor to hang if there are running VMs with attached VFs. In addition, on VMs the catas process should not run at all, since the HCA error buffer is not available to VMs in the BARs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: mtts resources units changed to offsetMarcel Apfelbaum2011-12-137-99/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous implementation mtts are managed by: 1. order - log(mtt segments), 'mtt segment' groups several mtts together. 2. first_seg - segment location relative to mtt table. In the current implementation: 1. order - log(mtts) rather than segments 2. offset - mtt index in mtt table Note: The actual mtt allocation is made in segments but it is transparent to callers. Rational: The mtt resource holders are not interested on how the allocation of mtt is done, but rather on how they will use it. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_en: Allow communication between functions on same hostEugenia Emantayev2011-12-132-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To enable internal loopback, always fill DMAC in control segment when transmitting the packet, once this is done, the packet is subject for loopback for if the DMAC mathces one of the multicast/unicast addresses registered on the physical port. In receive path if source MAC is our own MAC and we are not in selftest, or not in force LB mode - drop this packet. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4: Ethernet port management modificationsEugenia Emantayev2011-12-139-261/+630
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The physical port is now common to the PF and VFs. The port resources and configuration is managed by the PF, VFs can only influence the MTU of the port, it is set as max among all functions, Each function allocates RX buffers of required size to meet it's MTU enforcement. Port management code was moved to mlx4_core, as the mlx4_en module is virtualization unaware Move handling qp functionality to mlx4_get_eth_qp/mlx4_put_eth_qp including reserve/release range and add/release unicast steering. Let mlx4_register/unregister_mac deal only with MAC (un)registration. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4: Traffic steering management support for SRIOVEugenia Emantayev2011-12-135-69/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let multicast/unicast attaching flow go through resource tracker. The PF is the one responsible for managing all the steering entries. Define and use module parameter that determines the number of qps per multicast group. Minor changes in function calls according to changed prototype. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_ib: disable SRIOV mode for IB ports (not yet supported)Jack Morgenstein2011-12-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: resource tracking for HCA resources used by guestsEli Cohen2011-12-139-120/+3628
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resource tracker is used to track usage of HCA resources by the different guests. Virtual functions (VFs) are attached to guest operating systems but resources are allocated from the same pool and are assigned to VFs. It is essential that hostile/buggy guests not be able to affect the operation of other VFs, possibly attached to other guest OSs since ConnectX firmware is not tolerant to misuse of resources. The resource tracker module associates each resource with a VF and maintains state information for the allocated object. It also defines allowed state transitions and enforces them. Relationships between resources are also referred to. For example, CQs are pointed to by QPs, so it is forbidden to destroy a CQ if a QP refers to it. ICM memory is always accessible through the primary function and hence it is allocated by the owner of the primary function. When a guest dies, an FLR is generated for all the VFs it owns and all the resources it used are freed. The tracked resource types are: QPs, CQs, SRQs, MPTs, MTTs, MACs, RES_EQs, and XRCDNs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlx4_core: Add wrapper functions and comm channel and slave event support to EQsJack Morgenstein2011-12-131-25/+350
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Passing async events to slaves: In SRIOV mode, each slave creates its own async EQ, but only the master can register directly with the FW to receive async events. Async events which should be passed to slaves (such as a WQ_ACCESS_ERROR for a QP owned by a slave) are generated at the slave by the master using the GEN_EQE FW command. Wrapper functions: mlx4_MAP_EQ_wrapper Only the master can map an EQ. The slave commands to map their EQs arrive at the master via the comm channel. The master then invokes the wrapper function to do the work (and enter the resource in the tracking database). New events: COMM_CHANNEL and FLR The COMM_CHANNEL event arrives only at the master, and signals that a slave has posted a command on the comm channel. The FLR event is generated by the FW when a guest operating a VF unexpectedly goes down. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud