op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	net: release dst entry in dev_hard_start_xmit()	Eric Dumazet	2009-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One point of contention in high network loads is the dst_release() performed when a transmited skb is freed. This is because NIC tx completion calls dev_kree_skb() long after original call to dev_queue_xmit(skb). CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is quite visible if one CPU is 100% handling softirqs for a network device, since dst_clone() is done by other cpus, involving cache line ping pongs. It seems right place to release dst is in dev_hard_start_xmit(), for most devices but ones that are virtual, and some exceptions. David Miller suggested to define a new device flag, set in alloc_netdev_mq() (so that most devices set it at init time), and carefuly unset in devices which dont want a NULL skb->dst in their ndo_start_xmit(). List of devices that must clear this flag is : - loopback device, because it calls netif_rx() and quoting Patrick : "ip_route_input() doesn't accept loopback addresses, so loopback packets already need to have a dst_entry attached." - appletalk/ipddp.c : needs skb->dst in its xmit function - And all devices that call again dev_queue_xmit() from their xmit function (as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdev: add more functions to netdevice ops	Stephen Hemminger	2008-11-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ifb: convert to net_device_ops	Stephen Hemminger	2008-11-19	1	-3/+8
\| \| \| \| \| \| \|	Convert to new network device ops interface. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdev: Fix lockdep warnings in multiqueue configurations.	David S. Miller	2008-07-31	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When support for multiple TX queues were added, the netif_tx_lock() routines we converted to iterate over all TX queues and grab each queue's spinlock. This causes heartburn for lockdep and it's not a healthy thing to do with lots of TX queues anyways. So modify this to use a top-level lock and a "frozen" state for the individual TX queues. Signed-off-by: David S. Miller <davem@davemloft.net>
*	pkt_sched: Kill netdev_queue lock.	David S. Miller	2008-07-17	1	-20/+0
\| \| \| \| \| \| \|	We can simply use the qdisc->q.lock for all of the qdisc tree synchronization. Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdev: Allocate multiple queues for TX.	David S. Miller	2008-07-17	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	alloc_netdev_mq() now allocates an array of netdev_queue structures for TX, based upon the queue_count argument. Furthermore, all accesses to the TX queues are now vectored through the netdev_get_tx_queue() and netdev_for_each_tx_queue() interfaces. This makes it easy to grep the tree for all things that want to get to a TX queue of a net device. Problem spots which are not really multiqueue aware yet, and only work with one queue, can easily be spotted by grepping for all netdev_get_tx_queue() calls that pass in a zero index. Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdev: The ingress_lock member is no longer needed.	David S. Miller	2008-07-08	1	-4/+4
\| \| \| \| \| \| \| \|	Every qdisc is assosciated with a queue, and in the case of ingress qdiscs that will now be netdev->rx_queue so using that queue's lock is the thing to do. Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdev: Move queue_lock into struct netdev_queue.	David S. Miller	2008-07-08	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The lock is now an attribute of the device queue. One thing to notice is that "suspicious" places emerge which will need specific training about multiple queue handling. They are so marked with explicit "netdev->rx_queue" and "netdev->tx_queue" references. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET] ifb: set separate lockdep classes for queue locks	Jarek Poplawski	2008-03-20	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ 10.536424] ======================================================= [ 10.536424] [ INFO: possible circular locking dependency detected ] [ 10.536424] 2.6.25-rc3-devel #3 [ 10.536424] ------------------------------------------------------- [ 10.536424] swapper/0 is trying to acquire lock: [ 10.536424] (&dev->queue_lock){-+..}, at: [<c0299b4a>] dev_queue_xmit+0x175/0x2f3 [ 10.536424] [ 10.536424] but task is already holding lock: [ 10.536424] (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/0x178 [act_mirred] [ 10.536424] [ 10.536424] which lock already depends on the new lock. lockdep warns of locking order while using ifb with sch_ingress and act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock is at the beginning). This patch is only to tell lockdep that ifb is a different device (e.g. from eth) and has its own pair of queue locks. (This warning is a false-positive in common scenario of using ifb; yet there are possible situations, when this order could be dangerous; lockdep should warn in such a case.) (With suggestions by David S. Miller) Reported-and-tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET] drivers/net: statistics cleanup #1 -- save memory and shrink code	Jeff Garzik	2007-10-10	1	-18/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now have struct net_device_stats embedded in struct net_device, and the default ->get_stats() hook does the obvious thing for us. Run through drivers/net/* and remove the driver-local storage of statistics, and driver-local ->get_stats() hook where applicable. This was just the low-hanging fruit in drivers/net; plenty more drivers remain to be updated. [ Resolved conflicts with napi_struct changes and fix sunqe build regression... -DaveM ] Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Nuke SET_MODULE_OWNER macro.	Ralf Baechle	2007-10-10	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Make the device list and device lookups per namespace.	Eric W. Biederman	2007-10-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[RTNETLINK]: rtnl_link: allow specifying initial device address	Patrick McHardy	2007-07-11	1	-0/+12
\| \| \| \| \| \| \| \|	Drivers need to validate the initial addresses in their netlink attribute validation function or manually reject them if they can't support this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[RTNETLINK]: rtnl_link API simplification	Patrick McHardy	2007-07-11	1	-52/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	All drivers need to unregister their devices in the module unload function. While doing so they must hold the rtnl and atomically unregister the rtnl_link ops as well. This makes the rtnl_link_unregister function that takes the rtnl itself completely useless. Provide default newlink/dellink functions, make __rtnl_link_unregister and rtnl_link_unregister unregister all devices with matching rtnl_link_ops and change the existing users to take advantage of that. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IFB]: Use rtnl_link API	Patrick McHardy	2007-07-10	1	-20/+60
\| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IFB]: Keep ifb devices on list	Patrick McHardy	2007-07-10	1	-15/+21
\| \| \| \| \| \| \|	Use a list instead of an array to allow creating new devices. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IFB]: Fix crash on input device removal	Patrick McHardy	2007-03-29	1	-22/+13
\| \| \| \| \| \| \| \| \| \| \| \|	The input_device pointer is not refcounted, which means the device may disappear while packets are queued, causing a crash when ifb passes packets with a stale skb->dev pointer to netif_rx(). Fix by storing the interface index instead and do a lookup where neccessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Revert "net: ifb error path loop fix"	Linus Torvalds	2007-01-30	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 0c0b3ae68ec93b1db5c637d294647d1cca0df763. Quoth David: "Jeff, please revert It's wrong. We had a lengthy analysis of this piece of code several months ago, and it is correct. Consider, if we run the loop and we get an error the following happens: 1) attempt of ifb_init_one(i) fails, therefore we should not try to "ifb_free_one()" on "i" since it failed 2) the loop iteration first increments "i", then it check for error Therefore we must decrement "i" twice before the first free during the cleanup. One to "undo" the for() loop increment, and one to "skip" the ifb_init_one() case which failed." Reported-by: David Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@pobox.com> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	net: ifb error path loop fix	Mariusz Kozlowski	2007-01-30	1	-2/+1
\| \| \| \| \| \| \| \| \|	On error we should start freeing resources at [i-1] not [i-2]. Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
*	[NET]: ifb double-counts packets	dean gaudet	2007-01-03	1	-2/+2
\| \| \| \| \|	Signed-off-by: dean gaudet <dean@arctic.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PATCH] pr_debug: ifb: replace missing comma to separate pr_debug arguments	Zach Brown	2006-10-03	1	-2/+2
\| \| \| \| \| \| \| \|	ifb: replace missing comma to separate pr_debug arguments Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	drivers/net: Trim trailing whitespace	Jeff Garzik	2006-09-13	1	-21/+21
\| \| \| \|	Signed-off-by: Jeff Garzik <jeff@garzik.org>
*	[IFB] After ifb_init_one() failed, i is increased. Decrease	Nicolas Dichtel	2006-07-21	1	-0/+1
\| \| \| \| \| \| \| \|	It before entering in the loop for freeing the other ifb devices. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Remove obsolete #include <linux/config.h>	Jörn Engel	2006-06-30	1	-1/+0
\| \| \| \| \|	Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
*	[NET]: Add netif_tx_lock	Herbert Xu	2006-06-17	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various drivers use xmit_lock internally to synchronise with their transmission routines. They do so without setting xmit_lock_owner. This is fine as long as netpoll is not in use. With netpoll it is possible for deadlocks to occur if xmit_lock_owner isn't set. This is because if a printk occurs while xmit_lock is held and xmit_lock_owner is not set can cause netpoll to attempt to take xmit_lock recursively. While it is possible to resolve this by getting netpoll to use trylock, it is suboptimal because netpoll's sole objective is to maximise the chance of getting the printk out on the wire. So delaying or dropping the message is to be avoided as much as possible. So the only alternative is to always set xmit_lock_owner. The following patch does this by introducing the netif_tx_lock family of functions that take care of setting/unsetting xmit_lock_owner. I renamed xmit_lock to _xmit_lock to indicate that it should not be used directly. I didn't provide irq versions of the netif_tx_lock functions since xmit_lock is meant to be a BH-disabling lock. This is pretty much a straight text substitution except for a small bug fix in winbond. It currently uses netif_stop_queue/spin_unlock_wait to stop transmission. This is unsafe as an IRQ can potentially wake up the queue. So it is safer to use netif_tx_disable. The hamradio bits used spin_lock_irq but it is unnecessary as xmit_lock must never be taken in an IRQ handler. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Increase default IFB device count.	Richard Lucassen	2006-02-23	1	-1/+1
\| \| \| \| \| \| \| \|	The most usable number of ifb devices is 2. Change the default to 2. Signed-off-by: Richard Lucassen <spamtrap@lucassen.org> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Add IFB (Intermediate Functional Block) network device.	Jamal Hadi Salim	2006-01-09	1	-0/+294
	A new device to do intermidiate functional block in a system shared manner. To use the new functionality, you need to turn on qos/classifier actions. The new functionality can be grouped as: 1) qdiscs/policies that are per device as opposed to system wide. ifb allows for a device which can be redirected to thus providing an impression of sharing. 2) Allows for queueing incoming traffic for shaping instead of dropping. Packets are redirected to this device using tc/action mirred redirect construct. If they are sent to it by plain routing instead then they will merely be dropped and the stats would indicate that. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>