summaryrefslogtreecommitdiffstats
path: root/net/core
Commit message (Collapse)AuthorAgeFilesLines
* [ETHTOOL]: Internal cleanup of ethtool_value-related handlersJeff Garzik2007-10-101-160/+39
| | | | | | | | | Several get/set functions can be handled by a passing the ethtool_op function pointer directly to a generic function. This permits deletion of a fair bit of redundant code. Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ETHTOOL]: Introduce ->{get,set}_priv_flags, ETHTOOL_[GS]PFLAGSJeff Garzik2007-10-101-0/+38
| | | | | Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ETHTOOL]: Introduce get_sset_count. Obsolete get_stats_count, self_test_countJeff Garzik2007-10-101-25/+70
| | | | | Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ETHTOOL]: Add ETHTOOL_[GS]FLAGS sub-ioctlsJeff Garzik2007-10-101-0/+61
| | | | | Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] netconsole: Support dynamic reconfiguration using configfsSatyam Sharma2007-10-101-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based upon initial work by Keiichi Kii <k-keiichi@bx.jp.nec.com>. This patch introduces support for dynamic reconfiguration (adding, removing and/or modifying parameters of netconsole targets at runtime) using a userspace interface exported via configfs. Documentation is also updated accordingly. Issues and brief design overview: (1) Kernel-initiated creation / destruction of kernel objects is not possible with configfs -- the lifetimes of the "config items" is managed exclusively from userspace. But netconsole must support boot/module params too, and these are parsed in kernel and hence netpolls must be setup from the kernel. Joel Becker suggested to separately manage the lifetimes of the two kinds of netconsole_target objects -- those created via configfs mkdir(2) from userspace and those specified from the boot/module option string. This adds complexity and some redundancy here and also means that boot/module param-created targets are not exposed through the configfs namespace (and hence cannot be updated / destroyed dynamically). However, this saves us from locking / refcounting complexities that would need to be introduced in configfs to support kernel-initiated item creation / destroy there. (2) In configfs, item creation takes place in the call chain of the mkdir(2) syscall in the driver subsystem. If we used an ioctl(2) to create / destroy objects from userspace, the special userspace program is able to fill out the structure to be passed into the ioctl and hence specify attributes such as local interface that are required at the time we set up the netpoll. For configfs, this information is not available at the time of mkdir(2). So, we keep all newly-created targets (via configfs) disabled by default. The user is expected to set various attributes appropriately (including the local network interface if required) and then write(2) "1" to the "enabled" attribute. Thus, netpoll_setup() is then called on the set parameters in the context of _this_ write(2) on the "enabled" attribute itself. This design enables the user to reconfigure existing netconsole targets at runtime to be attached to newly-come-up interfaces that may not have existed when netconsole was loaded or when the targets were actually created. All this effectively enables us to get rid of custom ioctls. (3) Ultra-paranoid configfs attribute show() and store() operations, with sanity and input range checking, using only safe string primitives, and compliant with the recommendations in Documentation/filesystems/sysfs.txt. (4) A new function netpoll_print_options() is created in the netpoll API, that just prints out the configured parameters for a netpoll structure. netpoll_parse_options() is modified to use that and it is also exported to be used from netconsole. Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Keiichi Kii <k-keiichi@bx.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NEIGH]: Netlink notificationsThomas Graf2007-10-101-20/+13
| | | | | | | | | | | | | Currently neighbour event notifications are limited to update notifications and only sent if the ARP daemon is enabled. This patch extends the existing notification code by also reporting neighbours being removed due to gc or administratively and removes the dependency on the ARP daemon. This allows to keep track of neighbour states without periodically fetching the complete neighbour table. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NEIGH]: Combine neighbour cleanup and releaseThomas Graf2007-10-101-14/+13
| | | | | | | | | Introduces neigh_cleanup_and_release() to be used after a neighbour has been removed from its neighbour table. Serves as preparation to add event notifications. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [RTNETLINK]: Introduce generic rtnl_create_link().Pavel Emelianov2007-10-101-30/+53
| | | | | | | | | | | | | | This routine gets the parsed rtnl attributes and creates a new link with generic info (IFLA_LINKINFO policy). Its intention is to help the drivers, that need to create several links at once (like VETH). This is nothing but a copy-paste-ed part of rtnl_newlink() function that is responsible for creation of new device. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make NAPI polling independent of struct net_device objects.Stephen Hemminger2007-10-104-109/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device *dev, int *budget) to int foo_poll(struct napi_struct *napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, *budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKTGEN]: srcmac fixAdit Ranadive2007-09-161-0/+10
| | | | | | | From: Adit Ranadive <adit.262@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix two issues wrt. SO_BINDTODEVICE.David S. Miller2007-09-141-48/+58
| | | | | | | | | | | | | | | | | | | | | | | | 1) Comments suggest that setting optlen to zero will unbind the socket from whatever device it might be attached to. This hasn't been the case since at least 2.2.x because the first thing this function does is return -EINVAL if 'optlen' is less than sizeof(int). This check also means that passing in a two byte string doesn't work so well. It's almost as if this code was testing with "eth?" patterned strings and nothing else :-) Fix this by breaking the logic of this facility out into a seperate function which validates optlen more appropriately. The optlen==0 and small string cases now work properly. 2) We should reset the cached route of the socket after we have made the device binding changes, not before. Reported by Ben Greear. Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Do not dereference iov if length is zeroHerbert Xu2007-09-111-0/+3
| | | | | | | | | | | | When msg_iovlen is zero we shouldn't try to dereference msg_iov. Right now the only thing that tries to do so is skb_copy_and_csum_datagram_iovec. Since the total length should also be zero if msg_iovlen is zero, it's sufficient to check the total length there and simply return if it's zero. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKTGEN]: Remove write-only variable.Pavel Emelyanov2007-08-301-3/+0
| | | | | | | | | | | The pktgen_thread.pid is set to current->pid and is never used after this. So remove this at all. Found during isolating the explicit pid/tgid usage. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKTGEN]: Fix multiqueue oops.Robert Olsson2007-08-281-2/+3
| | | | | | | | | Initially pkt_dev can be NULL this causes netif_subqueue_stopped to oops. The patch below should cure it. But maybe the pktgen TX logic should be reworked to better support the new multiqueue support. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix crash in dev_mc_sync()/dev_mc_unsync()Benjamin Thery2007-08-261-4/+10
| | | | | | | | | | | This patch fixes a crash that may occur when the routine dev_mc_sync() deletes an address from the list it is currently going through. It saves the pointer to the next element before deleting the current one. The problem may also exist in dev_mc_unsync(). Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: is_power_of_2 in net/core/neighbour.cvignesh babu2007-08-261-1/+2
| | | | | | | Replacing n & (n - 1) for power of 2 check by is_power_of_2(n) Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Unexport dev_ethtoolAdrian Bunk2007-08-141-1/+0
| | | | | | | | This patch removes the no longer used EXPORT_SYMBOL(dev_ethtool). Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Share correct feature code between bridging and bondingHerbert Xu2007-08-131-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | http://bugzilla.kernel.org/show_bug.cgi?id=8797 shows that the bonding driver may produce bogus combinations of the checksum flags and SG/TSO. For example, if you bond devices with NETIF_F_HW_CSUM and NETIF_F_IP_CSUM you'll end up with a bonding device that has neither flag set. If both have TSO then this produces an illegal combination. The bridge device on the other hand has the correct code to deal with this. In fact, the same code can be used for both. So this patch moves that logic into net/core/dev.c and uses it for both bonding and bridging. In the process I've made small adjustments such as only setting GSO_ROBUST if at least one constituent device supports it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] net/core/utils: fix sparse warningJohannes Berg2007-08-071-0/+1
| | | | | | | | net_msg_warn is not defined because it is in net/sock.h which isn't included. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [RTNETLINK]: Fix warning for !CONFIG_KMODThomas Graf2007-07-311-0/+2
| | | | | | | replay label is unused otherwise. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: ethtool_perm_addr only has one implementationMatthew Wilcox2007-07-311-35/+8
| | | | | | | | | All drivers implement ethtool get_perm_addr the same way -- by calling the generic function. So we can inline the generic function into the caller and avoid going through the drivers. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: ethtool ops are the only wayMatthew Wilcox2007-07-311-14/+7
| | | | | | | | | | | | | | | During the transition to the ethtool_ops way of doing things, we supported calling the device's ->do_ioctl method to allow unconverted drivers to continue working. Those days are long behind us, all in-tree drivers use the ethtool_ops way, and so we no longer need to support this. The bonding driver is the biggest beneficiary of this; it no longer needs to call ioctl() as a fallback if ethtool_ops aren't supported. Also put a proper copyright statement on ethtool.c. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKTGEN]: make get_ipsec_sa() static and non-inlineAdrian Bunk2007-07-311-2/+1
| | | | | | | | | Non-static inline code usually doesn't makes sense. In this case making is static and non-inline is the correct solution. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Allow netdev REGISTER/CHANGENAME events to failHerbert Xu2007-07-311-10/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds code to allow errors to be passed up from event handlers of NETDEV_REGISTER and NETDEV_CHANGENAME. It also adds the notifier_from_errno/notifier_to_errnor helpers to pass the errno value up to the notifier caller. If an error is detected when a device is registered, it causes that operation to fail. A NETDEV_UNREGISTER will be sent to all event handlers. Similarly if NETDEV_CHANGENAME fails the original name is restored and a new NETDEV_CHANGENAME event is sent. As such all event handlers must be idempotent with respect to these events. When an event handler is registered NETDEV_REGISTER events are sent for all devices currently registered. Should any of them fail, we will send NETDEV_GOING_DOWN/NETDEV_DOWN/NETDEV_UNREGISTER events to that handler for the devices which have already been registered with it. The handler registration itself will fail. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Take dev_base_lock when moving device name hash list entryHerbert Xu2007-07-311-0/+4
| | | | | | | | | | | | | | When we added name-based hashing the dev_base_lock was designated as the lock to take when changing the name hash list. Unfortunately, because it was a preexisting lock that just happened to be taken in the right spots we neglected to take it in dev_change_name. The race can affect calles of __dev_get_by_name that do so without taking the RTNL. They may end up walking down the wrong hash chain and end up missing the device that they're looking for. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Call uninit if necessary in register_netdeviceHerbert Xu2007-07-311-3/+8
| | | | | | | | | | This patch makes register_netdevice call dev->uninit if the regsitration fails after dev->init has completed successfully. Very few drivers use the init/uninit calls but at least one (drivers/net/wan/sealevel.c) may leak without this change. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKTGEN]: Add missing KERN_* tags to printk()s.David S. Miller2007-07-311-46/+57
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: kernel-doc fixesRandy Dunlap2007-07-311-5/+11
| | | | | | | | | | | | Fix kernel-doc omissions in net/: Warning(linux-2.6.23-rc1//net/core/dev.c:2728): No description found for parameter 'addr' Warning(linux-2.6.23-rc1//net/core/dev.c:2752): No description found for parameter 'addr' Warning(linux-2.6.23-rc1//net/core/dev.c:3839): No description found for parameter 'net_dma' Warning(linux-2.6.23-rc1//net/core/dev.c:3877): No description found for parameter 'state' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add missing entries to family name tablesDavid Howells2007-07-211-1/+2
| | | | | | | Add missing entries to af_family_clock_key_strings[]. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix loopback crashes when multiqueue is enabled.Patrick McHardy2007-07-201-2/+2
| | | | | | | From: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-204-7/+7
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Merge branch 'master' of ↵Linus Torvalds2007-07-193-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits) [TG3]: Fix msi issue with kexec/kdump. [NET] XFRM: Fix whitespace errors. [NET] TIPC: Fix whitespace errors. [NET] SUNRPC: Fix whitespace errors. [NET] SCTP: Fix whitespace errors. [NET] RXRPC: Fix whitespace errors. [NET] ROSE: Fix whitespace errors. [NET] RFKILL: Fix whitespace errors. [NET] PACKET: Fix whitespace errors. [NET] NETROM: Fix whitespace errors. [NET] NETFILTER: Fix whitespace errors. [NET] IPV4: Fix whitespace errors. [NET] DCCP: Fix whitespace errors. [NET] CORE: Fix whitespace errors. [NET] BLUETOOTH: Fix whitespace errors. [NET] AX25: Fix whitespace errors. [PATCH] mac80211: remove rtnl locking in ieee80211_sta.c [PATCH] mac80211: fix GCC warning on 64bit platforms [GENETLINK]: Dynamic multicast groups. [NETLIKN]: Allow removing multicast groups. ...
| * [NET] CORE: Fix whitespace errors.YOSHIFUJI Hideaki2007-07-193-3/+3
| | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* | lockdep: fixup sk_callback_lock annotationPeter Zijlstra2007-07-191-4/+19
|/ | | | | | | | | | the two init sites resulted in inconsistend names for the lock class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [NET]: move __dev_addr_discard adjacent to dev_addr_discard for readabilityDenis Cheng2007-07-181-14/+14
| | | | | Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: merge dev_unicast_discard and dev_mc_discard into oneDenis Cheng2007-07-181-12/+4
| | | | | | | this two functions could share the dev->_xmit_lock acquired context. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: move dev_mc_discard from dev_mcast.c to dev.cDenis Cheng2007-07-182-13/+13
| | | | | | | | | | | | | | | | | | | | Because this function is only called by unregister_netdevice, this moving could make this non-global function static, and also remove its declaration in netdevice.h; Any further, function __dev_addr_discard is also just called by dev_mc_discard and dev_unicast_discard, keeping this two functions both in one c file could make __dev_addr_discard also static and remove its declaration in netdevice.h; Futhermore, the sequential call to dev_unicast_discard and then dev_mc_discard in unregister_netdevice have a similar mechanism that: (netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh), they should merged into one to eliminate duplicates in acquiring and releasing the dev->_xmit_lock, this would be done in my following patch. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: gen_estimator deadlock fixRanko Zivojnovic2007-07-181-32/+49
| | | | | | | | | | | | | | | | | | | | | | | -Fixes ABBA deadlock noted by Patrick McHardy <kaber@trash.net>: > There is at least one ABBA deadlock, est_timer() does: > read_lock(&est_lock) > spin_lock(e->stats_lock) (which is dev->queue_lock) > > and qdisc_destroy calls htb_destroy under dev->queue_lock, which > calls htb_destroy_class, then gen_kill_estimator and this > write_locks est_lock. To fix the ABBA deadlock the rate estimators are now kept on an rcu list. -The est_lock changes the use from protecting the list to protecting the update to the 'bstat' pointer in order to avoid NULL dereferencing. -The 'interval' member of the gen_estimator structure removed as it is not needed. Signed-off-by: Ranko Zivojnovic <ranko@spidernet.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Freezer: make kernel threads nonfreezable by defaultRafael J. Wysocki2007-07-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the freezer treats all tasks as freezable, except for the kernel threads that explicitly set the PF_NOFREEZE flag for themselves. This approach is problematic, since it requires every kernel thread to either set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't care for the freezing of tasks at all. It seems better to only require the kernel threads that want to or need to be frozen to use some freezer-related code and to remove any freezer-related code from the other (nonfreezable) kernel threads, which is done in this patch. The patch causes all kernel threads to be nonfreezable by default (ie. to have PF_NOFREEZE set by default) and introduces the set_freezable() function that should be called by the freezable kernel threads in order to unset PF_NOFREEZE. It also makes all of the currently freezable kernel threads call set_freezable(), so it shouldn't cause any (intentional) change of behaviour to appear. Additionally, it updates documentation to describe the freezing of tasks more accurately. [akpm@linux-foundation.org: build fixes] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Revert "[NET]: Fix races in net_rx_action vs netpoll."Linus Torvalds2007-07-161-8/+0
| | | | | | | | | | | | | | This reverts commit 29578624e354f56143d92510fff33a8b2aaa2c03. Ingo Molnar reports complete breakage with his e1000 card (no networking, card reports transmit timeouts), and bisected it down to this commit. Let's figure out what went wrong, but not keep breaking machines until we do. Cc: Ingo Molnar <mingo@elte.hu> Cc: Olaf Kirch <olaf.kirch@oracle.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* O_CLOEXEC for SCM_RIGHTSUlrich Drepper2007-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part two in the O_CLOEXEC saga: adding support for file descriptors received through Unix domain sockets. The patch is once again pretty minimal, it introduces a new flag for recvmsg and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit is not used otherwise but the networking people will know better. This new flag is not recognized by recvfrom and recv. These functions cannot be used for that purpose and the asymmetry this introduces is not worse than the already existing MSG_CMSG_COMPAT situations. The patch must be applied on the patch which introduced O_CLOEXEC. It has to remove static from the new get_unused_fd_flags function but since scm.c cannot live in a module the function still hasn't to be exported. Here's a test program to make sure the code works. It's so much longer than the actual patch... #include <errno.h> #include <error.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <sys/un.h> #ifndef O_CLOEXEC # define O_CLOEXEC 02000000 #endif #ifndef MSG_CMSG_CLOEXEC # define MSG_CMSG_CLOEXEC 0x40000000 #endif int main (int argc, char *argv[]) { if (argc > 1) { int fd = atol (argv[1]); printf ("child: fd = %d\n", fd); if (fcntl (fd, F_GETFD) == 0 || errno != EBADF) { puts ("file descriptor valid in child"); return 1; } return 0; } struct sockaddr_un sun; strcpy (sun.sun_path, "./testsocket"); sun.sun_family = AF_UNIX; char databuf[] = "hello"; struct iovec iov[1]; iov[0].iov_base = databuf; iov[0].iov_len = sizeof (databuf); union { struct cmsghdr hdr; char bytes[CMSG_SPACE (sizeof (int))]; } buf; struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1, .msg_control = buf.bytes, .msg_controllen = sizeof (buf) }; struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg); cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_type = SCM_RIGHTS; cmsg->cmsg_len = CMSG_LEN (sizeof (int)); msg.msg_controllen = cmsg->cmsg_len; pid_t child = fork (); if (child == -1) error (1, errno, "fork"); if (child == 0) { int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "bind"); if (listen (sock, SOMAXCONN) < 0) error (1, errno, "listen"); int conn = accept (sock, NULL, NULL); if (conn == -1) error (1, errno, "accept"); *(int *) CMSG_DATA (cmsg) = sock; if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0) error (1, errno, "sendmsg"); return 0; } /* For a test suite this should be more robust like a barrier in shared memory. */ sleep (1); int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "connect"); unlink (sun.sun_path); *(int *) CMSG_DATA (cmsg) = -1; if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0) error (1, errno, "recvmsg"); int fd = *(int *) CMSG_DATA (cmsg); if (fd == -1) error (1, 0, "no descriptor received"); char fdname[20]; snprintf (fdname, sizeof (fdname), "%d", fd); execl ("/proc/self/exe", argv[0], fdname, NULL); puts ("execl failed"); return 1; } [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch] [akpm@linux-foundation.org: build fix] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Michael Buesch <mb@bu3sch.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.Michael Chan2007-07-141-0/+12
| | | | | | | | | Add ethtool utility function to set or clear IPV6_CSUM feature flag. Modify tg3.c and bnx2.c to use this function when doing ethtool -K to change tx checksum. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add macvlan driverPatrick McHardy2007-07-141-0/+26
| | | | | | | | Add macvlan driver, which allows to create virtual ethernet devices based on MAC address. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: dev_mcast: add multicast list synchronization helpersPatrick McHardy2007-07-141-0/+75
| | | | | | | | | | | | | | | | | | | | The method drivers currently use to synchronize multicast lists is not very pretty: - walk the multicast list - search each entry on a copy of the previous list - if new add to lower device - walk the copy of the previous list - search each entry on the current list - if removed delete from lower device - copy entire list This patch adds a new field to struct dev_addr_list to store the synchronization state and adds two helper functions for synchronization and cleanup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add net_device change_rx_mode callbackPatrick McHardy2007-07-141-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the set_multicast_list (and set_rx_mode) callbacks are responsible for configuring the device according to the IFF_PROMISC, IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in case of set_rx_mode). These callbacks can be invoked from BH context without the rtnl_mutex by dev_mc_add/dev_mc_delete, which makes reading the device flags and promiscous/allmulti count racy. For real hardware drivers that just commit all changes to the hardware this is not a real problem since the stack guarantees to call them for every change, so at least the final call will not race and commit the correct configuration to the hardware. For software devices that want to synchronize promiscous and multicast state to an underlying device however this can cause corruption of the underlying device's flags or promisc/allmulti counts. When the software device is concurrently put in promiscous or allmulti mode while set_multicast_list is invoked from bottem half context, the device might synchronize the change to the underlying device without holding the rtnl_mutex, which races with concurrent changes to the underlying device. Add a dev->change_rx_flags hook that is invoked when any of the flags that affect rx filtering change (under the rtnl_mutex), which allows drivers to perform synchronization immediately and only synchronize the address lists in set_multicast_list/set_rx_mode. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ioat-md-accel-for-linus' of ↵Linus Torvalds2007-07-131-34/+78
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://lost.foo-projects.org/~dwillia2/git/iop * 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits) ioatdma: add the unisys "i/oat" pci vendor/device id ARM: Add drivers/dma to arch/arm/Kconfig iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver iop13xx: surface the iop13xx adma units to the iop-adma driver dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines md: remove raid5 compute_block and compute_parity5 md: handle_stripe5 - request io processing in raid5_run_ops md: handle_stripe5 - add request/completion logic for async expand ops md: handle_stripe5 - add request/completion logic for async read ops md: handle_stripe5 - add request/completion logic for async check ops md: handle_stripe5 - add request/completion logic for async compute ops md: handle_stripe5 - add request/completion logic for async write ops md: common infrastructure for running operations with raid5_run_ops md: raid5_run_ops - run stripe operations outside sh->lock raid5: replace custom debug PRINTKs with standard pr_debug raid5: refactor handle_stripe5 and handle_stripe6 (v3) async_tx: add the async_tx api xor: make 'xor_blocks' a library routine for use with async_tx dmaengine: make clients responsible for managing channels dmaengine: refactor dmaengine around dma_async_tx_descriptor ...
| * dmaengine: make clients responsible for managing channelsDan Williams2007-07-131-34/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation assumes that a channel will only be used by one client at a time. In order to enable channel sharing the dmaengine core is changed to a model where clients subscribe to channel-available-events. Instead of tracking how many channels a client wants and how many it has received the core just broadcasts the available channels and lets the clients optionally take a reference. The core learns about the clients' needs at dma_event_callback time. In support of multiple operation types, clients can specify a capability mask to only be notified of channels that satisfy a certain set of capabilities. Changelog: * removed DMA_TX_ARRAY_INIT, no longer needed * dma_client_chan_free -> dma_chan_release: switch to global reference counting only at device unregistration time, before it was also happening at client unregistration time * clients now return dma_state_client to dmaengine (ack, dup, nak) * checkpatch.pl fixes * fixup merge with git-ioat Cc: Chris Leech <christopher.leech@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David S. Miller <davem@davemloft.net>
* | [RTNETLINK]: rtnl_link: allow specifying initial device addressPatrick McHardy2007-07-111-2/+7
| | | | | | | | | | | | | | | | Drivers need to validate the initial addresses in their netlink attribute validation function or manually reject them if they can't support this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [RTNETLINK]: rtnl_link API simplificationPatrick McHardy2007-07-111-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | All drivers need to unregister their devices in the module unload function. While doing so they must hold the rtnl and atomically unregister the rtnl_link ops as well. This makes the rtnl_link_unregister function that takes the rtnl itself completely useless. Provide default newlink/dellink functions, make __rtnl_link_unregister and rtnl_link_unregister unregister all devices with matching rtnl_link_ops and change the existing users to take advantage of that. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [NET]: Fix races in net_rx_action vs netpoll.Olaf Kirch2007-07-111-0/+8
| | | | | | | | | | | | | | | | Keep netpoll/poll_napi from messing with the poll_list. Only net_rx_action is allowed to manipulate the list. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud