summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAITEric Dumazet2011-10-276-17/+16
| | | | | | | | | | | | | | | commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT) fixed IPv4 only. This part is for the IPv6 side, adding a tclass param to ip6_xmit() We alias tw_tclass and tw_tos, if socket family is INET6. [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill TOS field, TCLASS is not taken into account ] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2011-10-262-2/+4
|\ | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: caif: Fix BUG() with network namespaces net: make bonding slaves honour master's skb->priority net: Unlock sock before calling sk_free()
| * caif: Fix BUG() with network namespacesDavid Woodhouse2011-10-251-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The caif code will register its own pernet_operations, and then register a netdevice_notifier. Each time the netdevice_notifier is triggered, it'll do some stuff... including a lookup of its own pernet stuff with net_generic(). If the net_generic() call ever returns NULL, the caif code will BUG(). That doesn't seem *so* unreasonable, I suppose — it does seem like it should never happen. However, it *does* happen. When we clone a network namespace, setup_net() runs through all the pernet_operations one at a time. It gets to loopback before it gets to caif. And loopback_net_init() registers a netdevice... while caif hasn't been initialised. So the caif netdevice notifier triggers, and immediately goes BUG(). We could imagine a complex and overengineered solution to this generic class of problems, but this patch takes the simple approach. It just makes caif_device_notify() *not* go looking for its pernet data structures if the device it's being notified about isn't a caif device in the first place. Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Unlock sock before calling sk_free()Thomas Gleixner2011-10-251-0/+1
| | | | | | | | | | Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'tty-next' of ↵Linus Torvalds2011-10-261-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (79 commits) TTY: serial_core: Fix crash if DCD drop during suspend tty/serial: atmel_serial: bootconsole removed from auto-enumerates Revert "TTY: call tty_driver_lookup_tty unconditionally" tty/serial: atmel_serial: add device tree support tty/serial: atmel_serial: auto-enumerate ports tty/serial: atmel_serial: whitespace and braces modifications tty/serial: atmel_serial: change platform_data variable name tty/serial: RS485 bindings for device tree TTY: call tty_driver_lookup_tty unconditionally TTY: pty, release tty in all ptmx_open fail paths TTY: make tty_add_file non-failing TTY: drop driver reference in tty_open fail path 8250_pci: Fix kernel panic when pch_uart is disabled h8300: drivers/serial/Kconfig was moved parport_pc: release IO region properly if unsupported ITE887x card is found tty: Support compat_ioctl get/set termios_locked hvc_console: display printk messages on console. TTY: snyclinkmp: forever loop in tx_load_dma_buffer() tty/n_gsm: avoid fifo overflow in gsm_dlci_data_output tty/n_gsm: fix a bug in gsm_dlci_data_output (adaption = 2 case) ... Fix up Conflicts in: - drivers/tty/serial/8250_pci.c Trivial conflict with removed duplicate device ID - drivers/tty/serial/atmel_serial.c Annoying silly conflict between "specify the port num via platform_data" and other changes to atmel_console_init
| * | TTY: use tty_wait_until_sent_from_close in other driversJiri Slaby2011-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's use the newly added helper to avoid stalls in drivers which are not yet ported to tty_port helpers. Those which are broken (call tty_wait_until_sent with irqs disabled) are left untouched. They are in a deeper trouble than we are trying to solve here. This includes amiserial, 68328serial, 68360serial and crisv10. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | Merge branch 'for-linus' of git://github.com/ericvh/linuxLinus Torvalds2011-10-266-411/+554
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://github.com/ericvh/linux: 9p: fix 9p.txt to advertise msize instead of maxdata net/9p: Convert net/9p protocol dumps to tracepoints fs/9p: change an int to unsigned int fs/9p: Cleanup option parsing in 9p 9p: move dereference after NULL check fs/9p: inode file operation is properly initialized init_special_inode fs/9p: Update zero-copy implementation in 9p
| * | net/9p: Convert net/9p protocol dumps to tracepointsAneesh Kumar K.V2011-10-243-79/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This helps in more control over debugging. root@qemu-img-64:~# ls /pass/123 ls: cannot access /pass/123: No such file or directory root@qemu-img-64:~# cat /sys/kernel/debug/tracing/trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | ls-1536 [001] 70.928584: 9p_protocol_dump: clnt 18446612132784021504 P9_TWALK(tag = 1) 000: 16 00 00 00 6e 01 00 01 00 00 00 02 00 00 00 01 010: 00 03 00 31 32 33 00 00 00 ff ff ff ff 00 00 00 ls-1536 [001] 70.928587: <stack trace> => trace_9p_protocol_dump => p9pdu_finalize => p9_client_rpc => p9_client_walk => v9fs_vfs_lookup => d_alloc_and_lookup => walk_component => path_lookupat ls-1536 [000] 70.929696: 9p_protocol_dump: clnt 18446612132784021504 P9_RLERROR(tag = 1) 000: 0b 00 00 00 07 01 00 02 00 00 00 4e 03 00 02 00 010: 00 00 00 00 03 00 02 00 00 00 00 00 ff 43 00 00 ls-1536 [000] 70.929697: <stack trace> => trace_9p_protocol_dump => p9_client_rpc => p9_client_walk => v9fs_vfs_lookup => d_alloc_and_lookup => walk_component => path_lookupat => do_path_lookup Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | fs/9p: change an int to unsigned intDan Carpenter2011-10-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Without this msize=4294967295 will result in a crash Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | fs/9p: Cleanup option parsing in 9pAneesh Kumar K.V2011-10-241-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | Instead of saying all integer argument option should be listed in the beginning move integer parsing to each option type. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | 9p: move dereference after NULL checkDan Carpenter2011-10-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | We dereferenced "req->tc" and "req->rc" before checking for NULL. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | fs/9p: Update zero-copy implementation in 9pAneesh Kumar K.V2011-10-246-331/+500
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * remove lot of update to different data structure * add a seperate callback for zero copy request. * above makes non zero copy code path simpler * remove conditionalizing TREAD/TREADDIR/TWRITE in the zero copy path * Fix the dotu p9_check_errors with zero copy. Add sufficient doc around * Add support for both in and output buffers in zero copy callback * pin and unpin pages in the same context * use helpers instead of defining page offset and rest of page ourself * Fix mem leak in p9_check_errors * Remove 'E' and 'F' in p9pdu_vwritef Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* | | Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2011-10-255-29/+31
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits) Check validity of cl_rpcclient in nfs_server_list_show NFS: Get rid of the nfs_rdata_mempool NFS: Don't rely on PageError in nfs_readpage_release_partial NFS: Get rid of unnecessary calls to ClearPageError() in read code NFS: Get rid of nfs_restart_rpc() NFS: Get rid of the unused nfs_write_data->flags field NFS: Get rid of the unused nfs_read_data->flags field NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup() NFS: Use the inode->i_version to cache NFSv4 change attribute information SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr SUNRPC: Fix rpc_sockaddr2uaddr nfs/super.c: local functions should be static pnfsblock: fix writeback deadlock pnfsblock: fix NULL pointer dereference pnfs: recoalesce when ld read pagelist fails pnfs: recoalesce when ld write pagelist fails pnfs: make _set_lo_fail generic pnfsblock: add missing rpc_put_mount and path_put SUNRPC/NFS: make rpc pipe upcall generic ...
| * | | NFS: Get rid of nfs_restart_rpc()Trond Myklebust2011-10-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | It can trivially be replaced with rpc_restart_call_prepare. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddrTrond Myklebust2011-10-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | It is only used internally by the RPC code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | SUNRPC: Fix rpc_sockaddr2uaddrTrond Myklebust2011-10-182-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where it is used in a non-blockable context in at least one case. Add non-blocking capability by adding a gfp_t argument Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | SUNRPC/NFS: make rpc pipe upcall genericPeng Tao2011-10-182-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The same function is used by idmap, gss and blocklayout code. Make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2011-10-254-26/+46
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits) nfs41: implement DESTROY_CLIENTID operation nfsd4: typo logical vs bitwise negate for want_mask nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED nfsd4: seq->status_flags may be used unitialized nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid nfsd4: implement new 4.1 open reclaim types nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround nfsd4: warn on open failure after create nfsd4: preallocate open stateid in process_open1() nfsd4: do idr preallocation with stateid allocation nfsd4: preallocate nfs4_file in process_open1() nfsd4: clean up open owners on OPEN failure nfsd4: simplify process_open1 logic nfsd4: make is_open_owner boolean nfsd4: centralize renew_client() calls nfsd4: typo logical vs bitwise negate nfs: fix bug about IPv6 address scope checking nfsd4: more robust ignoring of WANT bits in OPEN nfsd4: move name-length checks to xdr nfsd4: move access/deny validity checks to xdr code ...
| * | | | sunrpc: add MODULE_ALIAS to match the filesystem nameMichal Schmidt2011-10-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sunrpc implements the rpc_pipefs filesystem type. Add the alias to have the module requested automatically by the kernel when the filesystem is mounted. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | SUNRPC: Replace svc_addr_u by sockaddr_storageMi Jinlong2011-09-142-17/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For IPv6 local address, lockd can not callback to client for missing scope id when binding address at inet6_bind: 324 if (addr_type & IPV6_ADDR_LINKLOCAL) { 325 if (addr_len >= sizeof(struct sockaddr_in6) && 326 addr->sin6_scope_id) { 327 /* Override any existing binding, if another one 328 * is supplied by user. 329 */ 330 sk->sk_bound_dev_if = addr->sin6_scope_id; 331 } 332 333 /* Binding to link-local address requires an interface */ 334 if (!sk->sk_bound_dev_if) { 335 err = -EINVAL; 336 goto out_unlock; 337 } Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info besides address. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | sunrpc: use better NUMA affinitiesEric Dumazet2011-08-191-9/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use NUMA aware allocations to reduce latencies and increase throughput. sunrpc kthreads can use kthread_create_on_node() if pool_mode is "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can also take into account NUMA node affinity for memory allocations. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: "J. Bruce Fields" <bfields@fieldses.org> CC: Neil Brown <neilb@suse.de> CC: David Miller <davem@davemloft.net> Reviewed-by: Greg Banks <gnb@fastmail.fm> [bfields@redhat.com: fix up caller nfs41_callback_up] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | | Merge branch 'pm-for-linus' of ↵Linus Torvalds2011-10-253-3/+3
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (63 commits) PM / Clocks: Remove redundant NULL checks before kfree() PM / Documentation: Update docs about suspend and CPU hotplug ACPI / PM: Add Sony VGN-FW21E to nonvs blacklist. ARM: mach-shmobile: sh7372 A4R support (v4) ARM: mach-shmobile: sh7372 A3SP support (v4) PM / Sleep: Mark devices involved in wakeup signaling during suspend PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image PM / Hibernate: Do not initialize static and extern variables to 0 PM / Freezer: Make fake_signal_wake_up() wake TASK_KILLABLE tasks too PM / Hibernate: Add resumedelay kernel param in addition to resumewait MAINTAINERS: Update linux-pm list address PM / ACPI: Blacklist Vaio VGN-FW520F machine known to require acpi_sleep=nonvs PM / ACPI: Blacklist Sony Vaio known to require acpi_sleep=nonvs PM / Hibernate: Add resumewait param to support MMC-like devices as resume file PM / Hibernate: Fix typo in a kerneldoc comment PM / Hibernate: Freeze kernel threads after preallocating memory PM: Update the policy on default wakeup settings PM / VT: Cleanup #if defined uglyness and fix compile error PM / Suspend: Off by one in pm_suspend() PM / Hibernate: Include storage keys in hibernation image on s390 ...
| * \ \ \ \ Merge branch 'pm-qos' into pm-for-linusRafael J. Wysocki2011-10-073-3/+3
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-qos: PM / QoS: Update Documentation for the pm_qos and dev_pm_qos frameworks PM / QoS: Add function dev_pm_qos_read_value() (v3) PM QoS: Add global notification mechanism for device constraints PM QoS: Implement per-device PM QoS constraints PM QoS: Generalize and export constraints management code PM QoS: Reorganize data structs PM QoS: Code reorganization PM QoS: Minor clean-ups PM QoS: Move and rename the implementation files
| | * | | | | PM QoS: Move and rename the implementation filesJean Pihet2011-08-253-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PM QoS implementation files are better named kernel/power/qos.c and include/linux/pm_qos.h. The PM QoS support is compiled under the CONFIG_PM option. Signed-off-by: Jean Pihet <j-pihet@ti.com> Acked-by: markgross <markgross@thegnar.org> Reviewed-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2011-10-25285-5540/+14144
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
| * | | | | | | ipv6: Do not use routes from locally generated RAsAndreas Hofmeister2011-10-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When hybrid mode is enabled (accept_ra == 2), the kernel also sees RAs generated locally. This is useful since it allows the kernel to auto-configure its own interface addresses. However, if 'accept_ra_defrtr' and/or 'accept_ra_rtr_pref' are set and the locally generated RAs announce the default route and/or other route information, the kernel happily inserts bogus routes with its own address as gateway. With this patch, adding routes from an RA will be skiped when the RAs source address matches any local address, just as if 'accept_ra_defrtr' and 'accept_ra_rtr_pref' were set to 0. Signed-off-by: Andreas Hofmeister <andi@collax.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net_sched: cls_flow: use skb_header_pointer()Eric Dumazet2011-10-241-92/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dan Siemon would like to add tunnelling support to cls_flow This preliminary patch introduces use of skb_header_pointer() to help this task, while avoiding skb head reallocation because of deep packet inspection. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | ipv4: avoid useless call of the function check_peer_pmtuGao feng2011-10-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In func ipv4_dst_check,check_peer_pmtu should be called only when peer is updated. So,if the peer is not updated in ip_rt_frag_needed,we can not inc __rt_peer_genid. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-10-2425-95/+268
| |\ \ \ \ \ \ \
| | * | | | | | | rtnetlink: Add missing manual netlink notification in dev_change_net_namespacesEric W. Biederman2011-10-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Renato Westphal noticed that since commit a2835763e130c343ace5320c20d33c281e7097b7 "rtnetlink: handle rtnl_link netlink notifications manually" was merged we no longer send a netlink message when a networking device is moved from one network namespace to another. Fix this by adding the missing manual notification in dev_change_net_namespaces. Since all network devices that are processed by dev_change_net_namspaces are in the initialized state the complicated tests that guard the manual rtmsg_ifinfo calls in rollback_registered and register_netdevice are unnecessary and we can just perform a plain notification. Cc: stable@kernel.org Tested-by: Renato Westphal <renatowestphal@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | ipv4: fix ipsec forward performance regressionYan, Zheng2011-10-241-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable). It makes xfrm4_fill_dst() modify wrong data structure. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Reported-by: Kim Phillips <kim.phillips@freescale.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | route: fix ICMP redirect validationFlavio Leitner2011-10-241-5/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit f39925dbde7788cfb96419c0f092b086aa325c0f (ipv4: Cache learned redirect information in inetpeer.) removed some ICMP packet validations which are required by RFC 1122, section 3.2.2.2: ... A Redirect message SHOULD be silently discarded if the new gateway address it specifies is not on the same connected (sub-) net through which the Redirect arrived [INTRO:2, Appendix A], or if the source of the Redirect is not the current first-hop gateway for the specified destination (see Section 3.3.1). Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | net: hold sock reference while processing tx timestampsRichard Cochran2011-10-241-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pair of functions, * skb_clone_tx_timestamp() * skb_complete_tx_timestamp() were designed to allow timestamping in PHY devices. The first function, called during the MAC driver's hard_xmit method, identifies PTP protocol packets, clones them, and gives them to the PHY device driver. The PHY driver may hold onto the packet and deliver it at a later time using the second function, which adds the packet to the socket's error queue. As pointed out by Johannes, nothing prevents the socket from disappearing while the cloned packet is sitting in the PHY driver awaiting a timestamp. This patch fixes the issue by taking a reference on the socket for each such packet. In addition, the comments regarding the usage of these function are expanded to highlight the rule that PHY drivers must use skb_complete_tx_timestamp() to release the packet, in order to release the socket reference, too. These functions first appeared in v2.6.36. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Cc: <stable@vger.kernel.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | Merge branch 'batman-adv/maint' of git://git.open-mesh.org/linux-mergeDavid S. Miller2011-10-201-1/+6
| | |\ \ \ \ \ \ \
| | | * | | | | | | batman-adv: correctly set the data field in the TT_REPONSE packetAntonio Quartulli2011-10-181-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the TT_RESPONSE packet, the number of carried entries is not correctly set. This leads to a wrong interpretation of the packet payload on the receiver side causing random entries to be added to the global translation table. Therefore the latter gets always corrupted, triggering a table recovery all the time. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
| | | * | | | | | | batman-adv: fix tt_local_reset_flags() functionAntonio Quartulli2011-10-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the counter of tt_local_entry structures (tt_local_num) is incremented each time the tt_local_reset_flags() is invoked causing the node to send wrong TT_REPONSE packets containing a copy of non-initialised memory thus corrupting other nodes global translation table and making higher level communication impossible. Reported-by: Junkeun Song <jun361@gmail.com> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Acked-by: Junkeun Song <jun361@gmail.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
| | * | | | | | | | ip_gre: dont increase dev->needed_headroom on a live deviceEric Dumazet2011-10-201-2/+0
| | | |_|_|_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems ip_gre is able to change dev->needed_headroom on the fly. Its is not legal unfortunately and triggers a BUG in raw_sendmsg() skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev) < another cpu change dev->needed_headromm (making it bigger) ... skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev)); We end with LL_RESERVED_SPACE() being bigger than LL_ALLOCATED_SPACE() -> we crash later because skb head is exhausted. Bug introduced in commit 243aad83 in 2.6.34 (ip_gre: include route header_len in max_headroom calculation) Reported-by: Elmar Vonlanthen <evonlanthen@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Timo Teräs <timo.teras@iki.fi> CC: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | TCP: remove TCP_DEBUGFlavio Leitner2011-10-242-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was enabled by default and the messages guarded by the define are useful. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAITEric Dumazet2011-10-243-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a long standing bug in linux tcp stack, about ACK messages sent on behalf of TIME_WAIT sockets. In the IP header of the ACK message, we choose to reflect TOS field of incoming message, and this might break some setups. Example of things that were broken : - Routing using TOS as a selector - Firewalls - Trafic classification / shaping We now remember in timewait structure the inet tos field and use it in ACK generation, and route lookup. Notes : - We still reflect incoming TOS in RST messages. - We could extend MuraliRaja Muniraju patch to report TOS value in netlink messages for TIME_WAIT sockets. - A patch is needed for IPv6 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: md5: add more const attributesEric Dumazet2011-10-242-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now tcp_md5_hash_header() has a const tcphdr argument, we can add more const attributes to callers. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: md5: dont write skb head in tcp_md5_hash_header()Eric Dumazet2011-10-241-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tcp_md5_hash_header() writes into skb header a temporary zero value, this might confuse other users of this area. Since tcphdr is small (20 bytes), copy it in a temporary variable and make the change in the copy. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: use INET_ECN_MASK instead of hardcoded 3Maciej Żenczykowski2011-10-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: add const qualifiers where possibleEric Dumazet2011-10-219-131/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding const qualifiers to pointers can ease code review, and spot some bugs. It might allow compiler to optimize code further. For example, is it legal to temporary write a null cksum into tcphdr in tcp_md5_hash_header() ? I am afraid a sniffer could catch the temporary null value... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | dev: use name hash for dev_seq_opsMihai Maruseac2011-10-211-15/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using the dev->next chain and trying to resync at each call to dev_seq_start, use the name hash, keeping the bucket and the offset in seq->private field. Tests revealed the following results for ifconfig > /dev/null * 1000 interfaces: * 0.114s without patch * 0.089s with patch * 3000 interfaces: * 0.489s without patch * 0.110s with patch * 5000 interfaces: * 1.363s without patch * 0.250s with patch * 128000 interfaces (other setup): * ~100s without patch * ~30s with patch Signed-off-by: Mihai Maruseac <mmaruseac@ixiacom.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: add opaque struct around skb frag pageIan Campbell2011-10-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've split this bit out of the skb frag destructor patch since it helps enforce the use of the fragment API. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: allow CAP_NET_RAW to set socket options IP{,V6}_TRANSPARENTMaciej Żenczykowski2011-10-202-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up till now the IP{,V6}_TRANSPARENT socket options (which actually set the same bit in the socket struct) have required CAP_NET_ADMIN privileges to set or clear the option. - we make clearing the bit not require any privileges. - we allow CAP_NET_ADMIN to set the bit (as before this change) - we allow CAP_NET_RAW to set this bit, because raw sockets already pretty much effectively allow you to emulate socket transparency. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: remove unused tcp_fin() parametersEric Dumazet2011-10-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tcp_fin() only needs socket pointer, we can remove skb and th params. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | pktgen: remove ndelay() callEric Dumazet2011-10-201-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Turull reported inaccuracies in pktgen when using low packet rates, because we call ndelay(val) with values bigger than 20000. Instead of calling ndelay() for delays < 100us, we can instead loop calling ktime_now() only. Reported-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf()Eric Dumazet2011-10-201-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 356f039822b (TCP: increase default initial receive window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments. Change tcp_fixup_rcvbuf() to reflect this change, even if no real change is expected, since sysctl_tcp_rmem[1] = 87380 and this value is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720) Note: Since commit 356f039822b limited default window to maximum of 10*1460 and 2*MSS, we use same heuristic in this patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: do not take an additional reference in skb_frag_set_pageIan Campbell2011-10-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I audited all of the callers in the tree and only one of them (pktgen) expects it to do so. Taking this reference is pretty obviously confusing and error prone. In particular I looked at the following commits which switched callers of (__)skb_frag_set_page to the skb paged fragment api: 6a930b9f163d7e6d9ef692e05616c4ede65038ec cxgb3: convert to SKB paged frag API. 5dc3e196ea21e833128d51eb5b788a070fea1f28 myri10ge: convert to SKB paged frag API. 0e0634d20dd670a89af19af2a686a6cce943ac14 vmxnet3: convert to SKB paged frag API. 86ee8130a46769f73f8f423f99dbf782a09f9233 virtionet: convert to SKB paged frag API. 4a22c4c919c201c2a7f4ee09e672435a3072d875 sfc: convert to SKB paged frag API. 18324d690d6a5028e3c174fc1921447aedead2b8 cassini: convert to SKB paged frag API. b061b39e3ae18ad75466258cf2116e18fa5bbd80 benet: convert to SKB paged frag API. b7b6a688d217936459ff5cf1087b2361db952509 bnx2: convert to SKB paged frag API. 804cf14ea5ceca46554d5801e2817bba8116b7e5 net: xfrm: convert to SKB frag APIs ea2ab69379a941c6f8884e290fdd28c93936a778 net: convert core to skb paged frag APIs Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud