op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	net: In __netif_schedule() use WARN_ON instead of BUG_ON	Linus Torvalds	2008-07-21	1	-1/+2
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Improve simple_tx_hash().	David S. Miller	2008-07-21	1	-13/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based upon feedback from Eric Dumazet and Andi Kleen. Cure several deficiencies in simple_tx_hash() by using jhash + reciprocol multiply. 1) Eliminates expensive modulus operation. 2) Makes hash less attackable by using random seed. 3) Eliminates endianness hash distribution issues. Signed-off-by: David S. Miller <davem@davemloft.net>
*	pkt_sched: Remove unused variable skb in dev_deactivate_queue function.	Daniel Lezcano	2008-07-21	1	-3/+0
\| \| \| \| \| \| \|	Removed unused variable 'skb' in the dev_deactivate_queue function Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sunhme: Remove stop/wake TX queue calls in set-multicast-list handler.	David S. Miller	2008-07-21	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \|	Based upon a bug report by Alexander Beregalov and commentary from Ben Hutchings. These are totally unnecessary, in particular because this driver's ->hard_start_xmit() handler takes the same driver spinlock that the set-multicast-list handler uses. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ucc_geth: do not touch net queue in adjust_link phylib callback	Anton Vorontsov	2008-07-21	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the net queue has not been started, we'll get this nice oops and non-working ethernet: ------------[ cut here ]------------ Kernel BUG at c01f4648 [verbose debug info unavailable] Oops: Exception in kernel mode, sig: 5 [#1] MPC836x RDK Modules linked in: NIP: c01f4648 LR: c01c0a10 CTR: c01c08e4 REGS: cf839e40 TRAP: 0700 Not tainted (2.6.26-05254-gc7b9969) MSR: 00021032 <ME,IR,DR> CR: 22042044 XER: 00000000 TASK = cf828c30[4] 'events/0' THREAD: cf838000 GPR00: c01c0a10 cf839ef0 cf828c30 c035ceb0 cf8469a0 00000064 00000000 00000000 GPR08: c035ceb0 00000001 00000001 cf99c280 22044044 7ca81020 0fffc000 00000000 GPR16: 0fff2544 0fff63c0 00000000 0fff78e0 0ffa5580 00000004 00000000 00000000 GPR24: 02082000 cf9d0000 d1068000 00009032 cf846800 cf846b80 00000001 00000014 NIP [c01f4648] __netif_schedule+0x28/0x8c LR [c01c0a10] adjust_link+0x12c/0x1e4 Call Trace: [cf839ef0] [c0380f50] 0xc0380f50 (unreliable) [cf839f10] [c01c0a10] adjust_link+0x12c/0x1e4 [cf839f40] [c01c2628] phy_state_machine+0x2e0/0x448 [cf839f60] [c00425e8] run_workqueue+0xc8/0x168 [cf839f90] [c0042c6c] worker_thread+0x70/0xd0 [cf839fd0] [c0046954] kthread+0x48/0x84 [cf839ff0] [c0012488] kernel_thread+0x44/0x60 Instruction dump: 7c0803a6 4e800020 3d20c036 9421ffe0 7c0802a6 7c681b78 3929ceb0 7c694a78 7d290034 90010024 bfa10014 5529d97e <0f090000> 39600002 38030024 7d200028 ---[ end trace a57d367843bd2904 ]--- Since the driver is using phylib (which is doing netif_carrier_on/off()), we should simply remove netif_tx_schedule_all() from adjust_link(). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	gianfar: do not touch net queue in adjust_link phylib callback	Anton Vorontsov	2008-07-21	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the net queue has not been started, we'll get this nice oops and non-working ethernet: PHY: 0:01 - Link is Up - 1000/Full ------------[ cut here ]------------ kernel BUG at net/core/dev.c:1328! Oops: Exception in kernel mode, sig: 5 [#1] MPC837x RDB Modules linked in: NIP: c02544a0 LR: c01a17d0 CTR: c01a16ac REGS: cf837e40 TRAP: 0700 Not tainted (2.6.26-05253-g14b395e) MSR: 00021032 <ME,IR,DR> CR: 22042044 XER: 00000000 TASK = cf819400[5] 'events/0' THREAD: cf836000 GPR00: c01a17d0 cf837ef0 cf819400 c03d8d08 cf8469a0 00000064 00000000 00000000 GPR08: c03d8d08 00000001 00000001 cf899ba0 22044044 00000000 0fffd000 00000000 GPR16: 0fff3028 0fff6cf0 00000000 0fff8390 0ff494a0 00000004 00000000 00000000 GPR24: c0361a00 00001058 cf9f6600 00009032 cf846800 cf846b80 00000001 00000014 NIP [c02544a0] __netif_schedule+0x28/0x8c LR [c01a17d0] adjust_link+0x124/0x1cc Call Trace: [cf837ef0] [c03fb3a0] 0xc03fb3a0 (unreliable) [cf837f10] [c01a17d0] adjust_link+0x124/0x1cc [cf837f40] [c01a8e28] phy_state_machine+0x2e0/0x448 [cf837f60] [c0040254] run_workqueue+0xc8/0x168 [cf837f90] [c00408d8] worker_thread+0x70/0xd0 [cf837fd0] [c0044630] kthread+0x48/0x84 [cf837ff0] [c0012610] kernel_thread+0x44/0x60 Instruction dump: 7c0803a6 4e800020 3d20c03e 9421ffe0 7c0802a6 7c681b78 39298d08 7c694a78 7d290034 90010024 bfa10014 5529d97e <0f090000> 39600002 38030024 7d200028 ---[ end trace 13dfd73ee42d0c30 ]--- Since the driver is using phylib (which is doing netif_carrier_on/off()), we should simply remove netif_tx_schedule_all() from adjust_link(). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	atl1: Do not wake queue before queue has been started.	David S. Miller	2008-07-21	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based upon a bug report by Alexey Dobriyan, the patch is also tested by him and confirmed to fix the problem. Packet flow during link state events should not be done by waking and stopping the TX queue anyways, that is handled transparently by netif_carrier_{on,off}(). So, remove the netif_{wake,stop}_queue() calls in the link check code, and add the necessary netif_start_queue() call to atl1_up(). Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	2008-07-20	36	-966/+1042
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...
\| *	nfsd: nfs4xdr.c do-while is not a compound statement	Harvey Harrison	2008-07-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The WRITEMEM macro produces sparse warnings of the form: fs/nfsd/nfs4xdr.c:2668:2: warning: do-while statement is not a compound statement Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c	J. Bruce Fields	2008-07-18	1	-74/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to problem report and original patch from Harvey Harrison. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benny Halevy <bhalevy@panasas.com>
\| *	lockd: Pass "struct sockaddr *" to new failover-by-IP function	Chuck Lever	2008-07-15	3	-15/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to allow for future support of IPv6. Also provide additional sanity checking in failover_unlock_ip() when constructing the server's IP address. As an added bonus, provide clean kerneldoc comments on related NLM interfaces which were recently added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	lockd: get host reference in nlmsvc_create_block() instead of callers	J. Bruce Fields	2008-07-15	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It may not be obvious (till you look at the definition of nlm_alloc_call()) that a function like nlmsvc_create_block() should consume a reference on success or failure, so I find it clearer if it takes the reference it needs itself. And both callers already do this immediately before the call anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	lockd: minor svclock.c style fixes	J. Bruce Fields	2008-07-15	1	-5/+5
\| \| \| \| \| \| \| \|	Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock	Jeff Layton	2008-07-15	4	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this function, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_lock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. Since nlmsvc_testlock() now just uses the caller's reference, we no longer need to get or release it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock	Jeff Layton	2008-07-15	4	-12/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this functions, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_testlock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. We take a reference to host in the place where nlmsvc_testlock() previous did a new lookup, so the reference counting is unchanged from before. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	lockd: nlm_release_host() checks for NULL, caller needn't	Jeff Layton	2008-07-15	2	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	No need to check for a NULL argument twice. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	file lock: reorder struct file_lock to save space on 64 bit builds	Richard Kennedy	2008-07-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduce sizeof struct file_lock by 8 on 64 bit builds allowing +1 objects per slab in the file_lock_cache Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	nfsd: take file and mnt write in nfs4_upgrade_open	Benny Halevy	2008-07-07	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	testing with newpynfs revealed this warning: Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write() Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------ Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e() Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc Jul 3 07:32:50 buml kernel: Call Trace: Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793 Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd] Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd] Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd] Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd] Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc] Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100 Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd] Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd] Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c Jul 3 07:32:50 buml kernel: Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]--- Bruce Fields suggested this (Thanks!): maybe we need to be doing a mnt_want_write on open_upgrade and mnt_put_write on downgrade? This patch adds a call to mnt_want_write and file_take_write (which is doing the actual work). The counter-calls mnt_drop_write a file_release_write are now being properly called by drop_file_write_access in the exact path printed by the warning above. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	nfsd: document open share bit tracking	J. Bruce Fields	2008-07-07	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not immediately obvious from the code why we're doing this. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Benny Halevy <bhalevy@panasas.com>
\| *	nfsd: tabulate nfs4 xdr encoding functions	Benny Halevy	2008-07-04	1	-114/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for minorversion 1 All encoders now return an nfserr status (typically their nfserr argument). Unsupported ops go through nfsd4_encode_operation too, so use nfsd4_encode_noop to encode nothing for their reply body. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| *	Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 ↵	J. Bruce Fields	2008-07-03	1742	-18919/+31340
\| \|\ \| \| \| \| \| \| \| \| \|	into for-2.6.27
\| \| *	svcrdma: Change WR context get/put to use the kmem cache	Tom Tucker	2008-07-02	2	-115/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the WR context pool to be shared across mount points. This reduces the RDMA transport memory footprint significantly since idle mounts don't consume WR context memory. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Create a kmem cache for the WR contexts	Tom Tucker	2008-07-02	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a kmem cache to hold WR contexts. Next we will convert the WR context get and put services to use this kmem cache. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Add flush_scheduled_work to module exit function	Tom Tucker	2008-07-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make certain all transports pending free are flushed from the wq before unloading the module. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Limit ORD based on client's advertised IRD	Tom Tucker	2008-07-02	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When adapters have differing IRD limits, the RDMA transport will fail to connect properly. The RDMA transport should use the client's advertised inbound read limit when computing its outbound read limit. For iWARP transports, there is currently no standard for exchanging IRD/ORD during connection establishment so the 'responder_resources' field in the connect event is the local device's limit. The RDMA transport can be configured to use a smaller ORD by writing the desired number to the /proc/sys/sunrpc/svc_rdma/max_outbound_read_requests file. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Remove unused wait q from svcrdma_xprt structure	Tom Tucker	2008-07-02	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sc_read_wait queue head is no longer used. Remove it. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Remove unneeded spin locks from __svc_rdma_free	Tom Tucker	2008-07-02	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the time __svc_rdma_free is called, we are guaranteed that all references to this transport are gone. There is, therefore, no need to protect the resource lists with a spin lock. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Add dma map count and WARN_ON	Tom Tucker	2008-07-02	4	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a dma map count in order to verify that all DMA mapping resources have been freed when the transport is closed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Move the DMA unmap logic to the CQ handler	Tom Tucker	2008-07-02	1	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Separate DMA unmap from context destruction and perform DMA unmapping in the SQ/RQ CQ reap functions. This is necessary to support software based RDMA implementations that actually copy the data in their ib_dma_unmap callback functions and architectures that don't have cache coherent I/O busses. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Use reply and chunk map for RDMA_READ processing	Tom Tucker	2008-07-02	2	-45/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modify the RDMA_READ processing to use the reply and chunk list mapping data types. Also add a special purpose 'hdr_count' field in in the context to hold the header page count instead of overloading the SGE length field and corrupting the DMA map length. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Use RPC reply map for RDMA_WRITE processing	Tom Tucker	2008-07-02	2	-88/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the new svc_rdma_req_map data type for mapping the client side memory to the server side memory. Move the DMA mapping to the context pointed to by each WR individually so that it is unmapped after the WR completes. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| \| *	svcrdma: Add a type for keeping NFS RPC mapping	Tom Tucker	2008-07-02	3	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a new data structure to hold the remote client address space to local server address space mapping. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
\| * \|	nfsd: dprint operation names	Benny Halevy	2008-07-02	1	-2/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: nfs4 minorversion decoder vectors	Benny Halevy	2008-07-02	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Have separate vectors of operation decoders for each minorversion. Obsolete ops in newer minorversions have default implementation returning nfserr_opnotsupp. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: unsupported nfs4 ops should fail with nfserr_opnotsupp	Benny Halevy	2008-07-02	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nfserr_opnotsupp should be returned for unsupported nfs4 ops rather than nfserr_op_illegal. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: tabulate nfs4 xdr decoding functions	Benny Halevy	2008-07-02	1	-105/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for minorversion 1 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: return nfserr_minor_vers_mismatch when compound minorversion != 0	Benny Halevy	2008-07-02	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check minorversion once before decoding any operation and reject with nfserr_minor_vers_mismatch if != 0 (this still happens in nfsd4_proc_compound). In this case return a zero length resultdata array as required by RFC3530. minorversion 1 processing will have its own vector of decoders. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: clean up mnt_want_write calls	Miklos Szeredi	2008-07-01	1	-14/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple mnt_want_write() calls in the switch statement looks really ugly. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: treat all shutdown signals as equivalent	Jeff Layton	2008-06-30	1	-25/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	knfsd currently uses 2 signal masks when processing requests. A "loose" mask (SHUTDOWN_SIGS) that it uses when receiving network requests, and then a more "strict" mask (ALLOWED_SIGS, which is just SIGKILL) that it allows when doing the actual operation on the local storage. This is apparently unnecessarily complicated. The underlying filesystem should be able to sanely handle a signal in the middle of an operation. This patch removes the signal mask handling from knfsd altogether. When knfsd is started as a kthread, all signals are ignored. It then allows all of the signals in SHUTDOWN_SIGS. There's no need to set the mask as well. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfs: rewrap NFS/RDMA documentation to 80 lines	J. Bruce Fields	2008-06-30	1	-19/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wrap long lines. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	update NFS/RDMA documentation	James Lentini	2008-06-30	1	-31/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the NFS/RDMA documentation to clarify how to run mount.nfs. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: fix spurious EACCESS in reconnect_path()	Neil Brown	2008-06-30	1	-3/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to Frank Van Maarseveen for the original problem report: "A privileged process on an NFS client which drops privileges after using them to change the current working directory, will experience incorrect EACCES after an NFS server reboot. This problem can also occur after memory pressure on the server, particularly when the client side is quiet for some time." This occurs because the filehandle points to a directory whose parents are no longer in the dentry cache, and we're attempting to reconnect the directory to its parents without adequate permissions to perform lookups in the parent directories. We can therefore fix the problem by acquiring the necessary capabilities before attempting the reconnection. We do this only in the no_subtree_check case, since the documented behavior of the subtree_check export option requires the server to check that the user has lookup permissions on all parents. The subtree_check case still has a problem, since reconnect_path() unnecessarily requires both read and lookup permissions on all parent directories. However, a fix in that case would be more delicate, and use of subtree_check is already discouraged for other reasons. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	gss_krb5: Use random value to initialize confounder	Kevin Coffman	2008-06-23	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initialize the value used for the confounder to a random value rather than starting from zero. Allow for confounders of length 8 or 16 (which will be needed for AES). Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	gss_krb5: move gss_krb5_crypto into the krb5 module	Kevin Coffman	2008-06-23	2	-12/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gss_krb5_crypto.o object belongs in the rpcsec_gss_krb5 module. Also, there is no need to export symbols from gss_krb5_crypto.c Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	gss_krb5: create a define for token header size and clean up ptr location	Kevin Coffman	2008-06-23	4	-46/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cleanup: Document token header size with a #define instead of open-coding it. Don't needlessly increment "ptr" past the beginning of the header which makes the values passed to functions more understandable and eliminates the need for extra "krb5_hdr" pointer. Clean up some intersecting white-space issues flagged by checkpatch.pl. This leaves the checksum length hard-coded at 8 for DES. A later patch cleans that up. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: rename MAY_ flags	Miklos Szeredi	2008-06-23	10	-97/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it clear, that these are not used outside nfsd, and to avoid name and number space conflicts with the VFS. [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	knfsd: nfsd: Handle ERESTARTSYS from syscalls.	NeilBrown	2008-06-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OCFS2 can return -ERESTARTSYS from write requests (and possibly elsewhere) if there is a signal pending. If nfsd is shutdown (by sending a signal to each thread) while there is still an IO load from the client, each thread could handle one last request with a signal pending. This can result in -ERESTARTSYS which is not understood by nfserrno() and so is reflected back to the client as nfserr_io aka -EIO. This is wrong. Instead, interpret ERESTARTSYS to mean "try again later" by returning nfserr_jukebox. The client will resend and - if the server is restarted - the write will (hopefully) be successful and everyone will be happy. The symptom that I narrowed down to this was: copy a large file via NFS to an OCFS2 filesystem, and restart the nfs server during the copy. The 'cp' might get an -EIO, and the file will be corrupted - presumably holes in the middle where writes appeared to fail. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	nfsd: fix race in nfsd_nrthreads()	Neil Brown	2008-06-23	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need the nfsd_mutex before accessing nfsd_serv->sv_nrthreads or we can't even guarantee nfsd_serv will still be there. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	lockd: close potential race with rapid lockd_up/lockd_down cycle	Jeff Layton	2008-06-23	1	-20/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If lockd_down is called very rapidly after lockd_up returns, then there is a slim chance that lockd() will never be called. kthread() will return before calling the function, so we'll end up never actually calling the cleanup functions for the thread. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
\| * \|	sunrpc: remove sv_kill_signal field from svc_serv struct	Jeff Layton	2008-06-23	3	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we no longer make any distinction between shutdown signals with nfsd, then it becomes easier to just standardize on a particular signal to use to bring it down (SIGINT, in this case). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>