summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* nfsd41: sunrpc: Added rpc server-side backchannel handlingRahul Iyer2009-09-113-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the call direction is a reply, copy the xid and call direction into the req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns rpc_garbage. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [get rid of CONFIG_NFSD_V4_1] [sunrpc: refactoring of svc_tcp_recvfrom] [nfsd41: sunrpc: create common send routine for the fore and the back channels] [nfsd41: sunrpc: Use free_page() to free server backchannel pages] [nfsd41: sunrpc: Document server backchannel locking] [nfsd41: sunrpc: remove bc_connect_worker()] [nfsd41: sunrpc: Define xprt_server_backchannel()[ [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions] [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()] [nfsd41: sunrpc: Don't auto close the server backchannel connection] [nfsd41: sunrpc: Remove unused functions] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: change bc_sock to bc_xprt] [nfsd41: sunrpc: move struct rpc_buffer def into a common header file] [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex] [removed cosmetic changes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: add new xprt class for nfsv4.1 backchannel] [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [reverted more cosmetic leftovers] [got rid of xprt_server_backchannel] [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@netapp.com> [sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <trond.myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: replace page based DRC with buffer based DRCAndy Adamson2009-09-012-20/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd pages in cache. Connectathon testing has shown that 1024 bytes for encoded compound operation responses past the sequence operation is sufficient, 512 bytes is a little too small. Set NFSD_SLOT_CACHE_SIZE to 1024. Allocate memory for the session DRC in the CREATE_SESSION operation to guarantee that the memory resource is available for caching responses. Allocate each slot individually in preparation for slot table size negotiation. Remove struct nfsd4_cache_entry and helper functions for the old page-based DRC. The iov_len calculation in nfs4svc_encode_compoundres is now always correct. Replay is now done in nfsd4_sequence under the state lock, so the session ref count is only bumped on non-replay. Clean up the nfs4svc_encode_compoundres session logic. The nfsd4_compound_state statp pointer is also not used. Remove nfsd4_set_statp(). Move useful nfsd4_cache_entry fields into nfsd4_slot. Signed-off-by: Andy Adamson <andros@netapp.com Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: bound forechannel drc size by memory usageAndy Adamson2009-09-011-2/+6
| | | | | | | | | | | | By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound a forechannel drc request size, clients can tailor a session to usage. For example, an I/O session (READ/WRITE only) can have a much smaller ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger ca_maxresponses to service a large in-flight data window. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: expand solo sequence checkAndy Adamson2009-08-281-1/+1
| | | | | | | | | | | | | | | Compounds consisting of only a sequence operation don't need any additional caching beyond the sequence information we store in the slot entry. Fix nfsd4_is_solo_sequence to identify this case correctly. The additional check for a failed sequence in nfsd4_store_cache_entry() is redundant, since the nfsd4_is_solo_sequence call lower down catches this case. The final ce_cachethis set in nfsd4_sequence is also redundant. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: populate sin6_scope_id on callback address with scopeid from rq_addr ↵Jeff Layton2009-08-211-0/+15
| | | | | | | | | | | | | | | | | on SETCLIENTID call When a SETCLIENTID call comes in, one of the args given is the svc_rqst. This struct contains an rq_addr field which holds the address that sent the call. If this is an IPv6 address, then we can use the sin6_scope_id field in this address to populate the sin6_scope_id field in the callback address. AFAICT, the rq_addr.sin6_scope_id is non-zero if and only if the client mounted the server's link-local address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: convert nfs4_cb_conn struct to hold address in sockaddr_storageJeff Layton2009-08-211-2/+2
| | | | | | | | | | | ...rather than as a separate address and port fields. This will be necessary for implementing callbacks over IPv6. Also, convert gen_callback to use the standard rpcuaddr2sockaddr routine rather than its own private one. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: make nfs4_client->cl_addr a struct sockaddr_storageJeff Layton2009-08-211-1/+1
| | | | | | | | It's currently a __be32, which isn't big enough to hold an IPv6 address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* sunrpc: add common routine for copying address portion of a sockaddrJeff Layton2009-08-211-0/+50
| | | | | | Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* sunrpc: add routine for comparing addressesJeff Layton2009-08-212-43/+48
| | | | | | | | | | | lockd needs these sort of routines, as does the NFSv4 callback code. Move lockd's routines into common code and rename them so that they can be used by others. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Merge branch 'nfs-for-2.6.32' of ↵J. Bruce Fields2009-08-21119-593/+1248
|\ | | | | | | | | | | | | git://git.linux-nfs.org/projects/trondmy/nfs-2.6 into for-2.6.32-incoming Conflicts: net/sunrpc/cache.c
| * nfs: fix compile error in rpc_pipefs.hJ. Bruce Fields2009-08-201-0/+2
| | | | | | | | | | | | | | This include is needed for the definition of delayed_work. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * Merge branch 'nfsv4_xdr_cleanups-for-2.6.32' into nfs-for-2.6.32Trond Myklebust2009-08-197-48/+224
| |\ | | | | | | | | | | | | Conflicts: fs/nfs/nfs4xdr.c
| | * sunrpc: ntoh -> be*_to_cpuBenny Halevy2009-08-141-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | ntohl is already defined as be32_to_cpu. be64_to_cpu has architecture specific optimized implementations. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * sunrpc: hton -> cpu_to_be*Benny Halevy2009-08-141-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | htonl is already defined as cpu_to_be32. cpu_to_be64 has architecture specific optimized implementations. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds2009-08-131-11/+38
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Report the cloning task as parent on perf_counter_fork() perf_counter: Fix an ipi-deadlock perf: Rework/fix the whole read vs group stuff perf_counter: Fix swcounter context invariance perf report: Don't show unresolved DSOs and symbols when -S/-d is used perf tools: Add a general option to enable raw sample records perf tools: Add a per tracepoint counter attribute to get raw sample perf_counter: Provide hw_perf_counter_setup_online() APIs perf list: Fix large list output by using the pager perf_counter, x86: Fix/improve apic fallback perf record: Add missing -C option support for specifying profile cpu perf tools: Fix dso__new handle() to handle deleted DSOs perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available perf report: Show the tid too in -D perf record: Fix .tid and .pid fill-in when synthesizing events perf_counter, x86: Fix generic cache events on P6-mobile CPUs perf_counter, x86: Fix lapic printk message
| | | * perf: Rework/fix the whole read vs group stuffPeter Zijlstra2009-08-131-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace PERF_SAMPLE_GROUP with PERF_SAMPLE_READ and introduce PERF_FORMAT_GROUP to deal with group reads in a more generic way. This allows you to get group reads out of read() as well. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey J Ashford <cjashfor@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> LKML-Reference: <20090813103655.117411814@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * perf_counter: Provide hw_perf_counter_setup_online() APIsIngo Molnar2009-08-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide weak aliases for hw_perf_counter_setup_online(). This is used by the BTS patches (for v2.6.32), but it interacts with fixes so propagate this upstream. (it has no effect as of yet) Also export perf_counter_output() to architecture code. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2009-08-131-1/+8
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: Fix handling of bad requeue syscall pairing futex: Fix compat_futex to be same as futex for REQUEUE_PI locking, sched: Give waitqueue spinlocks their own lockdep classes futex: Update futex_q lock_ptr on requeue proxy lock
| | | * | locking, sched: Give waitqueue spinlocks their own lockdep classesPeter Zijlstra2009-08-101-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Give waitqueue spinlocks their own lockdep classes when they are initialised from init_waitqueue_head(). This means that struct wait_queue::func functions can operate other waitqueues. This is used by CacheFiles to catch the page from a backing fs being unlocked and to wake up another thread to take a copy of it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Takashi Iwai <tiwai@suse.de> Cc: linux-cachefs@redhat.com Cc: torvalds@osdl.org Cc: akpm@linux-foundation.org LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | NFS: Fix an O_DIRECT Oops...Trond Myklebust2009-08-121-3/+2
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds2009-08-102-7/+20
| | |\ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits) perf_counter: Zero dead bytes from ftrace raw samples size alignment perf_counter: Subtract the buffer size field from the event record size perf_counter: Require CAP_SYS_ADMIN for raw tracepoint data perf_counter: Correct PERF_SAMPLE_RAW output perf tools: callchain: Fix bad rounding of minimum rate perf_counter tools: Fix libbfd detection for systems with libz dependency perf: "Longum est iter per praecepta, breve et efficax per exempla" perf_counter: Fix a race on perf_counter_ctx perf_counter: Fix tracepoint sampling to be part of generic sampling perf_counter: Work around gcc warning by initializing tracepoint record unconditionally perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode perf tools: callchain: Fix 'perf report' display to be callchain by default perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty callchains perf record: Fix the -A UI for empty or non-existent perf.data perf util: Fix do_read() to fail on EOF instead of busy-looping perf list: Fix the output to not include tracepoints without an id perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support perf stat: Fix tool option consistency: rename -S/--scale to -c/--scale perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc) perf report: Fix per task mult-counter stat reporting ...
| | | * perf_counter: Zero dead bytes from ftrace raw samples size alignmentFrederic Weisbecker2009-08-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After aligning the ftrace raw samples, there are dead bytes storing random data from the stack. We don't want to leak these to userspace, then zero these out. Before: 0x2de88 [0x50]: event: 9 . . ... raw event: size 80 bytes . 0000: 09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff ......P........ . 0010: 68 01 00 00 68 01 00 00 2c 00 00 00 00 00 00 00 h...h...,...... . 0020: 2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00 ,...+...h...h.. . 0030: 6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00 kondemand/0.... . 0040: 68 01 00 00 40 7f 46 81 ff ff ff ff 00 10 1b 7f h...@.F........ ^ ^ ^ ^ Leak After: 0x2d318 [0x50]: event: 9 . . ... raw event: size 80 bytes . 0000: 09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff ......P........ . 0010: 68 01 00 00 68 01 00 00 68 14 00 00 00 00 00 00 h...h...h...... . 0020: 2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00 ,...+...h...h.. . 0030: 6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00 kondemand/0.... . 0040: 68 01 00 00 a0 80 46 81 ff ff ff ff 00 00 00 00 h.....F........ ^ ^ ^ ^ Fixed Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249915116-5210-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de>
| | | * perf_counter: Subtract the buffer size field from the event record sizeFrederic Weisbecker2009-08-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We compute the perf raw sample size by aligning the raw ftrace event size plus the buffer size field itself. We do that instead of aligning only the perf raw sample size, so that we might economize some in some cases. But this buffer size field is not stored in the perf raw sample, we must then substract its size from the buffer once we computed the alignment unless we may get a useless u32 field in the buffer. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090810141129.GA5124@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * perf_counter: Correct PERF_SAMPLE_RAW outputPeter Zijlstra2009-08-102-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PERF_SAMPLE_* output switches should unconditionally output the correct format, as they are the only way to unambiguously parse the PERF_EVENT_SAMPLE data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249896447.17467.74.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * perf_counter: Fix tracepoint sampling to be part of generic samplingFrederic Weisbecker2009-08-091-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on Peter's comments, make tracepoint sampling generic just like all the other sampling bits are. This is a rename with no code changes: - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW - struct perf_tracepoint_record to perf_raw_record We want the system in place that transport tracepoints raw samples events into the perf ring buffer to be generalized and usable by any type of counter. Reported-by; Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2009-08-091-0/+1
| | |\ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Avoid redelivery of edge interrupt before next edge KVM: MMU: limit rmap chain length KVM: ia64: fix build failures due to ia64/unsigned long mismatches KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc KVM: fix ack not being delivered when msi present KVM: s390: fix wait_queue handling KVM: VMX: Fix locking imbalance on emulation failure KVM: VMX: Fix locking order in handle_invalid_guest_state KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages KVM: SVM: force new asid on vcpu migration KVM: x86: verify MTRR/PAT validity KVM: PIT: fix kpit_elapsed division by zero KVM: Fix KVM_GET_MSR_INDEX_LIST
| | | * KVM: fix ack not being delivered when msi presentMichael S. Tsirkin2009-08-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_notify_acked_irq does not check irq type, so that it sometimes interprets msi vector as irq. As a result, ack notifiers are not called, which typially hangs the guest. The fix is to track and check irq type. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| | * | perf_counter: Fix/complete ftrace event records samplingFrederic Weisbecker2009-08-093-36/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the kernel side support for ftrace event record sampling. A new counter sampling attribute is added: PERF_SAMPLE_TP_RECORD which requests ftrace events record sampling. In this case if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint fires, we emit the tracepoint binary record to the perfcounter event buffer, as a sample. Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf record: perf record -f -F 1 -a -e workqueue:workqueue_execution perf report -D 0x21e18 [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........ . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!...... . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1........... . 0040: e0 b1 31 81 ff ff ff ff ....... . 0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33 The raw ftrace binary record starts at offset 0020. Translation: struct trace_entry { type = 0x2b = 43; flags = 1; preempt_count = 2; pid = 0xa = 10; tgid = 0xa = 10; } thread_comm = "events/1" thread_pid = 0xa = 10; func = 0xffffffff8131b1e0 = flush_to_ldisc() What will come next? - Userspace support ('perf trace'), 'flight data recorder' mode for perf trace, etc. - The unconditional copy from the profiling callback brings some costs however if someone wants no such sampling to occur, and needs to be fixed in the future. For that we need to have an instant access to the perf counter attribute. This is a matter of a flag to add in the struct ftrace_event. - Take care of the events recursivity! Don't ever try to record a lock event for example, it seems some locking is used in the profiling fast path and lead to a tracing recursivity. That will be fixed using raw spinlock or recursivity protection. - [...] - Profit! :-) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | perf_counter, ftrace: Fix perf_counter integrationPeter Zijlstra2009-08-091-25/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds possible second part to the assign argument of TP_EVENT(). TP_perf_assign( __perf_count(foo); __perf_addr(bar); ) Which, when specified make the swcounter increment with @foo instead of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address associated with the event) when this triggers a counter overflow. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32Trond Myklebust2009-08-103-13/+50
| |\ \ \
| | * | | SUNRPC: Add an rpc_pipefs front end for the sunrpc cache codeTrond Myklebust2009-08-092-0/+21
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Move procfs-specific stuff out of the generic sunrpc cache codeTrond Myklebust2009-08-091-2/+9
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Allow the cache_detail to specify alternative upcall mechanismsTrond Myklebust2009-08-091-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For events that are rare, such as referral DNS lookups, it makes limited sense to have a daemon constantly listening for upcalls on a channel. An alternative in those cases might simply be to run the app that fills the cache using call_usermodehelper_exec() and friends. The following patch allows the cache_detail to specify alternative upcall mechanisms for these particular cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | NFSD: Clean up the idmapper warning...Trond Myklebust2009-08-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What part of 'internal use' is so hard to understand? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: clean up rpc_setup_pipedir()Trond Myklebust2009-08-092-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is still a little wart or two there: Since we've already got a vfsmount, we might as well pass that in to rpc_create_client_dir. Another point is that if we open code __rpc_lookup_path() here, then we can avoid looking up the entire parent directory path over and over again: it doesn't change. Also get rid of rpc_clnt->cl_pathname, since it has no users... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Replace rpc_client->cl_dentry and cl_mnt, with a cl_pathTrond Myklebust2009-08-091-2/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Rename rpc_mkdir to rpc_create_client_dir()Trond Myklebust2009-08-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reflects the fact that rpc_mkdir() as it stands today, can only create a RPC client type directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Constify rpc_pipe_ops...Trond Myklebust2009-08-091-2/+3
| | |/ / | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32Trond Myklebust2009-08-103-3/+54
| |\ \ \
| | * | | SUNRPC: Kill RPC_DISPLAY_ALLChuck Lever2009-08-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At some point, I recall that rpc_pipe_fs used RPC_DISPLAY_ALL. Currently there are no uses of RPC_DISPLAY_ALL outside the transport modules themselves, so we can safely get rid of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Remove duplicate universal address generationChuck Lever2009-08-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RPC universal address generation is currently done in several places: rpcb_clnt.c, nfs4proc.c xprtsock.c, and xprtrdma.c. Remove the redundant cases that convert a socket address to a universal address. The nfs4proc.c case takes a pre-formatted presentation address string, not a socket address, so we'll leave that one. Because the new uaddr constructor uses the recently introduced rpc_ntop(), it now supports proper "::" shorthanding for IPv6 addresses. This allows the kernel to register properly formed universal addresses with the local rpcbind service, in _all_ cases. The kernel can now also send properly formed universal addresses in RPCB_GETADDR requests, and support link-local properly when encoding and decoding IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Provide functions for managing universal addressesChuck Lever2009-08-091-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | SUNRPC: Clean up RPCBIND_MAXUADDRLEN definitionsChuck Lever2009-08-091-1/+16
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up: Replace the single-integer definition of RPCBIND_MAXUADDRLEN with a definition that is based on previously defined address string sizes, and document the way this maximum is calculated. Also provide a separate macro for the size of the port number extension. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4: Add 'server capability' flags for NFSv4 recommended attributesTrond Myklebust2009-08-091-0/+9
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code needs to know that, so that it don't keep trying to poll for it. However, by the same count, if the NFSv4 server does support that attribute, then we should ensure that the inode metadata is appropriately labelled as being untrusted. For instance, if we don't know the correct value of the file's uid, we should certainly not be caching ACLs or ACCESS results. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/hch/xfs-icache-racesLinus Torvalds2009-08-071-1/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/hch/xfs-icache-races: xfs: fix freeing of inodes not yet added to the inode cache vfs: add __destroy_inode vfs: fix inode_init_always calling convention
| | * | vfs: add __destroy_inodeChristoph Hellwig2009-08-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we want to tear down an inode that lost the add to the cache race in XFS we must not call into ->destroy_inode because that would delete the inode that won the race from the inode cache radix tree. This patch provides the __destroy_inode helper needed to fix this, the actual fix will be in th next patch. As XFS was the only reason destroy_inode was exported we shift the export to the new __destroy_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
| | * | vfs: fix inode_init_always calling conventionChristoph Hellwig2009-08-071-1/+1
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently inode_init_always calls into ->destroy_inode if the additional initialization fails. That's not only counter-intuitive because inode_init_always did not allocate the inode structure, but in case of XFS it's actively harmful as ->destroy_inode might delete the inode from a radix-tree that has never been added. This in turn might end up deleting the inode for the same inum that has been instanciated by another process and cause lots of cause subtile problems. Also in the case of re-initializing a reclaimable inode in XFS it would free an inode we still want to keep alive. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
| * | Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds2009-08-071-5/+3
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Fix double list iteration in per task precise stats perf: Auto-detect libelf perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it ftrace: Fix perf-tracepoint OOPS perf report: Add missing command line options to man page perf: Auto-detect libbfd perf report: Make --sort comm,dso,symbol the default
| | * | ftrace: Fix perf-tracepoint OOPSPeter Zijlstra2009-08-061-5/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not all tracepoints are created equal, in specific the ftrace tracepoints are created with TRACE_EVENT_FORMAT() which does not generate the needed bits to tie them into perf counters. For those events, don't create the 'id' file and fail ->profile_enable when their ID is specified through other means. Reported-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1249497664.5890.4.camel@laptop> [ v2: fix build error in the !CONFIG_EVENT_PROFILE case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge git://git.infradead.org/mtd-2.6Linus Torvalds2009-08-072-1/+3
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.infradead.org/mtd-2.6: jffs2: Fix return value from jffs2_do_readpage_nolock() mtd: mtdblock: introduce mtdblks_lock mtd: remove 'SBC8240 Wind River' Device Driver Code mtd: OneNAND: OMAP2/3: free GPMC CS on module removal mtd: OneNAND: fix incorrect bufferram offset mtd: blkdevs: do not forget to get MTD devices mtd: fix the conversion from dev to mtd_info mtd: let include/linux/mtd/partitions.h stand on its own
OpenPOWER on IntegriCloud