summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* lguest: Add support for kvm_hypercall4()Matias Zabaljauregui2009-06-122-10/+25
| | | | | | | Add support for kvm_hypercall4(); PAE wants it. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGDMatias Zabaljauregui2009-06-125-6/+6
| | | | | | | | | replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name (That's really what it is, and the confusion gets worse with PAE support) Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
* lguest: use native_set_* macros, which properly handle 64-bit entries when ↵Matias Zabaljauregui2009-06-122-21/+22
| | | | | | | | | PAE is activated Some cleanups and replace direct assignment with native_set_* macros which properly handle 64-bit entries when PAE is activated Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: map switcher with executable page table entriesMatias Zabaljauregui2009-06-122-2/+2
| | | | | | | | Map switcher with executable page table entries. (This bug didn't matter before PAE and hence NX support -- RR) Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: fix writev returning short on console outputRusty Russell2009-06-121-1/+6
| | | | | | | I've never seen it here, but I can't find anywhere that says writev will write everything. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: clean up length-used value in example launcherRusty Russell2009-06-121-7/+4
| | | | | | | | | | | The "len" field in the used ring for virtio indicates the number of bytes *written* to the buffer. This means the guest doesn't have to zero the buffers in advance as it always knows the used length. Erroneously, the console and network example code puts the length *read* into that field. The guest ignores it, but it's wrong. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Segment selectors are 16-bit long. Fix lg_cpu.ss1 definition.Matias Zabaljauregui2009-06-121-1/+1
| | | | | | | If GDT_ENTRIES were every > 256, this could become a problem. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: beyond ARRAY_SIZE of cpu->arch.gdtRoel Kluin2009-06-121-1/+1
| | | | | | | Do not go beyond ARRAY_SIZE of cpu->arch.gdt Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: clean up example launcher compile flags.Rusty Russell2009-06-121-2/+1
| | | | | | | | | | | | 18 months ago 5bbf89fc260830f3f58b331d946a16b39ad1ca2d changed to loading bzImages directly, and no longer manually ungzipping them, so we no longer need libz. Also, -m32 is useful for those on 64-bit platforms (and harmless on 32-bit). Reported-by: Ron Minnich <rminnich@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: optimize by coding restore_flags and irq_enable in assembler.Rusty Russell2009-06-123-30/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The downside of the last patch which made restore_flags and irq_enable check interrupts is that they are now too big to be patched directly into the callsites, so the C versions are always used. But the C versions go via PV_CALLEE_SAVE_REGS_THUNK which saves all the registers. In fact, we don't need any registers in the fast path, so we can do better than this if we actually code them in assembler. The results are in the noise, but since it's about the same amount of code, it's worth applying. 1GB Guest->Host: input(suppressed),output(suppressed) Before: Seconds: 0:16.53 Packets: 377268,753673 Interrupts: 22461,24297 Notifications: 1(5245),21303(732370) Net IRQs triggered: 377023(245),42578(711095) After: Seconds: 0:16.48 Packets: 377289,753673 Interrupts: 22281,24465 Notifications: 1(5245),21296(732377) Net IRQs triggered: 377060(229),42564(711109) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: improve interrupt handling, speed up stream networkingRusty Russell2009-06-128-16/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | lguest never checked for pending interrupts when enabling interrupts, and things still worked. However, it makes a significant difference to TCP performance, so it's time we fixed it by introducing a pending_irq flag and checking it on irq_restore and irq_enable. These two routines are now too big to patch into the 8/10 bytes patch space, so we drop that code. Note: The high latency on interrupt delivery had a very curious effect: once everything else was optimized, networking without GSO was faster than networking with GSO, since more interrupts were sent and hence a greater chance of one getting through to the Guest! Note2: (Almost) Closing the same loophole for iret doesn't have any measurable effect, so I'm leaving that patch for the moment. Before: 1GB tcpblast Guest->Host: 30.7 seconds 1GB tcpblast Guest->Host (no GSO): 76.0 seconds After: 1GB tcpblast Guest->Host: 6.8 seconds 1GB tcpblast Guest->Host (no GSO): 27.8 seconds Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: fix race in halt codeRusty Russell2009-06-123-12/+31
| | | | | | | | | | | | | When the Guest does the LHCALL_HALT hypercall, we go to sleep, expecting that a timer or the Waker will wake_up_process() us. But we do it in a stupid way, leaving a classic missing wakeup race. So split maybe_do_interrupt() into interrupt_pending() and try_deliver_interrupt(), and check maybe_do_interrupt() and the "break_out" flag before calling schedule. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: remove invalid interrupt forcing logic.Rusty Russell2009-06-121-7/+1
| | | | | | | | | | | | | | | | | | 20887611523e749d99cc7d64ff6c97d27529fbae (lguest: notify on empty) introduced lguest support for the VIRTIO_F_NOTIFY_ON_EMPTY flag, but in fact it turned on interrupts all the time. Because we always process one buffer at a time, the inflight count is always 0 when call trigger_irq and so we always ignore VRING_AVAIL_F_NO_INTERRUPT from the Guest. It should be looking to see if there are more buffers in the Guest's queue: if it's empty, then we force an interrupt. This makes little difference, since we usually have an empty queue; but that's the subject of another patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: fix lguest wake on guest clock tick, or fd activityRusty Russell2009-06-122-5/+5
| | | | | | | | | The Launcher could be inside the Guest on another CPU; wake_up_process will do nothing because it is "running". kick_process will knock it back into our kernel in this case, otherwise we'll miss it until the next guest exit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* sched: export kick_processRusty Russell2009-06-121-0/+1
| | | | | | | | | | lguest needs kick_process: wake_up_process() does nothing if a process is running, which isn't sufficient (we need it in the kernel). And lguest support is usually modular. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>
* lguest: get more serious about wmb() in example Launcher codeRusty Russell2009-06-121-3/+4
| | | | | | | | | | Since the Launcher process runs the Guest, it doesn't have to be very serious about its barriers: the Guest isn't running while we are (Guest is UP). Before we change to use threads to service devices, we need to fix this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: clean up lguest_init_IRQRusty Russell2009-06-121-5/+4
| | | | | | | Copy from arch/x86/kernel/irqinit_32.c: we don't use the vectors beyond LGUEST_IRQS (if any), but we might as well set them all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: cleanup passing of /dev/lguest fd around example launcher.Rusty Russell2009-06-121-55/+47
| | | | | | | We hand the /dev/lguest fd everywhere; it's far neater to just make it a global (it already is, in fact, hidden in the waker_fds struct). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: be paranoid about guest playing with device descriptors.Rusty Russell2009-06-121-10/+17
| | | | | | | | | | | We can't trust the values in the device descriptor table once the guest has booted, so keep local copies. They could set them to strange values then cause us to segv (they're 8 bit values, so they can't make our pointers go too wild). This becomes more important with the following patches which read them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* block: fix kernel-doc in recent block/ changesRandy Dunlap2009-06-112-13/+14
| | | | | | | Fix kernel-doc warnings in recently changed block/ source code. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-06-11147-1834/+1707
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (87 commits) nilfs2: get rid of bd_mount_sem use from nilfs nilfs2: correct exclusion control in nilfs_remount function nilfs2: simplify remaining sget() use nilfs2: get rid of sget use for checking if current mount is present nilfs2: get rid of sget use for acquiring nilfs object nilfs2: remove meaningless EBUSY case from nilfs_get_sb function remove the call to ->write_super in __sync_filesystem nilfs2: call nilfs2_write_super from nilfs2_sync_fs jffs2: call jffs2_write_super from jffs2_sync_fs ufs: add ->sync_fs sysv: add ->sync_fs hfsplus: add ->sync_fs hfs: add ->sync_fs fat: add ->sync_fs ext2: add ->sync_fs exofs: add ->sync_fs bfs: add ->sync_fs affs: add ->sync_fs sanitize ->fsync() for affs repair bfs_write_inode(), switch bfs to simple_fsync() ...
| * nilfs2: get rid of bd_mount_sem use from nilfsRyusuke Konishi2009-06-114-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | This will remove every bd_mount_sem use in nilfs. The intended exclusion control was replaced by the previous patch ("nilfs2: correct exclusion control in nilfs_remount function") for nilfs_remount(), and this patch will replace remains with a new mutex that this inserts in nilfs object. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: correct exclusion control in nilfs_remount functionRyusuke Konishi2009-06-113-31/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nilfs_remount() changes mount state of a superblock instance. Even though nilfs accesses other superblock instances during mount or remount, the mount state was not properly protected in nilfs_remount(). Moreover, nilfs_remount() has a lock order reversal problem; nilfs_get_sb() holds: 1. bdev->bd_mount_sem 2. sb->s_umount (sget acquires) and nilfs_remount() holds: 1. sb->s_umount (locked by the caller in vfs) 2. bdev->bd_mount_sem To avoid these problems, this patch divides a semaphore protecting super block instances from nilfs->ns_sem, and applies it to the mount state protection in nilfs_remount(). With this change, bd_mount_sem use is removed from nilfs_remount() and the lock order reversal will be resolved. And the new rw-semaphore, nilfs->ns_super_sem will properly protect the mount state except the modification from nilfs_error function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: simplify remaining sget() useRyusuke Konishi2009-06-114-25/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This simplifies the test function passed on the remaining sget() callsite in nilfs. Instead of checking mount type (i.e. ro-mount/rw-mount/snapshot mount) in the test function passed to sget(), this patch first looks up the nilfs_sb_info struct which the given mount type matches, and then acquires the super block instance holding the nilfs_sb_info. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: get rid of sget use for checking if current mount is presentRyusuke Konishi2009-06-112-60/+35
| | | | | | | | | | | | | | | | | | | | This stops using sget() for checking if an r/w-mount or an r/o-mount exists on the device. This elimination uses a back pointer to the current mount added to nilfs object. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: get rid of sget use for acquiring nilfs objectRyusuke Konishi2009-06-113-65/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will change the way to obtain nilfs object in nilfs_get_sb() function. Previously, a preliminary sget() call was performed, and the nilfs object was acquired from a super block instance found by the sget() call. This patch, instead, instroduces a new dedicated function find_or_create_nilfs(); as the name implies, the function finds an existent nilfs object from a global list or creates a new one if no object is found on the device. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: remove meaningless EBUSY case from nilfs_get_sb functionRyusuke Konishi2009-06-111-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following EBUSY case in nilfs_get_sb() is meaningless. Indeed, this error code is never returned to the caller. if (!s->s_root) { ... } else if (!(s->s_flags & MS_RDONLY)) { err = -EBUSY; } This simply removes the else case. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * remove the call to ->write_super in __sync_filesystemChristoph Hellwig2009-06-111-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | Now that all filesystems provide ->sync_fs methods we can change __sync_filesystem to only call ->sync_fs. This gives us a clear separation between periodic writeouts which are driven by ->write_super and data integrity syncs that go through ->sync_fs. (modulo file_fsync which is also going away) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nilfs2: call nilfs2_write_super from nilfs2_sync_fsChristoph Hellwig2009-06-111-0/+2
| | | | | | | | | | | | | | | | The call to ->write_super from __sync_filesystem will go away, so make sure nilfs2 performs the same actions from inside ->sync_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * jffs2: call jffs2_write_super from jffs2_sync_fsChristoph Hellwig2009-06-111-0/+2
| | | | | | | | | | | | | | | | The call to ->write_super from __sync_filesystem will go away, so make sure jffs2 performs the same actions from inside ->sync_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * ufs: add ->sync_fsChristoph Hellwig2009-06-111-10/+22
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * sysv: add ->sync_fsChristoph Hellwig2009-06-111-6/+13
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * hfsplus: add ->sync_fsChristoph Hellwig2009-06-111-5/+11
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * hfs: add ->sync_fsChristoph Hellwig2009-06-111-0/+11
| | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * fat: add ->sync_fsChristoph Hellwig2009-06-111-0/+11
| | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * ext2: add ->sync_fsChristoph Hellwig2009-06-111-14/+27
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * exofs: add ->sync_fsChristoph Hellwig2009-06-111-3/+13
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * bfs: add ->sync_fsChristoph Hellwig2009-06-111-11/+21
| | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * affs: add ->sync_fsChristoph Hellwig2009-06-111-13/+27
| | | | | | | | | | | | | | | | | | Add a ->sync_fs method for data integrity syncs. Factor out common code between affs_put_super, affs_write_super and the new affs_sync_fs into a helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * sanitize ->fsync() for affsAl Viro2009-06-113-2/+15
| | | | | | | | | | | | | | | | | | unfortunately, for affs (especially for affs directories) we have no real way to keep track of metadata ownership. So we have to do more or less what file_fsync() does, but we do *not* need to call write_super() there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * repair bfs_write_inode(), switch bfs to simple_fsync()Al Viro2009-06-112-7/+13
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Fix adfs GET_FRAG_ID() on big-endianAl Viro2009-06-111-1/+1
| | | | | | | | | | | | Missing conversion to host-endian before doing shifts Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * repair adfs ->write_inode(), switch to simple_fsync()Al Viro2009-06-116-6/+48
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch omfs to simple_fsync()Al Viro2009-06-111-16/+1
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch udf to simple_fsync()Al Viro2009-06-115-58/+3
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch ufs to simple_fsync()Al Viro2009-06-113-24/+2
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * repair sysv_write_inode(), switch sysv to simple_fsync()Al Viro2009-06-114-47/+18
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch minix to simple_fsync()Al Viro2009-06-114-45/+12
| | | | | | | | | | | | | | * get minix_write_inode() to honour the second argument * now we can use simple_fsync() for minixfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch ext2 to simple_fsync()Al Viro2009-06-116-66/+6
| | | | | | | | | | | | | | kill ext2_sync_file() (along with ext2/fsync.c), get rid of ext2_update_inode() - it's an alias of ext2_write_inode(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Sanitize ->fsync() for FATAl Viro2009-06-117-19/+49
| | | | | | | | | | | | | | | | * mark directory data blocks as assoc. metadata * add new inode to deal with FAT, mark FAT blocks as assoc. metadata of that * now ->fsync() is trivial both for files and directories Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud