summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* CRED: Use creds in file structsDavid Howells2008-11-141-2/+2
| | | | | | | | | | | | Attach creds to file structs and discard f_uid/f_gid. file_operations::open() methods (such as hppfs_open()) should use file->f_cred rather than current_cred(). At the moment file->f_cred will be current_cred() at this point. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Make execve() take advantage of copy-on-write credentialsDavid Howells2008-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make execve() take advantage of copy-on-write credentials, allowing it to set up the credentials in advance, and then commit the whole lot after the point of no return. This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). The credential bits from struct linux_binprm are, for the most part, replaced with a single credentials pointer (bprm->cred). This means that all the creds can be calculated in advance and then applied at the point of no return with no possibility of failure. I would like to replace bprm->cap_effective with: cap_isclear(bprm->cap_effective) but this seems impossible due to special behaviour for processes of pid 1 (they always retain their parent's capability masks where normally they'd be changed - see cap_bprm_set_creds()). The following sequence of events now happens: (a) At the start of do_execve, the current task's cred_exec_mutex is locked to prevent PTRACE_ATTACH from obsoleting the calculation of creds that we make. (a) prepare_exec_creds() is then called to make a copy of the current task's credentials and prepare it. This copy is then assigned to bprm->cred. This renders security_bprm_alloc() and security_bprm_free() unnecessary, and so they've been removed. (b) The determination of unsafe execution is now performed immediately after (a) rather than later on in the code. The result is stored in bprm->unsafe for future reference. (c) prepare_binprm() is called, possibly multiple times. (i) This applies the result of set[ug]id binaries to the new creds attached to bprm->cred. Personality bit clearance is recorded, but now deferred on the basis that the exec procedure may yet fail. (ii) This then calls the new security_bprm_set_creds(). This should calculate the new LSM and capability credentials into *bprm->cred. This folds together security_bprm_set() and parts of security_bprm_apply_creds() (these two have been removed). Anything that might fail must be done at this point. (iii) bprm->cred_prepared is set to 1. bprm->cred_prepared is 0 on the first pass of the security calculations, and 1 on all subsequent passes. This allows SELinux in (ii) to base its calculations only on the initial script and not on the interpreter. (d) flush_old_exec() is called to commit the task to execution. This performs the following steps with regard to credentials: (i) Clear pdeath_signal and set dumpable on certain circumstances that may not be covered by commit_creds(). (ii) Clear any bits in current->personality that were deferred from (c.i). (e) install_exec_creds() [compute_creds() as was] is called to install the new credentials. This performs the following steps with regard to credentials: (i) Calls security_bprm_committing_creds() to apply any security requirements, such as flushing unauthorised files in SELinux, that must be done before the credentials are changed. This is made up of bits of security_bprm_apply_creds() and security_bprm_post_apply_creds(), both of which have been removed. This function is not allowed to fail; anything that might fail must have been done in (c.ii). (ii) Calls commit_creds() to apply the new credentials in a single assignment (more or less). Possibly pdeath_signal and dumpable should be part of struct creds. (iii) Unlocks the task's cred_replace_mutex, thus allowing PTRACE_ATTACH to take place. (iv) Clears The bprm->cred pointer as the credentials it was holding are now immutable. (v) Calls security_bprm_committed_creds() to apply any security alterations that must be done after the creds have been changed. SELinux uses this to flush signals and signal handlers. (f) If an error occurs before (d.i), bprm_free() will call abort_creds() to destroy the proposed new credentials and will then unlock cred_replace_mutex. No changes to the credentials will have been made. (2) LSM interface. A number of functions have been changed, added or removed: (*) security_bprm_alloc(), ->bprm_alloc_security() (*) security_bprm_free(), ->bprm_free_security() Removed in favour of preparing new credentials and modifying those. (*) security_bprm_apply_creds(), ->bprm_apply_creds() (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds() Removed; split between security_bprm_set_creds(), security_bprm_committing_creds() and security_bprm_committed_creds(). (*) security_bprm_set(), ->bprm_set_security() Removed; folded into security_bprm_set_creds(). (*) security_bprm_set_creds(), ->bprm_set_creds() New. The new credentials in bprm->creds should be checked and set up as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the second and subsequent calls. (*) security_bprm_committing_creds(), ->bprm_committing_creds() (*) security_bprm_committed_creds(), ->bprm_committed_creds() New. Apply the security effects of the new credentials. This includes closing unauthorised files in SELinux. This function may not fail. When the former is called, the creds haven't yet been applied to the process; when the latter is called, they have. The former may access bprm->cred, the latter may not. (3) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) The bprm_security_struct struct has been removed in favour of using the credentials-under-construction approach. (c) flush_unauthorized_files() now takes a cred pointer and passes it on to inode_has_perm(), file_has_perm() and dentry_open(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Pass credentials through dentry_open()David Howells2008-11-142-3/+4
| | | | | | | | | | | | | Pass credentials through dentry_open() so that the COW creds patch can have SELinux's flush_unauthorized_files() pass the appropriate creds back to itself when it opens its null chardev. The security_dentry_open() call also now takes a creds pointer, as does the dentry_open hook in struct security_operations. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Use RCU to access another task's creds and to release a task's own credsDavid Howells2008-11-141-12/+20
| | | | | | | | | | | | Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap current->cred and a few other accessorsDavid Howells2008-11-141-4/+3
| | | | | | | | | | Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Separate task security context from task_structDavid Howells2008-11-145-28/+33
| | | | | | | | | | | | | | | | Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the x86 archDavid Howells2008-11-141-1/+1
| | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the S390 archDavid Howells2008-11-141-2/+2
| | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the PowerPC archDavid Howells2008-11-142-3/+3
| | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the PA-RISC archDavid Howells2008-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: linux-parisc@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the MIPS archDavid Howells2008-11-141-2/+3
| | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Signed-off-by: James Morris <jmorris@namei.org>
* CRED: Wrap task credential accesses in the IA64 archDavid Howells2008-11-143-13/+16
| | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-ia64@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* Merge branch 'io-mappings-for-linus-2' of ↵Linus Torvalds2008-11-037-11/+70
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: io mapping: clean up #ifdefs io mapping: improve documentation i915: use io-mapping interfaces instead of a variety of mapping kludges resources: add io-mapping functions to dynamically map large device apertures x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
| * io mapping: clean up #ifdefsKeith Packard2008-11-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup clean up ifdefs: change #ifdef CONFIG_X86_32/64 to CONFIG_HAVE_ATOMIC_IOMAP. flip around the #ifdef sections to clean up the structure. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmapsKeith Packard2008-10-316-11/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | Impact: introduce new APIs, separate kmap code from CONFIG_HIGHMEM This takes the code used for CONFIG_HIGHMEM memory mappings except that it's designed for dynamic IO resource mapping. These fixmaps are available even with CONFIG_HIGHMEM turned off. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2008-11-023-5/+13
|\ \ | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix PCI resource mapping on sparc64 sparc64: Kill annoying warning when building compat_binfmt_elf.o sparc32: kernel/trace/trace.c wants DIE_OOPS sparc64: Fix __copy_{to,from}_user_inatomic defines.
| * | sparc64: Fix PCI resource mapping on sparc64Max Dmitrichenko2008-11-021-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a problem discovered in recent versions of ATI Mach64 driver in X.org on sparc64 architecture. In short, the driver fails to mmap MMIO aperture (PCI resource #2). I've found that kernel's __pci_mmap_make_offset() returns EINVAL. It checks whether user attempts to mmap more than the resource length, which is 0x1000 bytes in our case. But PAGE_SIZE on SPARC64 is 0x2000 and this is what actually is being mmaped. So __pci_mmap_make_offset() failed for this PCI resource. Signed-off-by: Max Dmitrichenko <dmitrmax@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Kill annoying warning when building compat_binfmt_elf.oDavid S. Miller2008-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC warns because some tests against 32-bit values never evaluate to true due to how TASK_SIZE is defined. I always wanted to mimick powerpc's definition of TASK_SIZE, which is simply TASK_SIZE_OF(current) and that also fixes the warning. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc32: kernel/trace/trace.c wants DIE_OOPSAl Viro2008-11-011-0/+1
| | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Fix __copy_{to,from}_user_inatomic defines.Hugh Dickins2008-11-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexander Beregalov reports oops in __bzero() called from copy_from_user_fixup() called from iov_iter_copy_from_user_atomic(), when running dbench on tmpfs on sparc64: its __copy_from_user_inatomic and __copy_to_user_inatomic should be avoiding, not calling, the fixups. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-11-021-0/+26
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) af_unix: netns: fix problem of return value IRDA: remove double inclusion of module.h udp: multicast packets need to check namespace net: add documentation for skb recycling key: fix setkey(8) policy set breakage bpa10x: free sk_buff with kfree_skb xfrm: do not leak ESRCH to user space net: Really remove all of LOOPBACK_TSO code. netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys() netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys net: delete excess kernel-doc notation pppoe: Fix socket leak. gianfar: Don't reset TBI<->SerDes link if it's already up gianfar: Fix race in TBI/SerDes configuration at91_ether: request/free GPIO for PHY interrupt amd8111e: fix dma_free_coherent context atl1: fix vlan tag regression SMC91x: delete unused local variable "lp" myri10ge: fix stop/go mmio ordering bonding: fix panic when taking bond interface down before removing module ...
| * \ \ Merge branch 'davem-fixes' of ↵David S. Miller2008-10-301-0/+26
| |\ \ \ | | |_|/ | |/| | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| | * | gianfar: Fix race in TBI/SerDes configurationTrent Piepho2008-10-311-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The init_phy() function attaches to the PHY, then configures the SerDes<->TBI link (in SGMII mode). The TBI is on the MDIO bus with the PHY (sort of) and is accessed via the gianfar's MDIO registers, using the functions gfar_local_mdio_read/write(), which don't do any locking. The previously attached PHY will start a work-queue on a timer, and probably an irq handler as well, which will talk to the PHY and thus use the MDIO bus. This uses phy_read/write(), which have locking, but not against the gfar_local_mdio versions. The result is that PHY code will try to use the MDIO bus at the same time as the SerDes setup code, corrupting the transfers. Setting up the SerDes before attaching to the PHY will insure that there is no race between the SerDes code and *our* PHY, but doesn't fix everything. Typically the PHYs for all gianfar devices are on the same MDIO bus, which is associated with the first gianfar device. This means that the first gianfar's SerDes code could corrupt the MDIO transfers for a different gianfar's PHY. The lock used by phy_read/write() is contained in the mii_bus structure, which is pointed to by the PHY. This is difficult to access from the gianfar drivers, as there is no link between a gianfar device and the mii_bus which shares the same MDIO registers. As far as the device layer and drivers are concerned they are two unrelated devices (which happen to share registers). Generally all gianfar devices' PHYs will be on the bus associated with the first gianfar. But this might not be the case, so simply locking the gianfar's PHY's mii bus might not lock the mii bus that the SerDes setup code is going to use. We solve this by having the code that creates the gianfar platform device look in the device tree for an mdio device that shares the gianfar's registers. If one is found the ID of its platform device is saved in the gianfar's platform data. A new function in the gianfar mii code, gfar_get_miibus(), can use the bus ID to search through the platform devices for a gianfar_mdio device with the right ID. The platform device's driver data is the mii_bus structure, which the SerDes setup code can use to lock the current bus. Signed-off-by: Trent Piepho <tpiepho@freescale.com> CC: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | | | sparc32: kernel/trace/trace.c wants DIE_OOPSAl Viro2008-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2008-11-012-1/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature x86: build fix
| * | | | x86: fix AMDC1E and XTOPOLOGY conflict in cpufeatureVenki Pallipadi2008-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix xsave slowdown regression Fix two features from conflicting in feature bits. Fixes this performance regression: Subject: cpu2000(both float and int) 13% regression with 2.6.28-rc1 http://lkml.org/lkml/2008/10/28/36 Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: build fixIngo Molnar2008-10-311-0/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: build fix on certain UP configs fix: arch/x86/kernel/cpu/common.c: In function 'cpu_init': arch/x86/kernel/cpu/common.c:1141: error: 'boot_cpu_id' undeclared (first use in this function) arch/x86/kernel/cpu/common.c:1141: error: (Each undeclared identifier is reported only once arch/x86/kernel/cpu/common.c:1141: error: for each function it appears in.) Pull in asm/smp.h on UP, so that we get the definition of boot_cpu_id. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | x86: Clean up late e820 resource allocationLinus Torvalds2008-11-011-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the late e820 resources use 'insert_resource_expand_to_fit()' instead of doing a 'reserve_region_with_split()', and also avoids marking them as IORESOURCE_BUSY. This results in us being perfectly happy to use pre-existing PCI resources even if they were marked as being in a reserved region, while still avoiding any _new_ allocations in the reserved regions. It also makes for a simpler and more accurate resource tree. Example resource allocation from Jonathan Corbet, who has firmware that has an e820 reserved entry that covered a big range (e0000000-fed003ff), and that had various PCI resources in it set up by firmware. With old kernels, the reserved range would force us to re-allocate all pre-existing PCI resources, and his reserved range would end up looking like this: e0000000-fed003ff : reserved fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 where only the pre-allocated special regions (IOAPIC and HPET) were kept around. With 2.6.28-rc2, which uses 'reserve_region_with_split()', Jonathan's resource tree looked like this: e0000000-fe7fffff : reserved fe800000-fe8fffff : PCI Bus 0000:01 fe800000-fe8fffff : reserved fe900000-fe9d9aff : reserved fe9d9b00-fe9d9bff : 0000:00:1f.3 fe9d9b00-fe9d9bff : reserved fe9d9c00-fe9d9fff : 0000:00:1a.7 fe9d9c00-fe9d9fff : reserved fe9da000-fe9dafff : 0000:00:03.3 fe9da000-fe9dafff : reserved fe9db000-fe9dbfff : 0000:00:19.0 fe9db000-fe9dbfff : reserved fe9dc000-fe9dffff : 0000:00:1b.0 fe9dc000-fe9dffff : reserved fe9e0000-fe9fffff : 0000:00:19.0 fe9e0000-fe9fffff : reserved fea00000-fea7ffff : 0000:00:02.0 fea00000-fea7ffff : reserved fea80000-feafffff : 0000:00:02.1 fea80000-feafffff : reserved feb00000-febfffff : 0000:00:02.0 feb00000-febfffff : reserved fec00000-fed003ff : reserved fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 and because the reserved entry had been split and moved into the individual resources, and because it used the IORESOURCE_BUSY flag, the drivers that actually wanted to _use_ those resources couldn't actually attach to them: e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff] HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff] with this patch, the resource tree instead becomes e0000000-fed003ff : reserved fe800000-fe8fffff : PCI Bus 0000:01 fe9d9b00-fe9d9bff : 0000:00:1f.3 fe9d9c00-fe9d9fff : 0000:00:1a.7 fe9d9c00-fe9d9fff : ehci_hcd fe9da000-fe9dafff : 0000:00:03.3 fe9db000-fe9dbfff : 0000:00:19.0 fe9db000-fe9dbfff : e1000e fe9dc000-fe9dffff : 0000:00:1b.0 fe9dc000-fe9dffff : ICH HD audio fe9e0000-fe9fffff : 0000:00:19.0 fe9e0000-fe9fffff : e1000e fea00000-fea7ffff : 0000:00:02.0 fea80000-feafffff : 0000:00:02.1 feb00000-febfffff : 0000:00:02.0 fec00000-fec00fff : IOAPIC 0 fed00000-fed003ff : HPET 0 ie the one reserved region now ends up surrounding all the PCI resources that were allocated inside of it by firmware, and because it is not marked BUSY, drivers have no problem attaching to the pre-allocated resources. Reported-and-tested-by: Jonathan Corbet <corbet@lwn.net> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'link_removal' of git://www.jni.nu/crisLinus Torvalds2008-11-017-246/+115
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | * 'link_removal' of git://www.jni.nu/cris: [CRIS] Remove links from CRIS build [CRIS] Merge asm-offsets.c for both arches into one file.
| * | | | [CRIS] Remove links from CRIS buildJesper Nilsson2008-10-315-178/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the links to architecture and machine dependent directories (boot, lib, drivers, arch, mach) The links were created and used mostly from the arch/cris/Makefile, so why not dispense with them altogether? Changed $(ARCH) to "cris" in Makefile, it is easier to read this way. The CRISv32 head.S common files for the kernel and compressed images needed to be modified to use ifdefs instead of using the now removed mach link. Since there are only two versions, this is not a huge loss in readability. The link to vmlinux.lds.S is also replaced with a merged version which uses ifdefs to select the correct layout. System.map before and after are identical. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
| * | | | [CRIS] Merge asm-offsets.c for both arches into one file.Jesper Nilsson2008-10-313-69/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eliminates the link to arch specific asm-offsets.c from CRIS architecture build system. Resulting asm-offsets.s are identical before and after change for both arch-v10 and arch-v32. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
* | | | | Merge branch 'cris_move' of git://www.jni.nu/crisLinus Torvalds2008-11-01353-95/+62520
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | | | | | | * 'cris_move' of git://www.jni.nu/cris: [CRIS] Move header files from include to arch/cris/include. [CRISv32] Remove warning in io.h
| * | | | [CRIS] Move header files from include to arch/cris/include.Jesper Nilsson2008-10-29353-95/+62520
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | Change all users of header files to correct path. Remove some unneeded headers for arch-v32. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
* | | | saner FASYNC handling on file closeAl Viro2008-11-011-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that *does* forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'merge' of ↵Linus Torvalds2008-10-3156-864/+1434
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (23 commits) Revert "powerpc: Sync RPA note in zImage with kernel's RPA note" powerpc: Fix compile errors with CONFIG_BUG=n powerpc: Fix format string warning in arch/powerpc/boot/main.c powerpc: Fix bug in kernel copy of libfdt's fdt_subnode_offset_namelen() powerpc: Remove duplicate DMA entry from mpc8313erdb device tree powerpc/cell/OProfile: Fix on-stack array size in activate spu profiling function powerpc/mpic: Fix regression caused by change of default IRQ affinity powerpc: Update remaining dma_mapping_ops to use map/unmap_page powerpc/pci: Fix unmapping of IO space on 64-bit powerpc/pci: Properly allocate bus resources for hotplug PHBs OF-device: Don't overwrite numa_node in device registration powerpc: Fix swapcontext system for VSX + old ucontext size powerpc: Fix compiler warning for the relocatable kernel powerpc: Work around ld bug in older binutils powerpc/ppc64/kdump: Better flag for running relocatable powerpc: Use is_kdump_kernel() powerpc: Kexec exit should not use magic numbers powerpc/44x: Update 44x defconfigs powerpc/40x: Update 40x defconfigs powerpc: enable heap randomization for linkstations ...
| * | | | Revert "powerpc: Sync RPA note in zImage with kernel's RPA note"Paul Mackerras2008-10-314-145/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 91a00302959545a9ae423e99732b1e46eb19e877, plus commit 0dcd440120ef12879ff34fc78d7e4abf171c79e4 ("powerpc: Revert CHRP boot wrapper to real-base = 12MB on 32-bit") which depended on it. Commit 91a00302 was causing NVRAM corruption on some pSeries machines, for as-yet unknown reasons, so this reverts it until the cause is identified. Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | Merge branch 'merge' of ↵Paul Mackerras2008-10-311-39/+0
| |\ \ \ \ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc into merge
| | * | | | powerpc: Remove duplicate DMA entry from mpc8313erdb device treeMike Dyer2008-10-311-39/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 574366128db29e7da609ec1f9c01bf9d80adec87 added a duplicate DMA controller node. Signed-off-by: Mike Dyer <mike.dyer@provision-comm.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| * | | | | powerpc: Fix compile errors with CONFIG_BUG=nPaul Mackerras2008-10-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes sure we don't try to call find_bug or is_warning_bug when CONFIG_BUG=n and CONFIG_XMON=y. Otherwise we get these errors: arch/powerpc/xmon/xmon.c: In function ‘print_bug_trap’: arch/powerpc/xmon/xmon.c:1364: error: implicit declaration of function ‘find_bug’ arch/powerpc/xmon/xmon.c:1364: warning: assignment makes pointer from integer without a cast arch/powerpc/xmon/xmon.c:1367: error: implicit declaration of function ‘is_warning_bug’ arch/powerpc/xmon/xmon.c:1374: error: dereferencing pointer to incomplete type make[2]: *** [arch/powerpc/xmon/xmon.o] Error 1 make[1]: *** [arch/powerpc/xmon] Error 2 make: *** [sub-make] Error 2 Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Fix format string warning in arch/powerpc/boot/main.cJon Smirl2008-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix format string warning in arch/powerpc/boot/main.c. Also correct a typo ("uncomressed") on the same line. BOOTCC arch/powerpc/boot/main.o arch/powerpc/boot/main.c: In function 'prep_kernel': arch/powerpc/boot/main.c:65: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int' Signed-off-by: Jon Smirl <jonsmirl@gmail.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Fix bug in kernel copy of libfdt's fdt_subnode_offset_namelen()David Gibson2008-10-311-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's currently an off-by-one bug in fdt_subnode_offset_namelen() which causes it to keep searching after it's finished the subnodes of the given parent, and into the subnodes of siblings of the original node which come after it in the tree. This bug was introduced in commit ed95d7450dcbfeb45ffc9d39b1747aee82b49a51 ("powerpc: Update in-kernel dtc and libfdt to version 1.2.0"). A patch has already been submitted to dtc/libfdt mainline. We don't really want to pull in a new upstream version during the 2.6.28 cycle, but we should still fix this bug, hence this standalone version of the fix for the in-kernel libfdt. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc/cell/OProfile: Fix on-stack array size in activate spu profiling ↵Carl Love2008-10-311-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function The size of the pm_signal_local array should be equal to the number of SPUs being configured in the array. Currently, the array is of size 4 (NR_PHYS_CTRS) but being indexed by a for loop from 0 to 7 (NUM_SPUS_PER_NODE). This could potentially cause an oops or random memory corruption since the pm_signal_local array is on the stack. This fixes it. Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc/mpic: Fix regression caused by change of default IRQ affinityKumar Gala2008-10-314-6/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Freescale implementation of MPIC only allows a single CPU destination for non-IPI interrupts. We add a flag to the mpic_init to distinquish these variants of MPIC. We pull in the irq_choose_cpu from sparc64 to select a single CPU as the destination of the interrupt. This is to deal with the fact that the default smp affinity was changed by commit 18404756765c713a0be4eb1082920c04822ce588 ("genirq: Expose default irq affinity mask (take 3)") to be all CPUs. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Update remaining dma_mapping_ops to use map/unmap_pageMark Nelson2008-10-318-99/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the merge of the 32 and 64bit DMA code, dma_direct_ops lost their map/unmap_single() functions but gained map/unmap_page(). This caused a problem for Cell because Cell's dma_iommu_fixed_ops called the dma_direct_ops if the fixed linear mapping was to be used or the iommu ops if the dynamic window was to be used. So in order to fix this problem we need to update the 64bit DMA code to use map/unmap_page. First, we update the generic IOMMU code so that iommu_map_single() becomes iommu_map_page() and iommu_unmap_single() becomes iommu_unmap_page(). Then we propagate these changes up through all the callers of these two functions and in the process update all the dma_mapping_ops so that they have map/unmap_page rahter than map/unmap_single. We can do this because on 64bit there is no HIGHMEM memory so map/unmap_page ends up performing exactly the same function as map/unmap_single, just taking different arguments. This has no affect on drivers because the dma_map_single_attrs() just ends up calling the map_page() function of the appropriate dma_mapping_ops and similarly the dma_unmap_single_attrs() calls unmap_page(). This fixes an oops on Cell blades, which oops on boot without this because they call dma_direct_ops.map_single, which is NULL. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc/pci: Fix unmapping of IO space on 64-bitBenjamin Herrenschmidt2008-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A typo/thinko made us pass the wrong argument to __flush_hash_table_range when unplugging bridges, thus not flushing all the translations for the IO space on unplug. The third parameter to __flush_hash_table_range is `end', not `size'. This causes the hypervisor to refuse unplugging slots. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc/pci: Properly allocate bus resources for hotplug PHBsNathan Fontenot2008-10-313-55/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resources for PHB's that are dynamically added to a system are not properly allocated in the resource tree. Not having these resources allocated causes an oops when removing the PHB when we try to release them. The diff appears a bit messy, this is mainly due to moving everything one tab to the left in the pcibios_allocate_bus_resources routine. The functionality change in this routine is only that the list_for_each_entry() loop is pulled out and moved to the necessary calling routine. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | OF-device: Don't overwrite numa_node in device registrationJeremy Kerr2008-10-311-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the numa_node of OF-devices will be overwritten during device_register, which simply sets the node to -1. On cell machines, this means that devices can't find their IOMMU, which is referenced through the device's numa node. Set the numa node for OF devices with no parent, and use the lower-level device_initialize and device_add functions, so that the node is preserved. We can remove the call to set_dev_node in of_device_alloc, as it will be overwritten during register. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Fix swapcontext system for VSX + old ucontext sizeMichael Neuling2008-10-312-39/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since VSX support was added, we now have two sizes of ucontext_t; the older, smaller size without the extra VSX state, and the new larger size with the extra VSX state. A program using the sys_swapcontext system call and supplying smaller ucontext_t structures will currently get an EINVAL error if the task has used VSX (e.g. because of calling library code that uses VSX) and the old_ctx argument is non-NULL (i.e. the program is asking for its current context to be saved). Thus the program will start getting EINVAL errors on calls that previously worked. This commit changes this behaviour so that we don't send an EINVAL in this case. It will now return the smaller context but the VSX MSR bit will always be cleared to indicate that the ucontext_t doesn't include the extra VSX state, even if the task has executed VSX instructions. Both 32 and 64 bit cases are updated. [paulus@samba.org - also fix some access_ok() and get_user() calls] Thanks to Ben Herrenschmidt for noticing this problem. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Fix compiler warning for the relocatable kernelMichael Neuling2008-10-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes this warning: arch/powerpc/kernel/setup_64.c:447:5: warning: "kernstart_addr" is not defined which arises because PHYSICAL_START is no longer a constant when CONFIG_RELOCATABLE=y. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | powerpc: Work around ld bug in older binutilsPaul Mackerras2008-10-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 549e8152de8039506f69c677a4546e5427aa6ae7 ("powerpc: Make the 64-bit kernel as a position-independent executable") added lines to vmlinux.lds.S to add the extra sections needed to implement a relocatable kernel. However, those lines seem to trigger a bug in older versions of GNU ld (such as 2.16.1) when building a non-relocatable kernel. Since ld 2.16.1 is still a popular choice for cross-toolchains, this adds an #ifdef to vmlinux.lds.S so the added lines are only included when building a relocatable kernel. Signed-off-by: Paul Mackerras <paulus@samba.org>
OpenPOWER on IntegriCloud