summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* xfs: don't vmap inode cluster buffers during freeDave Chinner2012-11-081-1/+2
| | | | | | | | | | | | Inode buffers do not need to be mapped as inodes are read or written directly from/to the pages underlying the buffer. This fixes a regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the default behaviour"). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: invalidate allocbt blocks moved to the free listDave Chinner2012-11-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we free a block from the alloc btree tree, we move it to the freelist held in the AGFL and mark it busy in the busy extent tree. This typically happens when we merge btree blocks. Once the transaction is committed and checkpointed, the block can remain on the free list for an indefinite amount of time. Now, this isn't the end of the world at this point - if the free list is shortened, the buffer is invalidated in the transaction that moves it back to free space. If the buffer is allocated as metadata from the free list, then all the modifications getted logged, and we have no issues, either. And if it gets allocated as userdata direct from the freelist, it gets invalidated and so will never get written. However, during the time it sits on the free list, pressure on the log can cause the AIL to be pushed and the buffer that covers the block gets pushed for write. IOWs, we end up writing a freed metadata block to disk. Again, this isn't the end of the world because we know from the above we are only writing to free space. The problem, however, is for validation callbacks. If the block was on old btree root block, then the level of the block is going to be higher than the current tree root, and so will fail validation. There may be other inconsistencies in the block as well, and currently we don't care because the block is in free space. Shutting down the filesystem because a freed block doesn't pass write validation, OTOH, is rather unfriendly. So, make sure we always invalidate buffers as they move from the free space trees to the free list so that we guarantee they never get written to disk while on the free list. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phil White <pwhite@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: silence uninitialised f.file warning.Dave Chinner2012-11-081-1/+1
| | | | | | | | | | | | Uninitialised variable build warning introduced by 2903ff0 ("switch simple cases of fget_light to fdget"), gcc is not smart enough to work out that the variable is not used uninitialised, and the commit removed the initialisation at declaration that the old variable had. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: growfs: don't read garbage for new secondary superblocksDave Chinner2012-11-081-2/+19
| | | | | | | | | | | | | When updating new secondary superblocks in a growfs operation, the superblock buffer is read from the newly grown region of the underlying device. This is not guaranteed to be zero, so violates the underlying assumption that the unused parts of superblocks are zero filled. Get a new buffer for these secondary superblocks to ensure that the unused regions are zero filled correctly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: move allocation stack switch up to xfs_bmapi_allocateDave Chinner2012-11-084-56/+54
| | | | | | | | | | | | | | | | | | | | | | | | Switching stacks are xfs_alloc_vextent can cause deadlocks when we run out of worker threads on the allocation workqueue. This can occur because xfs_bmap_btalloc can make multiple calls to xfs_alloc_vextent() and even if xfs_alloc_vextent() fails it can return with the AGF locked in the current allocation transaction. If we then need to make another allocation, and all the allocation worker contexts are exhausted because the are blocked waiting for the AGF lock, holder of the AGF cannot get it's xfs-alloc_vextent work completed to release the AGF. Hence allocation effectively deadlocks. To avoid this, move the stack switch one layer up to xfs_bmapi_allocate() so that all of the allocation attempts in a single switched stack transaction occur in a single worker context. This avoids the problem of an allocation being blocked waiting for a worker thread whilst holding the AGF. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: introduce XFS_BMAPI_STACK_SWITCHDave Chinner2012-11-085-3/+13
| | | | | | | | | | | | | | | | | | | | | Certain allocation paths through xfs_bmapi_write() are in situations where we have limited stack available. These are almost always in the buffered IO writeback path when convertion delayed allocation extents to real extents. The current stack switch occurs for userdata allocations, which means we also do stack switches for preallocation, direct IO and unwritten extent conversion, even those these call chains have never been implicated in a stack overrun. Hence, let's target just the single stack overun offended for stack switches. To do that, introduce a XFS_BMAPI_STACK_SWITCH flag that the caller can pass xfs_bmapi_write() to indicate it should switch stacks if it needs to do allocation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: zero allocation_args on the kernel stackMark Tinguely2012-11-083-0/+5
| | | | | | | | Zero the kernel stack space that makes up the xfs_alloc_arg structures. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: only update the last_sync_lsn when a transaction completesDave Chinner2012-11-081-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The log write code stamps each iclog with the current tail LSN in the iclog header so that recovery knows where to find the tail of thelog once it has found the head. Normally this is taken from the first item on the AIL - the log item that corresponds to the oldest active item in the log. The problem is that when the AIL is empty, the tail lsn is dervied from the the l_last_sync_lsn, which is the LSN of the last iclog to be written to the log. In most cases this doesn't happen, because the AIL is rarely empty on an active filesystem. However, when it does, it opens up an interesting case when the transaction being committed to the iclog spans multiple iclogs. That is, the first iclog is stamped with the l_last_sync_lsn, and IO is issued. Then the next iclog is setup, the changes copied into the iclog (takes some time), and then the l_last_sync_lsn is stamped into the header and IO is issued. This is still the same transaction, so the tail lsn of both iclogs must be the same for log recovery to find the entire transaction to be able to replay it. The problem arises in that the iclog buffer IO completion updates the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog completes it's IO before the second iclog is filled and has the tail lsn stamped in it, it will stamp the LSN of the first iclog into it's tail lsn field. If the system fails at this point, log recovery will not see a complete transaction, so the transaction will no be replayed. The fix is simple - the l_last_sync_lsn is updated when a iclog buffer IO completes, and this is incorrect. The l_last_sync_lsn shoul dbe updated when a transaction is completed by a iclog buffer IO. That is, only iclog buffers that have transaction commit callbacks attached to them should update the l_last_sync_lsn. This means that the last_sync_lsn will only move forward when a commit record it written, not in the middle of a large transaction that is rolling through multiple iclog buffers. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* Linux 3.7-rc1v3.7-rc1Linus Torvalds2012-10-141-2/+2
|
* Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds2012-10-14107-4493/+2895
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull MIPS update from Ralf Baechle: "Cleanups and fixes for breakage that occured earlier during this merge phase. Also a few patches that didn't make the first pull request. Of those is the Alchemy work that merges code for many of the SOCs and evaluation boards thus among other code shrinkage, reduces the number of MIPS defconfigs by 5." * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (22 commits) MIPS: SNI: Switch RM400 serial to SCCNXP driver MIPS: Remove unused empty_bad_pmd_table[] declaration. MIPS: MT: Remove kspd. MIPS: Malta: Fix section mismatch. MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets. MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code. MIPS: Alchemy: merge PB1550 support into DB1550 code MIPS: Alchemy: Single kernel for DB1200/1300/1550 MIPS: Optimize TLB refill for RI/XI configurations. MIPS: proc: Cleanup printing of ASEs. MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required. MIPS: Add detection of DSP ASE Revision 2. MIPS: Optimize pgd_init and pmd_init MIPS: perf: Add perf functionality for BMIPS5000 MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP MIPS: perf: Remove unnecessary #ifdef MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt) MIPS: perf: Change the "mips_perf_event" table unsupported indicator. MIPS: Align swapper_pg_dir to 64K for better TLB Refill code. vmlinux.lds.h: Allow architectures to add sections to the front of .bss ...
| * Merge tag 'disintegrate-mips-20121009' of ↵Ralf Baechle2012-10-1146-1706/+1846
| |\ | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/dhowells/linux-headers into mips-for-linux-next UAPI Disintegration 2012-10-09 Patchwork: https://patchwork.linux-mips.org/patch/4414/
| | * UAPI: (Scripted) Disintegrate arch/mips/include/asmDavid Howells2012-10-0946-1706/+1846
| | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
| * | MIPS: SNI: Switch RM400 serial to SCCNXP driverThomas Bogendoerfer2012-10-111-24/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new SCCNXP driver supports the SC2681 chips used in RM400 machines. We now use the new driver instead of the old SC26xx driver. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4417/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Remove unused empty_bad_pmd_table[] declaration.Ralf Baechle2012-10-111-1/+0
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: MT: Remove kspd.Ralf Baechle2012-10-115-462/+0
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Malta: Fix section mismatch.Ralf Baechle2012-10-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LD arch/mips/pci/built-in.o WARNING: arch/mips/pci/built-in.o(.devinit.text+0x2a0): Section mismatch in reference from the function malta_piix_func0_fixup() to the variable .init.data:pci_irq The function __devinit malta_piix_func0_fixup() references a variable __initdata pci_irq. If pci_irq is only used by malta_piix_func0_fixup then annotate pci_irq with a matching annotation. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets.Ralf Baechle2012-10-111-10/+0
| | | | | | | | | | | | | | | | | | | | | Originally added in 05b541489c48e7fbeec19a92acf8683230750d0a [Merge with Linux 2.5.5.] over 10 years ago but never been used. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code.Manuel Lauss2012-10-118-678/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PB1100/1500 are similar to their DB-cousins but with a few more devices on the bus. This patch adds PB1100/1500 support to the existing DB1100/1500 code. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: lnux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4338/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Alchemy: merge PB1550 support into DB1550 codeManuel Lauss2012-10-116-437/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PB1550 is more or less a DB1550 without the PCI IDE controller, a more complicated (read: configurable) Flash setup and some other minor changes. Like the DB1550 it can be automatically detected by reading the CPLD ID register bits. This patch adds PB1550 detection and setup to the DB1550 code. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4337/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Alchemy: Single kernel for DB1200/1300/1550Manuel Lauss2012-10-1113-941/+548
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Combine support for the DB1200/PB1200, DB1300 and DB1550 boards into a single kernel image. defconfig-generated image verified on DB1200, DB1300 and DB1550. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4335/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Optimize TLB refill for RI/XI configurations.David Daney2012-10-111-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't have to do a separate shift to eliminate the software bits, just rotate them into the fill and they will be ignored. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4294/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: proc: Cleanup printing of ASEs.Ralf Baechle2012-10-111-9/+11
| | | | | | | | | | | | | | | | | | The number of %s was just getting ridiculous. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required.Ralf Baechle2012-10-1119-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | Most supported systems currently hardwire cpu_has_dsp to 0, so we also can disable support for cpu_has_dsp2 resulting in a slightly smaller kernel. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Add detection of DSP ASE Revision 2.Steven J. Hill2012-10-115-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ralf@linux-mips.org: This patch really only detects the ASE and passes its existence on to userland via /proc/cpuinfo. The DSP ASE Rev 2. adds new resources but no resources that would need management by the kernel.] Signed-off-by: Steven J. Hill <sjhill@mips.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4165/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Optimize pgd_init and pmd_initDavid Daney2012-10-111-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a dual issue processor GCC generates code that saves a couple of clock cycles per loop if we rearrange things slightly. Checking for p != end saves a SLTU per loop, moving the increment to the middle can let it dual issue on multi-issue processors. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4249/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: perf: Add perf functionality for BMIPS5000Al Cooper2012-10-111-1/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add hardware performance counter support to kernel "perf" code for BMIPS5000. The BMIPS5000 performance counters are similar to MIPS MTI cores, so the changes were mostly made in perf_event_mipsxx.c which is typically for MTI cores. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4109/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMPAl Cooper2012-10-112-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the Kconfig option CONFIG_MIPS_MT_SMP into CONFIG_MIPS_MT_SMP and CONFIG_MIPS_PERF_SHARED_TC_COUNTERS so some of the code used for performance counters that are shared between threads can be used for MIPS cores that are not MT_SMP. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4108/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: perf: Remove unnecessary #ifdefAl Cooper2012-10-111-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The #ifdef for CONFIG_HW_PERF_EVENTS is not needed because the Makefile will only compile the module if this config option is set. This means that the code under #else would never be compiled. This may have been done to leave the original broken code around for reference, but the FIXME comment above the code already shows the broken code. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4107/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt)Al Cooper2012-10-115-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCI (Program Counter Interrupt) bit in the "cause" register is mandatory for MIPS32R2 cores, but has also been added to some R1 cores (BMIPS5000). This change adds a cpu feature bit to make it easier to check for and use this feature. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4106/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: perf: Change the "mips_perf_event" table unsupported indicator.Al Cooper2012-10-111-150/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the indicator from 0xffffffff in the "event_id" member to zero in the "cntr_mask" member. This removes the need to initialize entries that are unsupported. This also solves a problem where the number of entries in the table was increased based on a globel enum used for all platforms, but the new unsupported entries were not added for mips. This was leaving new table entries of all zeros that we not marked UNSUPPORTED. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4110/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Align swapper_pg_dir to 64K for better TLB Refill code.David Daney2012-10-112-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can save an instruction in the TLB Refill path for kernel mappings by aligning swapper_pg_dir on a 64K boundary. The address of swapper_pg_dir can be generated with a single LUI instead of LUI/{D}ADDUI. The alignment of __init_end is bumped up to 64K so there are no holes between it and swapper_pg_dir, which is placed at the very beginning of .bss. The alignment of invalid_pmd_table and invalid_pte_table can be relaxed to PAGE_SIZE. We do this by using __page_aligned_bss, which has the added benefit of eliminating alignment holes in .bss. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org, Cc: linux-kernel@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Patchwork: https://patchwork.linux-mips.org/patch/4220/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | vmlinux.lds.h: Allow architectures to add sections to the front of .bssDavid Daney2012-10-111-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow-on MIPS patch will put an object here that needs 64K alignment to minimize padding. For those architectures that don't define BSS_FIRST_SECTIONS, there is no change. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org, Cc: linux-kernel@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Patchwork: https://patchwork.linux-mips.org/patch/4221/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | Improve atomic.h robustnessJoshua Kinard2012-10-111-35/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've maintained this patch, originally from Thiemo Seufer in 2004, for a really long time, but I think it's time for it to get a look at for possible inclusion. I have had no problems with it across various SGI systems over the years. To quote the post here: http://www.linux-mips.org/archives/linux-mips/2004-12/msg00000.html "the atomic functions use so far memory references for the inline assembler to access the semaphore. This can lead to additional instructions in the ll/sc loop, because newer compilers don't expand the memory reference any more but leave it to the assembler. The appended patch uses registers instead, and makes the ll/sc arguments more explicit. In some cases it will lead also to better register scheduling because the register isn't bound to an output any more." Signed-off-by: Joshua Kinard <kumba@gentoo.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4029/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | | Merge branch 'modules-next' of ↵Linus Torvalds2012-10-14128-594/+6799
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...
| * | | X.509: Fix indefinite length element skip error handlingDavid Howells2012-10-101-9/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asn1_find_indefinite_length() returns an error indicator of -1, which the caller asn1_ber_decoder() places in a size_t (which is usually unsigned) and then checks to see whether it is less than 0 (which it can't be). This can lead to the following warning: lib/asn1_decoder.c:320 asn1_ber_decoder() warn: unsigned 'len' is never less than zero. Instead, asn1_find_indefinite_length() update the caller's idea of the data cursor and length separately from returning the error code. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | X.509: Convert some printk calls to pr_develDavid Howells2012-10-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some debugging printk() calls should've been converted to pr_devel() calls. Do that now. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | asymmetric keys: fix printk format warningRandy Dunlap2012-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix printk format warning in x509_cert_parser.c: crypto/asymmetric_keys/x509_cert_parser.c: In function 'x509_note_OID': crypto/asymmetric_keys/x509_cert_parser.c:113:3: warning: format '%zu' expects type 'size_t', but argument 2 has type 'long unsigned int' Builds cleanly on i386 and x86_64. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: David Howells <dhowells@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: linux-crypto@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checkingDavid Howells2012-10-103-20/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current choice of lifetime for the autogenerated X.509 of 100 years, putting the validTo date in 2112, causes problems on 32-bit systems where a 32-bit time_t wraps in 2106. 64-bit x86_64 systems seem to be unaffected. This can result in something like: Loading module verification certificates X.509: Cert 6e03943da0f3b015ba6ed7f5e0cac4fe48680994 has expired MODSIGN: Problem loading in-kernel X.509 certificate (-127) Or: X.509: Cert 6e03943da0f3b015ba6ed7f5e0cac4fe48680994 is not yet valid MODSIGN: Problem loading in-kernel X.509 certificate (-129) Instead of turning the dates into time_t values and comparing, turn the system clock and the ASN.1 dates into tm structs and compare those piecemeal instead. Reported-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Make mrproper should remove generated files.Rusty Russell2012-10-102-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn't, because the clean targets don't include kernel/Makefile, and because two files were missing from the list. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certsDavid Howells2012-10-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Place an indication that the certificate should use utf8 strings into the x509.genkey template generated by kernel/Makefile. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Use the same digest for the autogen key sig as for the module sigDavid Howells2012-10-101-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the same digest type for the autogenerated key signature as for the module signature so that the hash algorithm is guaranteed to be present in the kernel. Without this, the X.509 certificate loader may reject the X.509 certificate so generated because it was self-signed and the signature will be checked against itself - but this won't work if the digest algorithm must be loaded as a module. The symptom is that the key fails to load with the following message emitted into the kernel log: MODSIGN: Problem loading in-kernel X.509 certificate (-65) the error in brackets being -ENOPKG. What you should see is something like: MODSIGN: Loaded cert 'Magarathea: Glacier signing key: 9588321144239a119d3406d4c4cf1fbae1836fa0' Note that this doesn't apply to certificates that are not self-signed as we don't check those currently as they require the parent CA certificate to be available. Reported-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Sign modules during the build processDavid Howells2012-10-102-1/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If CONFIG_MODULE_SIG is set, then this patch will cause all modules files to to have signatures added. The following steps will occur: (1) The module will be linked to foo.ko.unsigned instead of foo.ko (2) The module will be stripped using both "strip -x -g" and "eu-strip" to ensure minimal size for inclusion in an initramfs. (3) The signature will be generated on the stripped module. (4) The signature will be appended to the module, along with some information about the signature and a magic string that indicates the presence of the signature. Step (3) requires private and public keys to be available. By default these are expected to be found in files: signing_key.priv signing_key.x509 in the base directory of the build. The first is the private key in PEM form and the second is the X.509 certificate in DER form as can be generated from openssl: openssl req \ -new -x509 -outform PEM -out signing_key.x509 \ -keyout signing_key.priv -nodes \ -subj "/CN=H2G2/O=Magrathea/CN=Slartibartfast" If the secret key is not found then signing will be skipped and the unsigned module from (1) will just be copied to foo.ko. If signing occurs, lines like the following will be seen: LD [M] fs/foo/foo.ko.unsigned STRIP [M] fs/foo/foo.ko.stripped SIGN [M] fs/foo/foo.ko will appear in the build log. If the signature step will be skipped and the following will be seen: LD [M] fs/foo/foo.ko.unsigned STRIP [M] fs/foo/foo.ko.stripped NO SIGN [M] fs/foo/foo.ko NOTE! After the signature step, the signed module _must_not_ be passed through strip. The unstripped, unsigned module is still available at the name on the LD [M] line. This restriction may affect packaging tools (such as rpmbuild) and initramfs composition tools. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Provide a script for generating a key ID from an X.509 certDavid Howells2012-10-101-0/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a script to parse an X.509 certificate and certain pieces of information from it in order to generate a key identifier to be included within a module signature. The script takes the Subject Name and extracts (if present) the organizationName (O), the commonName (CN) and the emailAddress and fabricates the signer's name from them: (1) If both O and CN exist, then the name will be "O: CN", unless: (a) CN is prefixed by O, in which case only CN is used. (b) CN and O share at least the first 7 characters, in which case only CN is used. (2) Otherwise, CN is used if present. (3) Otherwise, O is used if present. (4) Otherwise the emailAddress is used, if present. (5) Otherwise a blank name is used. The script emits a binary encoded identifier in the following form: - 2 BE bytes indicating the length of the signer's name. - 2 BE bytes indicating the length of the subject key identifier. - The characters of the signer's name. - The bytes of the subject key identifier. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Implement module signature checkingDavid Howells2012-10-102-1/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check the signature on the module against the keys compiled into the kernel or available in a hardware key store. Currently, only RSA keys are supported - though that's easy enough to change, and the signature is expected to contain raw components (so not a PGP or PKCS#7 formatted blob). The signature blob is expected to consist of the following pieces in order: (1) The binary identifier for the key. This is expected to match the SubjectKeyIdentifier from an X.509 certificate. Only X.509 type identifiers are currently supported. (2) The signature data, consisting of a series of MPIs in which each is in the format of a 2-byte BE word sizes followed by the content data. (3) A 12 byte information block of the form: struct module_signature { enum pkey_algo algo : 8; enum pkey_hash_algo hash : 8; enum pkey_id_type id_type : 8; u8 __pad; __be32 id_length; __be32 sig_length; }; The three enums are defined in crypto/public_key.h. 'algo' contains the public-key algorithm identifier (0->DSA, 1->RSA). 'hash' contains the digest algorithm identifier (0->MD4, 1->MD5, 2->SHA1, etc.). 'id_type' contains the public-key identifier type (0->PGP, 1->X.509). '__pad' should be 0. 'id_length' should contain in the binary identifier length in BE form. 'sig_length' should contain in the signature data length in BE form. The lengths are in BE order rather than CPU order to make dealing with cross-compilation easier. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor Kconfig fix)
| * | | MODSIGN: Provide module signing public keys to the kernelDavid Howells2012-10-103-2/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Include a PGP keyring containing the public keys required to perform module verification in the kernel image during build and create a special keyring during boot which is then populated with keys of crypto type holding the public keys found in the PGP keyring. These can be seen by root: [root@andromeda ~]# cat /proc/keys 07ad4ee0 I----- 1 perm 3f010000 0 0 crypto modsign.0: RSA 87b9b3bd [] 15c7f8c3 I----- 1 perm 1f030000 0 0 keyring .module_sign: 1/4 ... It is probably worth permitting root to invalidate these keys, resulting in their removal and preventing further modules from being loaded with that key. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Automatically generate module signing keys if missingDavid Howells2012-10-101-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Automatically generate keys for module signing if they're absent so that allyesconfig doesn't break. The builder should consider generating their own key and certificate, however, so that the keys are appropriately named. The private key for the module signer should be placed in signing_key.priv (unencrypted!) and the public key in an X.509 certificate as signing_key.x509. If a transient key is desired for signing the modules, a config file for 'openssl req' can be placed in x509.genkey, looking something like the following: [ req ] default_bits = 4096 distinguished_name = req_distinguished_name prompt = no x509_extensions = myexts [ req_distinguished_name ] O = Magarathea CN = Glacier signing key emailAddress = slartibartfast@magrathea.h2g2 [ myexts ] basicConstraints=critical,CA:FALSE keyUsage=digitalSignature subjectKeyIdentifier=hash authorityKeyIdentifier=hash The build process will use this to configure: openssl req -new -nodes -utf8 -sha1 -days 36500 -batch \ -x509 -config x509.genkey \ -outform DER -out signing_key.x509 \ -keyout signing_key.priv to generate the key. Note that it is required that the X.509 certificate have a subjectKeyIdentifier and an authorityKeyIdentifier. Without those, the certificate will be rejected. These can be used to check the validity of a certificate. Note that 'make distclean' will remove signing_key.{priv,x509} and x509.genkey, whether or not they were generated automatically. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Provide Kconfig optionsDavid Howells2012-10-101-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide kernel configuration options for module signing. The following configuration options are added: CONFIG_MODULE_SIG_SHA1 CONFIG_MODULE_SIG_SHA224 CONFIG_MODULE_SIG_SHA256 CONFIG_MODULE_SIG_SHA384 CONFIG_MODULE_SIG_SHA512 These select the cryptographic hash used to digest the data prior to signing. Additionally, the crypto module selected will be built into the kernel as it won't be possible to load it as a module without incurring a circular dependency when the kernel tries to check its signature. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Provide gitignore and make clean rules for extra filesDavid Howells2012-10-102-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide gitignore and make clean rules for extra files to hide and clean up the extra files produced by module signing stuff once it is added. Also add a clean up rule for the module content extractor program used to extract the data to be signed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | MODSIGN: Add FIPS policyDavid Howells2012-10-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're in FIPS mode, we should panic if we fail to verify the signature on a module or we're asked to load an unsigned module in signature enforcing mode. Possibly FIPS mode should automatically enable enforcing mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | module: signature checking hookRusty Russell2012-10-107-1/+157
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do a very simple search for a particular string appended to the module (which is cache-hot and about to be SHA'd anyway). There's both a config option and a boot parameter which control whether we accept or fail with unsigned modules and modules that are signed with an unknown key. If module signing is enabled, the kernel will be tainted if a module is loaded that is unsigned or has a signature for which we don't have the key. (Useful feedback and tweaks by David Howells <dhowells@redhat.com>) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
OpenPOWER on IntegriCloud