summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* [NETROM]: Implement G8PZT Circuit reset for NET/ROMRalf Baechle2005-09-122-2/+28
| | | | | | | | | | | | | | | | | NET/ROM is lacking a connection reset like TCP's RST flag which at times may result in a connecting having to slowly timing out instead of just being reset. An earlier attempt to reset the connection by sending a NR_CONNACK | NR_CHOKE_FLAG transport was inacceptable as it did result in crashes of BPQ systems. An alternative approach of introducing a new transport type 7 (NR_RESET) has be implemented several years ago in Paula Jayne Dowie G8PZT's Xrouter. Implement NR_RESET for Linux's NET/ROM but like any messing with the state engine consider this experimental for now and thus control it by a sysctl (net.netrom.reset) which for the time being defaults to off. Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [AX.25]: Add descriptions to constantsRalf Baechle2005-09-121-5/+5
| | | | | | | Comment the names used for the AX.25 state machine. Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [AX.25]: Add more PIDsRalf Baechle2005-09-121-5/+14
| | | | | | | | Add a few more PID definitions. AX.25 PIDs are the equivalent to IP protocol numbers. Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [AX.25]: Rename ax25_encapsulate to ax25_hard_headerRalf Baechle2005-09-121-1/+1
| | | | | | | | Rename ax25_encapsulate to ax25_hard_header which these days more accurately describes what the function is supposed to do. Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add netlink connector.Evgeniy Polyakov2005-09-112-0/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel connector - new userspace <-> kernel space easy to use communication module which implements easy to use bidirectional message bus using netlink as it's backend. Connector was created to eliminate complex skb handling both in send and receive message bus direction. Connector driver adds possibility to connect various agents using as one of it's backends netlink based network. One must register callback and identifier. When driver receives special netlink message with appropriate identifier, appropriate callback will be called. From the userspace point of view it's quite straightforward: socket(); bind(); send(); recv(); But if kernelspace want to use full power of such connections, driver writer must create special sockets, must know about struct sk_buff handling... Connector allows any kernelspace agents to use netlink based networking for inter-process communication in a significantly easier way: int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); struct cb_id { __u32 idx; __u32 val; }; idx and val are unique identifiers which must be registered in connector.h for in-kernel usage. void (*callback) (void *) - is a callback function which will be called when message with above idx.val will be received by connector core. Using connector completely hides low-level transport layer from it's users. Connector uses new netlink ability to have many groups in one socket. [ Incorporating many cleanups and fixes by myself and Andrew Morton -DaveM ] Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Pull sn-features into release branchTony Luck2005-09-112-9/+84
|\
| * [IA64-SGI] Add new vendor-specific SAL calls for:Jack Steiner2005-08-312-9/+84
| | | | | | | | | | | | | | | | | | | | | | | | - notifying the PROM of specific features that are supported by the OS. This is used to enable PROM feature if and only if the corresponding feature is implemented in the OS - fetch feature sets that are supported by the current PROM. This allows the OS to selectively enable features when the PROM support is available. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | [IA64] MCA/INIT: remove obsolete unwind codeKeith Owens2005-09-111-7/+0
| | | | | | | | | | | | | | | | Delete the special case unwind code that was only used by the old MCA/INIT handler. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | [PATCH] MCA/INIT: use per cpu stacksKeith Owens2005-09-112-136/+91
| | | | | | | | | | | | | | | | | | | | | | | | The bulk of the change. Use per cpu MCA/INIT stacks. Change the SAL to OS state (sos) to be per process. Do all the assembler work on the MCA/INIT stacks, leaving the original stack alone. Pass per cpu state data to the C handlers for MCA and INIT, which also means changing the mca_drv interfaces slightly. Lots of verification on whether the original stack is usable before converting it to a sleeping process. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | [IA64] MCA/INIT: add an extra thread_info flagKeith Owens2005-09-112-1/+3
| | | | | | | | | | | | | | | | Add an extra thread_info flag to indicate the special MCA/INIT stacks. Mainly for debuggers. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | [PATCH] MCA/INIT: scheduler hooksKeith Owens2005-09-111-0/+2
| | | | | | | | | | | | | | Scheduler hooks to see/change which process is deemed to be on a cpu. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband Linus Torvalds2005-09-115-5/+141
|\ \
| * | [PATCH] IB CM: support CM redirJohn Kingman2005-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Changes to CM to support CM and port redirection (REJ reason 24). Signed-off-by: John Kingman <kingman <at> storagegear.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Make sure that userspace does not retrieve stale asynchronous orRoland Dreier2005-09-091-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | completion events after destroying a CQ, QP or SRQ. We do this by sweeping the event lists before returning from a destroy calls, and then return the number of events already reported before the destroy call. This allows userspace wait until it has processed all events for an object returned from the kernel before it frees its context for the object. The ABI of the destroy CQ, destroy QP and destroy SRQ commands has to change to return the event count, so bump the ABI version from 1 to 2. The userspace libibverbs library has already been updated to handle both the old and new ABI versions. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | [PATCH] IB: Add struct for ClassPortInfoRoland Dreier2005-09-091-0/+21
| | | | | | | | | | | | | | | | | | | | | Add structure definition for ClassPortInfo format. This is needed for (at least) handling CM redirects. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | [PATCH] IB: Move SA attributes to ib_sa.hHal Rosenstock2005-09-091-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | SA: Move SA attributes to ib_sa.h so are accessible to more than sa_query.c. Also, remove deprecated attributes and add one missing one. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | [PATCH] IB: Define more SA methodsHal Rosenstock2005-09-091-1/+5
| | | | | | | | | | | | | | | | | | | | | ib_sa.h: Define more SA methods (initially for madeye decode) Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | [PATCH] IB: Add user-supplied context to userspace CM ABISean Hefty2005-09-071-3/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add user specified context to all uCM events. Users will not retrieve any events associated with the context after destroying the corresponding cm_id. - Provide the ib_cm_init_qp_attr() call to userspace clients of the CM. This call may be used to set QP attributes properly before modifying the QP. - Fixes some error handling synchonization and cleanup issues. - Performs some minor code cleanup. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | | Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Linus Torvalds2005-09-102-21/+37
|\ \ \
| * \ \ Merge davem@outer-richmond.davemloft.net:src/GIT/net-2.6/ David S. Miller2005-09-102-21/+37
| |\ \ \
| | * | | [IPV6]: Bring Type 0 routing header in-line with rfc3542.Brian Haley2005-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6-git-rfc3542 David S. Miller2005-09-101-20/+36
| | |\ \ \
| | | * | | [IPV6]: Note values allocated for ip6_tables.YOSHIFUJI Hideaki2005-09-101-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid future conflicts, add a note values allocated for ip6_tables. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
| | | * | | [IPV6]: rearrange constants for new advanced API to solve conflicts.YOSHIFUJI Hideaki2005-09-101-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 64, 65 are already used in ip6_tables. Pointed out by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* | | | | | [PATCH] uml spinlock breakageAl Viro2005-09-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mingo missed that one... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input Linus Torvalds2005-09-101-0/+8
|\ \ \ \ \ \
| * \ \ \ \ \ Manual merge with LinusDmitry Torokhov2005-09-09640-13075/+11813
| |\ \ \ \ \ \ | | | |/ / / / | | |/| | | |
| * | | | | | Input: HID - add more consumer usagesVojtech Pavlik2005-09-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend mapping of the consumer usage page in hid-input.c to handle more cases appearing on new USB keyboards. Signed-off-by: Vojtech Pavlik <vojtech@suse.cz> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* | | | | | | [PATCH] uml: inline mk_pte and various friendsPaolo 'Blaisorblade' Giarrusso2005-09-102-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turns out that, for UML, a *lot* of VM-related trivial functions are not inlined but rather normal functions. In other sections of UML code, this is justified by having files which interact with the host and cannot therefore include kernel headers, but in this case there's no such justification. I've had to turn many of them to macros because of missing declarations. While doing this, I've decided to reuse some already existing macros. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | | [PATCH] i386 / uml: add dwarf sections to static link scriptPaolo 'Blaisorblade' Giarrusso2005-09-101-0/+38
| |_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inside the linker script, insert the code for DWARF debug info sections. This may help GDB'ing a Uml binary. Actually, it seems that ld is able to guess what I added correctly, but normal linker scripts include this section so it should be correct anyway adding it. On request by Sam Ravnborg <sam@ravnborg.org>, I've added it to asm-generic/vmlinux.lds.s. I've also moved there the stabs debug section, used the new macro in i386 linker script and added DWARF debug section to that. In the truth, I've not been able to verify the difference in GDB behaviour after this change (I've seen large improvements with another patch). This may depend on my binutils version, older one may have worse defaults. However, this section is present in normal linker script, so add it at least for the sake of cleanness. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | Merge master.kernel.org:/home/rmk/linux-2.6-arm Linus Torvalds2005-09-101-0/+3
|\ \ \ \ \ \
| * | | | | | [ARM] Add memory type based allocation syscallsRussell King2005-09-091-0/+3
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add syscall numbers and syscall table entries for mbind, set_mempolicy and get_mempolicy. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* | | | | | [PATCH] __user annotations (scsi/ch)viro@ZenIV.linux.org.uk2005-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] ppc32: support hotplug cpu on powermacsPaul Mackerras2005-09-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows cpus to be off-lined on 32-bit SMP powermacs. When a cpu is off-lined, it is put into sleep mode with interrupts disabled. It can be on-lined again by asserting its soft-reset pin, which is connected to a GPIO pin. With this I can off-line the second cpu in my dual G4 powermac, which means that I can then suspend the machine (the suspend/resume code refuses to suspend if more than one cpu is online, and making it cope with multiple cpus is surprisingly messy). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] ppc32: Kill init on unhandled synchronous signalsPaul Mackerras2005-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a patch that I have had in my tree for ages. If init causes an exception that raises a signal, such as a SIGSEGV, SIGILL or SIGFPE, and it hasn't registered a handler for it, we don't deliver the signal, since init doesn't get any signals that it doesn't have a handler for. But that means that we just return to userland and generate the same exception again immediately. With this patch we print a message and kill init in this situation. This is very useful when you have a bug in the kernel that means that init doesn't get as far as executing its first instruction. :) Without this patch the system hangs when it gets to starting the userland init; with it you at least get a message giving you a clue about what has gone wrong. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] time.h: remove ifdefsAndrew Morton2005-09-101-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove these ifdefs - there's no need to have more than one definition of these multipliers anywhere. Cc: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] include: update jiffies/{m,u}secs conversion functionsNishanth Aravamudan2005-09-102-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clarify the human-time units to jiffies conversion functions by using the constants in time.h. This makes many of the subsequent patches direct copies of the current code. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] add schedule_timeout_{,un}interruptible() interfacesNishanth Aravamudan2005-09-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add schedule_timeout_{,un}interruptible() interfaces so that schedule_timeout() callers don't have to worry about forgetting to add the set_current_state() call beforehand. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] i386/x86_64: make get_cpu_vendor() staticAdrian Bunk2005-09-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_cpu_vendor() no longer has any users in other files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] include/linux/bio.h: "extern inline" -> "static inline"Adrian Bunk2005-09-101-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] "extern inline" -> "static inline"Adrian Bunk2005-09-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] include/linux/blkdev.h: "extern inline" -> "static inline"Adrian Bunk2005-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] include/asm-i386/: "extern inline" -> "static inline"Adrian Bunk2005-09-102-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] include/asm-arm26/hardirq.h: remove #define irq_enter()Adrian Bunk2005-09-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes a #define for irq_enter that is superfluous due to a similar one in include/linux/hardirq.h. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] dmapool: Fix "nocast type" warningsVictor Fusco2005-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the sparse warning "implicit cast to nocast type" Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] lib/radix-tree: Fix "nocast type" warningsVictor Fusco2005-09-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the sparse warning "implicit cast to nocast type" Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] mm/slab: fix sparse warningsVictor Fusco2005-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the sparse warning "implicit cast to nocast type" Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] mm/filemap.c: make two functions staticAdrian Bunk2005-09-102-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With Nick Piggin <npiggin@suse.de> Give some things static scope. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] sched: TASK_NONINTERACTIVEIngo Molnar2005-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a task state bit (TASK_NONINTERACTIVE), which can be used by blocking points to mark the task's wait as "non-interactive". This does not mean the task will be considered a CPU-hog - the wait will simply not have an effect on the waiting task's priority - positive or negative alike. Right now only pipe_wait() will make use of it, because it's a common source of not-so-interactive waits (kernel compilation jobs, etc.). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] cpuset semaphore depth check deadlock fixPaul Jackson2005-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cpusets-formalize-intermediate-gfp_kernel-containment patch has a deadlock problem. This patch was part of a set of four patches to make more extensive use of the cpuset 'mem_exclusive' attribute to manage kernel GFP_KERNEL memory allocations and to constrain the out-of-memory (oom) killer. A task that is changing cpusets in particular ways on a system when it is very short of free memory could double trip over the global cpuset_sem semaphore (get the lock and then deadlock trying to get it again). The second attempt to get cpuset_sem would be in the routine cpuset_zone_allowed(). This was discovered by code inspection. I can not reproduce the problem except with an artifically hacked kernel and a specialized stress test. In real life you cannot hit this unless you are manipulating cpusets, and are very unlikely to hit it unless you are rapidly modifying cpusets on a memory tight system. Even then it would be a rare occurence. If you did hit it, the task double tripping over cpuset_sem would deadlock in the kernel, and any other task also trying to manipulate cpusets would deadlock there too, on cpuset_sem. Your batch manager would be wedged solid (if it was cpuset savvy), but classic Unix shells and utilities would work well enough to reboot the system. The unusual condition that led to this bug is that unlike most semaphores, cpuset_sem _can_ be acquired while in the page allocation code, when __alloc_pages() calls cpuset_zone_allowed. So it easy to mistakenly perform the following sequence: 1) task makes system call to alter a cpuset 2) take cpuset_sem 3) try to allocate memory 4) memory allocator, via cpuset_zone_allowed, trys to take cpuset_sem 5) deadlock The reason that this is not a serious bug for most users is that almost all calls to allocate memory don't require taking cpuset_sem. Only some code paths off the beaten track require taking cpuset_sem -- which is good. Taking a global semaphore on the main code path for allocating memory would not scale well. This patch fixes this deadlock by wrapping the up() and down() calls on cpuset_sem in kernel/cpuset.c with code that tracks the nesting depth of the current task on that semaphore, and only does the real down() if the task doesn't hold the lock already, and only does the real up() if the nesting depth (number of unmatched downs) is exactly one. The previous required use of refresh_mems(), anytime that the cpuset_sem semaphore was acquired and the code executed while holding that semaphore might try to allocate memory, is no longer required. Two refresh_mems() calls were removed thanks to this. This is a good change, as failing to get all the necessary refresh_mems() calls placed was a primary source of bugs in this cpuset code. The only remaining call to refresh_mems() is made while doing a memory allocation, if certain task memory placement data needs to be updated from its cpuset, due to the cpuset having been changed behind the tasks back. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
OpenPOWER on IntegriCloud