summaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] mark f_ops const in the inodeArjan van de Ven2006-03-2811-22/+22
| | | | | | | | | | | Mark the f_ops members of inodes as const, as well as fix the ripple-through this causes by places that copy this f_ops and then "do stuff" with it. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] for_each_possible_cpu: fixes for generic partKAMEZAWA Hiroyuki2006-03-282-3/+3
| | | | | | | | replaces for_each_cpu with for_each_possible_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] for_each_possible_cpu: defines for_each_possible_cpuKAMEZAWA Hiroyuki2006-03-281-2/+3
| | | | | | | | | | | | | for_each_cpu() is a for-loop over cpu_possible_map. for_each_online_cpu is for-loop cpu over cpu_online_map. .....for_each_cpu() is not sufficiently explicit and can lead to mistakes. This patch adds for_each_possible_cpu() in preparation for the removal of for_each_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Optimize select/poll by putting small data sets on the stackAndi Kleen2006-03-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize select and poll by a using stack space for small fd sets This brings back an old optimization from Linux 2.0. Using the stack is faster than kmalloc. On a Intel P4 system it speeds up a select of a single pty fd by about 13% (~4000 cycles -> ~3500) It also saves memory because a daemon hanging in select or poll will usually save one or two less pages. This can add up - e.g. if you have 10 daemons blocking in poll/select you save 40KB of memory. I did a patch for this long ago, but it was never applied. This version is a reimplementation of the old patch that tries to be less intrusive. I only did the minimal changes needed for the stack allocation. The cut off point before external memory is allocated is currently at 832bytes. The system calls always allocate this much memory on the stack. These 832 bytes are divided into 256 bytes frontend data (for the select bitmaps of the pollfds) and the rest of the space for the wait queues used by the low level drivers. There are some extreme cases where this won't work out for select and it falls back to allocating memory too early - especially with very sparse large select bitmaps - but the majority of processes who only have a small number of file descriptors should be ok. [TBD: 832/256 might not be the best split for select or poll] I suspect more optimizations might be possible, but they would be more complicated. One way would be to cache the select/poll context over multiple system calls because typically the input values should be similar. Problem is when to flush the file descriptors out though. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Small fixes backported to old IDE SiS driverAlan Cox2006-03-281-0/+1
| | | | | | | | | | | | | | | | | Some quick backport bits from the libata PATA work to fix things found in the sis driver. The piix driver needs some fixes too but those are way to large and need someone working on old IDE with time to do them. This patch fixes the case where random bits get loaded into SIS timing registers according to the description of the correct behaviour from Vojtech Pavlik. It also adds the SiS5517 ATA16 chipset which is not currently supported by the driver. Thanks to Conrad Harriss for loaning me the machine with the 5517 chipset. Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] remove relayfs_fs.hAndrew Morton2006-03-281-287/+0
| | | | | | | | | | This is obsolete. Cc: Tom Zanussi <zanussi@us.ibm.com> Cc: Jens Axboe <axboe@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/fat/: proper prototypes for two functionsAdrian Bunk2006-03-281-0/+3
| | | | | | | | | | Add proper prototypes for fat_cache_init() and fat_cache_destroy() in msdos_fs.h. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Add oprofile_add_ext_sampleBrian Rogan2006-03-281-0/+10
| | | | | | | | | | | | | | On ppc64 we look at a profiling register to work out the sample address and if it was in userspace or kernel. The backtrace interface oprofile_add_sample does not allow this. Create oprofile_add_ext_sample and make oprofile_add_sample use it too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Philippe Elie <phil.el@wanadoo.fr> Cc: John Levon <levon@movementarian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] synclink_gt add gpio featurePaul Fulghum2006-03-281-1/+10
| | | | | | | | | Add driver support for general purpose I/O feature of the Synclink GT adapters. Signed-off-by: Paul Fulghum <paulkf@micrgate.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds2006-03-271-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Don't make debugfs depend on DEBUG_KERNEL [PATCH] Fix blktrace compile with sysfs not defined [PATCH] unused label in drivers/block/cciss. [BLOCK] increase size of disk stat counters [PATCH] blk_execute_rq_nowait-speedup [PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion [PATCH] kzalloc() conversion in drivers/block [PATCH] update max_sectors documentation
| * [BLOCK] increase size of disk stat countersBen Woodard2006-03-271-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel's representation of the disk statistics uses the type unsigned which is 32b on both 32b and 64b platforms. Unfortunately, most system tools that work with these numbers that are exported in /proc/diskstats including iostat read these numbers into unsigned longs. This works fine on 32b platforms and when the number of IO transactions are small on 64b platforms. However, when the numbers wrap on 64b platforms & you read the numbers into unsigned longs, and compare the numbers to previous readings, then you get an unsigned representation of a negative number. This looks like a very large 64b number & gives you bizarre readouts in iostat: ilc4: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util ilc4: sda 5.50 0.00 143.96 0.00 307496983987862656.00 0.00 153748491993931328.00 0.00 2136028725038430.00 7.94 55.12 5.59 80.42 Though fixing iostat in user space is possible, and a quick survey indicates that several other similar tools also use unsigned longs when processing /proc/diskstats. Therefore, it seems like a better approach would be to extend the length of the disk_stats structure on 64b architectures to 64b. The following patch does that. It should not affect the operation on 32b platforms. Signed-off-by: Ben Woodard <woodard@redhat.com> Cc: Rick Lindsley <ricklind@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>
* | [PATCH] md: Convert reconfig_sem to reconfig_mutexNeilBrown2006-03-271-1/+1
| | | | | | | | | | | | | | | | ... being careful that mutex_trylock is inverted wrt down_trylock Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Support suspending of IO to regions of an md arrayNeilBrown2006-03-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows user-space to access data safely. This is needed for raid5 reshape as user-space needs to take a backup of the first few stripes before allowing reshape to commence. It will also be useful in cluster-aware raid1 configurations so that all cluster members can leave a section of the array untouched while a resync/recovery happens. A 'start' and 'end' of the suspended range are written to 2 sysfs attributes. Note that only one range can be suspended at a time. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Split reshape handler in check_reshape and start_reshapeNeilBrown2006-03-271-1/+2
| | | | | | | | | | | | | | | | | | | | check_reshape checks validity and does things that can be done instantly - like adding devices to raid1. start_reshape initiates a restriping process to convert the whole array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Only checkpoint expansion progress occasionallyNeilBrown2006-03-271-0/+3
| | | | | | | | | | | | | | | | | | | | Instead of checkpointing at each stripe, only checkpoint when a new write would overwrite uncheckpointed data. Block any write to the uncheckpointed area. Arbitrarily checkpoint at least every 3Meg. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Checkpoint and allow restart of raid5 reshapeNeilBrown2006-03-274-3/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We allow the superblock to record an 'old' and a 'new' geometry, and a position where any conversion is up to. The geometry allows for changing chunksize, layout and level as well as number of devices. When using verion-0.90 superblock, we convert the version to 0.91 while the conversion is happening so that an old kernel will refuse the assemble the array. For version-1, we use a feature bit for the same effect. When starting an array we check for an incomplete reshape and restart the reshape process if needed. If the reshape stopped at an awkward time (like when updating the first stripe) we refuse to assemble the array, and let user-space worry about it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Final stages of raid5 expand codeNeilBrown2006-03-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds raid5_reshape and end_reshape which will start and finish the reshape processes. raid5_reshape is only enabled in CONFIG_MD_RAID5_RESHAPE is set, to discourage accidental use. Read the 'help' for the CONFIG_MD_RAID5_RESHAPE entry. and Make sure that you have backups, just in case. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Core of raid5 resize processNeilBrown2006-03-272-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides the core of the resize/expand process. sync_request notices if a 'reshape' is happening and acts accordingly. It allocated new stripe_heads for the next chunk-wide-stripe in the target geometry, marking them STRIPE_EXPANDING. Then it finds which stripe heads in the old geometry can provide data needed by these and marks them STRIPE_EXPAND_SOURCE. This causes stripe_handle to read all blocks on those stripes. Once all blocks on a STRIPE_EXPAND_SOURCE stripe_head are read, any that are needed are copied into the corresponding STRIPE_EXPANDING stripe_head. Once a STRIPE_EXPANDING stripe_head is full, it is marks STRIPE_EXPAND_READY and then is written out and released. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Infrastructure to allow normal IO to continue while array is ↵NeilBrown2006-03-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | expanding We need to allow that different stripes are of different effective sizes, and use the appropriate size. Also, when a stripe is being expanded, we must block any IO attempts until the stripe is stable again. Key elements in this change are: - each stripe_head gets a 'disk' field which is part of the key, thus there can sometimes be two stripe heads of the same area of the array, but covering different numbers of devices. One of these will be marked STRIPE_EXPANDING and so won't accept new requests. - conf->expand_progress tracks how the expansion is progressing and is used to determine whether the target part of the array has been expanded yet or not. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Allow stripes to be expanded in preparation for expanding an arrayNeilBrown2006-03-271-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before a RAID-5 can be expanded, we need to be able to expand the stripe-cache data structure. This requires allocating new stripes in a new kmem_cache. If this succeeds, we copy cache pages over and release the old stripes and kmem_cache. We then allocate new pages. If that fails, we leave the stripe cache at it's new size. It isn't worth the effort to shrink it back again. Unfortuanately this means we need two kmem_cache names as we, for a short period of time, we have two kmem_caches. So they are raid5/%s and raid5/%s-alt Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] md: Split disks array out of raid5 conf structure so it is easier to ↵NeilBrown2006-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | grow The remainder of this batch implements raid5 reshaping. Currently the only shape change that is supported is added a device, but it is envisioned that changing the chunksize and layout will also be supported, as well as changing the level (e.g. 1->5, 5->6). The reshape process naturally has to move all of the data in the array, and so should be used with caution. It is believed to work, and some testing does support this, but wider testing would be great for increasing my confidence. You will need a version of mdadm newer than 2.3.1 to make use of raid5 growth. This is because mdadm need to take a copy of a 'critical section' at the start of the array incase there is a crash at an awkward moment. On restart, mdadm will restore the critical section and allow reshape to continue. I hope to release a 2.4-pre by early next week - it still needs a little more polishing. This patch: Previously the array of disk information was included in the raid5 'conf' structure which was allocated to an appropriate size. This makes it awkward to change the size of that array. So we split it off into a separate kmalloced array which will require a little extra indexing, but is much easier to grow. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] dm/md dependency tree in sysfs: bd_claim_by_kobjectJun'ichi Nomura2006-03-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | Adding bd_claim_by_kobject() function which takes kobject as additional signature of holder device and creates sysfs symlinks between holder device and claimed device. bd_release_from_kobject() is a counterpart of bd_claim_by_kobject. Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] dm-md-dependency-tree-in-sysfs-holders-slaves-subdirectory-tidyAndrew Morton2006-03-271-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove all the CONFIG_SYSFS stuff. That's supposed to all be implemented up in header files. Yes, the CONFIG_SYSFS=n data structures will be a little larger than necessary, but that's a tradeoff we can decide to make. Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] dm/md dependency tree in sysfs: holders/slaves subdirectoryJun'ichi Nomura2006-03-271-0/+7
| | | | | | | | | | | | | | | | | | | | Creating "slaves" and "holders" directories in /sys/block/<disk> and creating "holders" directory under /sys/block/<disk>/<partition> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] dm store geometryDarrick J. Wong2006-03-272-2/+17
| | | | | | | | | | | | | | | | | | | | | | Allow drive geometry to be stored with a new DM_DEV_SET_GEOMETRY ioctl. Device-mapper will now respond to HDIO_GETGEO. If the geometry information is not available, zero will be returned for all of the parameters. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] dm: make sure QUEUE_FLAG_CLUSTER is set properlyNeilBrown2006-03-271-0/+1
| | | | | | | | | | | | | | | | | | | | This flag should be set for a virtual device iff it is set for all underlying devices. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Add ID for Quadro NVS280Pavel Roskin2006-03-271-0/+1
| | | | | | | | | | | | | | | | | | | | Quadro NVS280 is a dual-head PCIe card with PCI ID 10de:00fd and subsystem ID 10de:0215. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] RTC subsystem: M48T86 driverAlessandro Zummo2006-03-271-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | Add a driver for the ST M48T86 / Dallas DS12887 RTC. This is a platform driver. The platform device must provide I/O routines to access the RTC. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] RTC subsystem: I2C driver idsAlessandro Zummo2006-03-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | This patch adds the I2C driver ids to i2c-id.h in preparation of the I2C direct probing method. This is kept separate so that it can be integrated to Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] RTC subsystem: I2C cleanupAlessandro Zummo2006-03-271-31/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch, completely optional, removes from drivers/i2c/chips all the drivers that are implemented in the new RTC subsystem. It should be noted that none of the current driver is actually integrated, i.e. usable without further patches. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] RTC subsystem: classAlessandro Zummo2006-03-271-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the basic RTC subsystem infrastructure to the kernel. rtc/class.c - registration facilities for RTC drivers rtc/interface.c - kernel/rtc interface functions rtc/hctosys.c - snippet of code that copies hw clock to sw clock at bootup, if configured to do so. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] RTC Subsystem: library functionsAlessandro Zummo2006-03-271-0/+5
| | | | | | | | | | | | | | | | RTC and date/time related functions. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Notifier chain update: API changesAlan Stern2006-03-275-18/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel's implementation of notifier chains is unsafe. There is no protection against entries being added to or removed from a chain while the chain is in use. The issues were discussed in this thread: http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 We noticed that notifier chains in the kernel fall into two basic usage classes: "Blocking" chains are always called from a process context and the callout routines are allowed to sleep; "Atomic" chains can be called from an atomic context and the callout routines are not allowed to sleep. We decided to codify this distinction and make it part of the API. Therefore this set of patches introduces three new, parallel APIs: one for blocking notifiers, one for atomic notifiers, and one for "raw" notifiers (which is really just the old API under a new name). New kinds of data structures are used for the heads of the chains, and new routines are defined for registration, unregistration, and calling a chain. The three APIs are explained in include/linux/notifier.h and their implementation is in kernel/sys.c. With atomic and blocking chains, the implementation guarantees that the chain links will not be corrupted and that chain callers will not get messed up by entries being added or removed. For raw chains the implementation provides no guarantees at all; users of this API must provide their own protections. (The idea was that situations may come up where the assumptions of the atomic and blocking APIs are not appropriate, so it should be possible for users to handle these things in their own way.) There are some limitations, which should not be too hard to live with. For atomic/blocking chains, registration and unregistration must always be done in a process context since the chain is protected by a mutex/rwsem. Also, a callout routine for a non-raw chain must not try to register or unregister entries on its own chain. (This did happen in a couple of places and the code had to be changed to avoid it.) Since atomic chains may be called from within an NMI handler, they cannot use spinlocks for synchronization. Instead we use RCU. The overhead falls almost entirely in the unregister routine, which is okay since unregistration is much less frequent that calling a chain. Here is the list of chains that we adjusted and their classifications. None of them use the raw API, so for the moment it is only a placeholder. ATOMIC CHAINS ------------- arch/i386/kernel/traps.c: i386die_chain arch/ia64/kernel/traps.c: ia64die_chain arch/powerpc/kernel/traps.c: powerpc_die_chain arch/sparc64/kernel/traps.c: sparc64die_chain arch/x86_64/kernel/traps.c: die_chain drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list kernel/panic.c: panic_notifier_list kernel/profile.c: task_free_notifier net/bluetooth/hci_core.c: hci_notifier net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain net/ipv6/addrconf.c: inet6addr_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain net/netlink/af_netlink.c: netlink_chain BLOCKING CHAINS --------------- arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain arch/s390/kernel/process.c: idle_chain arch/x86_64/kernel/process.c idle_notifier drivers/base/memory.c: memory_chain drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list drivers/macintosh/adb.c: adb_client_list drivers/macintosh/via-pmu.c sleep_notifier_list drivers/macintosh/via-pmu68k.c sleep_notifier_list drivers/macintosh/windfarm_core.c wf_client_list drivers/usb/core/notify.c usb_notifier_list drivers/video/fbmem.c fb_notifier_list kernel/cpu.c cpu_chain kernel/module.c module_notify_list kernel/profile.c munmap_notifier kernel/profile.c task_exit_notifier kernel/sys.c reboot_notifier_list net/core/dev.c netdev_chain net/decnet/dn_dev.c: dnaddr_chain net/ipv4/devinet.c: inetaddr_chain It's possible that some of these classifications are wrong. If they are, please let us know or submit a patch to fix them. Note that any chain that gets called very frequently should be atomic, because the rwsem read-locking used for blocking chains is very likely to incur cache misses on SMP systems. (However, if the chain's callout routines may sleep then the chain cannot be atomic.) The patch set was written by Alan Stern and Chandra Seetharaman, incorporating material written by Keith Owens and suggestions from Paul McKenney and Andrew Morton. [jes@sgi.com: restructure the notifier chain initialization macros] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] lightweight robust futexes updates 2Ingo Molnar2006-03-271-10/+4
| | | | | | | | | | | | | | | | | | | | | | futex.h updates: - get rid of FUTEX_OWNER_PENDING - it's not used - reduce ROBUST_LIST_LIMIT to a saner value Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] lightweight robust futexes updatesIngo Molnar2006-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - fix: initialize the robust list(s) to NULL in copy_process. - doc update - cleanup: rename _inuser to _inatomic - __user cleanups and other small cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] lightweight robust futexes: compatIngo Molnar2006-03-272-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | 32-bit syscall compatibility support. (This patch also moves all futex related compat functionality into kernel/futex_compat.c.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] lightweight robust futexes: coreIngo Molnar2006-03-273-1/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | Add the core infrastructure for robust futexes: structure definitions, the new syscalls and the do_exit() based cleanup mechanism. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Acked-by: Ulrich Drepper <drepper@redhat.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] unify PFN_* macrosDave Hansen2006-03-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just about every architecture defines some macros to do operations on pfns. They're all virtually identical. This patch consolidates all of them. One minor glitch is that at least i386 uses them in a very skeletal header file. To keep away from #include dependency hell, I stuck the new definitions in a new, isolated header. Of all of the implementations, sh64 is the only one that varied by a bit. It used some masks to ensure that any sign-extension got ripped away before the arithmetic is done. This has been posted to that sh64 maintainers and the development list. Compiles on x86, x86_64, ia64 and ppc64. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] uninline zone helpersKAMEZAWA Hiroyuki2006-03-271-35/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Helper functions for for_each_online_pgdat/for_each_zone look too big to be inlined. Speed of these helper macro itself is not very important. (inner loops are tend to do more work than this) This patch make helper function to be out-of-lined. inline out-of-line .text 005c0680 005bf6a0 005c0680 - 005bf6a0 = FE0 = 4Kbytes. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] for_each_online_pgdat: remove pgdat_listKAMEZAWA Hiroyuki2006-03-271-3/+0
| | | | | | | | | | | | | | | | | | By using for_each_online_pgdat(), pgdat_list is not necessary now. This patch removes it. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] for_each_online_pgdat: for_each_bootmemKAMEZAWA Hiroyuki2006-03-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a list_head to bootmem_data_t and make bootmems use it. bootmem list is sorted by node_boot_start. Only nodes against which init_bootmem() is called are linked to the list. (i386 allocates bootmem only from one node(0) not from all online nodes.) A summary: 1. for_each_online_pgdat() traverses all *online* nodes. 2. alloc_bootmem() allocates memory only from initialized-for-bootmem nodes. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] define for_each_online_pgdatKAMEZAWA Hiroyuki2006-03-272-51/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch defines for_each_online_pgdat() as a replacement of for_each_pgdat() Now, online nodes are managed by node_online_map. But for_each_pgdat() uses pgdat_link to iterate over all nodes(pgdat). This means management structure for online pgdat is duplicated. I think using node_online_map for for_each_pgdat() is simple and sane rather ather than pgdat_link. New macro is named as for_each_online_pgdat(). Following patch will fix callers of for_each_pgdat(). The bootmem allocater uses for_each_pgdat() before pgdat initialization. I don't think it's sane. Following patch will fix it. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] remove zone_mem_mapKAMEZAWA Hiroyuki2006-03-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes zone_mem_map. pfn_to_page uses pgdat, page_to_pfn uses zone. page_to_pfn can use pgdat instead of zone, which is only one user of zone_mem_map. By modifing it, we can remove zone_mem_map. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] unify pfn_to_page: generic functionsKAMEZAWA Hiroyuki2006-03-271-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 3 memory models, FLATMEM, DISCONTIGMEM, SPARSEMEM. Each arch has its own page_to_pfn(), pfn_to_page() for each models. But most of them can use the same arithmetic. This patch adds asm-generic/memory_model.h, which includes generic page_to_pfn(), pfn_to_page() definitions for each memory model. When CONFIG_OUT_OF_LINE_PFN_TO_PAGE=y, out-of-line functions are used instead of macro. This is enabled by some archs and reduces text size. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] sched: new sched domain for representing multi-coreSiddha, Suresh B2006-03-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new sched domain for representing multi-core with shared caches between cores. Consider a dual package system, each package containing two cores and with last level cache shared between cores with in a package. If there are two runnable processes, with this appended patch those two processes will be scheduled on different packages. On such systems, with this patch we have observed 8% perf improvement with specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2 users). This new domain will come into play only on multi-core systems with shared caches. On other systems, this sched domain will be removed by domain degeneration code. This new domain can be also used for implementing power savings policy (see OLS 2005 CMP kernel scheduler paper for more details.. I will post another patch for power savings policy soon) Most of the arch/* file changes are for cpu_coregroup_map() implementation. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fs/nfsd/export.c,net/sunrpc/cache.c: make needlessly global code staticAdrian Bunk2006-03-272-5/+1
| | | | | | | | | | | | | | | | | | | | We can now make some code static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] knfsd: Convert sunrpc_cache to use krefsNeilBrown2006-03-272-10/+7
| | | | | | | | | | | | | | | | .. it makes some of the code nicer. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] knfsd: Unexport cache_fresh and fix a small raceNeilBrown2006-03-271-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | Cache_fresh is now only used in cache.c, so unexport it. Part of cache_fresh (setting CACHE_VALID) should really be done under the lock, while part (calling cache_revisit_request etc) must be done outside the lock. So we split it up appropriately. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] knfsd: Remove DefineCacheLookupNeilBrown2006-03-271-113/+0
| | | | | | | | | | | | | | | | This has been replaced by more traditional code. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] knfsd: Create cache_lookup function instead of using a macro to ↵NeilBrown2006-03-271-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | declare one The C++-like 'template' approach proves to be too ugly and hard to work with. The old 'template' won't go away until all users are updated. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
OpenPOWER on IntegriCloud