diff options
Diffstat (limited to 'Documentation')
92 files changed, 2631 insertions, 439 deletions
diff --git a/Documentation/ABI/stable/sysfs-bus-xen-backend b/Documentation/ABI/stable/sysfs-bus-xen-backend new file mode 100644 index 0000000..3d5951c --- /dev/null +++ b/Documentation/ABI/stable/sysfs-bus-xen-backend @@ -0,0 +1,75 @@ +What: /sys/bus/xen-backend/devices/*/devtype +Date: Feb 2009 +KernelVersion: 2.6.38 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The type of the device. e.g., one of: 'vbd' (block), + 'vif' (network), or 'vfb' (framebuffer). + +What: /sys/bus/xen-backend/devices/*/nodename +Date: Feb 2009 +KernelVersion: 2.6.38 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + XenStore node (under /local/domain/NNN/) for this + backend device. + +What: /sys/bus/xen-backend/devices/vbd-*/physical_device +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The major:minor number (in hexidecimal) of the + physical device providing the storage for this backend + block device. + +What: /sys/bus/xen-backend/devices/vbd-*/mode +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Whether the block device is read-only ('r') or + read-write ('w'). + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/f_req +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of flush requests from the frontend. + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/oo_req +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of requests delayed because the backend was too + busy processing previous requests. + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/rd_req +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of read requests from the frontend. + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/rd_sect +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of sectors read by the frontend. + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/wr_req +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of write requests from the frontend. + +What: /sys/bus/xen-backend/devices/vbd-*/statistics/wr_sect +Date: April 2011 +KernelVersion: 3.0 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Number of sectors written by the frontend. diff --git a/Documentation/ABI/stable/sysfs-devices-system-xen_memory b/Documentation/ABI/stable/sysfs-devices-system-xen_memory new file mode 100644 index 0000000..caa311d --- /dev/null +++ b/Documentation/ABI/stable/sysfs-devices-system-xen_memory @@ -0,0 +1,77 @@ +What: /sys/devices/system/xen_memory/xen_memory0/max_retry_count +Date: May 2011 +KernelVersion: 2.6.39 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The maximum number of times the balloon driver will + attempt to increase the balloon before giving up. See + also 'retry_count' below. + A value of zero means retry forever and is the default one. + +What: /sys/devices/system/xen_memory/xen_memory0/max_schedule_delay +Date: May 2011 +KernelVersion: 2.6.39 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The limit that 'schedule_delay' (see below) will be + increased to. The default value is 32 seconds. + +What: /sys/devices/system/xen_memory/xen_memory0/retry_count +Date: May 2011 +KernelVersion: 2.6.39 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The current number of times that the balloon driver + has attempted to increase the size of the balloon. + The default value is one. With max_retry_count being + zero (unlimited), this means that the driver will attempt + to retry with a 'schedule_delay' delay. + +What: /sys/devices/system/xen_memory/xen_memory0/schedule_delay +Date: May 2011 +KernelVersion: 2.6.39 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The time (in seconds) to wait between attempts to + increase the balloon. Each time the balloon cannot be + increased, 'schedule_delay' is increased (until + 'max_schedule_delay' is reached at which point it + will use the max value). + +What: /sys/devices/system/xen_memory/xen_memory0/target +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + The target number of pages to adjust this domain's + memory reservation to. + +What: /sys/devices/system/xen_memory/xen_memory0/target_kb +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + As target above, except the value is in KiB. + +What: /sys/devices/system/xen_memory/xen_memory0/info/current_kb +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Current size (in KiB) of this domain's memory + reservation. + +What: /sys/devices/system/xen_memory/xen_memory0/info/high_kb +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Amount (in KiB) of high memory in the balloon. + +What: /sys/devices/system/xen_memory/xen_memory0/info/low_kb +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> +Description: + Amount (in KiB) of low (or normal) memory in the + balloon. diff --git a/Documentation/ABI/testing/sysfs-bus-rbd b/Documentation/ABI/testing/sysfs-bus-rbd index fa72ccb..dbedafb 100644 --- a/Documentation/ABI/testing/sysfs-bus-rbd +++ b/Documentation/ABI/testing/sysfs-bus-rbd @@ -57,13 +57,6 @@ create_snap $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create -rollback_snap - - Rolls back data to the specified snapshot. This goes over the entire - list of rados blocks and sends a rollback command to each. - - $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback - snap_* A directory per each snapshot diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb index e647378..b4f5487 100644 --- a/Documentation/ABI/testing/sysfs-bus-usb +++ b/Documentation/ABI/testing/sysfs-bus-usb @@ -119,6 +119,31 @@ Description: Write a 1 to force the device to disconnect (equivalent to unplugging a wired USB device). +What: /sys/bus/usb/drivers/.../new_id +Date: October 2011 +Contact: linux-usb@vger.kernel.org +Description: + Writing a device ID to this file will attempt to + dynamically add a new device ID to a USB device driver. + This may allow the driver to support more hardware than + was included in the driver's static device ID support + table at compile time. The format for the device ID is: + idVendor idProduct bInterfaceClass. + The vendor ID and device ID fields are required, the + interface class is optional. + Upon successfully adding an ID, the driver will probe + for the device and attempt to bind to it. For example: + # echo "8086 10f5" > /sys/bus/usb/drivers/foo/new_id + +What: /sys/bus/usb-serial/drivers/.../new_id +Date: October 2011 +Contact: linux-usb@vger.kernel.org +Description: + For serial USB drivers, this attribute appears under the + extra bus folder "usb-serial" in sysfs; apart from that + difference, all descriptions from the entry + "/sys/bus/usb/drivers/.../new_id" apply. + What: /sys/bus/usb/drivers/.../remove_id Date: November 2009 Contact: CHENG Renquan <rqcheng@smu.edu.sg> diff --git a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff index 9aec8ef..167d903 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff +++ b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff @@ -1,7 +1,7 @@ What: /sys/module/hid_logitech/drivers/hid:logitech/<dev>/range. Date: July 2011 KernelVersion: 3.2 -Contact: Michal Malý <madcatxster@gmail.com> +Contact: Michal Malý <madcatxster@gmail.com> Description: Display minimum, maximum and current range of the steering wheel. Writing a value within min and max boundaries sets the range of the wheel. diff --git a/Documentation/ABI/testing/sysfs-driver-hid-multitouch b/Documentation/ABI/testing/sysfs-driver-hid-multitouch new file mode 100644 index 0000000..f79839d --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-hid-multitouch @@ -0,0 +1,9 @@ +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/quirks +Date: November 2011 +Contact: Benjamin Tissoires <benjamin.tissoires@gmail.com> +Description: The integer value of this attribute corresponds to the + quirks actually in place to handle the device's protocol. + When read, this attribute returns the current settings (see + MT_QUIRKS_* in hid-multitouch.c). + When written this attribute change on the fly the quirks, then + the protocol to handle the device. diff --git a/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku b/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku new file mode 100644 index 0000000..189dc43 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku @@ -0,0 +1,135 @@ +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/actual_profile +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The integer value of this attribute ranges from 0-4. + When read, this attribute returns the number of the actual + profile. This value is persistent, so its equivalent to the + profile that's active when the device is powered on next time. + When written, this file sets the number of the startup profile + and the device activates this profile immediately. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/info +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When read, this file returns general data like firmware version. + The data is 6 bytes long. + This file is readonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/key_mask +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one deactivate certain keys like + windows and application keys, to prevent accidental presses. + Profile number for which this settings occur is included in + written data. The data has to be 6 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_capslock +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the + capslock key for a specific profile. Profile number is included + in written data. The data has to be 6 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_easyzone +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the + easyzone keys for a specific profile. Profile number is included + in written data. The data has to be 65 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_function +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the + function keys for a specific profile. Profile number is included + in written data. The data has to be 41 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_macro +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the macro + keys for a specific profile. Profile number is included in + written data. The data has to be 35 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_media +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the media + keys for a specific profile. Profile number is included in + written data. The data has to be 29 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/keys_thumbster +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the function of the + thumbster keys for a specific profile. Profile number is included + in written data. The data has to be 23 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/last_set +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the time in secs since + epoch in which the last configuration took place. + The data has to be 20 bytes long. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/light +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one set the backlight intensity for + a specific profile. Profile number is included in written data. + The data has to be 10 bytes long. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/macro +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one store macros with max 500 + keystrokes for a specific button for a specific profile. + Button and profile numbers are included in written data. + The data has to be 2083 bytes long. + Before reading this file, control has to be written to select + which profile and key to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/control +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one select which data from which + profile will be read next. The data has to be 3 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/talk +Date: June 2011 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one trigger easyshift functionality + from the host. + The data has to be 16 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net diff --git a/Documentation/ABI/testing/sysfs-driver-hid-wiimote b/Documentation/ABI/testing/sysfs-driver-hid-wiimote index 5d5a16e..3d98009 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-wiimote +++ b/Documentation/ABI/testing/sysfs-driver-hid-wiimote @@ -8,3 +8,15 @@ Contact: David Herrmann <dh.herrmann@googlemail.com> Description: Make it possible to set/get current led state. Reading from it returns 0 if led is off and 1 if it is on. Writing 0 to it disables the led, writing 1 enables it. + +What: /sys/bus/hid/drivers/wiimote/<dev>/extension +Date: August 2011 +KernelVersion: 3.2 +Contact: David Herrmann <dh.herrmann@googlemail.com> +Description: This file contains the currently connected and initialized + extensions. It can be one of: none, motionp, nunchuck, classic, + motionp+nunchuck, motionp+classic + motionp is the official Nintendo Motion+ extension, nunchuck is + the official Nintendo Nunchuck extension and classic is the + Nintendo Classic Controller extension. The motionp extension can + be combined with the other two. diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl index 08ff908..24979f6 100644 --- a/Documentation/DocBook/debugobjects.tmpl +++ b/Documentation/DocBook/debugobjects.tmpl @@ -96,6 +96,7 @@ <listitem><para>debug_object_deactivate</para></listitem> <listitem><para>debug_object_destroy</para></listitem> <listitem><para>debug_object_free</para></listitem> + <listitem><para>debug_object_assert_init</para></listitem> </itemizedlist> Each of these functions takes the address of the real object and a pointer to the object type specific debug description @@ -273,6 +274,26 @@ debug checks. </para> </sect1> + + <sect1 id="debug_object_assert_init"> + <title>debug_object_assert_init</title> + <para> + This function is called to assert that an object has been + initialized. + </para> + <para> + When the real object is not tracked by debugobjects, it calls + fixup_assert_init of the object type description structure + provided by the caller, with the hardcoded object state + ODEBUG_NOT_AVAILABLE. The fixup function can correct the problem + by calling debug_object_init and other specific initializing + functions. + </para> + <para> + When the real object is already tracked by debugobjects it is + ignored. + </para> + </sect1> </chapter> <chapter id="fixupfunctions"> <title>Fixup functions</title> @@ -381,6 +402,35 @@ statistics. </para> </sect1> + <sect1 id="fixup_assert_init"> + <title>fixup_assert_init</title> + <para> + This function is called from the debug code whenever a problem + in debug_object_assert_init is detected. + </para> + <para> + Called from debug_object_assert_init() with a hardcoded state + ODEBUG_STATE_NOTAVAILABLE when the object is not found in the + debug bucket. + </para> + <para> + The function returns 1 when the fixup was successful, + otherwise 0. The return value is used to update the + statistics. + </para> + <para> + Note, this function should make sure debug_object_init() is + called before returning. + </para> + <para> + The handling of statically initialized objects is a special + case. The fixup function should check if this is a legitimate + case of a statically initialized object or not. In this case only + debug_object_init() should be called to make the object known to + the tracker. Then the function should return 0 because this is not + a real fixup. + </para> + </sect1> </chapter> <chapter id="bugs"> <title>Known Bugs And Assumptions</title> diff --git a/Documentation/HOWTO b/Documentation/HOWTO index 81bc1a9..f7ade3b 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -275,8 +275,8 @@ versions. If no 2.6.x.y kernel is available, then the highest numbered 2.6.x kernel is the current stable kernel. -2.6.x.y are maintained by the "stable" team <stable@kernel.org>, and are -released as needs dictate. The normal release period is approximately +2.6.x.y are maintained by the "stable" team <stable@vger.kernel.org>, and +are released as needs dictate. The normal release period is approximately two weeks, but it can be longer if there are no pressing problems. A security-related problem, instead, can cause a release to happen almost instantly. diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 0c134f8..bff2d8b 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -328,6 +328,12 @@ over a rather long period of time, but improvements are always welcome! RCU rather than SRCU, because RCU is almost always faster and easier to use than is SRCU. + If you need to enter your read-side critical section in a + hardirq or exception handler, and then exit that same read-side + critical section in the task that was interrupted, then you need + to srcu_read_lock_raw() and srcu_read_unlock_raw(), which avoid + the lockdep checking that would otherwise this practice illegal. + Also unlike other forms of RCU, explicit initialization and cleanup is required via init_srcu_struct() and cleanup_srcu_struct(). These are passed a "struct srcu_struct" diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index 3185270..bf77833 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt @@ -38,11 +38,11 @@ o How can the updater tell when a grace period has completed Preemptible variants of RCU (CONFIG_TREE_PREEMPT_RCU) get the same effect, but require that the readers manipulate CPU-local - counters. These counters allow limited types of blocking - within RCU read-side critical sections. SRCU also uses - CPU-local counters, and permits general blocking within - RCU read-side critical sections. These two variants of - RCU detect grace periods by sampling these counters. + counters. These counters allow limited types of blocking within + RCU read-side critical sections. SRCU also uses CPU-local + counters, and permits general blocking within RCU read-side + critical sections. These variants of RCU detect grace periods + by sampling these counters. o If I am running on a uniprocessor kernel, which can only do one thing at a time, why should I wait for a grace period? diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 4e95920..083d88c 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -101,6 +101,11 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning messages. +o A hardware or software issue shuts off the scheduler-clock + interrupt on a CPU that is not in dyntick-idle mode. This + problem really has happened, and seems to be most likely to + result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels. + o A bug in the RCU implementation. o A hardware failure. This is quite unlikely, but has occurred @@ -109,12 +114,11 @@ o A hardware failure. This is quite unlikely, but has occurred This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed. -The RCU, RCU-sched, and RCU-bh implementations have CPU stall -warning. SRCU does not have its own CPU stall warnings, but its -calls to synchronize_sched() will result in RCU-sched detecting -RCU-sched-related CPU stalls. Please note that RCU only detects -CPU stalls when there is a grace period in progress. No grace period, -no CPU stall warnings. +The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. +SRCU does not have its own CPU stall warnings, but its calls to +synchronize_sched() will result in RCU-sched detecting RCU-sched-related +CPU stalls. Please note that RCU only detects CPU stalls when there is +a grace period in progress. No grace period, no CPU stall warnings. To diagnose the cause of the stall, inspect the stack traces. The offending function will usually be near the top of the stack. diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 783d6c1..d67068d 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -61,11 +61,24 @@ nreaders This is the number of RCU reading threads supported. To properly exercise RCU implementations with preemptible read-side critical sections. +onoff_interval + The number of seconds between each attempt to execute a + randomly selected CPU-hotplug operation. Defaults to + zero, which disables CPU hotplugging. In HOTPLUG_CPU=n + kernels, rcutorture will silently refuse to do any + CPU-hotplug operations regardless of what value is + specified for onoff_interval. + shuffle_interval The number of seconds to keep the test threads affinitied to a particular subset of the CPUs, defaults to 3 seconds. Used in conjunction with test_no_idle_hz. +shutdown_secs The number of seconds to run the test before terminating + the test and powering off the system. The default is + zero, which disables test termination and system shutdown. + This capability is useful for automated testing. + stat_interval The number of seconds between output of torture statistics (via printk()). Regardless of the interval, statistics are printed when the module is unloaded. diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index aaf65f6..49587ab 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -105,14 +105,10 @@ o "dt" is the current value of the dyntick counter that is incremented or one greater than the interrupt-nesting depth otherwise. The number after the second "/" is the NMI nesting depth. - This field is displayed only for CONFIG_NO_HZ kernels. - o "df" is the number of times that some other CPU has forced a quiescent state on behalf of this CPU due to this CPU being in dynticks-idle state. - This field is displayed only for CONFIG_NO_HZ kernels. - o "of" is the number of times that some other CPU has forced a quiescent state on behalf of this CPU due to this CPU being offline. In a perfect world, this might never happen, but it diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 6ef6926..6bbe8dc 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -4,6 +4,7 @@ to start learning about RCU: 1. What is RCU, Fundamentally? http://lwn.net/Articles/262464/ 2. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/ 3. RCU part 3: the RCU API http://lwn.net/Articles/264090/ +4. The RCU API, 2010 Edition http://lwn.net/Articles/418853/ What is RCU? @@ -834,6 +835,8 @@ SRCU: Critical sections Grace period Barrier srcu_read_lock synchronize_srcu N/A srcu_read_unlock synchronize_srcu_expedited + srcu_read_lock_raw + srcu_read_unlock_raw srcu_dereference SRCU: Initialization/cleanup @@ -855,27 +858,33 @@ list can be helpful: a. Will readers need to block? If so, you need SRCU. -b. What about the -rt patchset? If readers would need to block +b. Is it necessary to start a read-side critical section in a + hardirq handler or exception handler, and then to complete + this read-side critical section in the task that was + interrupted? If so, you need SRCU's srcu_read_lock_raw() and + srcu_read_unlock_raw() primitives. + +c. What about the -rt patchset? If readers would need to block in an non-rt kernel, you need SRCU. If readers would block in a -rt kernel, but not in a non-rt kernel, SRCU is not necessary. -c. Do you need to treat NMI handlers, hardirq handlers, +d. Do you need to treat NMI handlers, hardirq handlers, and code segments with preemption disabled (whether via preempt_disable(), local_irq_save(), local_bh_disable(), or some other mechanism) as if they were explicit RCU readers? If so, you need RCU-sched. -d. Do you need RCU grace periods to complete even in the face +e. Do you need RCU grace periods to complete even in the face of softirq monopolization of one or more of the CPUs? For example, is your code subject to network-based denial-of-service attacks? If so, you need RCU-bh. -e. Is your workload too update-intensive for normal use of +f. Is your workload too update-intensive for normal use of RCU, but inappropriate for other synchronization mechanisms? If so, consider SLAB_DESTROY_BY_RCU. But please be careful! -f. Otherwise, use RCU. +g. Otherwise, use RCU. Of course, this all assumes that you have determined that RCU is in fact the right tool for your job. diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt index 771d48d..208a2d4 100644 --- a/Documentation/arm/memory.txt +++ b/Documentation/arm/memory.txt @@ -51,15 +51,14 @@ ffc00000 ffefffff DMA memory mapping region. Memory returned ff000000 ffbfffff Reserved for future expansion of DMA mapping region. -VMALLOC_END feffffff Free for platform use, recommended. - VMALLOC_END must be aligned to a 2MB - boundary. - VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space. Memory returned by vmalloc/ioremap will be dynamically placed in this region. - VMALLOC_START may be based upon the value - of the high_memory variable. + Machine specific static mappings are also + located here through iotable_init(). + VMALLOC_START is based upon the value + of the high_memory variable, and VMALLOC_END + is equal to 0xff000000. PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region. This maps the platforms RAM, and typically diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt index 3bd585b..27f2b21 100644 --- a/Documentation/atomic_ops.txt +++ b/Documentation/atomic_ops.txt @@ -84,6 +84,93 @@ compiler optimizes the section accessing atomic_t variables. *** YOU HAVE BEEN WARNED! *** +Properly aligned pointers, longs, ints, and chars (and unsigned +equivalents) may be atomically loaded from and stored to in the same +sense as described for atomic_read() and atomic_set(). The ACCESS_ONCE() +macro should be used to prevent the compiler from using optimizations +that might otherwise optimize accesses out of existence on the one hand, +or that might create unsolicited accesses on the other. + +For example consider the following code: + + while (a > 0) + do_something(); + +If the compiler can prove that do_something() does not store to the +variable a, then the compiler is within its rights transforming this to +the following: + + tmp = a; + if (a > 0) + for (;;) + do_something(); + +If you don't want the compiler to do this (and you probably don't), then +you should use something like the following: + + while (ACCESS_ONCE(a) < 0) + do_something(); + +Alternatively, you could place a barrier() call in the loop. + +For another example, consider the following code: + + tmp_a = a; + do_something_with(tmp_a); + do_something_else_with(tmp_a); + +If the compiler can prove that do_something_with() does not store to the +variable a, then the compiler is within its rights to manufacture an +additional load as follows: + + tmp_a = a; + do_something_with(tmp_a); + tmp_a = a; + do_something_else_with(tmp_a); + +This could fatally confuse your code if it expected the same value +to be passed to do_something_with() and do_something_else_with(). + +The compiler would be likely to manufacture this additional load if +do_something_with() was an inline function that made very heavy use +of registers: reloading from variable a could save a flush to the +stack and later reload. To prevent the compiler from attacking your +code in this manner, write the following: + + tmp_a = ACCESS_ONCE(a); + do_something_with(tmp_a); + do_something_else_with(tmp_a); + +For a final example, consider the following code, assuming that the +variable a is set at boot time before the second CPU is brought online +and never changed later, so that memory barriers are not needed: + + if (a) + b = 9; + else + b = 42; + +The compiler is within its rights to manufacture an additional store +by transforming the above code into the following: + + b = 42; + if (a) + b = 9; + +This could come as a fatal surprise to other code running concurrently +that expected b to never have the value 42 if a was zero. To prevent +the compiler from doing this, write something like: + + if (a) + ACCESS_ONCE(b) = 9; + else + ACCESS_ONCE(b) = 42; + +Don't even -think- about doing this without proper use of memory barriers, +locks, or atomic operations if variable a can change at runtime! + +*** WARNING: ACCESS_ONCE() DOES NOT IMPLY A BARRIER! *** + Now, we move onto the atomic operation interfaces typically implemented with the help of assembly code. diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 9c452ef..a7c96ae 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -594,53 +594,44 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be called multiple times against a cgroup. int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp, - struct task_struct *task) + struct cgroup_taskset *tset) (cgroup_mutex held by caller) -Called prior to moving a task into a cgroup; if the subsystem -returns an error, this will abort the attach operation. If a NULL -task is passed, then a successful result indicates that *any* -unspecified task can be moved into the cgroup. Note that this isn't -called on a fork. If this method returns 0 (success) then this should -remain valid while the caller holds cgroup_mutex and it is ensured that either +Called prior to moving one or more tasks into a cgroup; if the +subsystem returns an error, this will abort the attach operation. +@tset contains the tasks to be attached and is guaranteed to have at +least one task in it. + +If there are multiple tasks in the taskset, then: + - it's guaranteed that all are from the same thread group + - @tset contains all tasks from the thread group whether or not + they're switching cgroups + - the first task is the leader + +Each @tset entry also contains the task's old cgroup and tasks which +aren't switching cgroup can be skipped easily using the +cgroup_taskset_for_each() iterator. Note that this isn't called on a +fork. If this method returns 0 (success) then this should remain valid +while the caller holds cgroup_mutex and it is ensured that either attach() or cancel_attach() will be called in future. -int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk); -(cgroup_mutex held by caller) - -As can_attach, but for operations that must be run once per task to be -attached (possibly many when using cgroup_attach_proc). Called after -can_attach. - void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp, - struct task_struct *task, bool threadgroup) + struct cgroup_taskset *tset) (cgroup_mutex held by caller) Called when a task attach operation has failed after can_attach() has succeeded. A subsystem whose can_attach() has some side-effects should provide this function, so that the subsystem can implement a rollback. If not, not necessary. This will be called only about subsystems whose can_attach() operation have -succeeded. - -void pre_attach(struct cgroup *cgrp); -(cgroup_mutex held by caller) - -For any non-per-thread attachment work that needs to happen before -attach_task. Needed by cpuset. +succeeded. The parameters are identical to can_attach(). void attach(struct cgroup_subsys *ss, struct cgroup *cgrp, - struct cgroup *old_cgrp, struct task_struct *task) + struct cgroup_taskset *tset) (cgroup_mutex held by caller) Called after the task has been attached to the cgroup, to allow any post-attachment activity that requires memory allocations or blocking. - -void attach_task(struct cgroup *cgrp, struct task_struct *tsk); -(cgroup_mutex held by caller) - -As attach, but for operations that must be run once per task to be attached, -like can_attach_task. Called before attach. Currently does not support any -subsystem that might need the old_cgrp for every thread in the group. +The parameters are identical to can_attach(). void fork(struct cgroup_subsy *ss, struct task_struct *task) diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index cc0ebc5..4d8774f 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -44,8 +44,8 @@ Features: - oom-killer disable knob and oom-notifier - Root cgroup has no limit controls. - Kernel memory and Hugepages are not under control yet. We just manage - pages on LRU. To add more controls, we have to take care of performance. + Kernel memory support is work in progress, and the current version provides + basically functionality. (See Section 2.7) Brief summary of control files. @@ -72,6 +72,9 @@ Brief summary of control files. memory.oom_control # set/show oom controls. memory.numa_stat # show the number of memory usage per numa node + memory.kmem.tcp.limit_in_bytes # set/show hard limit for tcp buf memory + memory.kmem.tcp.usage_in_bytes # show current tcp buf memory allocation + 1. History The memory controller has a long history. A request for comments for the memory @@ -255,6 +258,27 @@ When oom event notifier is registered, event will be delivered. per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by zone->lru_lock, it has no lock of its own. +2.7 Kernel Memory Extension (CONFIG_CGROUP_MEM_RES_CTLR_KMEM) + +With the Kernel memory extension, the Memory Controller is able to limit +the amount of kernel memory used by the system. Kernel memory is fundamentally +different than user memory, since it can't be swapped out, which makes it +possible to DoS the system by consuming too much of this precious resource. + +Kernel memory limits are not imposed for the root cgroup. Usage for the root +cgroup may or may not be accounted. + +Currently no soft limit is implemented for kernel memory. It is future work +to trigger slab reclaim when those limits are reached. + +2.7.1 Current Kernel Memory resources accounted + +* sockets memory pressure: some sockets protocols have memory pressure +thresholds. The Memory Controller allows them to be controlled individually +per cgroup, instead of globally. + +* tcp memory pressure: sockets memory pressure for the tcp protocol. + 3. User Interface 0. Configuration diff --git a/Documentation/cgroups/net_prio.txt b/Documentation/cgroups/net_prio.txt new file mode 100644 index 0000000..01b3226 --- /dev/null +++ b/Documentation/cgroups/net_prio.txt @@ -0,0 +1,53 @@ +Network priority cgroup +------------------------- + +The Network priority cgroup provides an interface to allow an administrator to +dynamically set the priority of network traffic generated by various +applications + +Nominally, an application would set the priority of its traffic via the +SO_PRIORITY socket option. This however, is not always possible because: + +1) The application may not have been coded to set this value +2) The priority of application traffic is often a site-specific administrative + decision rather than an application defined one. + +This cgroup allows an administrator to assign a process to a group which defines +the priority of egress traffic on a given interface. Network priority groups can +be created by first mounting the cgroup filesystem. + +# mount -t cgroup -onet_prio none /sys/fs/cgroup/net_prio + +With the above step, the initial group acting as the parent accounting group +becomes visible at '/sys/fs/cgroup/net_prio'. This group includes all tasks in +the system. '/sys/fs/cgroup/net_prio/tasks' lists the tasks in this cgroup. + +Each net_prio cgroup contains two files that are subsystem specific + +net_prio.prioidx +This file is read-only, and is simply informative. It contains a unique integer +value that the kernel uses as an internal representation of this cgroup. + +net_prio.ifpriomap +This file contains a map of the priorities assigned to traffic originating from +processes in this group and egressing the system on various interfaces. It +contains a list of tuples in the form <ifname priority>. Contents of this file +can be modified by echoing a string into the file using the same tuple format. +for example: + +echo "eth0 5" > /sys/fs/cgroups/net_prio/iscsi/net_prio.ifpriomap + +This command would force any traffic originating from processes belonging to the +iscsi net_prio cgroup and egressing on interface eth0 to have the priority of +said traffic set to the value 5. The parent accounting group also has a +writeable 'net_prio.ifpriomap' file that can be used to set a system default +priority. + +Priorities are set immediately prior to queueing a frame to the device +queueing discipline (qdisc) so priorities will be assigned prior to the hardware +queue selection being made. + +One usage for the net_prio cgroup is with mqprio qdisc allowing application +traffic to be steered to hardware/driver based traffic classes. These mappings +can then be managed by administrators or other networking protocols such as +DCBX. diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index d221781..c7a2eb8 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -127,7 +127,7 @@ in the bash (as said, 1000 is default), do: echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \ >ondemand/sampling_rate -show_sampling_rate_min: +sampling_rate_min: The sampling rate is limited by the HW transition latency: transition_latency * 100 Or by kernel restrictions: @@ -140,8 +140,6 @@ HZ=100: min=200000us (200ms) The highest value of kernel and HW latency restrictions is shown and used as the minimum sampling rate. -show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT. - up_threshold: defines what the average CPU usage between the samplings of 'sampling_rate' needs to be for the kernel to make a decision on whether it should increase the frequency. For example when it is set diff --git a/Documentation/development-process/5.Posting b/Documentation/development-process/5.Posting index 903a254..8a48c9b 100644 --- a/Documentation/development-process/5.Posting +++ b/Documentation/development-process/5.Posting @@ -271,10 +271,10 @@ copies should go to: the linux-kernel list. - If you are fixing a bug, think about whether the fix should go into the - next stable update. If so, stable@kernel.org should get a copy of the - patch. Also add a "Cc: stable@kernel.org" to the tags within the patch - itself; that will cause the stable team to get a notification when your - fix goes into the mainline. + next stable update. If so, stable@vger.kernel.org should get a copy of + the patch. Also add a "Cc: stable@vger.kernel.org" to the tags within + the patch itself; that will cause the stable team to get a notification + when your fix goes into the mainline. When selecting recipients for a patch, it is good to have an idea of who you think will eventually accept the patch and get it merged. While it diff --git a/Documentation/devices.txt b/Documentation/devices.txt index eccffe7..cec8864 100644 --- a/Documentation/devices.txt +++ b/Documentation/devices.txt @@ -379,7 +379,7 @@ Your cooperation is appreciated. 162 = /dev/smbus System Management Bus 163 = /dev/lik Logitech Internet Keyboard 164 = /dev/ipmo Intel Intelligent Platform Management - 165 = /dev/vmmon VMWare virtual machine monitor + 165 = /dev/vmmon VMware virtual machine monitor 166 = /dev/i2o/ctl I2O configuration manager 167 = /dev/specialix_sxctl Specialix serial control 168 = /dev/tcldrv Technology Concepts serial control diff --git a/Documentation/devicetree/bindings/arm/fsl.txt b/Documentation/devicetree/bindings/arm/fsl.txt index c9848ad..54bddda 100644 --- a/Documentation/devicetree/bindings/arm/fsl.txt +++ b/Documentation/devicetree/bindings/arm/fsl.txt @@ -21,6 +21,10 @@ i.MX53 Smart Mobile Reference Design Board Required root node properties: - compatible = "fsl,imx53-smd", "fsl,imx53"; -i.MX6 Quad SABRE Automotive Board +i.MX6 Quad Armadillo2 Board Required root node properties: - - compatible = "fsl,imx6q-sabreauto", "fsl,imx6q"; + - compatible = "fsl,imx6q-arm2", "fsl,imx6q"; + +i.MX6 Quad SABRE Lite Board +Required root node properties: + - compatible = "fsl,imx6q-sabrelite", "fsl,imx6q"; diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt index 52916b4..9b4b82a 100644 --- a/Documentation/devicetree/bindings/arm/gic.txt +++ b/Documentation/devicetree/bindings/arm/gic.txt @@ -42,6 +42,10 @@ Optional - interrupts : Interrupt source of the parent interrupt controller. Only present on secondary GICs. +- cpu-offset : per-cpu offset within the distributor and cpu interface + regions, used when the GIC doesn't have banked registers. The offset is + cpu-offset * cpu-nr. + Example: intc: interrupt-controller@fff11000 { diff --git a/Documentation/devicetree/bindings/arm/insignal-boards.txt b/Documentation/devicetree/bindings/arm/insignal-boards.txt new file mode 100644 index 0000000..524c3dc --- /dev/null +++ b/Documentation/devicetree/bindings/arm/insignal-boards.txt @@ -0,0 +1,8 @@ +* Insignal's Exynos4210 based Origen evaluation board + +Origen low-cost evaluation board is based on Samsung's Exynos4210 SoC. + +Required root node properties: + - compatible = should be one or more of the following. + (a) "samsung,smdkv310" - for Samsung's SMDKV310 eval board. + (b) "samsung,exynos4210" - for boards based on Exynos4210 SoC. diff --git a/Documentation/devicetree/bindings/arm/samsung-boards.txt b/Documentation/devicetree/bindings/arm/samsung-boards.txt new file mode 100644 index 0000000..0bf68be --- /dev/null +++ b/Documentation/devicetree/bindings/arm/samsung-boards.txt @@ -0,0 +1,8 @@ +* Samsung's Exynos4210 based SMDKV310 evaluation board + +SMDKV310 evaluation board is based on Samsung's Exynos4210 SoC. + +Required root node properties: + - compatible = should be one or more of the following. + (a) "samsung,smdkv310" - for Samsung's SMDKV310 eval board. + (b) "samsung,exynos4210" - for boards based on Exynos4210 SoC. diff --git a/Documentation/devicetree/bindings/arm/tegra.txt b/Documentation/devicetree/bindings/arm/tegra.txt new file mode 100644 index 0000000..6e69d2e --- /dev/null +++ b/Documentation/devicetree/bindings/arm/tegra.txt @@ -0,0 +1,14 @@ +NVIDIA Tegra device tree bindings +------------------------------------------- + +Boards with the tegra20 SoC shall have the following properties: + +Required root node property: + +compatible = "nvidia,tegra20"; + +Boards with the tegra30 SoC shall have the following properties: + +Required root node property: + +compatible = "nvidia,tegra30"; diff --git a/Documentation/devicetree/bindings/arm/vic.txt b/Documentation/devicetree/bindings/arm/vic.txt new file mode 100644 index 0000000..266716b --- /dev/null +++ b/Documentation/devicetree/bindings/arm/vic.txt @@ -0,0 +1,29 @@ +* ARM Vectored Interrupt Controller + +One or more Vectored Interrupt Controllers (VIC's) can be connected in an ARM +system for interrupt routing. For multiple controllers they can either be +nested or have the outputs wire-OR'd together. + +Required properties: + +- compatible : should be one of + "arm,pl190-vic" + "arm,pl192-vic" +- interrupt-controller : Identifies the node as an interrupt controller +- #interrupt-cells : The number of cells to define the interrupts. Must be 1 as + the VIC has no configuration options for interrupt sources. The cell is a u32 + and defines the interrupt number. +- reg : The register bank for the VIC. + +Optional properties: + +- interrupts : Interrupt source for parent controllers if the VIC is nested. + +Example: + + vic0: interrupt-controller@60000 { + compatible = "arm,pl192-vic"; + interrupt-controller; + #interrupt-cells = <1>; + reg = <0x60000 0x1000>; + }; diff --git a/Documentation/devicetree/bindings/dma/arm-pl330.txt b/Documentation/devicetree/bindings/dma/arm-pl330.txt new file mode 100644 index 0000000..a4cd273 --- /dev/null +++ b/Documentation/devicetree/bindings/dma/arm-pl330.txt @@ -0,0 +1,30 @@ +* ARM PrimeCell PL330 DMA Controller + +The ARM PrimeCell PL330 DMA controller can move blocks of memory contents +between memory and peripherals or memory to memory. + +Required properties: + - compatible: should include both "arm,pl330" and "arm,primecell". + - reg: physical base address of the controller and length of memory mapped + region. + - interrupts: interrupt number to the cpu. + +Example: + + pdma0: pdma@12680000 { + compatible = "arm,pl330", "arm,primecell"; + reg = <0x12680000 0x1000>; + interrupts = <99>; + }; + +Client drivers (device nodes requiring dma transfers from dev-to-mem or +mem-to-dev) should specify the DMA channel numbers using a two-value pair +as shown below. + + [property name] = <[phandle of the dma controller] [dma request id]>; + + where 'dma request id' is the dma request number which is connected + to the client controller. The 'property name' is recommended to be + of the form <name>-dma-channel. + + Example: tx-dma-channel = <&pdma0 12>; diff --git a/Documentation/devicetree/bindings/gpio/gpio-samsung.txt b/Documentation/devicetree/bindings/gpio/gpio-samsung.txt new file mode 100644 index 0000000..8f50fe5 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio-samsung.txt @@ -0,0 +1,40 @@ +Samsung Exynos4 GPIO Controller + +Required properties: +- compatible: Compatible property value should be "samsung,exynos4-gpio>". + +- reg: Physical base address of the controller and length of memory mapped + region. + +- #gpio-cells: Should be 4. The syntax of the gpio specifier used by client nodes + should be the following with values derived from the SoC user manual. + <[phandle of the gpio controller node] + [pin number within the gpio controller] + [mux function] + [pull up/down] + [drive strength]> + + Values for gpio specifier: + - Pin number: is a value between 0 to 7. + - Pull Up/Down: 0 - Pull Up/Down Disabled. + 1 - Pull Down Enabled. + 3 - Pull Up Enabled. + - Drive Strength: 0 - 1x, + 1 - 3x, + 2 - 2x, + 3 - 4x + +- gpio-controller: Specifies that the node is a gpio controller. +- #address-cells: should be 1. +- #size-cells: should be 1. + +Example: + + gpa0: gpio-controller@11400000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "samsung,exynos4-gpio"; + reg = <0x11400000 0x20>; + #gpio-cells = <4>; + gpio-controller; + }; diff --git a/Documentation/devicetree/bindings/i2c/i2c-designware.txt b/Documentation/devicetree/bindings/i2c/i2c-designware.txt new file mode 100644 index 0000000..e42a2ee --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/i2c-designware.txt @@ -0,0 +1,22 @@ +* Synopsys DesignWare I2C + +Required properties : + + - compatible : should be "snps,designware-i2c" + - reg : Offset and length of the register set for the device + - interrupts : <IRQ> where IRQ is the interrupt number. + +Recommended properties : + + - clock-frequency : desired I2C bus clock frequency in Hz. + +Example : + + i2c@f0000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xf0000 0x1000>; + interrupts = <11>; + clock-frequency = <400000>; + }; diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt b/Documentation/devicetree/bindings/i2c/trivial-devices.txt new file mode 100644 index 0000000..1a85f98 --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt @@ -0,0 +1,58 @@ +This is a list of trivial i2c devices that have simple device tree +bindings, consisting only of a compatible field, an address and +possibly an interrupt line. + +If a device needs more specific bindings, such as properties to +describe some aspect of it, there needs to be a specific binding +document for it just like any other devices. + + +Compatible Vendor / Chip +========== ============= +ad,ad7414 SMBus/I2C Digital Temperature Sensor in 6-Pin SOT with SMBus Alert and Over Temperature Pin +ad,adm9240 ADM9240: Complete System Hardware Monitor for uProcessor-Based Systems +adi,adt7461 +/-1C TDM Extended Temp Range I.C +adt7461 +/-1C TDM Extended Temp Range I.C +at,24c08 i2c serial eeprom (24cxx) +atmel,24c02 i2c serial eeprom (24cxx) +catalyst,24c32 i2c serial eeprom +dallas,ds1307 64 x 8, Serial, I2C Real-Time Clock +dallas,ds1338 I2C RTC with 56-Byte NV RAM +dallas,ds1339 I2C Serial Real-Time Clock +dallas,ds1340 I2C RTC with Trickle Charger +dallas,ds1374 I2C, 32-Bit Binary Counter Watchdog RTC with Trickle Charger and Reset Input/Output +dallas,ds1631 High-Precision Digital Thermometer +dallas,ds1682 Total-Elapsed-Time Recorder with Alarm +dallas,ds1775 Tiny Digital Thermometer and Thermostat +dallas,ds3232 Extremely Accurate I²C RTC with Integrated Crystal and SRAM +dallas,ds4510 CPU Supervisor with Nonvolatile Memory and Programmable I/O +dallas,ds75 Digital Thermometer and Thermostat +dialog,da9053 DA9053: flexible system level PMIC with multicore support +epson,rx8025 High-Stability. I2C-Bus INTERFACE REAL TIME CLOCK MODULE +epson,rx8581 I2C-BUS INTERFACE REAL TIME CLOCK MODULE +fsl,mag3110 MAG3110: Xtrinsic High Accuracy, 3D Magnetometer +fsl,mc13892 MC13892: Power Management Integrated Circuit (PMIC) for i.MX35/51 +fsl,mma8450 MMA8450Q: Xtrinsic Low-power, 3-axis Xtrinsic Accelerometer +fsl,mpr121 MPR121: Proximity Capacitive Touch Sensor Controller +fsl,sgtl5000 SGTL5000: Ultra Low-Power Audio Codec +maxim,ds1050 5 Bit Programmable, Pulse-Width Modulator +maxim,max1237 Low-Power, 4-/12-Channel, 2-Wire Serial, 12-Bit ADCs +maxim,max6625 9-Bit/12-Bit Temperature Sensors with I²C-Compatible Serial Interface +mc,rv3029c2 Real Time Clock Module with I2C-Bus +national,lm75 I2C TEMP SENSOR +national,lm80 Serial Interface ACPI-Compatible Microprocessor System Hardware Monitor +national,lm92 ±0.33°C Accurate, 12-Bit + Sign Temperature Sensor and Thermal Window Comparator with Two-Wire Interface +nxp,pca9556 Octal SMBus and I2C registered interface +nxp,pca9557 8-bit I2C-bus and SMBus I/O port with reset +nxp,pcf8563 Real-time clock/calendar +ovti,ov5642 OV5642: Color CMOS QSXGA (5-megapixel) Image Sensor with OmniBSI and Embedded TrueFocus +pericom,pt7c4338 Real-time Clock Module +plx,pex8648 48-Lane, 12-Port PCI Express Gen 2 (5.0 GT/s) Switch +ramtron,24c64 i2c serial eeprom (24cxx) +ricoh,rs5c372a I2C bus SERIAL INTERFACE REAL-TIME CLOCK IC +samsung,24ad0xd1 S524AD0XF1 (128K/256K-bit Serial EEPROM for Low Power) +st-micro,24c256 i2c serial eeprom (24cxx) +stm,m41t00 Serial Access TIMEKEEPER +stm,m41t62 Serial real-time clock (RTC) with alarm +stm,m41t80 M41T80 - SERIAL ACCESS RTC WITH ALARMS +ti,tsc2003 I2C Touch-Screen Controller diff --git a/Documentation/devicetree/bindings/input/samsung-keypad.txt b/Documentation/devicetree/bindings/input/samsung-keypad.txt new file mode 100644 index 0000000..ce3e394 --- /dev/null +++ b/Documentation/devicetree/bindings/input/samsung-keypad.txt @@ -0,0 +1,88 @@ +* Samsung's Keypad Controller device tree bindings + +Samsung's Keypad controller is used to interface a SoC with a matrix-type +keypad device. The keypad controller supports multiple row and column lines. +A key can be placed at each intersection of a unique row and a unique column. +The keypad controller can sense a key-press and key-release and report the +event using a interrupt to the cpu. + +Required SoC Specific Properties: +- compatible: should be one of the following + - "samsung,s3c6410-keypad": For controllers compatible with s3c6410 keypad + controller. + - "samsung,s5pv210-keypad": For controllers compatible with s5pv210 keypad + controller. + +- reg: physical base address of the controller and length of memory mapped + region. + +- interrupts: The interrupt number to the cpu. + +Required Board Specific Properties: +- samsung,keypad-num-rows: Number of row lines connected to the keypad + controller. + +- samsung,keypad-num-columns: Number of column lines connected to the + keypad controller. + +- row-gpios: List of gpios used as row lines. The gpio specifier for + this property depends on the gpio controller to which these row lines + are connected. + +- col-gpios: List of gpios used as column lines. The gpio specifier for + this property depends on the gpio controller to which these column + lines are connected. + +- Keys represented as child nodes: Each key connected to the keypad + controller is represented as a child node to the keypad controller + device node and should include the following properties. + - keypad,row: the row number to which the key is connected. + - keypad,column: the column number to which the key is connected. + - linux,code: the key-code to be reported when the key is pressed + and released. + +Optional Properties specific to linux: +- linux,keypad-no-autorepeat: do no enable autorepeat feature. +- linux,keypad-wakeup: use any event on keypad as wakeup event. + + +Example: + keypad@100A0000 { + compatible = "samsung,s5pv210-keypad"; + reg = <0x100A0000 0x100>; + interrupts = <173>; + samsung,keypad-num-rows = <2>; + samsung,keypad-num-columns = <8>; + linux,input-no-autorepeat; + linux,input-wakeup; + + row-gpios = <&gpx2 0 3 3 0 + &gpx2 1 3 3 0>; + + col-gpios = <&gpx1 0 3 0 0 + &gpx1 1 3 0 0 + &gpx1 2 3 0 0 + &gpx1 3 3 0 0 + &gpx1 4 3 0 0 + &gpx1 5 3 0 0 + &gpx1 6 3 0 0 + &gpx1 7 3 0 0>; + + key_1 { + keypad,row = <0>; + keypad,column = <3>; + linux,code = <2>; + }; + + key_2 { + keypad,row = <0>; + keypad,column = <4>; + linux,code = <3>; + }; + + key_3 { + keypad,row = <0>; + keypad,column = <5>; + linux,code = <4>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/calxeda-xgmac.txt b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt new file mode 100644 index 0000000..411727a --- /dev/null +++ b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt @@ -0,0 +1,15 @@ +* Calxeda Highbank 10Gb XGMAC Ethernet + +Required properties: +- compatible : Should be "calxeda,hb-xgmac" +- reg : Address and length of the register set for the device +- interrupts : Should contain 3 xgmac interrupts. The 1st is main interrupt. + The 2nd is pwr mgt interrupt. The 3rd is low power state interrupt. + +Example: + +ethernet@fff50000 { + compatible = "calxeda,hb-xgmac"; + reg = <0xfff50000 0x1000>; + interrupts = <0 77 4 0 78 4 0 79 4>; +}; diff --git a/Documentation/devicetree/bindings/net/can/cc770.txt b/Documentation/devicetree/bindings/net/can/cc770.txt new file mode 100644 index 0000000..77027bf --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/cc770.txt @@ -0,0 +1,53 @@ +Memory mapped Bosch CC770 and Intel AN82527 CAN controller + +Note: The CC770 is a CAN controller from Bosch, which is 100% +compatible with the old AN82527 from Intel, but with "bugs" being fixed. + +Required properties: + +- compatible : should be "bosch,cc770" for the CC770 and "intc,82527" + for the AN82527. + +- reg : should specify the chip select, address offset and size required + to map the registers of the controller. The size is usually 0x80. + +- interrupts : property with a value describing the interrupt source + (number and sensitivity) required for the controller. + +Optional properties: + +- bosch,external-clock-frequency : frequency of the external oscillator + clock in Hz. Note that the internal clock frequency used by the + controller is half of that value. If not specified, a default + value of 16000000 (16 MHz) is used. + +- bosch,clock-out-frequency : slock frequency in Hz on the CLKOUT pin. + If not specified or if the specified value is 0, the CLKOUT pin + will be disabled. + +- bosch,slew-rate : slew rate of the CLKOUT signal. If not specified, + a resonable value will be calculated. + +- bosch,disconnect-rx0-input : see data sheet. + +- bosch,disconnect-rx1-input : see data sheet. + +- bosch,disconnect-tx1-output : see data sheet. + +- bosch,polarity-dominant : see data sheet. + +- bosch,divide-memory-clock : see data sheet. + +- bosch,iso-low-speed-mux : see data sheet. + +For further information, please have a look to the CC770 or AN82527. + +Examples: + +can@3,100 { + compatible = "bosch,cc770"; + reg = <3 0x100 0x80>; + interrupts = <2 0>; + interrupt-parent = <&mpic>; + bosch,external-clock-frequency = <16000000>; +}; diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt new file mode 100644 index 0000000..44afa0e --- /dev/null +++ b/Documentation/devicetree/bindings/net/macb.txt @@ -0,0 +1,25 @@ +* Cadence MACB/GEM Ethernet controller + +Required properties: +- compatible: Should be "cdns,[<chip>-]{macb|gem}" + Use "cdns,at91sam9260-macb" Atmel at91sam9260 and at91sam9263 SoCs. + Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: "cdns,macb". + Use "cnds,pc302-gem" for Picochip picoXcell pc302 and later devices based on + the Cadence GEM, or the generic form: "cdns,gem". +- reg: Address and length of the register set for the device +- interrupts: Should contain macb interrupt +- phy-mode: String, operation mode of the PHY interface. + Supported values are: "mii", "rmii", "gmii", "rgmii". + +Optional properties: +- local-mac-address: 6 bytes, mac address + +Examples: + + macb0: ethernet@fffc4000 { + compatible = "cdns,at32ap7000-macb"; + reg = <0xfffc4000 0x4000>; + interrupts = <21>; + phy-mode = "rmii"; + local-mac-address = [3a 0e 03 04 05 06]; + }; diff --git a/Documentation/devicetree/bindings/nvec/nvec_nvidia.txt b/Documentation/devicetree/bindings/nvec/nvec_nvidia.txt new file mode 100644 index 0000000..5aeee53 --- /dev/null +++ b/Documentation/devicetree/bindings/nvec/nvec_nvidia.txt @@ -0,0 +1,9 @@ +NVIDIA compliant embedded controller + +Required properties: +- compatible : should be "nvidia,nvec". +- reg : the iomem of the i2c slave controller +- interrupts : the interrupt line of the i2c slave controller +- clock-frequency : the frequency of the i2c bus +- gpios : the gpio used for ec request +- slave-addr: the i2c address of the slave controller diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt new file mode 100644 index 0000000..b9a8a2b --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt @@ -0,0 +1,163 @@ +Message unit node: + +For SRIO controllers that implement the message unit as part of the controller +this node is required. For devices with RMAN this node should NOT exist. The +node is composed of three types of sub-nodes ("fsl-srio-msg-unit", +"fsl-srio-dbell-unit" and "fsl-srio-port-write-unit"). + +See srio.txt for more details about generic SRIO controller details. + + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,srio-rmu-vX.Y", "fsl,srio-rmu". + + The version X.Y should match the general SRIO controller's IP Block + revision register's Major(X) and Minor (Y) value. + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the physical address and + length of the SRIO configuration registers for message units + and doorbell units. + + - fsl,liodn + Usage: optional-but-recommended (for devices with PAMU) + Value type: <prop-encoded-array> + Definition: The logical I/O device number for the PAMU (IOMMU) to be + correctly configured for SRIO accesses. The property should + not exist on devices that do not support PAMU. + + The LIODN value is associated with all RMU transactions + (msg-unit, doorbell, port-write). + +Sub-Nodes for RMU: The RMU node is composed of multiple sub-nodes that +correspond to the actual sub-controllers in the RMU. The manual for a given +SoC will detail which and how many of these sub-controllers are implemented. + +Message Unit: + + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,srio-msg-unit-vX.Y", "fsl,srio-msg-unit". + + The version X.Y should match the general SRIO controller's IP Block + revision register's Major(X) and Minor (Y) value. + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the physical address and + length of the SRIO configuration registers for message units + and doorbell units. + + - interrupts + Usage: required + Value type: <prop_encoded-array> + Definition: Specifies the interrupts generated by this device. The + value of the interrupts property consists of one interrupt + specifier. The format of the specifier is defined by the + binding document describing the node's interrupt parent. + + A pair of IRQs are specified in this property. The first + element is associated with the transmit (TX) interrupt and the + second element is associated with the receive (RX) interrupt. + +Doorbell Unit: + + - compatible + Usage: required + Value type: <string> + Definition: Must include: + "fsl,srio-dbell-unit-vX.Y", "fsl,srio-dbell-unit" + + The version X.Y should match the general SRIO controller's IP Block + revision register's Major(X) and Minor (Y) value. + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the physical address and + length of the SRIO configuration registers for message units + and doorbell units. + + - interrupts + Usage: required + Value type: <prop_encoded-array> + Definition: Specifies the interrupts generated by this device. The + value of the interrupts property consists of one interrupt + specifier. The format of the specifier is defined by the + binding document describing the node's interrupt parent. + + A pair of IRQs are specified in this property. The first + element is associated with the transmit (TX) interrupt and the + second element is associated with the receive (RX) interrupt. + +Port-Write Unit: + + - compatible + Usage: required + Value type: <string> + Definition: Must include: + "fsl,srio-port-write-unit-vX.Y", "fsl,srio-port-write-unit" + + The version X.Y should match the general SRIO controller's IP Block + revision register's Major(X) and Minor (Y) value. + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the physical address and + length of the SRIO configuration registers for message units + and doorbell units. + + - interrupts + Usage: required + Value type: <prop_encoded-array> + Definition: Specifies the interrupts generated by this device. The + value of the interrupts property consists of one interrupt + specifier. The format of the specifier is defined by the + binding document describing the node's interrupt parent. + + A single IRQ that handles port-write conditions is + specified by this property. (Typically shared with error). + + Note: All other standard properties (see the ePAPR) are allowed + but are optional. + +Example: + rmu: rmu@d3000 { + compatible = "fsl,srio-rmu"; + reg = <0xd3000 0x400>; + ranges = <0x0 0xd3000 0x400>; + fsl,liodn = <0xc8>; + + message-unit@0 { + compatible = "fsl,srio-msg-unit"; + reg = <0x0 0x100>; + interrupts = < + 60 2 0 0 /* msg1_tx_irq */ + 61 2 0 0>;/* msg1_rx_irq */ + }; + message-unit@100 { + compatible = "fsl,srio-msg-unit"; + reg = <0x100 0x100>; + interrupts = < + 62 2 0 0 /* msg2_tx_irq */ + 63 2 0 0>;/* msg2_rx_irq */ + }; + doorbell-unit@400 { + compatible = "fsl,srio-dbell-unit"; + reg = <0x400 0x80>; + interrupts = < + 56 2 0 0 /* bell_outb_irq */ + 57 2 0 0>;/* bell_inb_irq */ + }; + port-write-unit@4e0 { + compatible = "fsl,srio-port-write-unit"; + reg = <0x4e0 0x20>; + interrupts = <16 2 1 11>; + }; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt new file mode 100644 index 0000000..b039bcb --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt @@ -0,0 +1,103 @@ +* Freescale Serial RapidIO (SRIO) Controller + +RapidIO port node: +Properties: + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,srio" for IP blocks with IP Block + Revision Register (SRIO IPBRR1) Major ID equal to 0x01c0. + + Optionally, a compatiable string of "fsl,srio-vX.Y" where X is Major + version in IP Block Revision Register and Y is Minor version. If this + compatiable is provided it should be ordered before "fsl,srio". + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the physical address and + length of the SRIO configuration registers. The size should + be set to 0x11000. + + - interrupts + Usage: required + Value type: <prop_encoded-array> + Definition: Specifies the interrupts generated by this device. The + value of the interrupts property consists of one interrupt + specifier. The format of the specifier is defined by the + binding document describing the node's interrupt parent. + + A single IRQ that handles error conditions is specified by this + property. (Typically shared with port-write). + + - fsl,srio-rmu-handle: + Usage: required if rmu node is defined + Value type: <phandle> + Definition: A single <phandle> value that points to the RMU. + (See srio-rmu.txt for more details on RMU node binding) + +Port Child Nodes: There should a port child node for each port that exists in +the controller. The ports are numbered starting at one (1) and should have +the following properties: + + - cell-index + Usage: required + Value type: <u32> + Definition: A standard property. Matches the port id. + + - ranges + Usage: required if local access windows preset + Value type: <prop-encoded-array> + Definition: A standard property. Utilized to describe the memory mapped + IO space utilized by the controller. This corresponds to the + setting of the local access windows that are targeted to this + SRIO port. + + - fsl,liodn + Usage: optional-but-recommended (for devices with PAMU) + Value type: <prop-encoded-array> + Definition: The logical I/O device number for the PAMU (IOMMU) to be + correctly configured for SRIO accesses. The property should + not exist on devices that do not support PAMU. + + For HW (ie, the P4080) that only supports a LIODN for both + memory and maintenance transactions then a single LIODN is + represented in the property for both transactions. + + For HW (ie, the P304x/P5020, etc) that supports an LIODN for + memory transactions and a unique LIODN for maintenance + transactions then a pair of LIODNs are represented in the + property. Within the pair, the first element represents the + LIODN associated with memory transactions and the second element + represents the LIODN associated with maintenance transactions + for the port. + +Note: All other standard properties (see ePAPR) are allowed but are optional. + +Example: + + rapidio: rapidio@ffe0c0000 { + #address-cells = <2>; + #size-cells = <2>; + reg = <0xf 0xfe0c0000 0 0x11000>; + compatible = "fsl,srio"; + interrupts = <16 2 1 11>; /* err_irq */ + fsl,srio-rmu-handle = <&rmu>; + ranges; + + port1 { + cell-index = <1>; + #address-cells = <2>; + #size-cells = <2>; + fsl,liodn = <34>; + ranges = <0 0 0xc 0x20000000 0 0x10000000>; + }; + + port2 { + cell-index = <2>; + #address-cells = <2>; + #size-cells = <2>; + fsl,liodn = <48>; + ranges = <0 0 0xc 0x30000000 0 0x10000000>; + }; + }; diff --git a/Documentation/devicetree/bindings/regulator/fixed-regulator.txt b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt new file mode 100644 index 0000000..9cf57fd --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt @@ -0,0 +1,29 @@ +Fixed Voltage regulators + +Required properties: +- compatible: Must be "regulator-fixed"; + +Optional properties: +- gpio: gpio to use for enable control +- startup-delay-us: startup time in microseconds +- enable-active-high: Polarity of GPIO is Active high +If this property is missing, the default assumed is Active low. + +Any property defined as part of the core regulator +binding, defined in regulator.txt, can also be used. +However a fixed voltage regulator is expected to have the +regulator-min-microvolt and regulator-max-microvolt +to be the same. + +Example: + + abc: fixedregulator@0 { + compatible = "regulator-fixed"; + regulator-name = "fixed-supply"; + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + gpio = <&gpio1 16 0>; + startup-delay-us = <70000>; + enable-active-high; + regulator-boot-on + }; diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt b/Documentation/devicetree/bindings/regulator/regulator.txt new file mode 100644 index 0000000..5b7a408 --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/regulator.txt @@ -0,0 +1,54 @@ +Voltage/Current Regulators + +Optional properties: +- regulator-name: A string used as a descriptive name for regulator outputs +- regulator-min-microvolt: smallest voltage consumers may set +- regulator-max-microvolt: largest voltage consumers may set +- regulator-microvolt-offset: Offset applied to voltages to compensate for voltage drops +- regulator-min-microamp: smallest current consumers may set +- regulator-max-microamp: largest current consumers may set +- regulator-always-on: boolean, regulator should never be disabled +- regulator-boot-on: bootloader/firmware enabled regulator +- <name>-supply: phandle to the parent supply/regulator node + +Example: + + xyzreg: regulator@0 { + regulator-min-microvolt = <1000000>; + regulator-max-microvolt = <2500000>; + regulator-always-on; + vin-supply = <&vin>; + }; + +Regulator Consumers: +Consumer nodes can reference one or more of its supplies/ +regulators using the below bindings. + +- <name>-supply: phandle to the regulator node + +These are the same bindings that a regulator in the above +example used to reference its own supply, in which case +its just seen as a special case of a regulator being a +consumer itself. + +Example of a consumer device node (mmc) referencing two +regulators (twl_reg1 and twl_reg2), + + twl_reg1: regulator@0 { + ... + ... + ... + }; + + twl_reg2: regulator@1 { + ... + ... + ... + }; + + mmc: mmc@0x0 { + ... + ... + vmmc-supply = <&twl_reg1>; + vmmcaux-supply = <&twl_reg2>; + }; diff --git a/Documentation/devicetree/bindings/rtc/s3c-rtc.txt b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt new file mode 100644 index 0000000..90ec45f --- /dev/null +++ b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt @@ -0,0 +1,20 @@ +* Samsung's S3C Real Time Clock controller + +Required properties: +- compatible: should be one of the following. + * "samsung,s3c2410-rtc" - for controllers compatible with s3c2410 rtc. + * "samsung,s3c6410-rtc" - for controllers compatible with s3c6410 rtc. +- reg: physical base address of the controller and length of memory mapped + region. +- interrupts: Two interrupt numbers to the cpu should be specified. First + interrupt number is the rtc alarm interupt and second interrupt number + is the rtc tick interrupt. The number of cells representing a interrupt + depends on the parent interrupt controller. + +Example: + + rtc@10070000 { + compatible = "samsung,s3c6410-rtc"; + reg = <0x10070000 0x100>; + interrupts = <44 0 45 0>; + }; diff --git a/Documentation/devicetree/bindings/serial/omap_serial.txt b/Documentation/devicetree/bindings/serial/omap_serial.txt new file mode 100644 index 0000000..342eedd --- /dev/null +++ b/Documentation/devicetree/bindings/serial/omap_serial.txt @@ -0,0 +1,10 @@ +OMAP UART controller + +Required properties: +- compatible : should be "ti,omap2-uart" for OMAP2 controllers +- compatible : should be "ti,omap3-uart" for OMAP3 controllers +- compatible : should be "ti,omap4-uart" for OMAP4 controllers +- ti,hwmods : Must be "uart<n>", n being the instance number (1-based) + +Optional properties: +- clock-frequency : frequency of the clock input to the UART diff --git a/Documentation/devicetree/bindings/serial/samsung_uart.txt b/Documentation/devicetree/bindings/serial/samsung_uart.txt new file mode 100644 index 0000000..2c8a17c --- /dev/null +++ b/Documentation/devicetree/bindings/serial/samsung_uart.txt @@ -0,0 +1,14 @@ +* Samsung's UART Controller + +The Samsung's UART controller is used for interfacing SoC with serial communicaion +devices. + +Required properties: +- compatible: should be + - "samsung,exynos4210-uart", for UART's compatible with Exynos4210 uart ports. + +- reg: base physical address of the controller and length of memory mapped + region. + +- interrupts: interrupt number to the cpu. The interrupt specifier format depends + on the interrupt controller parent. diff --git a/Documentation/devicetree/bindings/usb/tegra-usb.txt b/Documentation/devicetree/bindings/usb/tegra-usb.txt new file mode 100644 index 0000000..035d63d --- /dev/null +++ b/Documentation/devicetree/bindings/usb/tegra-usb.txt @@ -0,0 +1,13 @@ +Tegra SOC USB controllers + +The device node for a USB controller that is part of a Tegra +SOC is as described in the document "Open Firmware Recommended +Practice : Universal Serial Bus" with the following modifications +and additions : + +Required properties : + - compatible : Should be "nvidia,tegra20-ehci" for USB controllers + used in host mode. + - phy_type : Should be one of "ulpi" or "utmi". + - nvidia,vbus-gpio : If present, specifies a gpio that needs to be + activated for the bus to be powered. diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index e855278..1862696 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -8,7 +8,9 @@ amcc Applied Micro Circuits Corporation (APM, formally AMCC) apm Applied Micro Circuits Corporation (APM) arm ARM Ltd. atmel Atmel Corporation +cavium Cavium, Inc. chrp Common Hardware Reference Platform +cortina Cortina Systems, Inc. dallas Maxim Integrated Products (formerly Dallas Semiconductor) denx Denx Software Engineering epson Seiko Epson Corp. @@ -33,8 +35,10 @@ qcom Qualcomm, Inc. ramtron Ramtron International samsung Samsung Semiconductor schindler Schindler +sil Silicon Image simtek sirf SiRF Technology, Inc. +st STMicroelectronics stericsson ST-Ericsson ti Texas Instruments xlnx Xilinx diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt new file mode 100644 index 0000000..510eab3 --- /dev/null +++ b/Documentation/dma-buf-sharing.txt @@ -0,0 +1,224 @@ + DMA Buffer Sharing API Guide + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + Sumit Semwal + <sumit dot semwal at linaro dot org> + <sumit dot semwal at ti dot com> + +This document serves as a guide to device-driver writers on what is the dma-buf +buffer sharing API, how to use it for exporting and using shared buffers. + +Any device driver which wishes to be a part of DMA buffer sharing, can do so as +either the 'exporter' of buffers, or the 'user' of buffers. + +Say a driver A wants to use buffers created by driver B, then we call B as the +exporter, and A as buffer-user. + +The exporter +- implements and manages operations[1] for the buffer +- allows other users to share the buffer by using dma_buf sharing APIs, +- manages the details of buffer allocation, +- decides about the actual backing storage where this allocation happens, +- takes care of any migration of scatterlist - for all (shared) users of this + buffer, + +The buffer-user +- is one of (many) sharing users of the buffer. +- doesn't need to worry about how the buffer is allocated, or where. +- needs a mechanism to get access to the scatterlist that makes up this buffer + in memory, mapped into its own address space, so it can access the same area + of memory. + +*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details] +For this first version, A buffer shared using the dma_buf sharing API: +- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of + this framework. +- may be used *ONLY* by importers that do not need CPU access to the buffer. + +The dma_buf buffer sharing API usage contains the following steps: + +1. Exporter announces that it wishes to export a buffer +2. Userspace gets the file descriptor associated with the exported buffer, and + passes it around to potential buffer-users based on use case +3. Each buffer-user 'connects' itself to the buffer +4. When needed, buffer-user requests access to the buffer from exporter +5. When finished with its use, the buffer-user notifies end-of-DMA to exporter +6. when buffer-user is done using this buffer completely, it 'disconnects' + itself from the buffer. + + +1. Exporter's announcement of buffer export + + The buffer exporter announces its wish to export a buffer. In this, it + connects its own private buffer data, provides implementation for operations + that can be performed on the exported dma_buf, and flags for the file + associated with this buffer. + + Interface: + struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, + size_t size, int flags) + + If this succeeds, dma_buf_export allocates a dma_buf structure, and returns a + pointer to the same. It also associates an anonymous file with this buffer, + so it can be exported. On failure to allocate the dma_buf object, it returns + NULL. + +2. Userspace gets a handle to pass around to potential buffer-users + + Userspace entity requests for a file-descriptor (fd) which is a handle to the + anonymous file associated with the buffer. It can then share the fd with other + drivers and/or processes. + + Interface: + int dma_buf_fd(struct dma_buf *dmabuf) + + This API installs an fd for the anonymous file associated with this buffer; + returns either 'fd', or error. + +3. Each buffer-user 'connects' itself to the buffer + + Each buffer-user now gets a reference to the buffer, using the fd passed to + it. + + Interface: + struct dma_buf *dma_buf_get(int fd) + + This API will return a reference to the dma_buf, and increment refcount for + it. + + After this, the buffer-user needs to attach its device with the buffer, which + helps the exporter to know of device buffer constraints. + + Interface: + struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, + struct device *dev) + + This API returns reference to an attachment structure, which is then used + for scatterlist operations. It will optionally call the 'attach' dma_buf + operation, if provided by the exporter. + + The dma-buf sharing framework does the bookkeeping bits related to managing + the list of all attachments to a buffer. + +Until this stage, the buffer-exporter has the option to choose not to actually +allocate the backing storage for this buffer, but wait for the first buffer-user +to request use of buffer for allocation. + + +4. When needed, buffer-user requests access to the buffer + + Whenever a buffer-user wants to use the buffer for any DMA, it asks for + access to the buffer using dma_buf_map_attachment API. At least one attach to + the buffer must have happened before map_dma_buf can be called. + + Interface: + struct sg_table * dma_buf_map_attachment(struct dma_buf_attachment *, + enum dma_data_direction); + + This is a wrapper to dma_buf->ops->map_dma_buf operation, which hides the + "dma_buf->ops->" indirection from the users of this interface. + + In struct dma_buf_ops, map_dma_buf is defined as + struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *, + enum dma_data_direction); + + It is one of the buffer operations that must be implemented by the exporter. + It should return the sg_table containing scatterlist for this buffer, mapped + into caller's address space. + + If this is being called for the first time, the exporter can now choose to + scan through the list of attachments for this buffer, collate the requirements + of the attached devices, and choose an appropriate backing storage for the + buffer. + + Based on enum dma_data_direction, it might be possible to have multiple users + accessing at the same time (for reading, maybe), or any other kind of sharing + that the exporter might wish to make available to buffer-users. + + map_dma_buf() operation can return -EINTR if it is interrupted by a signal. + + +5. When finished, the buffer-user notifies end-of-DMA to exporter + + Once the DMA for the current buffer-user is over, it signals 'end-of-DMA' to + the exporter using the dma_buf_unmap_attachment API. + + Interface: + void dma_buf_unmap_attachment(struct dma_buf_attachment *, + struct sg_table *); + + This is a wrapper to dma_buf->ops->unmap_dma_buf() operation, which hides the + "dma_buf->ops->" indirection from the users of this interface. + + In struct dma_buf_ops, unmap_dma_buf is defined as + void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *); + + unmap_dma_buf signifies the end-of-DMA for the attachment provided. Like + map_dma_buf, this API also must be implemented by the exporter. + + +6. when buffer-user is done using this buffer, it 'disconnects' itself from the + buffer. + + After the buffer-user has no more interest in using this buffer, it should + disconnect itself from the buffer: + + - it first detaches itself from the buffer. + + Interface: + void dma_buf_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *dmabuf_attach); + + This API removes the attachment from the list in dmabuf, and optionally calls + dma_buf->ops->detach(), if provided by exporter, for any housekeeping bits. + + - Then, the buffer-user returns the buffer reference to exporter. + + Interface: + void dma_buf_put(struct dma_buf *dmabuf); + + This API then reduces the refcount for this buffer. + + If, as a result of this call, the refcount becomes 0, the 'release' file + operation related to this fd is called. It calls the dmabuf->ops->release() + operation in turn, and frees the memory allocated for dmabuf when exported. + +NOTES: +- Importance of attach-detach and {map,unmap}_dma_buf operation pairs + The attach-detach calls allow the exporter to figure out backing-storage + constraints for the currently-interested devices. This allows preferential + allocation, and/or migration of pages across different types of storage + available, if possible. + + Bracketing of DMA access with {map,unmap}_dma_buf operations is essential + to allow just-in-time backing of storage, and migration mid-way through a + use-case. + +- Migration of backing storage if needed + If after + - at least one map_dma_buf has happened, + - and the backing storage has been allocated for this buffer, + another new buffer-user intends to attach itself to this buffer, it might + be allowed, if possible for the exporter. + + In case it is allowed by the exporter: + if the new buffer-user has stricter 'backing-storage constraints', and the + exporter can handle these constraints, the exporter can just stall on the + map_dma_buf until all outstanding access is completed (as signalled by + unmap_dma_buf). + Once all users have finished accessing and have unmapped this buffer, the + exporter could potentially move the buffer to the stricter backing-storage, + and then allow further {map,unmap}_dma_buf operations from any buffer-user + from the migrated backing-storage. + + If the exporter cannot fulfil the backing-storage constraints of the new + buffer-user device as requested, dma_buf_attach() would return an error to + denote non-compatibility of the new buffer-sharing request with the current + buffer. + + If the exporter chooses not to allow an attach() operation once a + map_dma_buf() API has been called, it simply returns an error. + +References: +[1] struct dma_buf_ops in include/linux/dma-buf.h +[2] All interfaces mentioned above defined in include/linux/dma-buf.h diff --git a/Documentation/dontdiff b/Documentation/dontdiff index dfa6fc6..0c083c5 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -66,7 +66,6 @@ GRTAGS GSYMS GTAGS Image -Kerntypes Module.markers Module.symvers PENDING diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index d79aead..10c64c8 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -262,6 +262,7 @@ IOMAP devm_ioremap() devm_ioremap_nocache() devm_iounmap() + devm_request_and_ioremap() : checks resource, requests region, ioremaps pcim_iomap() pcim_iounmap() pcim_iomap_table() : array of mapped addresses indexed by BAR diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 3d84912..5575759 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -85,17 +85,6 @@ Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com> --------------------------- -What: Deprecated snapshot ioctls -When: 2.6.36 - -Why: The ioctls in kernel/power/user.c were marked as deprecated long time - ago. Now they notify users about that so that they need to replace - their userspace. After some more time, remove them completely. - -Who: Jiri Slaby <jirislaby@gmail.com> - ---------------------------- - What: The ieee80211_regdom module parameter When: March 2010 / desktop catchup @@ -263,8 +252,7 @@ Who: Ravikiran Thirumalai <kiran@scalex86.org> What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS (in net/core/net-sysfs.c) -When: After the only user (hal) has seen a release with the patches - for enough time, probably some time in 2010. +When: 3.5 Why: Over 1K .text/.data size reduction, data is available in other ways (ioctls) Who: Johannes Berg <johannes@sipsolutions.net> @@ -362,15 +350,6 @@ Who: anybody or Florian Mickler <florian@mickler.org> ---------------------------- -What: KVM paravirt mmu host support -When: January 2011 -Why: The paravirt mmu host support is slower than non-paravirt mmu, both - on newer and older hardware. It is already not exposed to the guest, - and kept only for live migration purposes. -Who: Avi Kivity <avi@redhat.com> - ----------------------------- - What: iwlwifi 50XX module parameters When: 3.0 Why: The "..50" modules parameters were used to configure 5000 series and @@ -535,6 +514,20 @@ Why: In 3.0, we can now autodetect internal 3G device and already have information log when acer-wmi initial. Who: Lee, Chun-Yi <jlee@novell.com> +--------------------------- + +What: /sys/devices/platform/_UDC_/udc/_UDC_/is_dualspeed file and + is_dualspeed line in /sys/devices/platform/ci13xxx_*/udc/device file. +When: 3.8 +Why: The is_dualspeed file is superseded by maximum_speed in the same + directory and is_dualspeed line in device file is superseded by + max_speed line in the same file. + + The maximum_speed/max_speed specifies maximum speed supported by UDC. + To check if dualspeeed is supported, check if the value is >= 3. + Various possible speeds are defined in <linux/usb/ch9.h>. +Who: Michal Nazarewicz <mina86@mina86.com> + ---------------------------- What: The XFS nodelaylog mount option diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index d819ba1..4fca82e 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -37,15 +37,15 @@ d_manage: no no yes (ref-walk) maybe --------------------------- inode_operations --------------------------- prototypes: - int (*create) (struct inode *,struct dentry *,int, struct nameidata *); + int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *); struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid ata *); int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct inode *,struct dentry *,const char *); - int (*mkdir) (struct inode *,struct dentry *,int); + int (*mkdir) (struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); - int (*mknod) (struct inode *,struct dentry *,int,dev_t); + int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct inode *, struct dentry *, struct inode *, struct dentry *); int (*readlink) (struct dentry *, char __user *,int); @@ -117,7 +117,7 @@ prototypes: int (*statfs) (struct dentry *, struct kstatfs *); int (*remount_fs) (struct super_block *, int *, char *); void (*umount_begin) (struct super_block *); - int (*show_options)(struct seq_file *, struct vfsmount *); + int (*show_options)(struct seq_file *, struct dentry *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t); diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt index 64087c3..7671352 100644 --- a/Documentation/filesystems/btrfs.txt +++ b/Documentation/filesystems/btrfs.txt @@ -63,8 +63,8 @@ IRC network. Userspace tools for creating and manipulating Btrfs file systems are available from the git repository at the following location: - http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git - git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git + http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git + git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git These include the following tools: diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt index dd57bb6..b40fec9 100644 --- a/Documentation/filesystems/configfs/configfs.txt +++ b/Documentation/filesystems/configfs/configfs.txt @@ -192,7 +192,7 @@ attribute value uses the store_attribute() method. struct configfs_attribute { char *ca_name; struct module *ca_owner; - mode_t ca_mode; + umode_t ca_mode; }; When a config_item wants an attribute to appear as a file in the item's diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt index 742cc06..6872c91 100644 --- a/Documentation/filesystems/debugfs.txt +++ b/Documentation/filesystems/debugfs.txt @@ -35,7 +35,7 @@ described below will work. The most general way to create a file within a debugfs directory is with: - struct dentry *debugfs_create_file(const char *name, mode_t mode, + struct dentry *debugfs_create_file(const char *name, umode_t mode, struct dentry *parent, void *data, const struct file_operations *fops); @@ -53,13 +53,13 @@ actually necessary; the debugfs code provides a number of helper functions for simple situations. Files containing a single integer value can be created with any of: - struct dentry *debugfs_create_u8(const char *name, mode_t mode, + struct dentry *debugfs_create_u8(const char *name, umode_t mode, struct dentry *parent, u8 *value); - struct dentry *debugfs_create_u16(const char *name, mode_t mode, + struct dentry *debugfs_create_u16(const char *name, umode_t mode, struct dentry *parent, u16 *value); - struct dentry *debugfs_create_u32(const char *name, mode_t mode, + struct dentry *debugfs_create_u32(const char *name, umode_t mode, struct dentry *parent, u32 *value); - struct dentry *debugfs_create_u64(const char *name, mode_t mode, + struct dentry *debugfs_create_u64(const char *name, umode_t mode, struct dentry *parent, u64 *value); These files support both reading and writing the given value; if a specific @@ -67,13 +67,13 @@ file should not be written to, simply set the mode bits accordingly. The values in these files are in decimal; if hexadecimal is more appropriate, the following functions can be used instead: - struct dentry *debugfs_create_x8(const char *name, mode_t mode, + struct dentry *debugfs_create_x8(const char *name, umode_t mode, struct dentry *parent, u8 *value); - struct dentry *debugfs_create_x16(const char *name, mode_t mode, + struct dentry *debugfs_create_x16(const char *name, umode_t mode, struct dentry *parent, u16 *value); - struct dentry *debugfs_create_x32(const char *name, mode_t mode, + struct dentry *debugfs_create_x32(const char *name, umode_t mode, struct dentry *parent, u32 *value); - struct dentry *debugfs_create_x64(const char *name, mode_t mode, + struct dentry *debugfs_create_x64(const char *name, umode_t mode, struct dentry *parent, u64 *value); These functions are useful as long as the developer knows the size of the @@ -81,7 +81,7 @@ value to be exported. Some types can have different widths on different architectures, though, complicating the situation somewhat. There is a function meant to help out in one special case: - struct dentry *debugfs_create_size_t(const char *name, mode_t mode, + struct dentry *debugfs_create_size_t(const char *name, umode_t mode, struct dentry *parent, size_t *value); @@ -90,21 +90,22 @@ a variable of type size_t. Boolean values can be placed in debugfs with: - struct dentry *debugfs_create_bool(const char *name, mode_t mode, + struct dentry *debugfs_create_bool(const char *name, umode_t mode, struct dentry *parent, u32 *value); A read on the resulting file will yield either Y (for non-zero values) or N, followed by a newline. If written to, it will accept either upper- or lower-case values, or 1 or 0. Any other input will be silently ignored. -Finally, a block of arbitrary binary data can be exported with: +Another option is exporting a block of arbitrary binary data, with +this structure and function: struct debugfs_blob_wrapper { void *data; unsigned long size; }; - struct dentry *debugfs_create_blob(const char *name, mode_t mode, + struct dentry *debugfs_create_blob(const char *name, umode_t mode, struct dentry *parent, struct debugfs_blob_wrapper *blob); @@ -115,6 +116,35 @@ can be used to export binary information, but there does not appear to be any code which does so in the mainline. Note that all files created with debugfs_create_blob() are read-only. +If you want to dump a block of registers (something that happens quite +often during development, even if little such code reaches mainline. +Debugfs offers two functions: one to make a registers-only file, and +another to insert a register block in the middle of another sequential +file. + + struct debugfs_reg32 { + char *name; + unsigned long offset; + }; + + struct debugfs_regset32 { + struct debugfs_reg32 *regs; + int nregs; + void __iomem *base; + }; + + struct dentry *debugfs_create_regset32(const char *name, mode_t mode, + struct dentry *parent, + struct debugfs_regset32 *regset); + + int debugfs_print_regs32(struct seq_file *s, struct debugfs_reg32 *regs, + int nregs, void __iomem *base, char *prefix); + +The "base" argument may be 0, but you may want to build the reg32 array +using __stringify, and a number of register names (macros) are actually +byte offsets over a base for the register block. + + There are a couple of other directory-oriented helper functions: struct dentry *debugfs_rename(struct dentry *old_dir, diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index 07235ca..a6619b7 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -70,7 +70,7 @@ An attribute definition is simply: struct attribute { char * name; struct module *owner; - mode_t mode; + umode_t mode; }; diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 43cbd08..3d9393b 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -225,7 +225,7 @@ struct super_operations { void (*clear_inode) (struct inode *); void (*umount_begin) (struct super_block *); - int (*show_options)(struct seq_file *, struct vfsmount *); + int (*show_options)(struct seq_file *, struct dentry *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); @@ -341,14 +341,14 @@ This describes how the VFS can manipulate an inode in your filesystem. As of kernel 2.6.22, the following members are defined: struct inode_operations { - int (*create) (struct inode *,struct dentry *,int, struct nameidata *); + int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *); struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *); int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct inode *,struct dentry *,const char *); - int (*mkdir) (struct inode *,struct dentry *,int); + int (*mkdir) (struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); - int (*mknod) (struct inode *,struct dentry *,int,dev_t); + int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct inode *, struct dentry *, struct inode *, struct dentry *); int (*readlink) (struct dentry *, char __user *,int); diff --git a/Documentation/hwmon/pmbus b/Documentation/hwmon/pmbus index 15ac911..d28b591 100644 --- a/Documentation/hwmon/pmbus +++ b/Documentation/hwmon/pmbus @@ -2,9 +2,8 @@ Kernel driver pmbus ==================== Supported chips: - * Ericsson BMR45X series - DC/DC Converter - Prefixes: 'bmr450', 'bmr451', 'bmr453', 'bmr454' + * Ericsson BMR453, BMR454 + Prefixes: 'bmr453', 'bmr454' Addresses scanned: - Datasheet: http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146395 diff --git a/Documentation/hwmon/zl6100 b/Documentation/hwmon/zl6100 index 7617798..51f76a1 100644 --- a/Documentation/hwmon/zl6100 +++ b/Documentation/hwmon/zl6100 @@ -6,6 +6,10 @@ Supported chips: Prefix: 'zl2004' Addresses scanned: - Datasheet: http://www.intersil.com/data/fn/fn6847.pdf + * Intersil / Zilker Labs ZL2005 + Prefix: 'zl2005' + Addresses scanned: - + Datasheet: http://www.intersil.com/data/fn/fn6848.pdf * Intersil / Zilker Labs ZL2006 Prefix: 'zl2006' Addresses scanned: - @@ -30,6 +34,17 @@ Supported chips: Prefix: 'zl6105' Addresses scanned: - Datasheet: http://www.intersil.com/data/fn/fn6906.pdf + * Ericsson BMR450, BMR451 + Prefix: 'bmr450', 'bmr451' + Addresses scanned: - + Datasheet: +http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146401 + * Ericsson BMR462, BMR463, BMR464 + Prefixes: 'bmr462', 'bmr463', 'bmr464' + Addresses scanned: - + Datasheet: +http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146256 + Author: Guenter Roeck <guenter.roeck@ericsson.com> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt index 7a9e0b4..506c739 100644 --- a/Documentation/kdump/kdump.txt +++ b/Documentation/kdump/kdump.txt @@ -17,8 +17,8 @@ You can use common commands, such as cp and scp, to copy the memory image to a dump file on the local disk, or across the network to a remote system. -Kdump and kexec are currently supported on the x86, x86_64, ppc64 and ia64 -architectures. +Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64, +and s390x architectures. When the system kernel boots, it reserves a small section of memory for the dump-capture kernel. This ensures that ongoing Direct Memory Access @@ -34,11 +34,18 @@ Similarly on PPC64 machines first 32KB of physical memory is needed for booting regardless of where the kernel is loaded and to support 64K page size kexec backs up the first 64KB memory. +For s390x, when kdump is triggered, the crashkernel region is exchanged +with the region [0, crashkernel region size] and then the kdump kernel +runs in [0, crashkernel region size]. Therefore no relocatable kernel is +needed for s390x. + All of the necessary information about the system kernel's core image is encoded in the ELF format, and stored in a reserved area of memory before a crash. The physical address of the start of the ELF header is passed to the dump-capture kernel through the elfcorehdr= boot -parameter. +parameter. Optionally the size of the ELF header can also be passed +when using the elfcorehdr=[size[KMG]@]offset[KMG] syntax. + With the dump-capture kernel, you can access the memory image, or "old memory," in two ways: @@ -291,6 +298,10 @@ Boot into System Kernel The region may be automatically placed on ia64, see the dump-capture kernel config option notes above. + On s390x, typically use "crashkernel=xxM". The value of xx is dependent + on the memory consumption of the kdump system. In general this is not + dependent on the memory size of the production system. + Load the Dump-capture Kernel ============================ @@ -308,6 +319,8 @@ For ppc64: - Use vmlinux For ia64: - Use vmlinux or vmlinuz.gz +For s390x: + - Use image or bzImage If you are using a uncompressed vmlinux image then use following command @@ -337,6 +350,8 @@ For i386, x86_64 and ia64: For ppc64: "1 maxcpus=1 noirqdistrib reset_devices" +For s390x: + "1 maxcpus=1 cgroup_disable=memory" Notes on loading the dump-capture kernel: @@ -362,6 +377,20 @@ Notes on loading the dump-capture kernel: dump. Hence generally it is useful either to build a UP dump-capture kernel or specify maxcpus=1 option while loading dump-capture kernel. +* For s390x there are two kdump modes: If a ELF header is specified with + the elfcorehdr= kernel parameter, it is used by the kdump kernel as it + is done on all other architectures. If no elfcorehdr= kernel parameter is + specified, the s390x kdump kernel dynamically creates the header. The + second mode has the advantage that for CPU and memory hotplug, kdump has + not to be reloaded with kexec_load(). + +* For s390x systems with many attached devices the "cio_ignore" kernel + parameter should be used for the kdump kernel in order to prevent allocation + of kernel memory for devices that are not relevant for kdump. The same + applies to systems that use SCSI/FCP devices. In that case the + "allow_lun_scan" zfcp module parameter should be set to zero before + setting FCP devices online. + Kernel Panic ============ diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index a0c5c5f..e69a461 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -315,12 +315,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. CPU-intensive style benchmark, and it can vary highly in a microbenchmark depending on workload and compiler. - 1: only for 32-bit processes - 2: only for 64-bit processes + 32: only for 32-bit processes + 64: only for 64-bit processes on: enable for both 32- and 64-bit processes off: disable for both 32- and 64-bit processes - amd_iommu= [HW,X86-84] + amd_iommu= [HW,X86-64] Pass parameters to the AMD IOMMU driver in the system. Possible values are: fullflush - enable flushing of IO/TLB entries when @@ -1178,9 +1178,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs. Default is 0 (don't ignore, but inject #GP) - kvm.oos_shadow= [KVM] Disable out-of-sync shadow paging. - Default is 1 (enabled) - kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit KVM MMU at runtime. Default is 0 (off) @@ -1885,6 +1882,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. arch_perfmon: [X86] Force use of architectural perfmon on Intel CPUs instead of the CPU specific event set. + timer: [X86] Force use of architectural NMI + timer mode (see also oprofile.timer + for generic hr timer mode) + [s390] Force legacy basic mode sampling + (report cpu_type "timer") oops=panic Always panic on oopses. Default is to just kill the process, but there is a small probability of @@ -2632,6 +2634,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. [USB] Start with the old device initialization scheme (default 0 = off). + usbcore.usbfs_memory_mb= + [USB] Memory limit (in MB) for buffers allocated by + usbfs (default = 16, 0 = max = 2047). + usbcore.use_both_schemes= [USB] Try the other device initialization scheme if the first one fails (default 1 = enabled). @@ -2750,11 +2756,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. functions are at fixed addresses, they make nice targets for exploits that can control RIP. - emulate Vsyscalls turn into traps and are emulated - reasonably safely. + emulate [default] Vsyscalls turn into traps and are + emulated reasonably safely. - native [default] Vsyscalls are native syscall - instructions. + native Vsyscalls are native syscall instructions. This is a little bit faster than trapping and makes a few dynamic recompilers work better than they would in emulation mode. diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt index abf768c..5dbc99c 100644 --- a/Documentation/lockdep-design.txt +++ b/Documentation/lockdep-design.txt @@ -221,3 +221,66 @@ when the chain is validated for the first time, is then put into a hash table, which hash-table can be checked in a lockfree manner. If the locking chain occurs again later on, the hash table tells us that we dont have to validate the chain again. + +Troubleshooting: +---------------- + +The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes. +Exceeding this number will trigger the following lockdep warning: + + (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)) + +By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical +desktop systems have less than 1,000 lock classes, so this warning +normally results from lock-class leakage or failure to properly +initialize locks. These two problems are illustrated below: + +1. Repeated module loading and unloading while running the validator + will result in lock-class leakage. The issue here is that each + load of the module will create a new set of lock classes for + that module's locks, but module unloading does not remove old + classes (see below discussion of reuse of lock classes for why). + Therefore, if that module is loaded and unloaded repeatedly, + the number of lock classes will eventually reach the maximum. + +2. Using structures such as arrays that have large numbers of + locks that are not explicitly initialized. For example, + a hash table with 8192 buckets where each bucket has its own + spinlock_t will consume 8192 lock classes -unless- each spinlock + is explicitly initialized at runtime, for example, using the + run-time spin_lock_init() as opposed to compile-time initializers + such as __SPIN_LOCK_UNLOCKED(). Failure to properly initialize + the per-bucket spinlocks would guarantee lock-class overflow. + In contrast, a loop that called spin_lock_init() on each lock + would place all 8192 locks into a single lock class. + + The moral of this story is that you should always explicitly + initialize your locks. + +One might argue that the validator should be modified to allow +lock classes to be reused. However, if you are tempted to make this +argument, first review the code and think through the changes that would +be required, keeping in mind that the lock classes to be removed are +likely to be linked into the lock-dependency graph. This turns out to +be harder to do than to say. + +Of course, if you do run out of lock classes, the next thing to do is +to find the offending lock classes. First, the following command gives +you the number of lock classes currently in use along with the maximum: + + grep "lock-classes" /proc/lockdep_stats + +This command produces the following output on a modest system: + + lock-classes: 748 [max: 8191] + +If the number allocated (748 above) increases continually over time, +then there is likely a leak. The following command can be used to +identify the leaking lock classes: + + grep "BD" /proc/lockdep + +Run the command and save the output, then compare against the output from +a later run of this command to identify the leakers. This same output +can also help you find situations where runtime lock initialization has +been omitted. diff --git a/Documentation/md.txt b/Documentation/md.txt index fc94770..993fba3 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -357,14 +357,14 @@ Each directory contains: written to, that device. state - A file recording the current state of the device in the array + A file recording the current state of the device in the array which can be a comma separated list of faulty - device has been kicked from active use due to - a detected fault or it has unacknowledged bad - blocks + a detected fault, or it has unacknowledged bad + blocks in_sync - device is a fully in-sync member of the array writemostly - device will only be subject to read - requests if there are no other options. + requests if there are no other options. This applies only to raid1 arrays. blocked - device has failed, and the failure hasn't been acknowledged yet by the metadata handler. @@ -374,6 +374,13 @@ Each directory contains: This includes spares that are in the process of being recovered to write_error - device has ever seen a write error. + want_replacement - device is (mostly) working but probably + should be replaced, either due to errors or + due to user request. + replacement - device is a replacement for another active + device with same raid_disk. + + This list may grow in future. This can be written to. Writing "faulty" simulates a failure on the device. @@ -386,6 +393,13 @@ Each directory contains: Writing "in_sync" sets the in_sync flag. Writing "write_error" sets writeerrorseen flag. Writing "-write_error" clears writeerrorseen flag. + Writing "want_replacement" is allowed at any time except to a + replacement device or a spare. It sets the flag. + Writing "-want_replacement" is allowed at any time. It clears + the flag. + Writing "replacement" or "-replacement" is only allowed before + starting the array. It sets or clears the flag. + This file responds to select/poll. Any change to 'faulty' or 'blocked' causes an event. diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index bbce121..9ad9dde 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -144,6 +144,8 @@ nfc.txt - The Linux Near Field Communication (NFS) subsystem. olympic.txt - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info. +openvswitch.txt + - Open vSwitch developer documentation. operstates.txt - Overview of network interface operational states. packet_mmap.txt diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt index c86d03f..221ad0c 100644 --- a/Documentation/networking/batman-adv.txt +++ b/Documentation/networking/batman-adv.txt @@ -200,15 +200,16 @@ abled during run time. Following log_levels are defined: 0 - All debug output disabled 1 - Enable messages related to routing / flooding / broadcasting -2 - Enable route or tt entry added / changed / deleted -3 - Enable all messages +2 - Enable messages related to route added / changed / deleted +4 - Enable messages related to translation table operations +7 - Enable all messages The debug output can be changed at runtime using the file /sys/class/net/bat0/mesh/log_level. e.g. # echo 2 > /sys/class/net/bat0/mesh/log_level -will enable debug messages for when routes or TTs change. +will enable debug messages for when routes change. BATCTL diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 91df678..080ad26 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -196,6 +196,23 @@ or, for backwards compatibility, the option value. E.g., The parameters are as follows: +active_slave + + Specifies the new active slave for modes that support it + (active-backup, balance-alb and balance-tlb). Possible values + are the name of any currently enslaved interface, or an empty + string. If a name is given, the slave and its link must be up in order + to be selected as the new active slave. If an empty string is + specified, the current active slave is cleared, and a new active + slave is selected automatically. + + Note that this is only available through the sysfs interface. No module + parameter by this name exists. + + The normal value of this option is the name of the currently + active slave, or the empty string if there is no active slave or + the current mode does not use an active slave. + ad_select Specifies the 802.3ad aggregation selection logic to use. The diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt index f41ea24..1dc1c24 100644 --- a/Documentation/networking/ieee802154.txt +++ b/Documentation/networking/ieee802154.txt @@ -78,3 +78,30 @@ in software. This is currently WIP. See header include/net/mac802154.h and several drivers in drivers/ieee802154/. +6LoWPAN Linux implementation +============================ + +The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80 +octets of actual MAC payload once security is turned on, on a wireless link +with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format +[RFC4944] was specified to carry IPv6 datagrams over such constrained links, +taking into account limited bandwidth, memory, or energy resources that are +expected in applications such as wireless Sensor Networks. [RFC4944] defines +a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header +to support the IPv6 minimum MTU requirement [RFC2460], and stateless header +compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the +relatively large IPv6 and UDP headers down to (in the best case) several bytes. + +In Semptember 2011 the standard update was published - [RFC6282]. +It deprecates HC1 and HC2 compression and defines IPHC encoding format which is +used in this Linux implementation. + +All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.* + +To setup 6lowpan interface you need (busybox release > 1.17.0): +1. Add IEEE802.15.4 interface and initialize PANid; +2. Add 6lowpan interface by command like: + # ip link add link wpan0 name lowpan0 type lowpan +3. Set MAC (if needs): + # ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be +4. Bring up 'lowpan0' interface diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c index 65968fb..ac5debb 100644 --- a/Documentation/networking/ifenslave.c +++ b/Documentation/networking/ifenslave.c @@ -539,12 +539,14 @@ static int if_getconfig(char *ifname) metric = 0; } else metric = ifr.ifr_metric; + printf("The result of SIOCGIFMETRIC is %d\n", metric); strcpy(ifr.ifr_name, ifname); if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0) mtu = 0; else mtu = ifr.ifr_mtu; + printf("The result of SIOCGIFMTU is %d\n", mtu); strcpy(ifr.ifr_name, ifname); if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) { diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index f049a1c..ad3e80e 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -31,6 +31,16 @@ neigh/default/gc_thresh3 - INTEGER when using large numbers of interfaces and when communicating with large numbers of directly-connected peers. +neigh/default/unres_qlen_bytes - INTEGER + The maximum number of bytes which may be used by packets + queued for each unresolved address by other network layers. + (added in linux 3.3) + +neigh/default/unres_qlen - INTEGER + The maximum number of packets which may be queued for each + unresolved address by other network layers. + (deprecated in linux 3.3) : use unres_qlen_bytes instead. + mtu_expires - INTEGER Time, in seconds, that cached PMTU information is kept. @@ -165,6 +175,9 @@ tcp_congestion_control - STRING connections. The algorithm "reno" is always available, but additional choices may be available based on kernel configuration. Default is set as part of kernel configuration. + For passive connections, the listener congestion control choice + is inherited. + [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ] tcp_cookie_size - INTEGER Default size of TCP Cookie Transactions (TCPCT) option, that may be @@ -282,11 +295,11 @@ tcp_max_ssthresh - INTEGER Default: 0 (off) tcp_max_syn_backlog - INTEGER - Maximal number of remembered connection requests, which are - still did not receive an acknowledgment from connecting client. - Default value is 1024 for systems with more than 128Mb of memory, - and 128 for low memory machines. If server suffers of overload, - try to increase this number. + Maximal number of remembered connection requests, which have not + received an acknowledgment from connecting client. + The minimal value is 128 for low memory machines, and it will + increase in proportion to the memory of machine. + If server suffers from overload, try increasing this number. tcp_max_tw_buckets - INTEGER Maximal number of timewait sockets held by system simultaneously. diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt new file mode 100644 index 0000000..b8a048b --- /dev/null +++ b/Documentation/networking/openvswitch.txt @@ -0,0 +1,195 @@ +Open vSwitch datapath developer documentation +============================================= + +The Open vSwitch kernel module allows flexible userspace control over +flow-level packet processing on selected network devices. It can be +used to implement a plain Ethernet switch, network device bonding, +VLAN processing, network access control, flow-based network control, +and so on. + +The kernel module implements multiple "datapaths" (analogous to +bridges), each of which can have multiple "vports" (analogous to ports +within a bridge). Each datapath also has associated with it a "flow +table" that userspace populates with "flows" that map from keys based +on packet headers and metadata to sets of actions. The most common +action forwards the packet to another vport; other actions are also +implemented. + +When a packet arrives on a vport, the kernel module processes it by +extracting its flow key and looking it up in the flow table. If there +is a matching flow, it executes the associated actions. If there is +no match, it queues the packet to userspace for processing (as part of +its processing, userspace will likely set up a flow to handle further +packets of the same type entirely in-kernel). + + +Flow key compatibility +---------------------- + +Network protocols evolve over time. New protocols become important +and existing protocols lose their prominence. For the Open vSwitch +kernel module to remain relevant, it must be possible for newer +versions to parse additional protocols as part of the flow key. It +might even be desirable, someday, to drop support for parsing +protocols that have become obsolete. Therefore, the Netlink interface +to Open vSwitch is designed to allow carefully written userspace +applications to work with any version of the flow key, past or future. + +To support this forward and backward compatibility, whenever the +kernel module passes a packet to userspace, it also passes along the +flow key that it parsed from the packet. Userspace then extracts its +own notion of a flow key from the packet and compares it against the +kernel-provided version: + + - If userspace's notion of the flow key for the packet matches the + kernel's, then nothing special is necessary. + + - If the kernel's flow key includes more fields than the userspace + version of the flow key, for example if the kernel decoded IPv6 + headers but userspace stopped at the Ethernet type (because it + does not understand IPv6), then again nothing special is + necessary. Userspace can still set up a flow in the usual way, + as long as it uses the kernel-provided flow key to do it. + + - If the userspace flow key includes more fields than the + kernel's, for example if userspace decoded an IPv6 header but + the kernel stopped at the Ethernet type, then userspace can + forward the packet manually, without setting up a flow in the + kernel. This case is bad for performance because every packet + that the kernel considers part of the flow must go to userspace, + but the forwarding behavior is correct. (If userspace can + determine that the values of the extra fields would not affect + forwarding behavior, then it could set up a flow anyway.) + +How flow keys evolve over time is important to making this work, so +the following sections go into detail. + + +Flow key format +--------------- + +A flow key is passed over a Netlink socket as a sequence of Netlink +attributes. Some attributes represent packet metadata, defined as any +information about a packet that cannot be extracted from the packet +itself, e.g. the vport on which the packet was received. Most +attributes, however, are extracted from headers within the packet, +e.g. source and destination addresses from Ethernet, IP, or TCP +headers. + +The <linux/openvswitch.h> header file defines the exact format of the +flow key attributes. For informal explanatory purposes here, we write +them as comma-separated strings, with parentheses indicating arguments +and nesting. For example, the following could represent a flow key +corresponding to a TCP packet that arrived on vport 1: + + in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4), + eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0, + frag=no), tcp(src=49163, dst=80) + +Often we ellipsize arguments not important to the discussion, e.g.: + + in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...) + + +Basic rule for evolving flow keys +--------------------------------- + +Some care is needed to really maintain forward and backward +compatibility for applications that follow the rules listed under +"Flow key compatibility" above. + +The basic rule is obvious: + + ------------------------------------------------------------------ + New network protocol support must only supplement existing flow + key attributes. It must not change the meaning of already defined + flow key attributes. + ------------------------------------------------------------------ + +This rule does have less-obvious consequences so it is worth working +through a few examples. Suppose, for example, that the kernel module +did not already implement VLAN parsing. Instead, it just interpreted +the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the +packet. The flow key for any packet with an 802.1Q header would look +essentially like this, ignoring metadata: + + eth(...), eth_type(0x8100) + +Naively, to add VLAN support, it makes sense to add a new "vlan" flow +key attribute to contain the VLAN tag, then continue to decode the +encapsulated headers beyond the VLAN tag using the existing field +definitions. With this change, an TCP packet in VLAN 10 would have a +flow key much like this: + + eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...) + +But this change would negatively affect a userspace application that +has not been updated to understand the new "vlan" flow key attribute. +The application could, following the flow compatibility rules above, +ignore the "vlan" attribute that it does not understand and therefore +assume that the flow contained IP packets. This is a bad assumption +(the flow only contains IP packets if one parses and skips over the +802.1Q header) and it could cause the application's behavior to change +across kernel versions even though it follows the compatibility rules. + +The solution is to use a set of nested attributes. This is, for +example, why 802.1Q support uses nested attributes. A TCP packet in +VLAN 10 is actually expressed as: + + eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800), + ip(proto=6, ...), tcp(...))) + +Notice how the "eth_type", "ip", and "tcp" flow key attributes are +nested inside the "encap" attribute. Thus, an application that does +not understand the "vlan" key will not see either of those attributes +and therefore will not misinterpret them. (Also, the outer eth_type +is still 0x8100, not changed to 0x0800.) + +Handling malformed packets +-------------------------- + +Don't drop packets in the kernel for malformed protocol headers, bad +checksums, etc. This would prevent userspace from implementing a +simple Ethernet switch that forwards every packet. + +Instead, in such a case, include an attribute with "empty" content. +It doesn't matter if the empty content could be valid protocol values, +as long as those values are rarely seen in practice, because userspace +can always forward all packets with those values to userspace and +handle them individually. + +For example, consider a packet that contains an IP header that +indicates protocol 6 for TCP, but which is truncated just after the IP +header, so that the TCP header is missing. The flow key for this +packet would include a tcp attribute with all-zero src and dst, like +this: + + eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0) + +As another example, consider a packet with an Ethernet type of 0x8100, +indicating that a VLAN TCI should follow, but which is truncated just +after the Ethernet type. The flow key for this packet would include +an all-zero-bits vlan and an empty encap attribute, like this: + + eth(...), eth_type(0x8100), vlan(0), encap() + +Unlike a TCP packet with source and destination ports 0, an +all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka +VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan +attribute expressly to allow this situation to be distinguished. +Thus, the flow key in this second example unambiguously indicates a +missing or malformed VLAN TCI. + +Other rules +----------- + +The other rules for flow keys are much less subtle: + + - Duplicate attributes are not allowed at a given nesting level. + + - Ordering of attributes is not significant. + + - When the kernel sends a given flow key to userspace, it always + composes it the same way. This allows userspace to hash and + compare entire flow keys that it may not be able to fully + interpret. diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index 4acea66..1c08a4b 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -155,7 +155,7 @@ As capture, each frame contains two parts: /* fill sockaddr_ll struct to prepare binding */ my_addr.sll_family = AF_PACKET; - my_addr.sll_protocol = ETH_P_ALL; + my_addr.sll_protocol = htons(ETH_P_ALL); my_addr.sll_ifindex = s_ifr.ifr_ifindex; /* bind socket to eth0 */ diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt index a177de2..579994a 100644 --- a/Documentation/networking/scaling.txt +++ b/Documentation/networking/scaling.txt @@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the length of the current CPU's backlog when a packet in this flow was last enqueued. Each backlog queue has a head counter that is incremented on dequeue. A tail counter is computed as head counter + queue length. In other words, the counter -in rps_dev_flow_table[i] records the last element in flow i that has +in rps_dev_flow[i] records the last element in flow i that has been enqueued onto the currently designated CPU for flow i (of course, entry i is actually selected by hash and multiple flows may hash to the same entry i). @@ -224,7 +224,7 @@ following is true: - The current CPU's queue head counter >= the recorded tail counter value in rps_dev_flow[i] -- The current CPU is unset (equal to NR_CPUS) +- The current CPU is unset (equal to RPS_NO_CPU) - The current CPU is offline After this check, the packet is sent to the (possibly updated) current @@ -235,7 +235,7 @@ CPU. ==== RFS Configuration -RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on +RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on by default for SMP). The functionality remains disabled until explicitly configured. The number of entries in the global flow table is set through: @@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value for the single queue would normally be configured to the same value as rps_sock_flow_entries. For a multi-queue device, the rps_flow_cnt for each queue might be configured as rps_sock_flow_entries / N, where N is the number of -queues. So for instance, if rps_flow_entries is set to 32768 and there +queues. So for instance, if rps_sock_flow_entries is set to 32768 and there are 16 configured receive queues, rps_flow_cnt for each queue might be configured as 2048. diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt index 8d67980..d0aeead 100644 --- a/Documentation/networking/stmmac.txt +++ b/Documentation/networking/stmmac.txt @@ -4,14 +4,16 @@ Copyright (C) 2007-2010 STMicroelectronics Ltd Author: Giuseppe Cavallaro <peppe.cavallaro@st.com> This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers -(Synopsys IP blocks); it has been fully tested on STLinux platforms. +(Synopsys IP blocks). Currently this network device driver is for all STM embedded MAC/GMAC -(i.e. 7xxx/5xxx SoCs) and it's known working on other platforms i.e. ARM SPEAr. +(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000 +FF1152AMT0221 D1215994A VIRTEX FPGA board. -DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100 -Universal version 4.0 have been used for developing the first code -implementation. +DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether MAC 10/100 +Universal version 4.0 have been used for developing this driver. + +This driver supports both the platform bus and PCI. Please, for more information also visit: www.stlinux.com @@ -277,5 +279,5 @@ In fact, these can generate an huge amount of debug messages. 6) TODO: o XGMAC is not supported. - o Review the timer optimisation code to use an embedded device that will be - available in new chip generations. + o Add the EEE - Energy Efficient Ethernet + o Add the PTP - precision time protocol diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt new file mode 100644 index 0000000..5a01368 --- /dev/null +++ b/Documentation/networking/team.txt @@ -0,0 +1,2 @@ +Team devices are driven from userspace via libteam library which is here: + https://github.com/jpirko/libteam diff --git a/Documentation/pinctrl.txt b/Documentation/pinctrl.txt index b04cb7d..6727b92 100644 --- a/Documentation/pinctrl.txt +++ b/Documentation/pinctrl.txt @@ -7,12 +7,9 @@ This subsystem deals with: - Multiplexing of pins, pads, fingers (etc) see below for details -The intention is to also deal with: - -- Software-controlled biasing and driving mode specific pins, such as - pull-up/down, open drain etc, load capacitance configuration when controlled - by software, etc. - +- Configuration of pins, pads, fingers (etc), such as software-controlled + biasing and driving mode specific pins, such as pull-up/down, open drain, + load capacitance etc. Top-level interface =================== @@ -32,7 +29,7 @@ Definition of PIN: be sparse - i.e. there may be gaps in the space with numbers where no pin exists. -When a PIN CONTROLLER is instatiated, it will register a descriptor to the +When a PIN CONTROLLER is instantiated, it will register a descriptor to the pin control framework, and this descriptor contains an array of pin descriptors describing the pins handled by this specific pin controller. @@ -61,14 +58,14 @@ this in our driver: #include <linux/pinctrl/pinctrl.h> -const struct pinctrl_pin_desc __refdata foo_pins[] = { - PINCTRL_PIN(0, "A1"), - PINCTRL_PIN(1, "A2"), - PINCTRL_PIN(2, "A3"), +const struct pinctrl_pin_desc foo_pins[] = { + PINCTRL_PIN(0, "A8"), + PINCTRL_PIN(1, "B8"), + PINCTRL_PIN(2, "C8"), ... - PINCTRL_PIN(61, "H6"), - PINCTRL_PIN(62, "H7"), - PINCTRL_PIN(63, "H8"), + PINCTRL_PIN(61, "F1"), + PINCTRL_PIN(62, "G1"), + PINCTRL_PIN(63, "H1"), }; static struct pinctrl_desc foo_desc = { @@ -88,11 +85,16 @@ int __init foo_probe(void) pr_err("could not register foo pin driver\n"); } +To enable the pinctrl subsystem and the subgroups for PINMUX and PINCONF and +selected drivers, you need to select them from your machine's Kconfig entry, +since these are so tightly integrated with the machines they are used on. +See for example arch/arm/mach-u300/Kconfig for an example. + Pins usually have fancier names than this. You can find these in the dataheet for your chip. Notice that the core pinctrl.h file provides a fancy macro called PINCTRL_PIN() to create the struct entries. As you can see I enumerated -the pins from 0 in the upper left corner to 63 in the lower right corner, -this enumeration was arbitrarily chosen, in practice you need to think +the pins from 0 in the upper left corner to 63 in the lower right corner. +This enumeration was arbitrarily chosen, in practice you need to think through your numbering system so that it matches the layout of registers and such things in your driver, or the code may become complicated. You must also consider matching of offsets to the GPIO ranges that may be handled by @@ -133,8 +135,8 @@ struct foo_group { const unsigned num_pins; }; -static unsigned int spi0_pins[] = { 0, 8, 16, 24 }; -static unsigned int i2c0_pins[] = { 24, 25 }; +static const unsigned int spi0_pins[] = { 0, 8, 16, 24 }; +static const unsigned int i2c0_pins[] = { 24, 25 }; static const struct foo_group foo_groups[] = { { @@ -193,6 +195,88 @@ structure, for example specific register ranges associated with each group and so on. +Pin configuration +================= + +Pins can sometimes be software-configured in an various ways, mostly related +to their electronic properties when used as inputs or outputs. For example you +may be able to make an output pin high impedance, or "tristate" meaning it is +effectively disconnected. You may be able to connect an input pin to VDD or GND +using a certain resistor value - pull up and pull down - so that the pin has a +stable value when nothing is driving the rail it is connected to, or when it's +unconnected. + +For example, a platform may do this: + +ret = pin_config_set("foo-dev", "FOO_GPIO_PIN", PLATFORM_X_PULL_UP); + +To pull up a pin to VDD. The pin configuration driver implements callbacks for +changing pin configuration in the pin controller ops like this: + +#include <linux/pinctrl/pinctrl.h> +#include <linux/pinctrl/pinconf.h> +#include "platform_x_pindefs.h" + +static int foo_pin_config_get(struct pinctrl_dev *pctldev, + unsigned offset, + unsigned long *config) +{ + struct my_conftype conf; + + ... Find setting for pin @ offset ... + + *config = (unsigned long) conf; +} + +static int foo_pin_config_set(struct pinctrl_dev *pctldev, + unsigned offset, + unsigned long config) +{ + struct my_conftype *conf = (struct my_conftype *) config; + + switch (conf) { + case PLATFORM_X_PULL_UP: + ... + } + } +} + +static int foo_pin_config_group_get (struct pinctrl_dev *pctldev, + unsigned selector, + unsigned long *config) +{ + ... +} + +static int foo_pin_config_group_set (struct pinctrl_dev *pctldev, + unsigned selector, + unsigned long config) +{ + ... +} + +static struct pinconf_ops foo_pconf_ops = { + .pin_config_get = foo_pin_config_get, + .pin_config_set = foo_pin_config_set, + .pin_config_group_get = foo_pin_config_group_get, + .pin_config_group_set = foo_pin_config_group_set, +}; + +/* Pin config operations are handled by some pin controller */ +static struct pinctrl_desc foo_desc = { + ... + .confops = &foo_pconf_ops, +}; + +Since some controllers have special logic for handling entire groups of pins +they can exploit the special whole-group pin control function. The +pin_config_group_set() callback is allowed to return the error code -EAGAIN, +for groups it does not want to handle, or if it just wants to do some +group-level handling and then fall through to iterate over all pins, in which +case each individual pin will be treated by separate pin_config_set() calls as +well. + + Interaction with the GPIO subsystem =================================== @@ -214,19 +298,20 @@ static struct pinctrl_gpio_range gpio_range_a = { .name = "chip a", .id = 0, .base = 32, + .pin_base = 32, .npins = 16, .gc = &chip_a; }; -static struct pinctrl_gpio_range gpio_range_a = { +static struct pinctrl_gpio_range gpio_range_b = { .name = "chip b", .id = 0, .base = 48, + .pin_base = 64, .npins = 8, .gc = &chip_b; }; - { struct pinctrl_dev *pctl; ... @@ -235,42 +320,39 @@ static struct pinctrl_gpio_range gpio_range_a = { } So this complex system has one pin controller handling two different -GPIO chips. Chip a has 16 pins and chip b has 8 pins. They are mapped in -the global GPIO pin space at: +GPIO chips. "chip a" has 16 pins and "chip b" has 8 pins. The "chip a" and +"chip b" have different .pin_base, which means a start pin number of the +GPIO range. + +The GPIO range of "chip a" starts from the GPIO base of 32 and actual +pin range also starts from 32. However "chip b" has different starting +offset for the GPIO range and pin range. The GPIO range of "chip b" starts +from GPIO number 48, while the pin range of "chip b" starts from 64. + +We can convert a gpio number to actual pin number using this "pin_base". +They are mapped in the global GPIO pin space at: -chip a: [32 .. 47] -chip b: [48 .. 55] +chip a: + - GPIO range : [32 .. 47] + - pin range : [32 .. 47] +chip b: + - GPIO range : [48 .. 55] + - pin range : [64 .. 71] When GPIO-specific functions in the pin control subsystem are called, these -ranges will be used to look up the apropriate pin controller by inspecting +ranges will be used to look up the appropriate pin controller by inspecting and matching the pin to the pin ranges across all controllers. When a pin controller handling the matching range is found, GPIO-specific functions will be called on that specific pin controller. For all functionalities dealing with pin biasing, pin muxing etc, the pin controller subsystem will subtract the range's .base offset from the passed -in gpio pin number, and pass that on to the pin control driver, so the driver -will get an offset into its handled number range. Further it is also passed +in gpio number, and add the ranges's .pin_base offset to retrive a pin number. +After that, the subsystem passes it on to the pin control driver, so the driver +will get an pin number into its handled number range. Further it is also passed the range ID value, so that the pin controller knows which range it should deal with. -For example: if a user issues pinctrl_gpio_set_foo(50), the pin control -subsystem will find that the second range on this pin controller matches, -subtract the base 48 and call the -pinctrl_driver_gpio_set_foo(pinctrl, range, 2) where the latter function has -this signature: - -int pinctrl_driver_gpio_set_foo(struct pinctrl_dev *pctldev, - struct pinctrl_gpio_range *rangeid, - unsigned offset); - -Now the driver knows that we want to do some GPIO-specific operation on the -second GPIO range handled by "chip b", at offset 2 in that specific range. - -(If the GPIO subsystem is ever refactored to use a local per-GPIO controller -pin space, this mapping will need to be augmented accordingly.) - - PINMUX interfaces ================= @@ -438,7 +520,7 @@ you. Define enumerators only for the pins you can control if that makes sense. Assumptions: -We assume that the number possible function maps to pin groups is limited by +We assume that the number of possible function maps to pin groups is limited by the hardware. I.e. we assume that there is no system where any function can be mapped to any pin, like in a phone exchange. So the available pins groups for a certain function will be limited to a few choices (say up to eight or so), @@ -585,7 +667,7 @@ int foo_list_funcs(struct pinctrl_dev *pctldev, unsigned selector) const char *foo_get_fname(struct pinctrl_dev *pctldev, unsigned selector) { - return myfuncs[selector].name; + return foo_functions[selector].name; } static int foo_get_groups(struct pinctrl_dev *pctldev, unsigned selector, @@ -600,16 +682,16 @@ static int foo_get_groups(struct pinctrl_dev *pctldev, unsigned selector, int foo_enable(struct pinctrl_dev *pctldev, unsigned selector, unsigned group) { - u8 regbit = (1 << group); + u8 regbit = (1 << selector + group); writeb((readb(MUX)|regbit), MUX) return 0; } -int foo_disable(struct pinctrl_dev *pctldev, unsigned selector, +void foo_disable(struct pinctrl_dev *pctldev, unsigned selector, unsigned group) { - u8 regbit = (1 << group); + u8 regbit = (1 << selector + group); writeb((readb(MUX) & ~(regbit)), MUX) return 0; @@ -647,6 +729,17 @@ All the above functions are mandatory to implement for a pinmux driver. Pinmux interaction with the GPIO subsystem ========================================== +The public pinmux API contains two functions named pinmux_request_gpio() +and pinmux_free_gpio(). These two functions shall *ONLY* be called from +gpiolib-based drivers as part of their gpio_request() and +gpio_free() semantics. Likewise the pinmux_gpio_direction_[input|output] +shall only be called from within respective gpio_direction_[input|output] +gpiolib implementation. + +NOTE that platforms and individual drivers shall *NOT* request GPIO pins to be +muxed in. Instead, implement a proper gpiolib driver and have that driver +request proper muxing for its pins. + The function list could become long, especially if you can convert every individual pin into a GPIO pin independent of any other pins, and then try the approach to define every pin as a function. @@ -654,19 +747,24 @@ the approach to define every pin as a function. In this case, the function array would become 64 entries for each GPIO setting and then the device functions. -For this reason there is an additional function a pinmux driver can implement -to enable only GPIO on an individual pin: .gpio_request_enable(). The same -.free() function as for other functions is assumed to be usable also for -GPIO pins. +For this reason there are two functions a pinmux driver can implement +to enable only GPIO on an individual pin: .gpio_request_enable() and +.gpio_disable_free(). This function will pass in the affected GPIO range identified by the pin controller core, so you know which GPIO pins are being affected by the request operation. -Alternatively it is fully allowed to use named functions for each GPIO -pin, the pinmux_request_gpio() will attempt to obtain the function "gpioN" -where "N" is the global GPIO pin number if no special GPIO-handler is -registered. +If your driver needs to have an indication from the framework of whether the +GPIO pin shall be used for input or output you can implement the +.gpio_set_direction() function. As described this shall be called from the +gpiolib driver and the affected GPIO range, pin offset and desired direction +will be passed along to this function. + +Alternatively to using these special functions, it is fully allowed to use +named functions for each GPIO pin, the pinmux_request_gpio() will attempt to +obtain the function "gpioN" where "N" is the global GPIO pin number if no +special GPIO-handler is registered. Pinmux board/machine configuration @@ -683,19 +781,19 @@ spi on the second function mapping: #include <linux/pinctrl/machine.h> -static struct pinmux_map pmx_mapping[] = { +static const struct pinmux_map __initdata pmx_mapping[] = { { - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "spi0", .dev_name = "foo-spi.0", }, { - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "i2c0", .dev_name = "foo-i2c.0", }, { - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", .dev_name = "foo-mmc.0", }, @@ -714,14 +812,14 @@ for example if they are not yet instantiated or cumbersome to obtain. You register this pinmux mapping to the pinmux subsystem by simply: - ret = pinmux_register_mappings(&pmx_mapping, ARRAY_SIZE(pmx_mapping)); + ret = pinmux_register_mappings(pmx_mapping, ARRAY_SIZE(pmx_mapping)); Since the above construct is pretty common there is a helper macro to make -it even more compact which assumes you want to use pinctrl.0 and position +it even more compact which assumes you want to use pinctrl-foo and position 0 for mapping, for example: -static struct pinmux_map pmx_mapping[] = { - PINMUX_MAP_PRIMARY("I2CMAP", "i2c0", "foo-i2c.0"), +static struct pinmux_map __initdata pmx_mapping[] = { + PINMUX_MAP("I2CMAP", "pinctrl-foo", "i2c0", "foo-i2c.0"), }; @@ -734,14 +832,14 @@ As it is possible to map a function to different groups of pins an optional ... { .name = "spi0-pos-A", - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "spi0", .group = "spi0_0_grp", .dev_name = "foo-spi.0", }, { .name = "spi0-pos-B", - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "spi0", .group = "spi0_1_grp", .dev_name = "foo-spi.0", @@ -760,44 +858,44 @@ case), we define a mapping like this: ... { .name "2bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_0_grp", + .group = "mmc0_1_grp", .dev_name = "foo-mmc.0", }, { .name "4bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_0_grp", + .group = "mmc0_1_grp", .dev_name = "foo-mmc.0", }, { .name "4bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_1_grp", + .group = "mmc0_2_grp", .dev_name = "foo-mmc.0", }, { .name "8bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_0_grp", + .group = "mmc0_1_grp", .dev_name = "foo-mmc.0", }, { .name "8bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_1_grp", + .group = "mmc0_2_grp", .dev_name = "foo-mmc.0", }, { .name "8bit" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "mmc0", - .group = "mmc0_2_grp", + .group = "mmc0_3_grp", .dev_name = "foo-mmc.0", }, ... @@ -898,7 +996,7 @@ like this: { .name "POWERMAP" - .ctrl_dev_name = "pinctrl.0", + .ctrl_dev_name = "pinctrl-foo", .function = "power_func", .hog_on_boot = true, }, diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index 646a89e..20af7de 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt @@ -123,9 +123,12 @@ please refer directly to the source code for more information about it. Subsystem-Level Methods ----------------------- The core methods to suspend and resume devices reside in struct dev_pm_ops -pointed to by the pm member of struct bus_type, struct device_type and -struct class. They are mostly of interest to the people writing infrastructure -for buses, like PCI or USB, or device type and device class drivers. +pointed to by the ops member of struct dev_pm_domain, or by the pm member of +struct bus_type, struct device_type and struct class. They are mostly of +interest to the people writing infrastructure for platforms and buses, like PCI +or USB, or device type and device class drivers. They also are relevant to the +writers of device drivers whose subsystems (PM domains, device types, device +classes and bus types) don't provide all power management methods. Bus drivers implement these methods as appropriate for the hardware and the drivers using it; PCI works differently from USB, and so on. Not many people @@ -139,41 +142,57 @@ sequencing in the driver model tree. /sys/devices/.../power/wakeup files ----------------------------------- -All devices in the driver model have two flags to control handling of wakeup -events (hardware signals that can force the device and/or system out of a low -power state). These flags are initialized by bus or device driver code using +All device objects in the driver model contain fields that control the handling +of system wakeup events (hardware signals that can force the system out of a +sleep state). These fields are initialized by bus or device driver code using device_set_wakeup_capable() and device_set_wakeup_enable(), defined in include/linux/pm_wakeup.h. -The "can_wakeup" flag just records whether the device (and its driver) can +The "power.can_wakeup" flag just records whether the device (and its driver) can physically support wakeup events. The device_set_wakeup_capable() routine -affects this flag. The "should_wakeup" flag controls whether the device should -try to use its wakeup mechanism. device_set_wakeup_enable() affects this flag; -for the most part drivers should not change its value. The initial value of -should_wakeup is supposed to be false for the majority of devices; the major -exceptions are power buttons, keyboards, and Ethernet adapters whose WoL -(wake-on-LAN) feature has been set up with ethtool. It should also default -to true for devices that don't generate wakeup requests on their own but merely -forward wakeup requests from one bus to another (like PCI bridges). +affects this flag. The "power.wakeup" field is a pointer to an object of type +struct wakeup_source used for controlling whether or not the device should use +its system wakeup mechanism and for notifying the PM core of system wakeup +events signaled by the device. This object is only present for wakeup-capable +devices (i.e. devices whose "can_wakeup" flags are set) and is created (or +removed) by device_set_wakeup_capable(). Whether or not a device is capable of issuing wakeup events is a hardware matter, and the kernel is responsible for keeping track of it. By contrast, whether or not a wakeup-capable device should issue wakeup events is a policy decision, and it is managed by user space through a sysfs attribute: the -power/wakeup file. User space can write the strings "enabled" or "disabled" to -set or clear the "should_wakeup" flag, respectively. This file is only present -for wakeup-capable devices (i.e. devices whose "can_wakeup" flags are set) -and is created (or removed) by device_set_wakeup_capable(). Reads from the -file will return the corresponding string. - -The device_may_wakeup() routine returns true only if both flags are set. +"power/wakeup" file. User space can write the strings "enabled" or "disabled" +to it to indicate whether or not, respectively, the device is supposed to signal +system wakeup. This file is only present if the "power.wakeup" object exists +for the given device and is created (or removed) along with that object, by +device_set_wakeup_capable(). Reads from the file will return the corresponding +string. + +The "power/wakeup" file is supposed to contain the "disabled" string initially +for the majority of devices; the major exceptions are power buttons, keyboards, +and Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with +ethtool. It should also default to "enabled" for devices that don't generate +wakeup requests on their own but merely forward wakeup requests from one bus to +another (like PCI Express ports). + +The device_may_wakeup() routine returns true only if the "power.wakeup" object +exists and the corresponding "power/wakeup" file contains the string "enabled". This information is used by subsystems, like the PCI bus type code, to see whether or not to enable the devices' wakeup mechanisms. If device wakeup mechanisms are enabled or disabled directly by drivers, they also should use device_may_wakeup() to decide what to do during a system sleep transition. -However for runtime power management, wakeup events should be enabled whenever -the device and driver both support them, regardless of the should_wakeup flag. - +Device drivers, however, are not supposed to call device_set_wakeup_enable() +directly in any case. + +It ought to be noted that system wakeup is conceptually different from "remote +wakeup" used by runtime power management, although it may be supported by the +same physical mechanism. Remote wakeup is a feature allowing devices in +low-power states to trigger specific interrupts to signal conditions in which +they should be put into the full-power state. Those interrupts may or may not +be used to signal system wakeup events, depending on the hardware design. On +some systems it is impossible to trigger them from system sleep states. In any +case, remote wakeup should always be enabled for runtime power management for +all devices and drivers that support it. /sys/devices/.../power/control files ------------------------------------ @@ -249,23 +268,37 @@ for every device before the next phase begins. Not all busses or classes support all these callbacks and not all drivers use all the callbacks. The various phases always run after tasks have been frozen and before they are unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have -been disabled (except for those marked with the IRQ_WAKEUP flag). +been disabled (except for those marked with the IRQF_NO_SUSPEND flag). + +All phases use PM domain, bus, type, class or driver callbacks (that is, methods +defined in dev->pm_domain->ops, dev->bus->pm, dev->type->pm, dev->class->pm or +dev->driver->pm). These callbacks are regarded by the PM core as mutually +exclusive. Moreover, PM domain callbacks always take precedence over all of the +other callbacks and, for example, type callbacks take precedence over bus, class +and driver callbacks. To be precise, the following rules are used to determine +which callback to execute in the given phase: + + 1. If dev->pm_domain is present, the PM core will choose the callback + included in dev->pm_domain->ops for execution + + 2. Otherwise, if both dev->type and dev->type->pm are present, the callback + included in dev->type->pm will be chosen for execution. + + 3. Otherwise, if both dev->class and dev->class->pm are present, the + callback included in dev->class->pm will be chosen for execution. + + 4. Otherwise, if both dev->bus and dev->bus->pm are present, the callback + included in dev->bus->pm will be chosen for execution. + +This allows PM domains and device types to override callbacks provided by bus +types or device classes if necessary. -All phases use bus, type, or class callbacks (that is, methods defined in -dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually -exclusive, so if the device type provides a struct dev_pm_ops object pointed to -by its pm field (i.e. both dev->type and dev->type->pm are defined), the -callbacks included in that object (i.e. dev->type->pm) will be used. Otherwise, -if the class provides a struct dev_pm_ops object pointed to by its pm field -(i.e. both dev->class and dev->class->pm are defined), the PM core will use the -callbacks from that object (i.e. dev->class->pm). Finally, if the pm fields of -both the device type and class objects are NULL (or those objects do not exist), -the callbacks provided by the bus (that is, the callbacks from dev->bus->pm) -will be used (this allows device types to override callbacks provided by bus -types or classes if necessary). +The PM domain, type, class and bus callbacks may in turn invoke device- or +driver-specific methods stored in dev->driver->pm, but they don't have to do +that. -These callbacks may in turn invoke device- or driver-specific methods stored in -dev->driver->pm, but they don't have to. +If the subsystem callback chosen for execution is not present, the PM core will +execute the corresponding method from dev->driver->pm instead if there is one. Entering System Suspend @@ -283,9 +316,8 @@ When the system goes into the standby or memory sleep state, the phases are: After the prepare callback method returns, no new children may be registered below the device. The method may also prepare the device or - driver in some way for the upcoming system power transition (for - example, by allocating additional memory required for this purpose), but - it should not put the device into a low-power state. + driver in some way for the upcoming system power transition, but it + should not put the device into a low-power state. 2. The suspend methods should quiesce the device to stop it from performing I/O. They also may save the device registers and put it into the diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt index 316c2ba..6ccb68f 100644 --- a/Documentation/power/freezing-of-tasks.txt +++ b/Documentation/power/freezing-of-tasks.txt @@ -21,7 +21,7 @@ freeze_processes() (defined in kernel/power/process.c) is called. It executes try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and either wakes them up, if they are kernel threads, or sends fake signals to them, if they are user space processes. A task that has TIF_FREEZE set, should react -to it by calling the function called refrigerator() (defined in +to it by calling the function called __refrigerator() (defined in kernel/freezer.c), which sets the task's PF_FROZEN flag, changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it. Then, we say that the task is 'frozen' and therefore the set of functions @@ -29,10 +29,10 @@ handling this mechanism is referred to as 'the freezer' (these functions are defined in kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h). User space processes are generally frozen before kernel threads. -It is not recommended to call refrigerator() directly. Instead, it is -recommended to use the try_to_freeze() function (defined in -include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the -task enter refrigerator() if the flag is set. +__refrigerator() must not be called directly. Instead, use the +try_to_freeze() function (defined in include/linux/freezer.h), that checks +the task's TIF_FREEZE flag and makes the task enter __refrigerator() if the +flag is set. For user space processes try_to_freeze() is called automatically from the signal-handling code, but the freezable kernel threads need to call it @@ -61,13 +61,13 @@ wait_event_freezable() and wait_event_freezable_timeout() macros. After the system memory state has been restored from a hibernation image and devices have been reinitialized, the function thaw_processes() is called in order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that -have been frozen leave refrigerator() and continue running. +have been frozen leave __refrigerator() and continue running. III. Which kernel threads are freezable? Kernel threads are not freezable by default. However, a kernel thread may clear PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE -directly is strongly discouraged). From this point it is regarded as freezable +directly is not allowed). From this point it is regarded as freezable and must call try_to_freeze() in a suitable place. IV. Why do we do that? @@ -176,3 +176,28 @@ tasks, since it generally exists anyway. A driver must have all firmwares it may need in RAM before suspend() is called. If keeping them is not practical, for example due to their size, they must be requested early enough using the suspend notifier API described in notifiers.txt. + +VI. Are there any precautions to be taken to prevent freezing failures? + +Yes, there are. + +First of all, grabbing the 'pm_mutex' lock to mutually exclude a piece of code +from system-wide sleep such as suspend/hibernation is not encouraged. +If possible, that piece of code must instead hook onto the suspend/hibernation +notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code +(kernel/cpu.c) for an example. + +However, if that is not feasible, and grabbing 'pm_mutex' is deemed necessary, +it is strongly discouraged to directly call mutex_[un]lock(&pm_mutex) since +that could lead to freezing failures, because if the suspend/hibernate code +successfully acquired the 'pm_mutex' lock, and hence that other entity failed +to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE +state. As a consequence, the freezer would not be able to freeze that task, +leading to freezing failure. + +However, the [un]lock_system_sleep() APIs are safe to use in this scenario, +since they ask the freezer to skip freezing this task, since it is anyway +"frozen enough" as it is blocked on 'pm_mutex', which will be released +only after the entire suspend/hibernation sequence is complete. +So, to summarize, use [un]lock_system_sleep() instead of directly using +mutex_[un]lock(&pm_mutex). That would prevent freezing failures. diff --git a/Documentation/power/regulator/regulator.txt b/Documentation/power/regulator/regulator.txt index 3f8b528..e272d99 100644 --- a/Documentation/power/regulator/regulator.txt +++ b/Documentation/power/regulator/regulator.txt @@ -12,7 +12,7 @@ Drivers can register a regulator by calling :- struct regulator_dev *regulator_register(struct regulator_desc *regulator_desc, struct device *dev, struct regulator_init_data *init_data, - void *driver_data); + void *driver_data, struct device_node *of_node); This will register the regulators capabilities and operations to the regulator core. diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt index 5336149..4abe83e 100644 --- a/Documentation/power/runtime_pm.txt +++ b/Documentation/power/runtime_pm.txt @@ -44,98 +44,112 @@ struct dev_pm_ops { }; The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks -are executed by the PM core for either the power domain, or the device type -(if the device power domain's struct dev_pm_ops does not exist), or the class -(if the device power domain's and type's struct dev_pm_ops object does not -exist), or the bus type (if the device power domain's, type's and class' -struct dev_pm_ops objects do not exist) of the given device, so the priority -order of callbacks from high to low is that power domain callbacks, device -type callbacks, class callbacks and bus type callbacks, and the high priority -one will take precedence over low priority one. The bus type, device type and -class callbacks are referred to as subsystem-level callbacks in what follows, -and generally speaking, the power domain callbacks are used for representing -power domains within a SoC. +are executed by the PM core for the device's subsystem that may be either of +the following: + + 1. PM domain of the device, if the device's PM domain object, dev->pm_domain, + is present. + + 2. Device type of the device, if both dev->type and dev->type->pm are present. + + 3. Device class of the device, if both dev->class and dev->class->pm are + present. + + 4. Bus type of the device, if both dev->bus and dev->bus->pm are present. + +If the subsystem chosen by applying the above rules doesn't provide the relevant +callback, the PM core will invoke the corresponding driver callback stored in +dev->driver->pm directly (if present). + +The PM core always checks which callback to use in the order given above, so the +priority order of callbacks from high to low is: PM domain, device type, class +and bus type. Moreover, the high-priority one will always take precedence over +a low-priority one. The PM domain, bus type, device type and class callbacks +are referred to as subsystem-level callbacks in what follows. By default, the callbacks are always invoked in process context with interrupts -enabled. However, subsystems can use the pm_runtime_irq_safe() helper function -to tell the PM core that a device's ->runtime_suspend() and ->runtime_resume() -callbacks should be invoked in atomic context with interrupts disabled. -This implies that these callback routines must not block or sleep, but it also -means that the synchronous helper functions listed at the end of Section 4 can -be used within an interrupt handler or in an atomic context. - -The subsystem-level suspend callback is _entirely_ _responsible_ for handling -the suspend of the device as appropriate, which may, but need not include -executing the device driver's own ->runtime_suspend() callback (from the +enabled. However, the pm_runtime_irq_safe() helper function can be used to tell +the PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume() +and ->runtime_idle() callbacks for the given device in atomic context with +interrupts disabled. This implies that the callback routines in question must +not block or sleep, but it also means that the synchronous helper functions +listed at the end of Section 4 may be used for that device within an interrupt +handler or generally in an atomic context. + +The subsystem-level suspend callback, if present, is _entirely_ _responsible_ +for handling the suspend of the device as appropriate, which may, but need not +include executing the device driver's own ->runtime_suspend() callback (from the PM core's point of view it is not necessary to implement a ->runtime_suspend() callback in a device driver as long as the subsystem-level suspend callback knows what to do to handle the device). - * Once the subsystem-level suspend callback has completed successfully - for given device, the PM core regards the device as suspended, which need - not mean that the device has been put into a low power state. It is - supposed to mean, however, that the device will not process data and will - not communicate with the CPU(s) and RAM until the subsystem-level resume - callback is executed for it. The runtime PM status of a device after - successful execution of the subsystem-level suspend callback is 'suspended'. - - * If the subsystem-level suspend callback returns -EBUSY or -EAGAIN, - the device's runtime PM status is 'active', which means that the device - _must_ be fully operational afterwards. - - * If the subsystem-level suspend callback returns an error code different - from -EBUSY or -EAGAIN, the PM core regards this as a fatal error and will - refuse to run the helper functions described in Section 4 for the device, - until the status of it is directly set either to 'active', or to 'suspended' - (the PM core provides special helper functions for this purpose). - -In particular, if the driver requires remote wake-up capability (i.e. hardware + * Once the subsystem-level suspend callback (or the driver suspend callback, + if invoked directly) has completed successfully for the given device, the PM + core regards the device as suspended, which need not mean that it has been + put into a low power state. It is supposed to mean, however, that the + device will not process data and will not communicate with the CPU(s) and + RAM until the appropriate resume callback is executed for it. The runtime + PM status of a device after successful execution of the suspend callback is + 'suspended'. + + * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM + status remains 'active', which means that the device _must_ be fully + operational afterwards. + + * If the suspend callback returns an error code different from -EBUSY and + -EAGAIN, the PM core regards this as a fatal error and will refuse to run + the helper functions described in Section 4 for the device until its status + is directly set to either'active', or 'suspended' (the PM core provides + special helper functions for this purpose). + +In particular, if the driver requires remote wakeup capability (i.e. hardware mechanism allowing the device to request a change of its power state, such as PCI PME) for proper functioning and device_run_wake() returns 'false' for the device, then ->runtime_suspend() should return -EBUSY. On the other hand, if -device_run_wake() returns 'true' for the device and the device is put into a low -power state during the execution of the subsystem-level suspend callback, it is -expected that remote wake-up will be enabled for the device. Generally, remote -wake-up should be enabled for all input devices put into a low power state at -run time. - -The subsystem-level resume callback is _entirely_ _responsible_ for handling the -resume of the device as appropriate, which may, but need not include executing -the device driver's own ->runtime_resume() callback (from the PM core's point of -view it is not necessary to implement a ->runtime_resume() callback in a device -driver as long as the subsystem-level resume callback knows what to do to handle -the device). - - * Once the subsystem-level resume callback has completed successfully, the PM - core regards the device as fully operational, which means that the device - _must_ be able to complete I/O operations as needed. The runtime PM status - of the device is then 'active'. - - * If the subsystem-level resume callback returns an error code, the PM core - regards this as a fatal error and will refuse to run the helper functions - described in Section 4 for the device, until its status is directly set - either to 'active' or to 'suspended' (the PM core provides special helper - functions for this purpose). - -The subsystem-level idle callback is executed by the PM core whenever the device -appears to be idle, which is indicated to the PM core by two counters, the -device's usage counter and the counter of 'active' children of the device. +device_run_wake() returns 'true' for the device and the device is put into a +low-power state during the execution of the suspend callback, it is expected +that remote wakeup will be enabled for the device. Generally, remote wakeup +should be enabled for all input devices put into low-power states at run time. + +The subsystem-level resume callback, if present, is _entirely_ _responsible_ for +handling the resume of the device as appropriate, which may, but need not +include executing the device driver's own ->runtime_resume() callback (from the +PM core's point of view it is not necessary to implement a ->runtime_resume() +callback in a device driver as long as the subsystem-level resume callback knows +what to do to handle the device). + + * Once the subsystem-level resume callback (or the driver resume callback, if + invoked directly) has completed successfully, the PM core regards the device + as fully operational, which means that the device _must_ be able to complete + I/O operations as needed. The runtime PM status of the device is then + 'active'. + + * If the resume callback returns an error code, the PM core regards this as a + fatal error and will refuse to run the helper functions described in Section + 4 for the device, until its status is directly set to either 'active', or + 'suspended' (by means of special helper functions provided by the PM core + for this purpose). + +The idle callback (a subsystem-level one, if present, or the driver one) is +executed by the PM core whenever the device appears to be idle, which is +indicated to the PM core by two counters, the device's usage counter and the +counter of 'active' children of the device. * If any of these counters is decreased using a helper function provided by the PM core and it turns out to be equal to zero, the other counter is checked. If that counter also is equal to zero, the PM core executes the - subsystem-level idle callback with the device as an argument. + idle callback with the device as its argument. -The action performed by a subsystem-level idle callback is totally dependent on -the subsystem in question, but the expected and recommended action is to check +The action performed by the idle callback is totally dependent on the subsystem +(or driver) in question, but the expected and recommended action is to check if the device can be suspended (i.e. if all of the conditions necessary for suspending the device are satisfied) and to queue up a suspend request for the device in that case. The value returned by this callback is ignored by the PM core. The helper functions provided by the PM core, described in Section 4, guarantee -that the following constraints are met with respect to the bus type's runtime -PM callbacks: +that the following constraints are met with respect to runtime PM callbacks for +one device: (1) The callbacks are mutually exclusive (e.g. it is forbidden to execute ->runtime_suspend() in parallel with ->runtime_resume() or with another diff --git a/Documentation/s390/Debugging390.txt b/Documentation/s390/Debugging390.txt index efe998b..462321c 100644 --- a/Documentation/s390/Debugging390.txt +++ b/Documentation/s390/Debugging390.txt @@ -41,7 +41,6 @@ ldd Debugging modules The proc file system Starting points for debugging scripting languages etc. -Dumptool & Lcrash SysRq References Special Thanks @@ -2455,39 +2454,6 @@ jdb <filename> another fully interactive gdb style debugger. -Dumptool & Lcrash ( lkcd ) -========================== -Michael Holzheu & others here at IBM have a fairly mature port of -SGI's lcrash tool which allows one to look at kernel structures in a -running kernel. - -It also complements a tool called dumptool which dumps all the kernel's -memory pages & registers to either a tape or a disk. -This can be used by tech support or an ambitious end user do -post mortem debugging of a machine like gdb core dumps. - -Going into how to use this tool in detail will be explained -in other documentation supplied by IBM with the patches & the -lcrash homepage http://oss.sgi.com/projects/lkcd/ & the lcrash manpage. - -How they work -------------- -Lcrash is a perfectly normal program,however, it requires 2 -additional files, Kerntypes which is built using a patch to the -linux kernel sources in the linux root directory & the System.map. - -Kerntypes is an objectfile whose sole purpose in life -is to provide stabs debug info to lcrash, to do this -Kerntypes is built from kerntypes.c which just includes the most commonly -referenced header files used when debugging, lcrash can then read the -.stabs section of this file. - -Debugging a live system it uses /dev/mem -alternatively for post mortem debugging it uses the data -collected by dumptool. - - - SysRq ===== This is now supported by linux for s/390 & z/Architecture. diff --git a/Documentation/scsi/53c700.txt b/Documentation/scsi/53c700.txt index 0da681d..e31aceb 100644 --- a/Documentation/scsi/53c700.txt +++ b/Documentation/scsi/53c700.txt @@ -16,32 +16,13 @@ fill in to get the driver working. Compile Time Flags ================== -The driver may be either io mapped or memory mapped. This is -selectable by configuration flags: - -CONFIG_53C700_MEM_MAPPED - -define if the driver is memory mapped. - -CONFIG_53C700_IO_MAPPED - -define if the driver is to be io mapped. - -One or other of the above flags *must* be defined. - -Other flags are: +A compile time flag is: CONFIG_53C700_LE_ON_BE define if the chipset must be supported in little endian mode on a big endian architecture (used for the 700 on parisc). -CONFIG_53C700_USE_CONSISTENT - -allocate consistent memory (should only be used if your architecture -has a mixture of consistent and inconsistent memory). Fully -consistent or fully inconsistent architectures should not define this. - Using the Chip Core Driver ========================== diff --git a/Documentation/serial/driver b/Documentation/serial/driver index 77ba0af..0a25a91 100644 --- a/Documentation/serial/driver +++ b/Documentation/serial/driver @@ -101,7 +101,7 @@ hardware. Returns the current state of modem control inputs. The state of the outputs should not be returned, since the core keeps track of their state. The state information should include: - - TIOCM_DCD state of DCD signal + - TIOCM_CAR state of DCD signal - TIOCM_CTS state of CTS signal - TIOCM_DSR state of DSR signal - TIOCM_RI state of RI signal diff --git a/Documentation/sound/alsa/soc/machine.txt b/Documentation/sound/alsa/soc/machine.txt index 3e2ec9c..d50c14d 100644 --- a/Documentation/sound/alsa/soc/machine.txt +++ b/Documentation/sound/alsa/soc/machine.txt @@ -50,8 +50,7 @@ Machine DAI Configuration The machine DAI configuration glues all the codec and CPU DAIs together. It can also be used to set up the DAI system clock and for any machine related DAI initialisation e.g. the machine audio map can be connected to the codec audio -map, unconnected codec pins can be set as such. Please see corgi.c, spitz.c -for examples. +map, unconnected codec pins can be set as such. struct snd_soc_dai_link is used to set up each DAI in your machine. e.g. @@ -83,8 +82,7 @@ Machine Power Map The machine driver can optionally extend the codec power map and to become an audio power map of the audio subsystem. This allows for automatic power up/down of speaker/HP amplifiers, etc. Codec pins can be connected to the machines jack -sockets in the machine init function. See soc/pxa/spitz.c and dapm.txt for -details. +sockets in the machine init function. Machine Controls diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt index b510564..bb24c2a0e 100644 --- a/Documentation/trace/events.txt +++ b/Documentation/trace/events.txt @@ -191,8 +191,6 @@ And for string fields they are: Currently, only exact string matches are supported. -Currently, the maximum number of predicates in a filter is 16. - 5.2 Setting filters ------------------- diff --git a/Documentation/usb/linux-cdc-acm.inf b/Documentation/usb/linux-cdc-acm.inf index 37a02ce..f0ffc27 100644 --- a/Documentation/usb/linux-cdc-acm.inf +++ b/Documentation/usb/linux-cdc-acm.inf @@ -90,10 +90,10 @@ ServiceBinary=%12%\USBSER.sys [SourceDisksFiles] [SourceDisksNames] [DeviceList] -%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02 +%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00 [DeviceList.NTamd64] -%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02 +%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00 ;------------------------------------------------------------------------------ diff --git a/Documentation/usb/usbmon.txt b/Documentation/usb/usbmon.txt index a4efa04..5335fa8 100644 --- a/Documentation/usb/usbmon.txt +++ b/Documentation/usb/usbmon.txt @@ -47,10 +47,11 @@ This allows to filter away annoying devices that talk continuously. 2. Find which bus connects to the desired device -Run "cat /proc/bus/usb/devices", and find the T-line which corresponds to -the device. Usually you do it by looking for the vendor string. If you have -many similar devices, unplug one and compare two /proc/bus/usb/devices outputs. -The T-line will have a bus number. Example: +Run "cat /sys/kernel/debug/usb/devices", and find the T-line which corresponds +to the device. Usually you do it by looking for the vendor string. If you have +many similar devices, unplug one and compare the two +/sys/kernel/debug/usb/devices outputs. The T-line will have a bus number. +Example: T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 @@ -58,7 +59,10 @@ P: Vendor=0557 ProdID=2004 Rev= 1.00 S: Manufacturer=ATEN S: Product=UC100KM V2.00 -Bus=03 means it's bus 3. +"Bus=03" means it's bus 3. Alternatively, you can look at the output from +"lsusb" and get the bus number from the appropriate line. Example: + +Bus 003 Device 002: ID 0557:2004 ATEN UC100KM V2.00 3. Start 'cat' diff --git a/Documentation/vgaarbiter.txt b/Documentation/vgaarbiter.txt index b7d401e..014423e 100644 --- a/Documentation/vgaarbiter.txt +++ b/Documentation/vgaarbiter.txt @@ -177,7 +177,7 @@ II. Credits Benjamin Herrenschmidt (IBM?) started this work when he discussed such design with the Xorg community in 2005 [1, 2]. In the end of 2007, Paulo Zanoni and -Tiago Vignatti (both of C3SL/Federal University of Paraná) proceeded his work +Tiago Vignatti (both of C3SL/Federal University of Paraná) proceeded his work enhancing the kernel code to adapt as a kernel module and also did the implementation of the user space side [3]. Now (2009) Tiago Vignatti and Dave Airlie finally put this work in shape and queued to Jesse Barnes' PCI tree. diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 7945b0b..e1d94bf 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1100,6 +1100,15 @@ emulate them efficiently. The fields in each entry are defined as follows: eax, ebx, ecx, edx: the values returned by the cpuid instruction for this function/index combination +The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned +as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC +support. Instead it is reported via + + ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER) + +if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the +feature in userspace, then you can enable the feature for KVM_SET_CPUID2. + 4.47 KVM_PPC_GET_PVINFO Capability: KVM_CAP_PPC_GET_PVINFO @@ -1151,6 +1160,13 @@ following flags are specified: /* Depends on KVM_CAP_IOMMU */ #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) +The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure +isolation of the device. Usages not specifying this flag are deprecated. + +Only PCI header type 0 devices with PCI BAR resources are supported by +device assignment. The user requesting this ioctl must have read/write +access to the PCI sysfs resource files associated with the device. + 4.49 KVM_DEASSIGN_PCI_DEVICE Capability: KVM_CAP_DEVICE_DEASSIGNMENT @@ -1450,6 +1466,31 @@ is supported; 2 if the processor requires all virtual machines to have an RMA, or 1 if the processor can use an RMA but doesn't require it, because it supports the Virtual RMA (VRMA) facility. +4.64 KVM_NMI + +Capability: KVM_CAP_USER_NMI +Architectures: x86 +Type: vcpu ioctl +Parameters: none +Returns: 0 on success, -1 on error + +Queues an NMI on the thread's vcpu. Note this is well defined only +when KVM_CREATE_IRQCHIP has not been called, since this is an interface +between the virtual cpu core and virtual local APIC. After KVM_CREATE_IRQCHIP +has been called, this interface is completely emulated within the kernel. + +To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the +following algorithm: + + - pause the vpcu + - read the local APIC's state (KVM_GET_LAPIC) + - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1) + - if so, issue KVM_NMI + - resume the vcpu + +Some guests configure the LINT1 NMI input to cause a panic, aiding in +debugging. + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/Documentation/watchdog/00-INDEX b/Documentation/watchdog/00-INDEX index fc51128..fc9082a 100644 --- a/Documentation/watchdog/00-INDEX +++ b/Documentation/watchdog/00-INDEX @@ -1,5 +1,7 @@ 00-INDEX - this file. +convert_drivers_to_kernel_api.txt + - how-to for converting old watchdog drivers to the new kernel API. hpwdt.txt - information on the HP iLO2 NMI watchdog pcwd-watchdog.txt diff --git a/Documentation/watchdog/convert_drivers_to_kernel_api.txt b/Documentation/watchdog/convert_drivers_to_kernel_api.txt index ae1e900..be8119b 100644 --- a/Documentation/watchdog/convert_drivers_to_kernel_api.txt +++ b/Documentation/watchdog/convert_drivers_to_kernel_api.txt @@ -163,6 +163,25 @@ Here is a simple example for a watchdog device: +}; +Handle the 'nowayout' feature +----------------------------- + +A few drivers use nowayout statically, i.e. there is no module parameter for it +and only CONFIG_WATCHDOG_NOWAYOUT determines if the feature is going to be +used. This needs to be converted by initializing the status variable of the +watchdog_device like this: + + .status = WATCHDOG_NOWAYOUT_INIT_STATUS, + +Most drivers, however, also allow runtime configuration of nowayout, usually +by adding a module parameter. The conversion for this would be something like: + + watchdog_set_nowayout(&s3c2410_wdd, nowayout); + +The module parameter itself needs to stay, everything else related to nowayout +can go, though. This will likely be some code in open(), close() or write(). + + Register the watchdog device ---------------------------- diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt index 4f7c894..4b93c28 100644 --- a/Documentation/watchdog/watchdog-kernel-api.txt +++ b/Documentation/watchdog/watchdog-kernel-api.txt @@ -1,6 +1,6 @@ The Linux WatchDog Timer Driver Core kernel API. =============================================== -Last reviewed: 22-Jul-2011 +Last reviewed: 29-Nov-2011 Wim Van Sebroeck <wim@iguana.be> @@ -142,6 +142,14 @@ bit-operations. The status bits that are defined are: * WDOG_NO_WAY_OUT: this bit stores the nowayout setting for the watchdog. If this bit is set then the watchdog timer will not be able to stop. + To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog + timer device) you can either: + * set it statically in your watchdog_device struct with + .status = WATCHDOG_NOWAYOUT_INIT_STATUS, + (this will set the value the same as CONFIG_WATCHDOG_NOWAYOUT) or + * use the following helper function: + static inline void watchdog_set_nowayout(struct watchdog_device *wdd, int nowayout) + Note: The WatchDog Timer Driver Core supports the magic close feature and the nowayout feature. To use the magic close feature you must set the WDIOF_MAGICCLOSE bit in the options field of the watchdog's info structure. |