aboutsummaryrefslogtreecommitdiff
path: root/Documentation/userspace-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/userspace-api')
-rw-r--r--Documentation/userspace-api/gpio/chardev.rst116
-rw-r--r--Documentation/userspace-api/gpio/chardev_v1.rst131
-rw-r--r--Documentation/userspace-api/gpio/error-codes.rst79
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-chipinfo-ioctl.rst41
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-lineevent-ioctl.rst84
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-linehandle-ioctl.rst125
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-lineinfo-ioctl.rst54
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-lineinfo-unwatch-ioctl.rst49
-rw-r--r--Documentation/userspace-api/gpio/gpio-get-lineinfo-watch-ioctl.rst74
-rw-r--r--Documentation/userspace-api/gpio/gpio-handle-get-line-values-ioctl.rst56
-rw-r--r--Documentation/userspace-api/gpio/gpio-handle-set-config-ioctl.rst63
-rw-r--r--Documentation/userspace-api/gpio/gpio-handle-set-line-values-ioctl.rst48
-rw-r--r--Documentation/userspace-api/gpio/gpio-lineevent-data-read.rst84
-rw-r--r--Documentation/userspace-api/gpio/gpio-lineinfo-changed-read.rst87
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-get-line-ioctl.rst152
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-ioctl.rst50
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-watch-ioctl.rst67
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-line-event-read.rst83
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-line-get-values-ioctl.rst51
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-line-set-config-ioctl.rst58
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-line-set-values-ioctl.rst47
-rw-r--r--Documentation/userspace-api/gpio/gpio-v2-lineinfo-changed-read.rst81
-rw-r--r--Documentation/userspace-api/gpio/index.rst18
-rw-r--r--Documentation/userspace-api/gpio/obsolete.rst11
-rw-r--r--Documentation/userspace-api/gpio/sysfs.rst170
-rw-r--r--Documentation/userspace-api/index.rst48
-rw-r--r--Documentation/userspace-api/ioctl/ioctl-number.rst4
-rw-r--r--Documentation/userspace-api/netlink/netlink-raw.rst42
-rw-r--r--Documentation/userspace-api/perf_ring_buffer.rst830
29 files changed, 2792 insertions, 11 deletions
diff --git a/Documentation/userspace-api/gpio/chardev.rst b/Documentation/userspace-api/gpio/chardev.rst
new file mode 100644
index 000000000000..c58dd9771ac9
--- /dev/null
+++ b/Documentation/userspace-api/gpio/chardev.rst
@@ -0,0 +1,116 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+GPIO Character Device Userspace API
+===================================
+
+This is latest version (v2) of the character device API, as defined in
+``include/uapi/linux/gpio.h.``
+
+First added in 5.10.
+
+.. note::
+ Do NOT abuse userspace APIs to control hardware that has proper kernel
+ drivers. There may already be a driver for your use case, and an existing
+ kernel driver is sure to provide a superior solution to bitbashing
+ from userspace.
+
+ Read Documentation/driver-api/gpio/drivers-on-gpio.rst to avoid reinventing
+ kernel wheels in userspace.
+
+ Similarly, for multi-function lines there may be other subsystems, such as
+ Documentation/spi/index.rst, Documentation/i2c/index.rst,
+ Documentation/driver-api/pwm.rst, Documentation/w1/index.rst etc, that
+ provide suitable drivers and APIs for your hardware.
+
+Basic examples using the character device API can be found in ``tools/gpio/*``.
+
+The API is based around two major objects, the :ref:`gpio-v2-chip` and the
+:ref:`gpio-v2-line-request`.
+
+.. _gpio-v2-chip:
+
+Chip
+====
+
+The Chip represents a single GPIO chip and is exposed to userspace using device
+files of the form ``/dev/gpiochipX``.
+
+Each chip supports a number of GPIO lines,
+:c:type:`chip.lines<gpiochip_info>`. Lines on the chip are identified by an
+``offset`` in the range from 0 to ``chip.lines - 1``, i.e. `[0,chip.lines)`.
+
+Lines are requested from the chip using gpio-v2-get-line-ioctl.rst
+and the resulting line request is used to access the GPIO chip's lines or
+monitor the lines for edge events.
+
+Within this documentation, the file descriptor returned by calling `open()`
+on the GPIO device file is referred to as ``chip_fd``.
+
+Operations
+----------
+
+The following operations may be performed on the chip:
+
+.. toctree::
+ :titlesonly:
+
+ Get Line <gpio-v2-get-line-ioctl>
+ Get Chip Info <gpio-get-chipinfo-ioctl>
+ Get Line Info <gpio-v2-get-lineinfo-ioctl>
+ Watch Line Info <gpio-v2-get-lineinfo-watch-ioctl>
+ Unwatch Line Info <gpio-get-lineinfo-unwatch-ioctl>
+ Read Line Info Changed Events <gpio-v2-lineinfo-changed-read>
+
+.. _gpio-v2-line-request:
+
+Line Request
+============
+
+Line requests are created by gpio-v2-get-line-ioctl.rst and provide
+access to a set of requested lines. The line request is exposed to userspace
+via the anonymous file descriptor returned in
+:c:type:`request.fd<gpio_v2_line_request>` by gpio-v2-get-line-ioctl.rst.
+
+Within this documentation, the line request file descriptor is referred to
+as ``req_fd``.
+
+Operations
+----------
+
+The following operations may be performed on the line request:
+
+.. toctree::
+ :titlesonly:
+
+ Get Line Values <gpio-v2-line-get-values-ioctl>
+ Set Line Values <gpio-v2-line-set-values-ioctl>
+ Read Line Edge Events <gpio-v2-line-event-read>
+ Reconfigure Lines <gpio-v2-line-set-config-ioctl>
+
+Types
+=====
+
+This section contains the structs and enums that are referenced by the API v2,
+as defined in ``include/uapi/linux/gpio.h``.
+
+.. kernel-doc:: include/uapi/linux/gpio.h
+ :identifiers:
+ gpio_v2_line_attr_id
+ gpio_v2_line_attribute
+ gpio_v2_line_changed_type
+ gpio_v2_line_config
+ gpio_v2_line_config_attribute
+ gpio_v2_line_event
+ gpio_v2_line_event_id
+ gpio_v2_line_flag
+ gpio_v2_line_info
+ gpio_v2_line_info_changed
+ gpio_v2_line_request
+ gpio_v2_line_values
+ gpiochip_info
+
+.. toctree::
+ :hidden:
+
+ error-codes
diff --git a/Documentation/userspace-api/gpio/chardev_v1.rst b/Documentation/userspace-api/gpio/chardev_v1.rst
new file mode 100644
index 000000000000..67124b1d0487
--- /dev/null
+++ b/Documentation/userspace-api/gpio/chardev_v1.rst
@@ -0,0 +1,131 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================================
+GPIO Character Device Userspace API (v1)
+========================================
+
+.. warning::
+ This API is obsoleted by chardev.rst (v2).
+
+ New developments should use the v2 API, and existing developments are
+ encouraged to migrate as soon as possible, as this API will be removed
+ in the future. The v2 API is a functional superset of the v1 API so any
+ v1 call can be directly translated to a v2 equivalent.
+
+ This interface will continue to be maintained for the migration period,
+ but new features will only be added to the new API.
+
+First added in 4.8.
+
+The API is based around three major objects, the :ref:`gpio-v1-chip`, the
+:ref:`gpio-v1-line-handle`, and the :ref:`gpio-v1-line-event`.
+
+Where "line event" is used in this document it refers to the request that can
+monitor a line for edge events, not the edge events themselves.
+
+.. _gpio-v1-chip:
+
+Chip
+====
+
+The Chip represents a single GPIO chip and is exposed to userspace using device
+files of the form ``/dev/gpiochipX``.
+
+Each chip supports a number of GPIO lines,
+:c:type:`chip.lines<gpiochip_info>`. Lines on the chip are identified by an
+``offset`` in the range from 0 to ``chip.lines - 1``, i.e. `[0,chip.lines)`.
+
+Lines are requested from the chip using either gpio-get-linehandle-ioctl.rst
+and the resulting line handle is used to access the GPIO chip's lines, or
+gpio-get-lineevent-ioctl.rst and the resulting line event is used to monitor
+a GPIO line for edge events.
+
+Within this documentation, the file descriptor returned by calling `open()`
+on the GPIO device file is referred to as ``chip_fd``.
+
+Operations
+----------
+
+The following operations may be performed on the chip:
+
+.. toctree::
+ :titlesonly:
+
+ Get Line Handle <gpio-get-linehandle-ioctl>
+ Get Line Event <gpio-get-lineevent-ioctl>
+ Get Chip Info <gpio-get-chipinfo-ioctl>
+ Get Line Info <gpio-get-lineinfo-ioctl>
+ Watch Line Info <gpio-get-lineinfo-watch-ioctl>
+ Unwatch Line Info <gpio-get-lineinfo-unwatch-ioctl>
+ Read Line Info Changed Events <gpio-lineinfo-changed-read>
+
+.. _gpio-v1-line-handle:
+
+Line Handle
+===========
+
+Line handles are created by gpio-get-linehandle-ioctl.rst and provide
+access to a set of requested lines. The line handle is exposed to userspace
+via the anonymous file descriptor returned in
+:c:type:`request.fd<gpiohandle_request>` by gpio-get-linehandle-ioctl.rst.
+
+Within this documentation, the line handle file descriptor is referred to
+as ``handle_fd``.
+
+Operations
+----------
+
+The following operations may be performed on the line handle:
+
+.. toctree::
+ :titlesonly:
+
+ Get Line Values <gpio-handle-get-line-values-ioctl>
+ Set Line Values <gpio-handle-set-line-values-ioctl>
+ Reconfigure Lines <gpio-handle-set-config-ioctl>
+
+.. _gpio-v1-line-event:
+
+Line Event
+==========
+
+Line events are created by gpio-get-lineevent-ioctl.rst and provide
+access to a requested line. The line event is exposed to userspace
+via the anonymous file descriptor returned in
+:c:type:`request.fd<gpioevent_request>` by gpio-get-lineevent-ioctl.rst.
+
+Within this documentation, the line event file descriptor is referred to
+as ``event_fd``.
+
+Operations
+----------
+
+The following operations may be performed on the line event:
+
+.. toctree::
+ :titlesonly:
+
+ Get Line Value <gpio-handle-get-line-values-ioctl>
+ Read Line Edge Events <gpio-lineevent-data-read>
+
+Types
+=====
+
+This section contains the structs that are referenced by the ABI v1.
+
+The :c:type:`struct gpiochip_info<gpiochip_info>` is common to ABI v1 and v2.
+
+.. kernel-doc:: include/uapi/linux/gpio.h
+ :identifiers:
+ gpioevent_data
+ gpioevent_request
+ gpiohandle_config
+ gpiohandle_data
+ gpiohandle_request
+ gpioline_info
+ gpioline_info_changed
+
+.. toctree::
+ :hidden:
+
+ error-codes
diff --git a/Documentation/userspace-api/gpio/error-codes.rst b/Documentation/userspace-api/gpio/error-codes.rst
new file mode 100644
index 000000000000..6bf2948990cd
--- /dev/null
+++ b/Documentation/userspace-api/gpio/error-codes.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _gpio_errors:
+
+*******************
+GPIO Error Codes
+*******************
+
+.. _gpio-errors:
+
+.. tabularcolumns:: |p{2.5cm}|p{15.0cm}|
+
+.. flat-table:: Common GPIO error codes
+ :header-rows: 0
+ :stub-columns: 0
+ :widths: 1 16
+
+ - - ``EAGAIN`` (aka ``EWOULDBLOCK``)
+
+ - The device was opened in non-blocking mode and a read can't
+ be performed as there is no data available.
+
+ - - ``EBADF``
+
+ - The file descriptor is not valid.
+
+ - - ``EBUSY``
+
+ - The ioctl can't be handled because the device is busy. Typically
+ returned when an ioctl attempts something that would require the
+ usage of a resource that was already allocated. The ioctl must not
+ be retried without performing another action to fix the problem
+ first.
+
+ - - ``EFAULT``
+
+ - There was a failure while copying data from/to userspace, probably
+ caused by an invalid pointer reference.
+
+ - - ``EINVAL``
+
+ - One or more of the ioctl parameters are invalid or out of the
+ allowed range. This is a widely used error code.
+
+ - - ``ENODEV``
+
+ - Device not found or was removed.
+
+ - - ``ENOMEM``
+
+ - There's not enough memory to handle the desired operation.
+
+ - - ``EPERM``
+
+ - Permission denied. Typically returned in response to an attempt
+ to perform an action incompatible with the current line
+ configuration.
+
+ - - ``EIO``
+
+ - I/O error. Typically returned when there are problems communicating
+ with a hardware device or requesting features that hardware does not
+ support. This could indicate broken or flaky hardware.
+ It's a 'Something is wrong, I give up!' type of error.
+
+ - - ``ENXIO``
+
+ - Typically returned when a feature requiring interrupt support was
+ requested, but the line does not support interrupts.
+
+.. note::
+
+ #. This list is not exhaustive; ioctls may return other error codes.
+ Since errors may have side effects such as a driver reset,
+ applications should abort on unexpected errors, or otherwise
+ assume that the device is in a bad state.
+
+ #. Request-specific error codes are listed in the individual
+ requests descriptions.
diff --git a/Documentation/userspace-api/gpio/gpio-get-chipinfo-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-chipinfo-ioctl.rst
new file mode 100644
index 000000000000..05f07fdefe2f
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-chipinfo-ioctl.rst
@@ -0,0 +1,41 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_CHIPINFO_IOCTL:
+
+***********************
+GPIO_GET_CHIPINFO_IOCTL
+***********************
+
+Name
+====
+
+GPIO_GET_CHIPINFO_IOCTL - Get the publicly available information for a chip.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_CHIPINFO_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_CHIPINFO_IOCTL, struct gpiochip_info *info)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``info``
+ The :c:type:`chip_info<gpiochip_info>` to be populated.
+
+Description
+===========
+
+Gets the publicly available information for a particular GPIO chip.
+
+Return Value
+============
+
+On success 0 and ``info`` is populated with the chip info.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-get-lineevent-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-lineevent-ioctl.rst
new file mode 100644
index 000000000000..09a9254f38cf
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-lineevent-ioctl.rst
@@ -0,0 +1,84 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_LINEEVENT_IOCTL:
+
+************************
+GPIO_GET_LINEEVENT_IOCTL
+************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-get-line-ioctl.rst.
+
+Name
+====
+
+GPIO_GET_LINEEVENT_IOCTL - Request a line with edge detection from the kernel.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_LINEEVENT_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_LINEEVENT_IOCTL, struct gpioevent_request *request)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``request``
+ The :c:type:`event_request<gpioevent_request>` specifying the line
+ to request and its configuration.
+
+Description
+===========
+
+Request a line with edge detection from the kernel.
+
+On success, the requesting process is granted exclusive access to the line
+value and may receive events when edges are detected on the line, as
+described in gpio-lineevent-data-read.rst.
+
+The state of a line is guaranteed to remain as requested until the returned
+file descriptor is closed. Once the file descriptor is closed, the state of
+the line becomes uncontrolled from the userspace perspective, and may revert
+to its default state.
+
+Requesting a line already in use is an error (**EBUSY**).
+
+Requesting edge detection on a line that does not support interrupts is an
+error (**ENXIO**).
+
+As with the :ref:`line handle<gpio-get-linehandle-config-support>`, the
+bias configuration is best effort.
+
+Closing the ``chip_fd`` has no effect on existing line events.
+
+Configuration Rules
+-------------------
+
+The following configuration rules apply:
+
+The line event is requested as an input, so no flags specific to output lines,
+``GPIOHANDLE_REQUEST_OUTPUT``, ``GPIOHANDLE_REQUEST_OPEN_DRAIN``, or
+``GPIOHANDLE_REQUEST_OPEN_SOURCE``, may be set.
+
+Only one bias flag, ``GPIOHANDLE_REQUEST_BIAS_xxx``, may be set.
+If no bias flags are set then the bias configuration is not changed.
+
+The edge flags, ``GPIOEVENT_REQUEST_RISING_EDGE`` and
+``GPIOEVENT_REQUEST_FALLING_EDGE``, may be combined to detect both rising
+and falling edges.
+
+Requesting an invalid configuration is an error (**EINVAL**).
+
+Return Value
+============
+
+On success 0 and the :c:type:`request.fd<gpioevent_request>` contains the file
+descriptor for the request.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-get-linehandle-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-linehandle-ioctl.rst
new file mode 100644
index 000000000000..9112a9d31174
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-linehandle-ioctl.rst
@@ -0,0 +1,125 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_LINEHANDLE_IOCTL:
+
+*************************
+GPIO_GET_LINEHANDLE_IOCTL
+*************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-get-line-ioctl.rst.
+
+Name
+====
+
+GPIO_GET_LINEHANDLE_IOCTL - Request a line or lines from the kernel.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_LINEHANDLE_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_LINEHANDLE_IOCTL, struct gpiohandle_request *request)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``request``
+ The :c:type:`handle_request<gpiohandle_request>` specifying the lines to
+ request and their configuration.
+
+Description
+===========
+
+Request a line or lines from the kernel.
+
+While multiple lines may be requested, the same configuration applies to all
+lines in the request.
+
+On success, the requesting process is granted exclusive access to the line
+value and write access to the line configuration.
+
+The state of a line, including the value of output lines, is guaranteed to
+remain as requested until the returned file descriptor is closed. Once the
+file descriptor is closed, the state of the line becomes uncontrolled from
+the userspace perspective, and may revert to its default state.
+
+Requesting a line already in use is an error (**EBUSY**).
+
+Closing the ``chip_fd`` has no effect on existing line handles.
+
+.. _gpio-get-linehandle-config-rules:
+
+Configuration Rules
+-------------------
+
+The following configuration rules apply:
+
+The direction flags, ``GPIOHANDLE_REQUEST_INPUT`` and
+``GPIOHANDLE_REQUEST_OUTPUT``, cannot be combined. If neither are set then the
+only other flag that may be set is ``GPIOHANDLE_REQUEST_ACTIVE_LOW`` and the
+line is requested "as-is" to allow reading of the line value without altering
+the electrical configuration.
+
+The drive flags, ``GPIOHANDLE_REQUEST_OPEN_xxx``, require the
+``GPIOHANDLE_REQUEST_OUTPUT`` to be set.
+Only one drive flag may be set.
+If none are set then the line is assumed push-pull.
+
+Only one bias flag, ``GPIOHANDLE_REQUEST_BIAS_xxx``, may be set, and
+it requires a direction flag to also be set.
+If no bias flags are set then the bias configuration is not changed.
+
+Requesting an invalid configuration is an error (**EINVAL**).
+
+
+.. _gpio-get-linehandle-config-support:
+
+Configuration Support
+---------------------
+
+Where the requested configuration is not directly supported by the underlying
+hardware and driver, the kernel applies one of these approaches:
+
+ - reject the request
+ - emulate the feature in software
+ - treat the feature as best effort
+
+The approach applied depends on whether the feature can reasonably be emulated
+in software, and the impact on the hardware and userspace if the feature is not
+supported.
+The approach applied for each feature is as follows:
+
+============== ===========
+Feature Approach
+============== ===========
+Bias best effort
+Direction reject
+Drive emulate
+============== ===========
+
+Bias is treated as best effort to allow userspace to apply the same
+configuration for platforms that support internal bias as those that require
+external bias.
+Worst case the line floats rather than being biased as expected.
+
+Drive is emulated by switching the line to an input when the line should not
+be driven.
+
+In all cases, the configuration reported by gpio-get-lineinfo-ioctl.rst
+is the requested configuration, not the resulting hardware configuration.
+Userspace cannot determine if a feature is supported in hardware, is
+emulated, or is best effort.
+
+Return Value
+============
+
+On success 0 and the :c:type:`request.fd<gpiohandle_request>` contains the
+file descriptor for the request.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-get-lineinfo-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-lineinfo-ioctl.rst
new file mode 100644
index 000000000000..c895b8910b25
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-lineinfo-ioctl.rst
@@ -0,0 +1,54 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_LINEINFO_IOCTL:
+
+***********************
+GPIO_GET_LINEINFO_IOCTL
+***********************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-get-lineinfo-ioctl.rst.
+
+Name
+====
+
+GPIO_GET_LINEINFO_IOCTL - Get the publicly available information for a line.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_LINEINFO_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_LINEINFO_IOCTL, struct gpioline_info *info)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``info``
+ The :c:type:`line_info<gpioline_info>` to be populated, with the
+ ``offset`` field set to indicate the line to be collected.
+
+Description
+===========
+
+Get the publicly available information for a line.
+
+This information is available independent of whether the line is in use.
+
+.. note::
+ The line info does not include the line value.
+
+ The line must be requested using gpio-get-linehandle-ioctl.rst or
+ gpio-get-lineevent-ioctl.rst to access its value.
+
+Return Value
+============
+
+On success 0 and ``info`` is populated with the chip info.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-get-lineinfo-unwatch-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-lineinfo-unwatch-ioctl.rst
new file mode 100644
index 000000000000..a82d0161daf8
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-lineinfo-unwatch-ioctl.rst
@@ -0,0 +1,49 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_LINEINFO_UNWATCH_IOCTL:
+
+*******************************
+GPIO_GET_LINEINFO_UNWATCH_IOCTL
+*******************************
+
+Name
+====
+
+GPIO_GET_LINEINFO_UNWATCH_IOCTL - Disable watching a line for changes to its
+requested state and configuration information.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_LINEINFO_UNWATCH_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_LINEINFO_UNWATCH_IOCTL, u32 *offset)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``offset``
+ The offset of the line to no longer watch.
+
+Description
+===========
+
+Remove the line from the list of lines being watched on this ``chip_fd``.
+
+This is the reverse of gpio-v2-get-lineinfo-watch-ioctl.rst (v2) and
+gpio-get-lineinfo-watch-ioctl.rst (v1).
+
+Unwatching a line that is not watched is an error (**EBUSY**).
+
+First added in 5.7.
+
+Return Value
+============
+
+On success 0.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-get-lineinfo-watch-ioctl.rst b/Documentation/userspace-api/gpio/gpio-get-lineinfo-watch-ioctl.rst
new file mode 100644
index 000000000000..f5c92b69a496
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-get-lineinfo-watch-ioctl.rst
@@ -0,0 +1,74 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_GET_LINEINFO_WATCH_IOCTL:
+
+*****************************
+GPIO_GET_LINEINFO_WATCH_IOCTL
+*****************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-get-lineinfo-watch-ioctl.rst.
+
+Name
+====
+
+GPIO_GET_LINEINFO_WATCH_IOCTL - Enable watching a line for changes to its
+request state and configuration information.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_GET_LINEINFO_WATCH_IOCTL
+
+``int ioctl(int chip_fd, GPIO_GET_LINEINFO_WATCH_IOCTL, struct gpioline_info *info)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``info``
+ The :c:type:`line_info<gpioline_info>` struct to be populated, with
+ the ``offset`` set to indicate the line to watch
+
+Description
+===========
+
+Enable watching a line for changes to its request state and configuration
+information. Changes to line info include a line being requested, released
+or reconfigured.
+
+.. note::
+ Watching line info is not generally required, and would typically only be
+ used by a system monitoring component.
+
+ The line info does NOT include the line value.
+
+ The line must be requested using gpio-get-linehandle-ioctl.rst or
+ gpio-get-lineevent-ioctl.rst to access its value, and the line event can
+ monitor a line for events using gpio-lineevent-data-read.rst.
+
+By default all lines are unwatched when the GPIO chip is opened.
+
+Multiple lines may be watched simultaneously by adding a watch for each.
+
+Once a watch is set, any changes to line info will generate events which can be
+read from the ``chip_fd`` as described in
+gpio-lineinfo-changed-read.rst.
+
+Adding a watch to a line that is already watched is an error (**EBUSY**).
+
+Watches are specific to the ``chip_fd`` and are independent of watches
+on the same GPIO chip opened with a separate call to `open()`.
+
+First added in 5.7.
+
+Return Value
+============
+
+On success 0 and ``info`` is populated with the current line info.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-handle-get-line-values-ioctl.rst b/Documentation/userspace-api/gpio/gpio-handle-get-line-values-ioctl.rst
new file mode 100644
index 000000000000..25263b8f0588
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-handle-get-line-values-ioctl.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIOHANDLE_GET_LINE_VALUES_IOCTL:
+
+********************************
+GPIOHANDLE_GET_LINE_VALUES_IOCTL
+********************************
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-line-get-values-ioctl.rst.
+
+Name
+====
+
+GPIOHANDLE_GET_LINE_VALUES_IOCTL - Get the values of all requested lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIOHANDLE_GET_LINE_VALUES_IOCTL
+
+``int ioctl(int handle_fd, GPIOHANDLE_GET_LINE_VALUES_IOCTL, struct gpiohandle_data *values)``
+
+Arguments
+=========
+
+``handle_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpiohandle_request>` by gpio-get-linehandle-ioctl.rst.
+
+``values``
+ The :c:type:`line_values<gpiohandle_data>` to be populated.
+
+Description
+===========
+
+Get the values of all requested lines.
+
+The values of both input and output lines may be read.
+
+For output lines, the value returned is driver and configuration dependent and
+may be either the output buffer (the last requested value set) or the input
+buffer (the actual level of the line), and depending on the hardware and
+configuration these may differ.
+
+This ioctl can also be used to read the line value for line events,
+substituting the ``event_fd`` for the ``handle_fd``. As there is only
+one line requested in that case, only the one value is returned in ``values``.
+
+Return Value
+============
+
+On success 0 and ``values`` populated with the values read.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-handle-set-config-ioctl.rst b/Documentation/userspace-api/gpio/gpio-handle-set-config-ioctl.rst
new file mode 100644
index 000000000000..d002a84681ac
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-handle-set-config-ioctl.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIOHANDLE_SET_CONFIG_IOCTL:
+
+***************************
+GPIOHANDLE_SET_CONFIG_IOCTL
+***************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-line-set-config-ioctl.rst.
+
+Name
+====
+
+GPIOHANDLE_SET_CONFIG_IOCTL - Update the configuration of previously requested lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIOHANDLE_SET_CONFIG_IOCTL
+
+``int ioctl(int handle_fd, GPIOHANDLE_SET_CONFIG_IOCTL, struct gpiohandle_config *config)``
+
+Arguments
+=========
+
+``handle_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpiohandle_request>` by gpio-get-linehandle-ioctl.rst.
+
+``config``
+ The new :c:type:`configuration<gpiohandle_config>` to apply to the
+ requested lines.
+
+Description
+===========
+
+Update the configuration of previously requested lines, without releasing the
+line or introducing potential glitches.
+
+The configuration applies to all requested lines.
+
+The same :ref:`gpio-get-linehandle-config-rules` and
+:ref:`gpio-get-linehandle-config-support` that apply when requesting the
+lines also apply when updating the line configuration.
+
+The motivating use case for this command is changing direction of
+bi-directional lines between input and output, but it may be used more
+generally to move lines seamlessly from one configuration state to another.
+
+To only change the value of output lines, use
+gpio-handle-set-line-values-ioctl.rst.
+
+First added in 5.5.
+
+Return Value
+============
+
+On success 0.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-handle-set-line-values-ioctl.rst b/Documentation/userspace-api/gpio/gpio-handle-set-line-values-ioctl.rst
new file mode 100644
index 000000000000..0aa05e623a6c
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-handle-set-line-values-ioctl.rst
@@ -0,0 +1,48 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_HANDLE_SET_LINE_VALUES_IOCTL:
+
+*********************************
+GPIO_HANDLE_SET_LINE_VALUES_IOCTL
+*********************************
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-line-set-values-ioctl.rst.
+
+Name
+====
+
+GPIO_HANDLE_SET_LINE_VALUES_IOCTL - Set the values of all requested output lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_HANDLE_SET_LINE_VALUES_IOCTL
+
+``int ioctl(int handle_fd, GPIO_HANDLE_SET_LINE_VALUES_IOCTL, struct gpiohandle_data *values)``
+
+Arguments
+=========
+
+``handle_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpiohandle_request>` by gpio-get-linehandle-ioctl.rst.
+
+``values``
+ The :c:type:`line_values<gpiohandle_data>` to set.
+
+Description
+===========
+
+Set the values of all requested output lines.
+
+Only the values of output lines may be set.
+Attempting to set the value of input lines is an error (**EPERM**).
+
+Return Value
+============
+
+On success 0.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-lineevent-data-read.rst b/Documentation/userspace-api/gpio/gpio-lineevent-data-read.rst
new file mode 100644
index 000000000000..68b8d4f9f604
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-lineevent-data-read.rst
@@ -0,0 +1,84 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_LINEEVENT_DATA_READ:
+
+************************
+GPIO_LINEEVENT_DATA_READ
+************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-line-event-read.rst.
+
+Name
+====
+
+GPIO_LINEEVENT_DATA_READ - Read edge detection events from a line event.
+
+Synopsis
+========
+
+``int read(int event_fd, void *buf, size_t count)``
+
+Arguments
+=========
+
+``event_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpioevent_request>` by gpio-get-lineevent-ioctl.rst.
+
+``buf``
+ The buffer to contain the :c:type:`events<gpioevent_data>`.
+
+``count``
+ The number of bytes available in ``buf``, which must be at
+ least the size of a :c:type:`gpioevent_data`.
+
+Description
+===========
+
+Read edge detection events for a line from a line event.
+
+Edge detection must be enabled for the input line using either
+``GPIOEVENT_REQUEST_RISING_EDGE`` or ``GPIOEVENT_REQUEST_FALLING_EDGE``, or
+both. Edge events are then generated whenever edge interrupts are detected on
+the input line.
+
+The kernel captures and timestamps edge events as close as possible to their
+occurrence and stores them in a buffer from where they can be read by
+userspace at its convenience using `read()`.
+
+The source of the clock for :c:type:`event.timestamp<gpioevent_data>` is
+``CLOCK_MONOTONIC``, except for kernels earlier than Linux 5.7 when it was
+``CLOCK_REALTIME``. There is no indication in the :c:type:`gpioevent_data`
+as to which clock source is used, it must be determined from either the kernel
+version or sanity checks on the timestamp itself.
+
+Events read from the buffer are always in the same order that they were
+detected by the kernel.
+
+The size of the kernel event buffer is fixed at 16 events.
+
+The buffer may overflow if bursts of events occur quicker than they are read
+by userspace. If an overflow occurs then the most recent event is discarded.
+Overflow cannot be detected from userspace.
+
+To minimize the number of calls required to copy events from the kernel to
+userspace, `read()` supports copying multiple events. The number of events
+copied is the lower of the number available in the kernel buffer and the
+number that will fit in the userspace buffer (``buf``).
+
+The `read()` will block if no event is available and the ``event_fd`` has not
+been set **O_NONBLOCK**.
+
+The presence of an event can be tested for by checking that the ``event_fd`` is
+readable using `poll()` or an equivalent.
+
+Return Value
+============
+
+On success the number of bytes read, which will be a multiple of the size of
+a :c:type:`gpio_lineevent_data` event.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-lineinfo-changed-read.rst b/Documentation/userspace-api/gpio/gpio-lineinfo-changed-read.rst
new file mode 100644
index 000000000000..c4f5e1787a9d
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-lineinfo-changed-read.rst
@@ -0,0 +1,87 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_LINEINFO_CHANGED_READ:
+
+**************************
+GPIO_LINEINFO_CHANGED_READ
+**************************
+
+.. warning::
+ This ioctl is part of chardev_v1.rst and is obsoleted by
+ gpio-v2-lineinfo-changed-read.rst.
+
+Name
+====
+
+GPIO_LINEINFO_CHANGED_READ - Read line info change events for watched lines
+from the chip.
+
+Synopsis
+========
+
+``int read(int chip_fd, void *buf, size_t count)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``buf``
+ The buffer to contain the :c:type:`events<gpioline_info_changed>`.
+
+``count``
+ The number of bytes available in ``buf``, which must be at least the size
+ of a :c:type:`gpioline_info_changed` event.
+
+Description
+===========
+
+Read line info change events for watched lines from the chip.
+
+.. note::
+ Monitoring line info changes is not generally required, and would typically
+ only be performed by a system monitoring component.
+
+ These events relate to changes in a line's request state or configuration,
+ not its value. Use gpio-lineevent-data-read.rst to receive events when a
+ line changes value.
+
+A line must be watched using gpio-get-lineinfo-watch-ioctl.rst to generate
+info changed events. Subsequently, a request, release, or reconfiguration
+of the line will generate an info changed event.
+
+The kernel timestamps events when they occur and stores them in a buffer
+from where they can be read by userspace at its convenience using `read()`.
+
+The size of the kernel event buffer is fixed at 32 events per ``chip_fd``.
+
+The buffer may overflow if bursts of events occur quicker than they are read
+by userspace. If an overflow occurs then the most recent event is discarded.
+Overflow cannot be detected from userspace.
+
+Events read from the buffer are always in the same order that they were
+detected by the kernel, including when multiple lines are being monitored by
+the one ``chip_fd``.
+
+To minimize the number of calls required to copy events from the kernel to
+userspace, `read()` supports copying multiple events. The number of events
+copied is the lower of the number available in the kernel buffer and the
+number that will fit in the userspace buffer (``buf``).
+
+A `read()` will block if no event is available and the ``chip_fd`` has not
+been set **O_NONBLOCK**.
+
+The presence of an event can be tested for by checking that the ``chip_fd`` is
+readable using `poll()` or an equivalent.
+
+First added in 5.7.
+
+Return Value
+============
+
+On success the number of bytes read, which will be a multiple of the size of
+a :c:type:`gpioline_info_changed` event.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-get-line-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-get-line-ioctl.rst
new file mode 100644
index 000000000000..56b975801b6a
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-get-line-ioctl.rst
@@ -0,0 +1,152 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_GET_LINE_IOCTL:
+
+**********************
+GPIO_V2_GET_LINE_IOCTL
+**********************
+
+Name
+====
+
+GPIO_V2_GET_LINE_IOCTL - Request a line or lines from the kernel.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_GET_LINE_IOCTL
+
+``int ioctl(int chip_fd, GPIO_V2_GET_LINE_IOCTL, struct gpio_v2_line_request *request)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``request``
+ The :c:type:`line_request<gpio_v2_line_request>` specifying the lines
+ to request and their configuration.
+
+Description
+===========
+
+On success, the requesting process is granted exclusive access to the line
+value, write access to the line configuration, and may receive events when
+edges are detected on the line, all of which are described in more detail in
+:ref:`gpio-v2-line-request`.
+
+A number of lines may be requested in the one line request, and request
+operations are performed on the requested lines by the kernel as atomically
+as possible. e.g. gpio-v2-line-get-values-ioctl.rst will read all the
+requested lines at once.
+
+The state of a line, including the value of output lines, is guaranteed to
+remain as requested until the returned file descriptor is closed. Once the
+file descriptor is closed, the state of the line becomes uncontrolled from
+the userspace perspective, and may revert to its default state.
+
+Requesting a line already in use is an error (**EBUSY**).
+
+Closing the ``chip_fd`` has no effect on existing line requests.
+
+.. _gpio-v2-get-line-config-rules:
+
+Configuration Rules
+-------------------
+
+For any given requested line, the following configuration rules apply:
+
+The direction flags, ``GPIO_V2_LINE_FLAG_INPUT`` and
+``GPIO_V2_LINE_FLAG_OUTPUT``, cannot be combined. If neither are set then
+the only other flag that may be set is ``GPIO_V2_LINE_FLAG_ACTIVE_LOW``
+and the line is requested "as-is" to allow reading of the line value
+without altering the electrical configuration.
+
+The drive flags, ``GPIO_V2_LINE_FLAG_OPEN_xxx``, require the
+``GPIO_V2_LINE_FLAG_OUTPUT`` to be set.
+Only one drive flag may be set.
+If none are set then the line is assumed push-pull.
+
+Only one bias flag, ``GPIO_V2_LINE_FLAG_BIAS_xxx``, may be set, and it
+requires a direction flag to also be set.
+If no bias flags are set then the bias configuration is not changed.
+
+The edge flags, ``GPIO_V2_LINE_FLAG_EDGE_xxx``, require
+``GPIO_V2_LINE_FLAG_INPUT`` to be set and may be combined to detect both rising
+and falling edges. Requesting edge detection from a line that does not support
+it is an error (**ENXIO**).
+
+Only one event clock flag, ``GPIO_V2_LINE_FLAG_EVENT_CLOCK_xxx``, may be set.
+If none are set then the event clock defaults to ``CLOCK_MONOTONIC``.
+The ``GPIO_V2_LINE_FLAG_EVENT_CLOCK_HTE`` flag requires supporting hardware
+and a kernel with ``CONFIG_HTE`` set. Requesting HTE from a device that
+doesn't support it is an error (**EOPNOTSUP**).
+
+The :c:type:`debounce_period_us<gpio_v2_line_attribute>` attribute may only
+be applied to lines with ``GPIO_V2_LINE_FLAG_INPUT`` set. When set, debounce
+applies to both the values returned by gpio-v2-line-get-values-ioctl.rst and
+the edges returned by gpio-v2-line-event-read.rst. If not
+supported directly by hardware, debouncing is emulated in software by the
+kernel. Requesting debounce on a line that supports neither debounce in
+hardware nor interrupts, as required for software emulation, is an error
+(**ENXIO**).
+
+Requesting an invalid configuration is an error (**EINVAL**).
+
+.. _gpio-v2-get-line-config-support:
+
+Configuration Support
+---------------------
+
+Where the requested configuration is not directly supported by the underlying
+hardware and driver, the kernel applies one of these approaches:
+
+ - reject the request
+ - emulate the feature in software
+ - treat the feature as best effort
+
+The approach applied depends on whether the feature can reasonably be emulated
+in software, and the impact on the hardware and userspace if the feature is not
+supported.
+The approach applied for each feature is as follows:
+
+============== ===========
+Feature Approach
+============== ===========
+Bias best effort
+Debounce emulate
+Direction reject
+Drive emulate
+Edge Detection reject
+============== ===========
+
+Bias is treated as best effort to allow userspace to apply the same
+configuration for platforms that support internal bias as those that require
+external bias.
+Worst case the line floats rather than being biased as expected.
+
+Debounce is emulated by applying a filter to hardware interrupts on the line.
+An edge event is generated after an edge is detected and the line remains
+stable for the debounce period.
+The event timestamp corresponds to the end of the debounce period.
+
+Drive is emulated by switching the line to an input when the line should not
+be actively driven.
+
+Edge detection requires interrupt support, and is rejected if that is not
+supported. Emulation by polling can still be performed from userspace.
+
+In all cases, the configuration reported by gpio-v2-get-lineinfo-ioctl.rst
+is the requested configuration, not the resulting hardware configuration.
+Userspace cannot determine if a feature is supported in hardware, is
+emulated, or is best effort.
+
+Return Value
+============
+
+On success 0 and the :c:type:`request.fd<gpio_v2_line_request>` contains the
+file descriptor for the request.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-ioctl.rst
new file mode 100644
index 000000000000..bc4d8df887d4
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-ioctl.rst
@@ -0,0 +1,50 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_GET_LINEINFO_IOCTL:
+
+**************************
+GPIO_V2_GET_LINEINFO_IOCTL
+**************************
+
+Name
+====
+
+GPIO_V2_GET_LINEINFO_IOCTL - Get the publicly available information for a line.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_GET_LINEINFO_IOCTL
+
+``int ioctl(int chip_fd, GPIO_V2_GET_LINEINFO_IOCTL, struct gpio_v2_line_info *info)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``info``
+ The :c:type:`line_info<gpio_v2_line_info>` to be populated, with the
+ ``offset`` field set to indicate the line to be collected.
+
+Description
+===========
+
+Get the publicly available information for a line.
+
+This information is available independent of whether the line is in use.
+
+.. note::
+ The line info does not include the line value.
+
+ The line must be requested using gpio-v2-get-line-ioctl.rst to access its
+ value.
+
+Return Value
+============
+
+On success 0 and ``info`` is populated with the chip info.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-watch-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-watch-ioctl.rst
new file mode 100644
index 000000000000..938ff85a9322
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-get-lineinfo-watch-ioctl.rst
@@ -0,0 +1,67 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_GET_LINEINFO_WATCH_IOCTL:
+
+********************************
+GPIO_V2_GET_LINEINFO_WATCH_IOCTL
+********************************
+
+Name
+====
+
+GPIO_V2_GET_LINEINFO_WATCH_IOCTL - Enable watching a line for changes to its
+request state and configuration information.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_GET_LINEINFO_WATCH_IOCTL
+
+``int ioctl(int chip_fd, GPIO_V2_GET_LINEINFO_WATCH_IOCTL, struct gpio_v2_line_info *info)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``info``
+ The :c:type:`line_info<gpio_v2_line_info>` struct to be populated, with
+ the ``offset`` set to indicate the line to watch
+
+Description
+===========
+
+Enable watching a line for changes to its request state and configuration
+information. Changes to line info include a line being requested, released
+or reconfigured.
+
+.. note::
+ Watching line info is not generally required, and would typically only be
+ used by a system monitoring component.
+
+ The line info does NOT include the line value.
+ The line must be requested using gpio-v2-get-line-ioctl.rst to access
+ its value, and the line request can monitor a line for events using
+ gpio-v2-line-event-read.rst.
+
+By default all lines are unwatched when the GPIO chip is opened.
+
+Multiple lines may be watched simultaneously by adding a watch for each.
+
+Once a watch is set, any changes to line info will generate events which can be
+read from the ``chip_fd`` as described in
+gpio-v2-lineinfo-changed-read.rst.
+
+Adding a watch to a line that is already watched is an error (**EBUSY**).
+
+Watches are specific to the ``chip_fd`` and are independent of watches
+on the same GPIO chip opened with a separate call to `open()`.
+
+Return Value
+============
+
+On success 0 and ``info`` is populated with the current line info.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-line-event-read.rst b/Documentation/userspace-api/gpio/gpio-v2-line-event-read.rst
new file mode 100644
index 000000000000..6513c23fb7ca
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-line-event-read.rst
@@ -0,0 +1,83 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_LINE_EVENT_READ:
+
+***********************
+GPIO_V2_LINE_EVENT_READ
+***********************
+
+Name
+====
+
+GPIO_V2_LINE_EVENT_READ - Read edge detection events for lines from a request.
+
+Synopsis
+========
+
+``int read(int req_fd, void *buf, size_t count)``
+
+Arguments
+=========
+
+``req_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpio_v2_line_request>` by gpio-v2-get-line-ioctl.rst.
+
+``buf``
+ The buffer to contain the :c:type:`events<gpio_v2_line_event>`.
+
+``count``
+ The number of bytes available in ``buf``, which must be at
+ least the size of a :c:type:`gpio_v2_line_event`.
+
+Description
+===========
+
+Read edge detection events for lines from a request.
+
+Edge detection must be enabled for the input line using either
+``GPIO_V2_LINE_FLAG_EDGE_RISING`` or ``GPIO_V2_LINE_FLAG_EDGE_FALLING``, or
+both. Edge events are then generated whenever edge interrupts are detected on
+the input line.
+
+The kernel captures and timestamps edge events as close as possible to their
+occurrence and stores them in a buffer from where they can be read by
+userspace at its convenience using `read()`.
+
+Events read from the buffer are always in the same order that they were
+detected by the kernel, including when multiple lines are being monitored by
+the one request.
+
+The size of the kernel event buffer is fixed at the time of line request
+creation, and can be influenced by the
+:c:type:`request.event_buffer_size<gpio_v2_line_request>`.
+The default size is 16 times the number of lines requested.
+
+The buffer may overflow if bursts of events occur quicker than they are read
+by userspace. If an overflow occurs then the oldest buffered event is
+discarded. Overflow can be detected from userspace by monitoring the event
+sequence numbers.
+
+To minimize the number of calls required to copy events from the kernel to
+userspace, `read()` supports copying multiple events. The number of events
+copied is the lower of the number available in the kernel buffer and the
+number that will fit in the userspace buffer (``buf``).
+
+Changing the edge detection flags using gpio-v2-line-set-config-ioctl.rst
+does not remove or modify the events already contained in the kernel event
+buffer.
+
+The `read()` will block if no event is available and the ``req_fd`` has not
+been set **O_NONBLOCK**.
+
+The presence of an event can be tested for by checking that the ``req_fd`` is
+readable using `poll()` or an equivalent.
+
+Return Value
+============
+
+On success the number of bytes read, which will be a multiple of the size of a
+:c:type:`gpio_v2_line_event` event.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-line-get-values-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-line-get-values-ioctl.rst
new file mode 100644
index 000000000000..e4e74a1926d8
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-line-get-values-ioctl.rst
@@ -0,0 +1,51 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_LINE_GET_VALUES_IOCTL:
+
+*****************************
+GPIO_V2_LINE_GET_VALUES_IOCTL
+*****************************
+
+Name
+====
+
+GPIO_V2_LINE_GET_VALUES_IOCTL - Get the values of requested lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_LINE_GET_VALUES_IOCTL
+
+``int ioctl(int req_fd, GPIO_V2_LINE_GET_VALUES_IOCTL, struct gpio_v2_line_values *values)``
+
+Arguments
+=========
+
+``req_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpio_v2_line_request>` by gpio-v2-get-line-ioctl.rst.
+
+``values``
+ The :c:type:`line_values<gpio_v2_line_values>` to get with the ``mask`` set
+ to indicate the subset of requested lines to get.
+
+Description
+===========
+
+Get the values of requested lines.
+
+The values of both input and output lines may be read.
+
+For output lines, the value returned is driver and configuration dependent and
+may be either the output buffer (the last requested value set) or the input
+buffer (the actual level of the line), and depending on the hardware and
+configuration these may differ.
+
+Return Value
+============
+
+On success 0 and the corresponding :c:type:`values.bits<gpio_v2_line_values>`
+contain the value read.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-line-set-config-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-line-set-config-ioctl.rst
new file mode 100644
index 000000000000..9b942a8a53ca
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-line-set-config-ioctl.rst
@@ -0,0 +1,58 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_LINE_SET_CONFIG_IOCTL:
+
+*****************************
+GPIO_V2_LINE_SET_CONFIG_IOCTL
+*****************************
+
+Name
+====
+
+GPIO_V2_LINE_SET_CONFIG_IOCTL - Update the configuration of previously requested lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_LINE_SET_CONFIG_IOCTL
+
+``int ioctl(int req_fd, GPIO_V2_LINE_SET_CONFIG_IOCTL, struct gpio_v2_line_config *config)``
+
+Arguments
+=========
+
+``req_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpio_v2_line_request>` by gpio-v2-get-line-ioctl.rst.
+
+``config``
+ The new :c:type:`configuration<gpio_v2_line_config>` to apply to the
+ requested lines.
+
+Description
+===========
+
+Update the configuration of previously requested lines, without releasing the
+line or introducing potential glitches.
+
+The new configuration must specify the configuration of all requested lines.
+
+The same :ref:`gpio-v2-get-line-config-rules` and
+:ref:`gpio-v2-get-line-config-support` that apply when requesting the lines
+also apply when updating the line configuration.
+
+The motivating use case for this command is changing direction of
+bi-directional lines between input and output, but it may also be used to
+dynamically control edge detection, or more generally move lines seamlessly
+from one configuration state to another.
+
+To only change the value of output lines, use
+gpio-v2-line-set-values-ioctl.rst.
+
+Return Value
+============
+
+On success 0.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-line-set-values-ioctl.rst b/Documentation/userspace-api/gpio/gpio-v2-line-set-values-ioctl.rst
new file mode 100644
index 000000000000..6d2d1886950b
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-line-set-values-ioctl.rst
@@ -0,0 +1,47 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_LINE_SET_VALUES_IOCTL:
+
+*****************************
+GPIO_V2_LINE_SET_VALUES_IOCTL
+*****************************
+
+Name
+====
+
+GPIO_V2_LINE_SET_VALUES_IOCTL - Set the values of requested output lines.
+
+Synopsis
+========
+
+.. c:macro:: GPIO_V2_LINE_SET_VALUES_IOCTL
+
+``int ioctl(int req_fd, GPIO_V2_LINE_SET_VALUES_IOCTL, struct gpio_v2_line_values *values)``
+
+Arguments
+=========
+
+``req_fd``
+ The file descriptor of the GPIO character device, as returned in the
+ :c:type:`request.fd<gpio_v2_line_request>` by gpio-v2-get-line-ioctl.rst.
+
+``values``
+ The :c:type:`line_values<gpio_v2_line_values>` to set with the ``mask`` set
+ to indicate the subset of requested lines to set and ``bits`` set to
+ indicate the new value.
+
+Description
+===========
+
+Set the values of requested output lines.
+
+Only the values of output lines may be set.
+Attempting to set the value of an input line is an error (**EPERM**).
+
+Return Value
+============
+
+On success 0.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/gpio-v2-lineinfo-changed-read.rst b/Documentation/userspace-api/gpio/gpio-v2-lineinfo-changed-read.rst
new file mode 100644
index 000000000000..24ad325e7253
--- /dev/null
+++ b/Documentation/userspace-api/gpio/gpio-v2-lineinfo-changed-read.rst
@@ -0,0 +1,81 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _GPIO_V2_LINEINFO_CHANGED_READ:
+
+*****************************
+GPIO_V2_LINEINFO_CHANGED_READ
+*****************************
+
+Name
+====
+
+GPIO_V2_LINEINFO_CHANGED_READ - Read line info changed events for watched
+lines from the chip.
+
+Synopsis
+========
+
+``int read(int chip_fd, void *buf, size_t count)``
+
+Arguments
+=========
+
+``chip_fd``
+ The file descriptor of the GPIO character device returned by `open()`.
+
+``buf``
+ The buffer to contain the :c:type:`events<gpio_v2_line_info_changed>`.
+
+``count``
+ The number of bytes available in ``buf``, which must be at least the size
+ of a :c:type:`gpio_v2_line_info_changed` event.
+
+Description
+===========
+
+Read line info changed events for watched lines from the chip.
+
+.. note::
+ Monitoring line info changes is not generally required, and would typically
+ only be performed by a system monitoring component.
+
+ These events relate to changes in a line's request state or configuration,
+ not its value. Use gpio-v2-line-event-read.rst to receive events when a
+ line changes value.
+
+A line must be watched using gpio-v2-get-lineinfo-watch-ioctl.rst to generate
+info changed events. Subsequently, a request, release, or reconfiguration
+of the line will generate an info changed event.
+
+The kernel timestamps events when they occur and stores them in a buffer
+from where they can be read by userspace at its convenience using `read()`.
+
+The size of the kernel event buffer is fixed at 32 events per ``chip_fd``.
+
+The buffer may overflow if bursts of events occur quicker than they are read
+by userspace. If an overflow occurs then the most recent event is discarded.
+Overflow cannot be detected from userspace.
+
+Events read from the buffer are always in the same order that they were
+detected by the kernel, including when multiple lines are being monitored by
+the one ``chip_fd``.
+
+To minimize the number of calls required to copy events from the kernel to
+userspace, `read()` supports copying multiple events. The number of events
+copied is the lower of the number available in the kernel buffer and the
+number that will fit in the userspace buffer (``buf``).
+
+A `read()` will block if no event is available and the ``chip_fd`` has not
+been set **O_NONBLOCK**.
+
+The presence of an event can be tested for by checking that the ``chip_fd`` is
+readable using `poll()` or an equivalent.
+
+Return Value
+============
+
+On success the number of bytes read, which will be a multiple of the size
+of a :c:type:`gpio_v2_line_info_changed` event.
+
+On error -1 and the ``errno`` variable is set appropriately.
+Common error codes are described in error-codes.rst.
diff --git a/Documentation/userspace-api/gpio/index.rst b/Documentation/userspace-api/gpio/index.rst
new file mode 100644
index 000000000000..f258de4ef370
--- /dev/null
+++ b/Documentation/userspace-api/gpio/index.rst
@@ -0,0 +1,18 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====
+GPIO
+====
+
+.. toctree::
+ :maxdepth: 1
+
+ Character Device Userspace API <chardev>
+ Obsolete Userspace APIs <obsolete>
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/userspace-api/gpio/obsolete.rst b/Documentation/userspace-api/gpio/obsolete.rst
new file mode 100644
index 000000000000..c42538b44ec8
--- /dev/null
+++ b/Documentation/userspace-api/gpio/obsolete.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Obsolete GPIO Userspace APIs
+============================
+
+.. toctree::
+ :maxdepth: 1
+
+ Character Device Userspace API (v1) <chardev_v1>
+ Sysfs Interface <sysfs>
diff --git a/Documentation/userspace-api/gpio/sysfs.rst b/Documentation/userspace-api/gpio/sysfs.rst
new file mode 100644
index 000000000000..116921048b18
--- /dev/null
+++ b/Documentation/userspace-api/gpio/sysfs.rst
@@ -0,0 +1,170 @@
+GPIO Sysfs Interface for Userspace
+==================================
+
+.. warning::
+ This API is obsoleted by the chardev.rst and the ABI documentation has
+ been moved to Documentation/ABI/obsolete/sysfs-gpio.
+
+ New developments should use the chardev.rst, and existing developments are
+ encouraged to migrate as soon as possible, as this API will be removed
+ in the future.
+
+ This interface will continue to be maintained for the migration period,
+ but new features will only be added to the new API.
+
+The obsolete sysfs ABI
+----------------------
+Platforms which use the "gpiolib" implementors framework may choose to
+configure a sysfs user interface to GPIOs. This is different from the
+debugfs interface, since it provides control over GPIO direction and
+value instead of just showing a gpio state summary. Plus, it could be
+present on production systems without debugging support.
+
+Given appropriate hardware documentation for the system, userspace could
+know for example that GPIO #23 controls the write protect line used to
+protect boot loader segments in flash memory. System upgrade procedures
+may need to temporarily remove that protection, first importing a GPIO,
+then changing its output state, then updating the code before re-enabling
+the write protection. In normal use, GPIO #23 would never be touched,
+and the kernel would have no need to know about it.
+
+Again depending on appropriate hardware documentation, on some systems
+userspace GPIO can be used to determine system configuration data that
+standard kernels won't know about. And for some tasks, simple userspace
+GPIO drivers could be all that the system really needs.
+
+.. note::
+ Do NOT abuse sysfs to control hardware that has proper kernel drivers.
+ Please read Documentation/driver-api/gpio/drivers-on-gpio.rst
+ to avoid reinventing kernel wheels in userspace.
+
+ I MEAN IT. REALLY.
+
+Paths in Sysfs
+--------------
+There are three kinds of entries in /sys/class/gpio:
+
+ - Control interfaces used to get userspace control over GPIOs;
+
+ - GPIOs themselves; and
+
+ - GPIO controllers ("gpio_chip" instances).
+
+That's in addition to standard files including the "device" symlink.
+
+The control interfaces are write-only:
+
+ /sys/class/gpio/
+
+ "export" ...
+ Userspace may ask the kernel to export control of
+ a GPIO to userspace by writing its number to this file.
+
+ Example: "echo 19 > export" will create a "gpio19" node
+ for GPIO #19, if that's not requested by kernel code.
+
+ "unexport" ...
+ Reverses the effect of exporting to userspace.
+
+ Example: "echo 19 > unexport" will remove a "gpio19"
+ node exported using the "export" file.
+
+GPIO signals have paths like /sys/class/gpio/gpio42/ (for GPIO #42)
+and have the following read/write attributes:
+
+ /sys/class/gpio/gpioN/
+
+ "direction" ...
+ reads as either "in" or "out". This value may
+ normally be written. Writing as "out" defaults to
+ initializing the value as low. To ensure glitch free
+ operation, values "low" and "high" may be written to
+ configure the GPIO as an output with that initial value.
+
+ Note that this attribute *will not exist* if the kernel
+ doesn't support changing the direction of a GPIO, or
+ it was exported by kernel code that didn't explicitly
+ allow userspace to reconfigure this GPIO's direction.
+
+ "value" ...
+ reads as either 0 (inactive) or 1 (active). If the GPIO
+ is configured as an output, this value may be written;
+ any nonzero value is treated as active.
+
+ If the pin can be configured as interrupt-generating interrupt
+ and if it has been configured to generate interrupts (see the
+ description of "edge"), you can poll(2) on that file and
+ poll(2) will return whenever the interrupt was triggered. If
+ you use poll(2), set the events POLLPRI and POLLERR. If you
+ use select(2), set the file descriptor in exceptfds. After
+ poll(2) returns, either lseek(2) to the beginning of the sysfs
+ file and read the new value or close the file and re-open it
+ to read the value.
+
+ "edge" ...
+ reads as either "none", "rising", "falling", or
+ "both". Write these strings to select the signal edge(s)
+ that will make poll(2) on the "value" file return.
+
+ This file exists only if the pin can be configured as an
+ interrupt generating input pin.
+
+ "active_low" ...
+ reads as either 0 (false) or 1 (true). Write
+ any nonzero value to invert the value attribute both
+ for reading and writing. Existing and subsequent
+ poll(2) support configuration via the edge attribute
+ for "rising" and "falling" edges will follow this
+ setting.
+
+GPIO controllers have paths like /sys/class/gpio/gpiochip42/ (for the
+controller implementing GPIOs starting at #42) and have the following
+read-only attributes:
+
+ /sys/class/gpio/gpiochipN/
+
+ "base" ...
+ same as N, the first GPIO managed by this chip
+
+ "label" ...
+ provided for diagnostics (not always unique)
+
+ "ngpio" ...
+ how many GPIOs this manages (N to N + ngpio - 1)
+
+Board documentation should in most cases cover what GPIOs are used for
+what purposes. However, those numbers are not always stable; GPIOs on
+a daughtercard might be different depending on the base board being used,
+or other cards in the stack. In such cases, you may need to use the
+gpiochip nodes (possibly in conjunction with schematics) to determine
+the correct GPIO number to use for a given signal.
+
+
+Exporting from Kernel code
+--------------------------
+Kernel code can explicitly manage exports of GPIOs which have already been
+requested using gpio_request()::
+
+ /* export the GPIO to userspace */
+ int gpiod_export(struct gpio_desc *desc, bool direction_may_change);
+
+ /* reverse gpiod_export() */
+ void gpiod_unexport(struct gpio_desc *desc);
+
+ /* create a sysfs link to an exported GPIO node */
+ int gpiod_export_link(struct device *dev, const char *name,
+ struct gpio_desc *desc);
+
+After a kernel driver requests a GPIO, it may only be made available in
+the sysfs interface by gpiod_export(). The driver can control whether the
+signal direction may change. This helps drivers prevent userspace code
+from accidentally clobbering important system state.
+
+This explicit exporting can help with debugging (by making some kinds
+of experiments easier), or can provide an always-there interface that's
+suitable for documenting as part of a board support package.
+
+After the GPIO has been exported, gpiod_export_link() allows creating
+symlinks from elsewhere in sysfs to the GPIO sysfs node. Drivers can
+use this to provide the interface under their own device in sysfs with
+a descriptive name.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 09f61bd2ac2e..afecfe3cc4a8 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -9,31 +9,59 @@ While much of the kernel's user-space API is documented elsewhere
also be found in the kernel tree itself. This manual is intended to be the
place where this information is gathered.
+
+System calls
+============
+
+.. toctree::
+ :maxdepth: 1
+
+ unshare
+ futex2
+ ebpf/index
+ ioctl/index
+
+Security-related interfaces
+===========================
+
.. toctree::
- :caption: Table of contents
- :maxdepth: 2
+ :maxdepth: 1
no_new_privs
seccomp_filter
landlock
- unshare
+ lsm
spec_ctrl
+ tee
+
+Devices and I/O
+===============
+
+.. toctree::
+ :maxdepth: 1
+
accelerators/ocxl
dma-buf-alloc-exchange
- ebpf/index
- ELF
- ioctl/index
+ gpio/index
iommu
iommufd
media/index
+ dcdbas
+ vduse
+ isapnp
+
+Everything else
+===============
+
+.. toctree::
+ :maxdepth: 1
+
+ ELF
netlink/index
sysfs-platform_profile
vduse
futex2
- lsm
- tee
- isapnp
- dcdbas
+ perf_ring_buffer
.. only:: subproject and html
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 457e16f06e04..c472423412bf 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -82,8 +82,9 @@ Code Seq# Include File Comments
0x10 00-0F drivers/char/s390/vmcp.h
0x10 10-1F arch/s390/include/uapi/sclp_ctl.h
0x10 20-2F arch/s390/include/uapi/asm/hypfs.h
-0x12 all linux/fs.h
+0x12 all linux/fs.h BLK* ioctls
linux/blkpg.h
+0x15 all linux/fs.h FS_IOC_* ioctls
0x1b all InfiniBand Subsystem
<http://infiniband.sourceforge.net/>
0x20 all drivers/cdrom/cm206.h
@@ -309,6 +310,7 @@ Code Seq# Include File Comments
0x89 0B-DF linux/sockios.h
0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range
0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range
+0x8A 00-1F linux/eventpoll.h
0x8B all linux/wireless.h
0x8C 00-3F WiNRADiO driver
<http://www.winradio.com.au/>
diff --git a/Documentation/userspace-api/netlink/netlink-raw.rst b/Documentation/userspace-api/netlink/netlink-raw.rst
index 1e14f5f22b8e..1990eea772d0 100644
--- a/Documentation/userspace-api/netlink/netlink-raw.rst
+++ b/Documentation/userspace-api/netlink/netlink-raw.rst
@@ -150,3 +150,45 @@ attributes from an ``attribute-set``. For example the following
Note that a selector attribute must appear in a netlink message before any
sub-message attributes that depend on it.
+
+If an attribute such as ``kind`` is defined at more than one nest level, then a
+sub-message selector will be resolved using the value 'closest' to the selector.
+For example, if the same attribute name is defined in a nested ``attribute-set``
+alongside a sub-message selector and also in a top level ``attribute-set``, then
+the selector will be resolved using the value 'closest' to the selector. If the
+value is not present in the message at the same level as defined in the spec
+then this is an error.
+
+Nested struct definitions
+-------------------------
+
+Many raw netlink families such as :doc:`tc<../../networking/netlink_spec/tc>`
+make use of nested struct definitions. The ``netlink-raw`` schema makes it
+possible to embed a struct within a struct definition using the ``struct``
+property. For example, the following struct definition embeds the
+``tc-ratespec`` struct definition for both the ``rate`` and the ``peakrate``
+members of ``struct tc-tbf-qopt``.
+
+.. code-block:: yaml
+
+ -
+ name: tc-tbf-qopt
+ type: struct
+ members:
+ -
+ name: rate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: peakrate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: limit
+ type: u32
+ -
+ name: buffer
+ type: u32
+ -
+ name: mtu
+ type: u32
diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst
new file mode 100644
index 000000000000..bde9d8cbc106
--- /dev/null
+++ b/Documentation/userspace-api/perf_ring_buffer.rst
@@ -0,0 +1,830 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Perf ring buffer
+================
+
+.. CONTENTS
+
+ 1. Introduction
+
+ 2. Ring buffer implementation
+ 2.1 Basic algorithm
+ 2.2 Ring buffer for different tracing modes
+ 2.2.1 Default mode
+ 2.2.2 Per-thread mode
+ 2.2.3 Per-CPU mode
+ 2.2.4 System wide mode
+ 2.3 Accessing buffer
+ 2.3.1 Producer-consumer model
+ 2.3.2 Properties of the ring buffers
+ 2.3.3 Writing samples into buffer
+ 2.3.4 Reading samples from buffer
+ 2.3.5 Memory synchronization
+
+ 3. The mechanism of AUX ring buffer
+ 3.1 The relationship between AUX and regular ring buffers
+ 3.2 AUX events
+ 3.3 Snapshot mode
+
+
+1. Introduction
+===============
+
+The ring buffer is a fundamental mechanism for data transfer. perf uses
+ring buffers to transfer event data from kernel to user space, another
+kind of ring buffer which is so called auxiliary (AUX) ring buffer also
+plays an important role for hardware tracing with Intel PT, Arm
+CoreSight, etc.
+
+The ring buffer implementation is critical but it's also a very
+challenging work. On the one hand, the kernel and perf tool in the user
+space use the ring buffer to exchange data and stores data into data
+file, thus the ring buffer needs to transfer data with high throughput;
+on the other hand, the ring buffer management should avoid significant
+overload to distract profiling results.
+
+This documentation dives into the details for perf ring buffer with two
+parts: firstly it explains the perf ring buffer implementation, then the
+second part discusses the AUX ring buffer mechanism.
+
+2. Ring buffer implementation
+=============================
+
+2.1 Basic algorithm
+-------------------
+
+That said, a typical ring buffer is managed by a head pointer and a tail
+pointer; the head pointer is manipulated by a writer and the tail
+pointer is updated by a reader respectively.
+
+::
+
+ +---------------------------+
+ | | |***|***|***| | |
+ +---------------------------+
+ `-> Tail `-> Head
+
+ * : the data is filled by the writer.
+
+ Figure 1. Ring buffer
+
+Perf uses the same way to manage its ring buffer. In the implementation
+there are two key data structures held together in a set of consecutive
+pages, the control structure and then the ring buffer itself. The page
+with the control structure in is known as the "user page". Being held
+in continuous virtual addresses simplifies locating the ring buffer
+address, it is in the pages after the page with the user page.
+
+The control structure is named as ``perf_event_mmap_page``, it contains a
+head pointer ``data_head`` and a tail pointer ``data_tail``. When the
+kernel starts to fill records into the ring buffer, it updates the head
+pointer to reserve the memory so later it can safely store events into
+the buffer. On the other side, when the user page is a writable mapping,
+the perf tool has the permission to update the tail pointer after consuming
+data from the ring buffer. Yet another case is for the user page's
+read-only mapping, which is to be addressed in the section
+:ref:`writing_samples_into_buffer`.
+
+::
+
+ user page ring buffer
+ +---------+---------+ +---------------------------------------+
+ |data_head|data_tail|...| | |***|***|***|***|***| | | |
+ +---------+---------+ +---------------------------------------+
+ ` `----------------^ ^
+ `----------------------------------------------|
+
+ * : the data is filled by the writer.
+
+ Figure 2. Perf ring buffer
+
+When using the ``perf record`` tool, we can specify the ring buffer size
+with option ``-m`` or ``--mmap-pages=``, the given size will be rounded up
+to a power of two that is a multiple of a page size. Though the kernel
+allocates at once for all memory pages, it's deferred to map the pages
+to VMA area until the perf tool accesses the buffer from the user space.
+In other words, at the first time accesses the buffer's page from user
+space in the perf tool, a data abort exception for page fault is taken
+and the kernel uses this occasion to map the page into process VMA
+(see ``perf_mmap_fault()``), thus the perf tool can continue to access
+the page after returning from the exception.
+
+2.2 Ring buffer for different tracing modes
+-------------------------------------------
+
+The perf profiles programs with different modes: default mode, per thread
+mode, per cpu mode, and system wide mode. This section describes these
+modes and how the ring buffer meets requirements for them. At last we
+will review the race conditions caused by these modes.
+
+2.2.1 Default mode
+^^^^^^^^^^^^^^^^^^
+
+Usually we execute ``perf record`` command followed by a profiling program
+name, like below command::
+
+ perf record test_program
+
+This command doesn't specify any options for CPU and thread modes, the
+perf tool applies the default mode on the perf event. It maps all the
+CPUs in the system and the profiled program's PID on the perf event, and
+it enables inheritance mode on the event so that child tasks inherits
+the events. As a result, the perf event is attributed as::
+
+ evsel::cpus::map[] = { 0 .. _SC_NPROCESSORS_ONLN-1 }
+ evsel::threads::map[] = { pid }
+ evsel::attr::inherit = 1
+
+These attributions finally will be reflected on the deployment of ring
+buffers. As shown below, the perf tool allocates individual ring buffer
+for each CPU, but it only enables events for the profiled program rather
+than for all threads in the system. The *T1* thread represents the
+thread context of the 'test_program', whereas *T2* and *T3* are irrelevant
+threads in the system. The perf samples are exclusively collected for
+the *T1* thread and stored in the ring buffer associated with the CPU on
+which the *T1* thread is running.
+
+::
+
+ T1 T2 T1
+ +----+ +-----------+ +----+
+ CPU0 |xxxx| |xxxxxxxxxxx| |xxxx|
+ +----+--------------+-----------+----------+----+-------->
+ | |
+ v v
+ +-----------------------------------------------------+
+ | Ring buffer 0 |
+ +-----------------------------------------------------+
+
+ T1
+ +-----+
+ CPU1 |xxxxx|
+ -----+-----+--------------------------------------------->
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 1 |
+ +-----------------------------------------------------+
+
+ T1 T3
+ +----+ +-------+
+ CPU2 |xxxx| |xxxxxxx|
+ --------------------------+----+--------+-------+-------->
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 2 |
+ +-----------------------------------------------------+
+
+ T1
+ +--------------+
+ CPU3 |xxxxxxxxxxxxxx|
+ -----------+--------------+------------------------------>
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 3 |
+ +-----------------------------------------------------+
+
+ T1: Thread 1; T2: Thread 2; T3: Thread 3
+ x: Thread is in running state
+
+ Figure 3. Ring buffer for default mode
+
+2.2.2 Per-thread mode
+^^^^^^^^^^^^^^^^^^^^^
+
+By specifying option ``--per-thread`` in perf command, e.g.
+
+::
+
+ perf record --per-thread test_program
+
+The perf event doesn't map to any CPUs and is only bound to the
+profiled process, thus, the perf event's attributions are::
+
+ evsel::cpus::map[0] = { -1 }
+ evsel::threads::map[] = { pid }
+ evsel::attr::inherit = 0
+
+In this mode, a single ring buffer is allocated for the profiled thread;
+if the thread is scheduled on a CPU, the events on that CPU will be
+enabled; and if the thread is scheduled out from the CPU, the events on
+the CPU will be disabled. When the thread is migrated from one CPU to
+another, the events are to be disabled on the previous CPU and enabled
+on the next CPU correspondingly.
+
+::
+
+ T1 T2 T1
+ +----+ +-----------+ +----+
+ CPU0 |xxxx| |xxxxxxxxxxx| |xxxx|
+ +----+--------------+-----------+----------+----+-------->
+ | |
+ | T1 |
+ | +-----+ |
+ CPU1 | |xxxxx| |
+ --|--+-----+----------------------------------|---------->
+ | | |
+ | | T1 T3 |
+ | | +----+ +---+ |
+ CPU2 | | |xxxx| |xxx| |
+ --|-----|-----------------+----+--------+---+-|---------->
+ | | | |
+ | | T1 | |
+ | | +--------------+ | |
+ CPU3 | | |xxxxxxxxxxxxxx| | |
+ --|-----|--+--------------+-|-----------------|---------->
+ | | | | |
+ v v v v v
+ +-----------------------------------------------------+
+ | Ring buffer |
+ +-----------------------------------------------------+
+
+ T1: Thread 1
+ x: Thread is in running state
+
+ Figure 4. Ring buffer for per-thread mode
+
+When perf runs in per-thread mode, a ring buffer is allocated for the
+profiled thread *T1*. The ring buffer is dedicated for thread *T1*, if the
+thread *T1* is running, the perf events will be recorded into the ring
+buffer; when the thread is sleeping, all associated events will be
+disabled, thus no trace data will be recorded into the ring buffer.
+
+2.2.3 Per-CPU mode
+^^^^^^^^^^^^^^^^^^
+
+The option ``-C`` is used to collect samples on the list of CPUs, for
+example the below perf command receives option ``-C 0,2``::
+
+ perf record -C 0,2 test_program
+
+It maps the perf event to CPUs 0 and 2, and the event is not associated to any
+PID. Thus the perf event attributions are set as::
+
+ evsel::cpus::map[0] = { 0, 2 }
+ evsel::threads::map[] = { -1 }
+ evsel::attr::inherit = 0
+
+This results in the session of ``perf record`` will sample all threads on CPU0
+and CPU2, and be terminated until test_program exits. Even there have tasks
+running on CPU1 and CPU3, since the ring buffer is absent for them, any
+activities on these two CPUs will be ignored. A usage case is to combine the
+options for per-thread mode and per-CPU mode, e.g. the options ``–C 0,2`` and
+``––per–thread`` are specified together, the samples are recorded only when
+the profiled thread is scheduled on any of the listed CPUs.
+
+::
+
+ T1 T2 T1
+ +----+ +-----------+ +----+
+ CPU0 |xxxx| |xxxxxxxxxxx| |xxxx|
+ +----+--------------+-----------+----------+----+-------->
+ | | |
+ v v v
+ +-----------------------------------------------------+
+ | Ring buffer 0 |
+ +-----------------------------------------------------+
+
+ T1
+ +-----+
+ CPU1 |xxxxx|
+ -----+-----+--------------------------------------------->
+
+ T1 T3
+ +----+ +-------+
+ CPU2 |xxxx| |xxxxxxx|
+ --------------------------+----+--------+-------+-------->
+ | |
+ v v
+ +-----------------------------------------------------+
+ | Ring buffer 1 |
+ +-----------------------------------------------------+
+
+ T1
+ +--------------+
+ CPU3 |xxxxxxxxxxxxxx|
+ -----------+--------------+------------------------------>
+
+ T1: Thread 1; T2: Thread 2; T3: Thread 3
+ x: Thread is in running state
+
+ Figure 5. Ring buffer for per-CPU mode
+
+2.2.4 System wide mode
+^^^^^^^^^^^^^^^^^^^^^^
+
+By using option ``–a`` or ``––all–cpus``, perf collects samples on all CPUs
+for all tasks, we call it as the system wide mode, the command is::
+
+ perf record -a test_program
+
+Similar to the per-CPU mode, the perf event doesn't bind to any PID, and
+it maps to all CPUs in the system::
+
+ evsel::cpus::map[] = { 0 .. _SC_NPROCESSORS_ONLN-1 }
+ evsel::threads::map[] = { -1 }
+ evsel::attr::inherit = 0
+
+In the system wide mode, every CPU has its own ring buffer, all threads
+are monitored during the running state and the samples are recorded into
+the ring buffer belonging to the CPU which the events occurred on.
+
+::
+
+ T1 T2 T1
+ +----+ +-----------+ +----+
+ CPU0 |xxxx| |xxxxxxxxxxx| |xxxx|
+ +----+--------------+-----------+----------+----+-------->
+ | | |
+ v v v
+ +-----------------------------------------------------+
+ | Ring buffer 0 |
+ +-----------------------------------------------------+
+
+ T1
+ +-----+
+ CPU1 |xxxxx|
+ -----+-----+--------------------------------------------->
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 1 |
+ +-----------------------------------------------------+
+
+ T1 T3
+ +----+ +-------+
+ CPU2 |xxxx| |xxxxxxx|
+ --------------------------+----+--------+-------+-------->
+ | |
+ v v
+ +-----------------------------------------------------+
+ | Ring buffer 2 |
+ +-----------------------------------------------------+
+
+ T1
+ +--------------+
+ CPU3 |xxxxxxxxxxxxxx|
+ -----------+--------------+------------------------------>
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 3 |
+ +-----------------------------------------------------+
+
+ T1: Thread 1; T2: Thread 2; T3: Thread 3
+ x: Thread is in running state
+
+ Figure 6. Ring buffer for system wide mode
+
+2.3 Accessing buffer
+--------------------
+
+Based on the understanding of how the ring buffer is allocated in
+various modes, this section explains access the ring buffer.
+
+2.3.1 Producer-consumer model
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the Linux kernel, the PMU events can produce samples which are stored
+into the ring buffer; the perf command in user space consumes the
+samples by reading out data from the ring buffer and finally saves the
+data into the file for post analysis. It’s a typical producer-consumer
+model for using the ring buffer.
+
+The perf process polls on the PMU events and sleeps when no events are
+incoming. To prevent frequent exchanges between the kernel and user
+space, the kernel event core layer introduces a watermark, which is
+stored in the ``perf_buffer::watermark``. When a sample is recorded into
+the ring buffer, and if the used buffer exceeds the watermark, the
+kernel wakes up the perf process to read samples from the ring buffer.
+
+::
+
+ Perf
+ / | Read samples
+ Polling / `--------------| Ring buffer
+ v v ;---------------------v
+ +----------------+ +---------+---------+ +-------------------+
+ |Event wait queue| |data_head|data_tail| |***|***| | |***|
+ +----------------+ +---------+---------+ +-------------------+
+ ^ ^ `------------------------^
+ | Wake up tasks | Store samples
+ +-----------------------------+
+ | Kernel event core layer |
+ +-----------------------------+
+
+ * : the data is filled by the writer.
+
+ Figure 7. Writing and reading the ring buffer
+
+When the kernel event core layer notifies the user space, because
+multiple events might share the same ring buffer for recording samples,
+the core layer iterates every event associated with the ring buffer and
+wakes up tasks waiting on the event. This is fulfilled by the kernel
+function ``ring_buffer_wakeup()``.
+
+After the perf process is woken up, it starts to check the ring buffers
+one by one, if it finds any ring buffer containing samples it will read
+out the samples for statistics or saving into the data file. Given the
+perf process is able to run on any CPU, this leads to the ring buffer
+potentially being accessed from multiple CPUs simultaneously, which
+causes race conditions. The race condition handling is described in the
+section :ref:`memory_synchronization`.
+
+2.3.2 Properties of the ring buffers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Linux kernel supports two write directions for the ring buffer: forward and
+backward. The forward writing saves samples from the beginning of the ring
+buffer, the backward writing stores data from the end of the ring buffer with
+the reversed direction. The perf tool determines the writing direction.
+
+Additionally, the tool can map buffers in either read-write mode or read-only
+mode to the user space.
+
+The ring buffer in the read-write mode is mapped with the property
+``PROT_READ | PROT_WRITE``. With the write permission, the perf tool
+updates the ``data_tail`` to indicate the data start position. Combining
+with the head pointer ``data_head``, which works as the end position of
+the current data, the perf tool can easily know where read out the data
+from.
+
+Alternatively, in the read-only mode, only the kernel keeps to update
+the ``data_head`` while the user space cannot access the ``data_tail`` due
+to the mapping property ``PROT_READ``.
+
+As a result, the matrix below illustrates the various combinations of
+direction and mapping characteristics. The perf tool employs two of these
+combinations to support buffer types: the non-overwrite buffer and the
+overwritable buffer.
+
+.. list-table::
+ :widths: 1 1 1
+ :header-rows: 1
+
+ * - Mapping mode
+ - Forward
+ - Backward
+ * - read-write
+ - Non-overwrite ring buffer
+ - Not used
+ * - read-only
+ - Not used
+ - Overwritable ring buffer
+
+The non-overwrite ring buffer uses the read-write mapping with forward
+writing. It starts to save data from the beginning of the ring buffer
+and wrap around when overflow, which is used with the read-write mode in
+the normal ring buffer. When the consumer doesn't keep up with the
+producer, it would lose some data, the kernel keeps how many records it
+lost and generates the ``PERF_RECORD_LOST`` records in the next time
+when it finds a space in the ring buffer.
+
+The overwritable ring buffer uses the backward writing with the
+read-only mode. It saves the data from the end of the ring buffer and
+the ``data_head`` keeps the position of current data, the perf always
+knows where it starts to read and until the end of the ring buffer, thus
+it don't need the ``data_tail``. In this mode, it will not generate the
+``PERF_RECORD_LOST`` records.
+
+.. _writing_samples_into_buffer:
+
+2.3.3 Writing samples into buffer
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When a sample is taken and saved into the ring buffer, the kernel
+prepares sample fields based on the sample type; then it prepares the
+info for writing ring buffer which is stored in the structure
+``perf_output_handle``. In the end, the kernel outputs the sample into
+the ring buffer and updates the head pointer in the user page so the
+perf tool can see the latest value.
+
+The structure ``perf_output_handle`` serves as a temporary context for
+tracking the information related to the buffer. The advantages of it is
+that it enables concurrent writing to the buffer by different events.
+For example, a software event and a hardware PMU event both are enabled
+for profiling, two instances of ``perf_output_handle`` serve as separate
+contexts for the software event and the hardware event respectively.
+This allows each event to reserve its own memory space for populating
+the record data.
+
+2.3.4 Reading samples from buffer
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the user space, the perf tool utilizes the ``perf_event_mmap_page``
+structure to handle the head and tail of the buffer. It also uses
+``perf_mmap`` structure to keep track of a context for the ring buffer, this
+context includes information about the buffer's starting and ending
+addresses. Additionally, the mask value can be utilized to compute the
+circular buffer pointer even for an overflow.
+
+Similar to the kernel, the perf tool in the user space first reads out
+the recorded data from the ring buffer, and then updates the buffer's
+tail pointer ``perf_event_mmap_page::data_tail``.
+
+.. _memory_synchronization:
+
+2.3.5 Memory synchronization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The modern CPUs with relaxed memory model cannot promise the memory
+ordering, this means it’s possible to access the ring buffer and the
+``perf_event_mmap_page`` structure out of order. To assure the specific
+sequence for memory accessing perf ring buffer, memory barriers are
+used to assure the data dependency. The rationale for the memory
+synchronization is as below::
+
+ Kernel User space
+
+ if (LOAD ->data_tail) { LOAD ->data_head
+ (A) smp_rmb() (C)
+ STORE $data LOAD $data
+ smp_wmb() (B) smp_mb() (D)
+ STORE ->data_head STORE ->data_tail
+ }
+
+The comments in tools/include/linux/ring_buffer.h gives nice description
+for why and how to use memory barriers, here we will just provide an
+alternative explanation:
+
+(A) is a control dependency so that CPU assures order between checking
+pointer ``perf_event_mmap_page::data_tail`` and filling sample into ring
+buffer;
+
+(D) pairs with (A). (D) separates the ring buffer data reading from
+writing the pointer ``data_tail``, perf tool first consumes samples and then
+tells the kernel that the data chunk has been released. Since a reading
+operation is followed by a writing operation, thus (D) is a full memory
+barrier.
+
+(B) is a writing barrier in the middle of two writing operations, which
+makes sure that recording a sample must be prior to updating the head
+pointer.
+
+(C) pairs with (B). (C) is a read memory barrier to ensure the head
+pointer is fetched before reading samples.
+
+To implement the above algorithm, the ``perf_output_put_handle()`` function
+in the kernel and two helpers ``ring_buffer_read_head()`` and
+``ring_buffer_write_tail()`` in the user space are introduced, they rely
+on memory barriers as described above to ensure the data dependency.
+
+Some architectures support one-way permeable barrier with load-acquire
+and store-release operations, these barriers are more relaxed with less
+performance penalty, so (C) and (D) can be optimized to use barriers
+``smp_load_acquire()`` and ``smp_store_release()`` respectively.
+
+If an architecture doesn’t support load-acquire and store-release in its
+memory model, it will roll back to the old fashion of memory barrier
+operations. In this case, ``smp_load_acquire()`` encapsulates
+``READ_ONCE()`` + ``smp_mb()``, since ``smp_mb()`` is costly,
+``ring_buffer_read_head()`` doesn't invoke ``smp_load_acquire()`` and it uses
+the barriers ``READ_ONCE()`` + ``smp_rmb()`` instead.
+
+3. The mechanism of AUX ring buffer
+===================================
+
+In this chapter, we will explain the implementation of the AUX ring
+buffer. In the first part it will discuss the connection between the
+AUX ring buffer and the regular ring buffer, then the second part will
+examine how the AUX ring buffer co-works with the regular ring buffer,
+as well as the additional features introduced by the AUX ring buffer for
+the sampling mechanism.
+
+3.1 The relationship between AUX and regular ring buffers
+---------------------------------------------------------
+
+Generally, the AUX ring buffer is an auxiliary for the regular ring
+buffer. The regular ring buffer is primarily used to store the event
+samples and every event format complies with the definition in the
+union ``perf_event``; the AUX ring buffer is for recording the hardware
+trace data and the trace data format is hardware IP dependent.
+
+The general use and advantage of the AUX ring buffer is that it is
+written directly by hardware rather than by the kernel. For example,
+regular profile samples that write to the regular ring buffer cause an
+interrupt. Tracing execution requires a high number of samples and
+using interrupts would be overwhelming for the regular ring buffer
+mechanism. Having an AUX buffer allows for a region of memory more
+decoupled from the kernel and written to directly by hardware tracing.
+
+The AUX ring buffer reuses the same algorithm with the regular ring
+buffer for the buffer management. The control structure
+``perf_event_mmap_page`` extends the new fields ``aux_head`` and ``aux_tail``
+for the head and tail pointers of the AUX ring buffer.
+
+During the initialisation phase, besides the mmap()-ed regular ring
+buffer, the perf tool invokes a second syscall in the
+``auxtrace_mmap__mmap()`` function for the mmap of the AUX buffer with
+non-zero file offset; ``rb_alloc_aux()`` in the kernel allocates pages
+correspondingly, these pages will be deferred to map into VMA when
+handling the page fault, which is the same lazy mechanism with the
+regular ring buffer.
+
+AUX events and AUX trace data are two different things. Let's see an
+example::
+
+ perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2
+
+The above command enables two events: one is the event *cycles* from PMU
+and another is the AUX event *cs_etm* from Arm CoreSight, both are saved
+into the regular ring buffer while the CoreSight's AUX trace data is
+stored in the AUX ring buffer.
+
+As a result, we can see the regular ring buffer and the AUX ring buffer
+are allocated in pairs. The perf in default mode allocates the regular
+ring buffer and the AUX ring buffer per CPU-wise, which is the same as
+the system wide mode, however, the default mode records samples only for
+the profiled program, whereas the latter mode profiles for all programs
+in the system. For per-thread mode, the perf tool allocates only one
+regular ring buffer and one AUX ring buffer for the whole session. For
+the per-CPU mode, the perf allocates two kinds of ring buffers for
+selected CPUs specified by the option ``-C``.
+
+The below figure demonstrates the buffers' layout in the system wide
+mode; if there are any activities on one CPU, the AUX event samples and
+the hardware trace data will be recorded into the dedicated buffers for
+the CPU.
+
+::
+
+ T1 T2 T1
+ +----+ +-----------+ +----+
+ CPU0 |xxxx| |xxxxxxxxxxx| |xxxx|
+ +----+--------------+-----------+----------+----+-------->
+ | | |
+ v v v
+ +-----------------------------------------------------+
+ | Ring buffer 0 |
+ +-----------------------------------------------------+
+ | | |
+ v v v
+ +-----------------------------------------------------+
+ | AUX Ring buffer 0 |
+ +-----------------------------------------------------+
+
+ T1
+ +-----+
+ CPU1 |xxxxx|
+ -----+-----+--------------------------------------------->
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 1 |
+ +-----------------------------------------------------+
+ |
+ v
+ +-----------------------------------------------------+
+ | AUX Ring buffer 1 |
+ +-----------------------------------------------------+
+
+ T1 T3
+ +----+ +-------+
+ CPU2 |xxxx| |xxxxxxx|
+ --------------------------+----+--------+-------+-------->
+ | |
+ v v
+ +-----------------------------------------------------+
+ | Ring buffer 2 |
+ +-----------------------------------------------------+
+ | |
+ v v
+ +-----------------------------------------------------+
+ | AUX Ring buffer 2 |
+ +-----------------------------------------------------+
+
+ T1
+ +--------------+
+ CPU3 |xxxxxxxxxxxxxx|
+ -----------+--------------+------------------------------>
+ |
+ v
+ +-----------------------------------------------------+
+ | Ring buffer 3 |
+ +-----------------------------------------------------+
+ |
+ v
+ +-----------------------------------------------------+
+ | AUX Ring buffer 3 |
+ +-----------------------------------------------------+
+
+ T1: Thread 1; T2: Thread 2; T3: Thread 3
+ x: Thread is in running state
+
+ Figure 8. AUX ring buffer for system wide mode
+
+3.2 AUX events
+--------------
+
+Similar to ``perf_output_begin()`` and ``perf_output_end()``'s working for the
+regular ring buffer, ``perf_aux_output_begin()`` and ``perf_aux_output_end()``
+serve for the AUX ring buffer for processing the hardware trace data.
+
+Once the hardware trace data is stored into the AUX ring buffer, the PMU
+driver will stop hardware tracing by calling the ``pmu::stop()`` callback.
+Similar to the regular ring buffer, the AUX ring buffer needs to apply
+the memory synchronization mechanism as discussed in the section
+:ref:`memory_synchronization`. Since the AUX ring buffer is managed by the
+PMU driver, the barrier (B), which is a writing barrier to ensure the trace
+data is externally visible prior to updating the head pointer, is asked
+to be implemented in the PMU driver.
+
+Then ``pmu::stop()`` can safely call the ``perf_aux_output_end()`` function to
+finish two things:
+
+- It fills an event ``PERF_RECORD_AUX`` into the regular ring buffer, this
+ event delivers the information of the start address and data size for a
+ chunk of hardware trace data has been stored into the AUX ring buffer;
+
+- Since the hardware trace driver has stored new trace data into the AUX
+ ring buffer, the argument *size* indicates how many bytes have been
+ consumed by the hardware tracing, thus ``perf_aux_output_end()`` updates the
+ header pointer ``perf_buffer::aux_head`` to reflect the latest buffer usage.
+
+At the end, the PMU driver will restart hardware tracing. During this
+temporary suspending period, it will lose hardware trace data, which
+will introduce a discontinuity during decoding phase.
+
+The event ``PERF_RECORD_AUX`` presents an AUX event which is handled in the
+kernel, but it lacks the information for saving the AUX trace data in
+the perf file. When the perf tool copies the trace data from AUX ring
+buffer to the perf data file, it synthesizes a ``PERF_RECORD_AUXTRACE``
+event which is not a kernel ABI, it's defined by the perf tool to describe
+which portion of data in the AUX ring buffer is saved. Afterwards, the perf
+tool reads out the AUX trace data from the perf file based on the
+``PERF_RECORD_AUXTRACE`` events, and the ``PERF_RECORD_AUX`` event is used to
+decode a chunk of data by correlating with time order.
+
+3.3 Snapshot mode
+-----------------
+
+Perf supports snapshot mode for AUX ring buffer, in this mode, users
+only record AUX trace data at a specific time point which users are
+interested in. E.g. below gives an example of how to take snapshots
+with 1 second interval with Arm CoreSight::
+
+ perf record -e cs_etm/@tmc_etr0/u -S -a program &
+ PERFPID=$!
+ while true; do
+ kill -USR2 $PERFPID
+ sleep 1
+ done
+
+The main flow for snapshot mode is:
+
+- Before a snapshot is taken, the AUX ring buffer acts in free run mode.
+ During free run mode the perf doesn't record any of the AUX events and
+ trace data;
+
+- Once the perf tool receives the *USR2* signal, it triggers the callback
+ function ``auxtrace_record::snapshot_start()`` to deactivate hardware
+ tracing. The kernel driver then populates the AUX ring buffer with the
+ hardware trace data, and the event ``PERF_RECORD_AUX`` is stored in the
+ regular ring buffer;
+
+- Then perf tool takes a snapshot, ``record__read_auxtrace_snapshot()``
+ reads out the hardware trace data from the AUX ring buffer and saves it
+ into perf data file;
+
+- After the snapshot is finished, ``auxtrace_record::snapshot_finish()``
+ restarts the PMU event for AUX tracing.
+
+The perf only accesses the head pointer ``perf_event_mmap_page::aux_head``
+in snapshot mode and doesn’t touch tail pointer ``aux_tail``, this is
+because the AUX ring buffer can overflow in free run mode, the tail
+pointer is useless in this case. Alternatively, the callback
+``auxtrace_record::find_snapshot()`` is introduced for making the decision
+of whether the AUX ring buffer has been wrapped around or not, at the
+end it fixes up the AUX buffer's head which are used to calculate the
+trace data size.
+
+As we know, the buffers' deployment can be per-thread mode, per-CPU
+mode, or system wide mode, and the snapshot can be applied to any of
+these modes. Below is an example of taking snapshot with system wide
+mode.
+
+::
+
+ Snapshot is taken
+ |
+ v
+ +------------------------+
+ | AUX Ring buffer 0 | <- aux_head
+ +------------------------+
+ v
+ +--------------------------------+
+ | AUX Ring buffer 1 | <- aux_head
+ +--------------------------------+
+ v
+ +--------------------------------------------+
+ | AUX Ring buffer 2 | <- aux_head
+ +--------------------------------------------+
+ v
+ +---------------------------------------+
+ | AUX Ring buffer 3 | <- aux_head
+ +---------------------------------------+
+
+ Figure 9. Snapshot with system wide mode