aboutsummaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/cgroup-v2.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/cgroup-v2.rst')
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst99
1 files changed, 80 insertions, 19 deletions
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 17e6e9565156..86311c2907cd 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -239,6 +239,13 @@ cgroup v2 currently supports the following mount options.
will not be tracked by the memory controller (even if cgroup
v2 is remounted later on).
+ pids_localevents
+ The option restores v1-like behavior of pids.events:max, that is only
+ local (inside cgroup proper) fork failures are counted. Without this
+ option pids.events.max represents any pids.max enforcemnt across
+ cgroup's subtree.
+
+
Organizing Processes and Threads
--------------------------------
@@ -1058,12 +1065,15 @@ cpufreq governor about the minimum desired frequency which should always be
provided by a CPU, as well as the maximum desired frequency, which should not
be exceeded by a CPU.
-WARNING: cgroup2 doesn't yet support control of realtime processes and
-the cpu controller can only be enabled when all RT processes are in
-the root cgroup. Be aware that system management software may already
-have placed RT processes into nonroot cgroups during the system boot
-process, and these processes may need to be moved to the root cgroup
-before the cpu controller can be enabled.
+WARNING: cgroup2 doesn't yet support control of realtime processes. For
+a kernel built with the CONFIG_RT_GROUP_SCHED option enabled for group
+scheduling of realtime processes, the cpu controller can only be enabled
+when all RT processes are in the root cgroup. This limitation does
+not apply if CONFIG_RT_GROUP_SCHED is disabled. Be aware that system
+management software may already have placed RT processes into nonroot
+cgroups during the system boot process, and these processes may need
+to be moved to the root cgroup before the cpu controller can be enabled
+with a CONFIG_RT_GROUP_SCHED enabled kernel.
CPU Interface Files
@@ -1296,17 +1306,10 @@ PAGE_SIZE multiple when read back.
This is a simple interface to trigger memory reclaim in the
target cgroup.
- This file accepts a single key, the number of bytes to reclaim.
- No nested keys are currently supported.
-
Example::
echo "1G" > memory.reclaim
- The interface can be later extended with nested keys to
- configure the reclaim behavior. For example, specify the
- type of memory to reclaim from (anon, file, ..).
-
Please note that the kernel can over or under reclaim from
the target cgroup. If less bytes are reclaimed than the
specified amount, -EAGAIN is returned.
@@ -1318,6 +1321,17 @@ PAGE_SIZE multiple when read back.
This means that the networking layer will not adapt based on
reclaim induced by memory.reclaim.
+The following nested keys are defined.
+
+ ========== ================================
+ swappiness Swappiness value to reclaim with
+ ========== ================================
+
+ Specifying a swappiness value instructs the kernel to perform
+ the reclaim with that swappiness value. Note that this has the
+ same semantics as vm.swappiness applied to memcg reclaim with
+ all the existing limitations and potential future extensions.
+
memory.peak
A read-only single value file which exists on non-root
cgroups.
@@ -1432,7 +1446,7 @@ PAGE_SIZE multiple when read back.
sec_pagetables
Amount of memory allocated for secondary page tables,
this currently includes KVM mmu allocations on x86
- and arm64.
+ and arm64 and IOMMU page tables.
percpu (npn)
Amount of memory used for storing per-cpu kernel
@@ -1572,6 +1586,15 @@ PAGE_SIZE multiple when read back.
pglazyfreed (npn)
Amount of reclaimed lazyfree pages
+ zswpin
+ Number of pages moved in to memory from zswap.
+
+ zswpout
+ Number of pages moved out of memory to zswap.
+
+ zswpwb
+ Number of pages written from zswap to swap.
+
thp_fault_alloc (npn)
Number of transparent hugepages which were allocated to satisfy
a page fault. This counter is not present when CONFIG_TRANSPARENT_HUGEPAGE
@@ -2181,11 +2204,31 @@ PID Interface Files
Hard limit of number of processes.
pids.current
- A read-only single value file which exists on all cgroups.
+ A read-only single value file which exists on non-root cgroups.
The number of processes currently in the cgroup and its
descendants.
+ pids.peak
+ A read-only single value file which exists on non-root cgroups.
+
+ The maximum value that the number of processes in the cgroup and its
+ descendants has ever reached.
+
+ pids.events
+ A read-only flat-keyed file which exists on non-root cgroups. Unless
+ specified otherwise, a value change in this file generates a file
+ modified event. The following entries are defined.
+
+ max
+ The number of times the cgroup's total number of processes hit the pids.max
+ limit (see also pids_localevents).
+
+ pids.events.local
+ Similar to pids.events but the fields in the file are local
+ to the cgroup i.e. not hierarchical. The file modified event
+ generated on this file reflects only the local events.
+
Organisational operations are not blocked by cgroup policies, so it is
possible to have pids.current > pids.max. This can be done by either
setting the limit to be smaller than pids.current, or attaching enough
@@ -2320,8 +2363,12 @@ Cpuset Interface Files
is always a subset of it.
Users can manually set it to a value that is different from
- "cpuset.cpus". The only constraint in setting it is that the
- list of CPUs must be exclusive with respect to its sibling.
+ "cpuset.cpus". One constraint in setting it is that the list of
+ CPUs must be exclusive with respect to "cpuset.cpus.exclusive"
+ of its sibling. If "cpuset.cpus.exclusive" of a sibling cgroup
+ isn't set, its "cpuset.cpus" value, if set, cannot be a subset
+ of it to leave at least one CPU available when the exclusive
+ CPUs are taken away.
For a parent cgroup, any one of its exclusive CPUs can only
be distributed to at most one of its child cgroups. Having an
@@ -2337,8 +2384,8 @@ Cpuset Interface Files
cpuset-enabled cgroups.
This file shows the effective set of exclusive CPUs that
- can be used to create a partition root. The content of this
- file will always be a subset of "cpuset.cpus" and its parent's
+ can be used to create a partition root. The content
+ of this file will always be a subset of its parent's
"cpuset.cpus.exclusive.effective" if its parent is not the root
cgroup. It will also be a subset of "cpuset.cpus.exclusive"
if it is set. If "cpuset.cpus.exclusive" is not set, it is
@@ -2599,6 +2646,15 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
res_a 3
res_b 0
+ misc.peak
+ A read-only flat-keyed file shown in all cgroups. It shows the
+ historical maximum usage of the resources in the cgroup and its
+ children.::
+
+ $ cat misc.peak
+ res_a 10
+ res_b 8
+
misc.max
A read-write flat-keyed file shown in the non root cgroups. Allowed
maximum usage of the resources in the cgroup and its children.::
@@ -2628,6 +2684,11 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
The number of times the cgroup's resource usage was
about to go over the max boundary.
+ misc.events.local
+ Similar to misc.events but the fields in the file are local to the
+ cgroup i.e. not hierarchical. The file modified event generated on
+ this file reflects only the local events.
+
Migration and Ownership
~~~~~~~~~~~~~~~~~~~~~~~