Lines Matching +full:partition +full:- +full:file +full:- +full:system
1 .. _cgroup-v2:
11 conventions of cgroup v2. It describes all userland-visible aspects
14 v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgroup-v1>`.
19 1-1. Terminology
20 1-2. What is cgroup?
22 2-1. Mounting
23 2-2. Organizing Processes and Threads
24 2-2-1. Processes
25 2-2-2. Threads
26 2-3. [Un]populated Notification
27 2-4. Controlling Controllers
28 2-4-1. Enabling and Disabling
29 2-4-2. Top-down Constraint
30 2-4-3. No Internal Process Constraint
31 2-5. Delegation
32 2-5-1. Model of Delegation
33 2-5-2. Delegation Containment
34 2-6. Guidelines
35 2-6-1. Organize Once and Control
36 2-6-2. Avoid Name Collisions
38 3-1. Weights
39 3-2. Limits
40 3-3. Protections
41 3-4. Allocations
43 4-1. Format
44 4-2. Conventions
45 4-3. Core Interface Files
47 5-1. CPU
48 5-1-1. CPU Interface Files
49 5-2. Memory
50 5-2-1. Memory Interface Files
51 5-2-2. Usage Guidelines
52 5-2-3. Memory Ownership
53 5-3. IO
54 5-3-1. IO Interface Files
55 5-3-2. Writeback
56 5-3-3. IO Latency
57 5-3-3-1. How IO Latency Throttling Works
58 5-3-3-2. IO Latency Interface Files
59 5-3-4. IO Priority
60 5-4. PID
61 5-4-1. PID Interface Files
62 5-5. Cpuset
63 5.5-1. Cpuset Interface Files
64 5-6. Device
65 5-7. RDMA
66 5-7-1. RDMA Interface Files
67 5-8. HugeTLB
68 5.8-1. HugeTLB Interface Files
69 5-9. Misc
70 5.9-1 Miscellaneous cgroup Interface Files
71 5.9-2 Migration and Ownership
72 5-10. Others
73 5-10-1. perf_event
74 5-N. Non-normative information
75 5-N-1. CPU controller root cgroup process behaviour
76 5-N-2. IO controller root cgroup process behaviour
78 6-1. Basics
79 6-2. The Root and Views
80 6-3. Migration and setns(2)
81 6-4. Interaction with Other Namespaces
83 P-1. Filesystem Support for Writeback
86 R-1. Multiple Hierarchies
87 R-2. Thread Granularity
88 R-3. Competition Between Inner Nodes and Threads
89 R-4. Other Interface Issues
90 R-5. Controller Issues and Remedies
91 R-5-1. Memory
98 -----------
107 ---------------
110 distribute system resources along the hierarchy in a controlled and
113 cgroup is largely composed of two parts - the core and controllers.
116 distributing a specific type of system resource along the hierarchy
120 cgroups form a tree structure and every process in the system belongs
129 hierarchical - if a controller is enabled on a cgroup, it affects all
131 sub-hierarchy of the cgroup. When a controller is enabled on a nested
141 --------
146 # mount -t cgroup2 none $MOUNT_POINT
156 is no longer referenced in its current hierarchy. Because per-cgroup
163 to inter-controller dependencies, other controllers may need to be
170 controllers after system boot.
172 During transition to v2, system management software might still
182 option is system wide and can only be set on mount or modified
184 ignored on non-init namespace mounts. Please refer to the
199 This option is system wide and can only be set on mount or
201 option is ignored on non-init namespace mounts.
209 behavior but is a mount-option to avoid regressing setups
223 controller. The pre-allocated pool does not belong to anyone.
243 The option restores v1-like behavior of pids.events:max, that is only
251 --------------------------------
257 A child cgroup can be created by creating a sub-directory::
262 structure. Each cgroup has a read-writable interface file
264 belong to the cgroup one-per-line. The PIDs are not ordered and the
269 target cgroup's "cgroup.procs" file. Only one process can be migrated
289 cgroup is in use in the system, this file may contain multiple lines,
295 0::/test-cgroup/test-cgroup-nested
302 0::/test-cgroup/test-cgroup-nested (deleted)
328 constraint - threaded controllers can be enabled on non-leaf cgroups
339 "cgroup.type" file which indicates whether the cgroup is a normal
344 threaded by writing "threaded" to the "cgroup.type" file. The
352 - As the cgroup will join the parent's resource domain. The parent
355 - When the parent is an unthreaded domain, it must not have any domain
359 Topology-wise, a cgroup can be in an invalid state. Please consider
362 A (threaded domain) - B (threaded) - C (domain, just created)
366 threaded cgroup. "cgroup.type" file will report "domain (invalid)" in
372 "cgroup.subtree_control" file while there are processes in the cgroup.
377 threads in the cgroup. Except that the operations are per-thread
378 instead of per-process, "cgroup.threads" has the same format and
400 between threads in a non-leaf cgroup and its child cgroups. Each
406 - cpu
407 - cpuset
408 - perf_event
409 - pids
412 --------------------------
414 Each non-root cgroup has a "cgroup.events" file which contains
415 "populated" field indicating whether the cgroup's sub-hierarchy has
419 example, to start a clean-up operation after all processes of a given
420 sub-hierarchy have exited. The populated state updates and
421 notifications are recursive. Consider the following sub-hierarchy
425 A(4) - B(0) - C(1)
430 file modified events will be generated on the "cgroup.events" files of
435 -----------------------
440 Each cgroup has a "cgroup.controllers" file which lists all
447 disabled by writing to the "cgroup.subtree_control" file::
449 # echo "+cpu +memory -io" > cgroup.subtree_control
458 Consider the following sub-hierarchy. The enabled controllers are
461 A(cpu,memory) - B(memory) - C()
475 controller interface files - anything which doesn't start with
479 Top-down Constraint
482 Resources are distributed top-down and a cgroup can further distribute
484 parent. This means that all non-root "cgroup.subtree_control" files
486 "cgroup.subtree_control" file. A controller can be enabled only if
494 Non-root cgroups can distribute domain resources to their children
509 refer to the Non-normative information section in the Controllers
518 file.
522 ----------
544 delegated, the user can build sub-hierarchy under the directory,
548 happens in the delegated sub-hierarchy, nothing can escape the
552 cgroups in or nesting depth of a delegated sub-hierarchy; however,
559 A delegated sub-hierarchy is contained in the sense that processes
560 can't be moved into or out of the sub-hierarchy by the delegatee.
563 requiring the following conditions for a process with a non-root euid
565 "cgroup.procs" file.
567 - The writer must have write access to the "cgroup.procs" file.
569 - The writer must have write access to the "cgroup.procs" file of the
573 processes around freely in the delegated sub-hierarchy it can't pull
574 in from or push out to outside the sub-hierarchy.
580 ~~~~~~~~~~~~~ - C0 - C00
583 ~~~~~~~~~~~~~ - C1 - C10
587 file; however, the common ancestor of the source cgroup C10 and the
590 will be denied with -EACCES.
595 is not reachable, the migration is rejected with -ENOENT.
599 ----------
607 inherent trade-offs between migration and various hot paths in terms
612 should be assigned to a cgroup according to the system's logical and
613 resource structure once on start-up. Dynamic adjustments to resource
629 character for collision avoidance. Also, interface file names won't
646 -------
652 work-conserving. Due to the dynamic nature, this model is usually
667 .. _cgroupv2-limits-distributor:
670 ------
673 Limits can be over-committed - the sum of the limits of children can
678 As limits can be over-committed, all configuration combinations are
685 .. _cgroupv2-protections-distributor:
688 -----------
693 soft boundaries. Protections can also be over-committed in which case
700 As protections can be over-committed, all configuration combinations
704 "memory.low" implements best-effort memory protection and is an
709 -----------
712 resource. Allocations can't be over-committed - the sum of the
719 As allocations can't be over-committed, some configuration
724 "cpu.rt.max" hard-allocates realtime slices and is an example of this
732 ------
737 New-line separated values
745 (when read-only or multiple values can be written at once)
761 For a writable file, the format for writing should generally match
771 -----------
773 - Settings for a single feature should be contained in a single file.
775 - The root cgroup should be exempt from resource control and thus
778 - The default time unit is microseconds. If a different unit is ever
781 - A parts-per quantity should use a percentage decimal with at least
782 two digit fractional part - e.g. 13.40.
784 - If a controller implements weight based resource distribution, its
785 interface file should be named "weight" and have the range [1,
790 - If a controller implements an absolute resource guarantee and/or
799 - If a setting has a configurable default value and keyed specific
801 appear as the first entry in the file.
813 # cat cgroup-example-interface-file
819 # echo 125 > cgroup-example-interface-file
823 # echo "default 125" > cgroup-example-interface-file
827 # echo "8:16 170" > cgroup-example-interface-file
831 # echo "8:0 default" > cgroup-example-interface-file
832 # cat cgroup-example-interface-file
836 - For events which are not very high frequency, an interface file
838 Whenever a notifiable event happens, file modified event should be
839 generated on the file.
843 --------------------
848 A read-write single value file which exists on non-root
854 - "domain" : A normal valid domain cgroup.
856 - "domain threaded" : A threaded domain cgroup which is
859 - "domain invalid" : A cgroup which is in an invalid state.
863 - "threaded" : A threaded cgroup which is a member of a
867 "threaded" to this file.
870 A read-write new-line separated values file which exists on
874 the cgroup one-per-line. The PIDs are not ordered and the
883 - It must have write access to the "cgroup.procs" file.
885 - It must have write access to the "cgroup.procs" file of the
888 When delegating a sub-hierarchy, write access to this file
891 In a threaded cgroup, reading this file fails with EOPNOTSUPP
896 A read-write new-line separated values file which exists on
900 the cgroup one-per-line. The TIDs are not ordered and the
909 - It must have write access to the "cgroup.threads" file.
911 - The cgroup that the thread is currently in must be in the
914 - It must have write access to the "cgroup.procs" file of the
917 When delegating a sub-hierarchy, write access to this file
921 A read-only space separated values file which exists on all
928 A read-write space separated values file which exists on all
935 Space separated list of controllers prefixed with '+' or '-'
937 name prefixed with '+' enables the controller and '-'
943 A read-only flat-keyed file which exists on non-root cgroups.
945 otherwise, a value change in this file generates a file
955 A read-write single value files. The default is "max".
962 A read-write single value files. The default is "max".
969 A read-only flat-keyed file with the following entries:
978 on system load) before being completely destroyed.
983 A dying cgroup can consume system resources not exceeding
995 A read-write single value file which exists on non-root cgroups.
998 Writing "1" to the file causes freezing of the cgroup and all
1002 is completed, the "frozen" value in the cgroup.events control file
1018 create new sub-cgroups.
1021 A write-only single value file which exists in non-root cgroups.
1024 Writing "1" to the file causes the cgroup and all descendant cgroups to
1031 In a threaded cgroup, writing this file fails with EOPNOTSUPP as
1033 the whole thread-group.
1036 A read-write single value file that allowed values are "0" and "1".
1039 Writing "0" to the file will disable the cgroup PSI accounting.
1040 Writing "1" to the file will re-enable the cgroup PSI accounting.
1048 This may cause non-negligible overhead for some workloads when under
1050 be used to disable PSI accounting in the non-leaf cgroups.
1053 A read-write nested-keyed file.
1061 .. _cgroup-v2-cpu:
1064 ---
1082 not apply if CONFIG_RT_GROUP_SCHED is disabled. Be aware that system
1084 cgroups during the system boot process, and these processes may need
1095 A read-only flat-keyed file.
1096 This file exists whether the controller is enabled or not.
1100 - usage_usec
1101 - user_usec
1102 - system_usec
1106 - nr_periods
1107 - nr_throttled
1108 - throttled_usec
1109 - nr_bursts
1110 - burst_usec
1113 A read-write single value file which exists on non-root
1123 A read-write single value file which exists on non-root
1126 The nice value is in the range [-20, 19].
1128 This interface file is an alternative interface for
1135 A read-write two value file which exists on non-root cgroups.
1147 A read-write single value file which exists on non-root
1153 A read-write nested-keyed file.
1159 A read-write single value file which exists on non-root cgroups.
1174 A read-write single value file which exists on non-root cgroups.
1185 A read-write single value file which exists on non-root cgroups.
1188 This is the cgroup analog of the per-task SCHED_IDLE sched policy.
1197 ------
1205 While not completely water-tight, all major memory usages by a given
1210 - Userland memory - page cache and anonymous memory.
1212 - Kernel data structures such as dentries and inodes.
1214 - TCP socket buffers.
1227 A read-only single value file which exists on non-root
1234 A read-write single value file which exists on non-root
1260 A read-write single value file which exists on non-root
1263 Best-effort memory protection. If the memory usage of a
1283 A read-write single value file which exists on non-root
1297 A read-write single value file which exists on non-root
1306 In default configuration regular 0-order allocations always
1311 as -ENOMEM or silently ignore in cases like disk readahead.
1314 A write-only nested-keyed file which exists for all cgroups.
1325 specified amount, -EAGAIN is returned.
1346 A read-write single value file which exists on non-root cgroups.
1351 A write of any non-empty string to this file resets it to the
1353 file descriptor.
1356 A read-write single value file which exists on non-root
1366 Tasks with the OOM protection (oom_score_adj set to -1000)
1374 A read-only flat-keyed file which exists on non-root cgroups.
1376 otherwise, a value change in this file generates a file
1379 Note that all fields in this file are hierarchical and the
1380 file modified event can be generated due to an event down the
1388 boundary is over-committed.
1408 considered as an option, e.g. for failed high-order
1419 Similar to memory.events but the fields in the file are local
1420 to the cgroup i.e. not hierarchical. The file modified event
1421 generated on this file reflects only the local events.
1424 A read-only flat-keyed file which exists on non-root cgroups.
1427 types of memory, type-specific details, and other information
1428 on the state and past events of the memory management system.
1436 If the entry has no per-node counter (or not show in the
1437 memory.numa_stat). We use 'npn' (non-per-node) as the tag
1444 file
1465 Amount of memory used for storing per-cpu kernel
1475 Amount of cached filesystem data that is swap-backed,
1512 Amount of memory, swap-backed and filesystem-backed,
1518 the value for the foo counter, since the foo counter is type-based, not
1519 list-based.
1530 Amount of memory used for storing in-kernel data
1537 Number of refaults of previously evicted file pages.
1544 Number of refaulted file pages that were immediately activated.
1551 Number of restored file pages which have been detected as an
1608 Number of zero-filled pages swapped out with I/O skipped due to the
1659 A read-only nested-keyed file which exists on non-root cgroups.
1662 types of memory, type-specific details, and other information
1663 per node on the state of the memory management system.
1684 A read-only single value file which exists on non-root
1691 A read-write single value file which exists on non-root
1696 allow userspace to implement custom out-of-memory procedures.
1707 A read-write single value file which exists on non-root cgroups.
1712 A write of any non-empty string to this file resets it to the
1714 file descriptor.
1717 A read-write single value file which exists on non-root
1724 A read-only flat-keyed file which exists on non-root cgroups.
1726 otherwise, a value change in this file generates a file
1740 because of running out of swap system-wide or max
1749 A read-only single value file which exists on non-root
1756 A read-write single value file which exists on non-root
1764 A read-write single value file. The default value is "1".
1782 A read-only nested-keyed file.
1792 Over-committing on high limit (sum of high limits > available memory)
1804 network to a file can use all available memory but can also operate as
1806 pressure - how much the workload is being impacted due to lack of
1807 memory - is necessary to determine whether a workload needs more
1821 To which cgroup the area will be charged is in-deterministic; however,
1832 --
1837 only if cfq-iosched is in use and neither scheme is available for
1838 blk-mq devices.
1845 A read-only nested-keyed file.
1865 A read-write nested-keyed file which exists only on the root
1868 This file configures the Quality of Service of the IO cost
1877 enable Weight-based control enable
1909 devices which show wide temporary behavior changes - e.g. a
1920 A read-write nested-keyed file which exists only on the root
1923 This file configures the cost model of the IO cost model based
1933 model The cost model in use - "linear"
1959 generate device-specific coefficients.
1962 A read-write flat-keyed file which exists on non-root cgroups.
1982 A read-write nested-keyed file which exists on non-root
1996 When writing, any number of nested key-value pairs can be
2021 A read-only nested-keyed file.
2040 writes out dirty pages for the memory domain. Both system-wide and
2041 per-cgroup dirty memory states are examined and the more restrictive
2079 memory controller and system-wide clean memory.
2112 your real setting, setting at 10-15% higher than the value in io.stat.
2122 - Queue depth throttling. This is the number of outstanding IO's a group is
2126 - Artificial delay induction. There are certain types of IO that cannot be
2173 no-change
2176 promote-to-rt
2177 For requests that have a non-RT I/O priority class, change it into RT.
2181 restrict-to-be
2191 none-to-rt
2192 Deprecated. Just an alias for promote-to-rt.
2196 +----------------+---+
2197 | no-change | 0 |
2198 +----------------+---+
2199 | promote-to-rt | 1 |
2200 +----------------+---+
2201 | restrict-to-be | 2 |
2202 +----------------+---+
2204 +----------------+---+
2208 +-------------------------------+---+
2210 +-------------------------------+---+
2211 | IOPRIO_CLASS_RT (real-time) | 1 |
2212 +-------------------------------+---+
2214 +-------------------------------+---+
2216 +-------------------------------+---+
2220 - If I/O priority class policy is promote-to-rt, change the request I/O
2223 - If I/O priority class policy is not promote-to-rt, translate the I/O priority
2229 ---
2248 A read-write single value file which exists on non-root
2254 A read-only single value file which exists on non-root cgroups.
2260 A read-only single value file which exists on non-root cgroups.
2266 A read-only flat-keyed file which exists on non-root cgroups. Unless
2267 specified otherwise, a value change in this file generates a file
2275 Similar to pids.events but the fields in the file are local
2276 to the cgroup i.e. not hierarchical. The file modified event
2277 generated on this file reflects only the local events.
2284 through fork() or clone(). These will return -EAGAIN if the creation
2289 ------
2296 memory placement to reduce cross-node memory access and contention
2297 can improve overall system performance.
2307 A read-write multiple values file which exists on non-root
2308 cpuset-enabled cgroups.
2315 The CPU numbers are comma-separated numbers or ranges.
2319 0-4,6,8-10
2322 setting as the nearest cgroup ancestor with a non-empty
2329 A read-only multiple values file which exists on all
2330 cpuset-enabled cgroups.
2336 If "cpuset.cpus" is empty, the "cpuset.cpus.effective" file shows
2346 A read-write multiple values file which exists on non-root
2347 cpuset-enabled cgroups.
2354 The memory node numbers are comma-separated numbers or ranges.
2358 0-1,3
2361 setting as the nearest cgroup ancestor with a non-empty
2368 Setting a non-empty value to "cpuset.mems" causes memory of
2380 A read-only multiple values file which exists on all
2381 cpuset-enabled cgroups.
2396 A read-write multiple values file which exists on non-root
2397 cpuset-enabled cgroups.
2400 to create a new cpuset partition. Its value is not used
2401 unless the cgroup becomes a valid partition root. See the
2402 "cpuset.cpus.partition" section below for a description of what
2403 a cpuset partition is.
2405 When the cgroup becomes a partition root, the actual exclusive
2406 CPUs that are allocated to that partition are listed in
2426 The root cgroup is a partition root and all its available CPUs
2430 A read-only multiple values file which exists on all non-root
2431 cpuset-enabled cgroups.
2433 This file shows the effective set of exclusive CPUs that
2434 can be used to create a partition root. The content
2435 of this file will always be a subset of its parent's
2440 formation of local partition.
2443 A read-only and root cgroup only multiple values file.
2445 This file shows the set of all isolated CPUs used in existing
2446 isolated partitions. It will be empty if no isolated partition
2449 cpuset.cpus.partition
2450 A read-write single value file which exists on non-root
2451 cpuset-enabled cgroups. This flag is owned by the parent cgroup
2457 "member" Non-root member of a partition
2458 "root" Partition root
2459 "isolated" Partition root without load balancing
2462 A cpuset partition is a collection of cpuset-enabled cgroups with
2463 a partition root at the top of the hierarchy and its descendants
2464 except those that are separate partition roots themselves and
2465 their descendants. A partition has exclusive access to the
2467 of that partition cannot use any CPUs in that set.
2469 There are two types of partitions - local and remote. A local
2470 partition is one whose parent cgroup is also a valid partition
2471 root. A remote partition is one whose parent cgroup is not a
2472 valid partition root itself. Writing to "cpuset.cpus.exclusive"
2473 is optional for the creation of a local partition as its
2474 "cpuset.cpus.exclusive" file will assume an implicit value that
2477 before the target partition root is mandatory for the creation
2478 of a remote partition.
2480 Currently, a remote partition cannot be created under a local
2481 partition. All the ancestors of a remote partition root except
2482 the root cgroup cannot be a partition root.
2484 The root cgroup is always a partition root and its state cannot
2485 be changed. All other non-root cgroups start out as "member".
2488 partition or scheduling domain. The set of exclusive CPUs is
2491 When set to "isolated", the CPUs in that partition will be in
2494 a partition with multiple CPUs should be carefully distributed
2497 A partition root ("root" or "isolated") can be in one of the
2498 two possible states - valid or invalid. An invalid partition
2505 On read, the "cpuset.cpus.partition" file can show the following
2509 "member" Non-root member of a partition
2510 "root" Partition root
2511 "isolated" Partition root without load balancing
2512 "root invalid (<reason>)" Invalid partition root
2513 "isolated invalid (<reason>)" Invalid isolated partition root
2516 In the case of an invalid partition root, a descriptive string on
2517 why the partition is invalid is included within parentheses.
2519 For a local partition root to be valid, the following conditions
2522 1) The parent cgroup is a valid partition root.
2523 2) The "cpuset.cpus.exclusive.effective" file cannot be empty,
2526 no task associated with this partition.
2528 For a remote partition root to be valid, all the above conditions
2532 "cpuset.cpus.exclusive" can cause a valid partition root to
2536 A valid non-root parent partition may distribute out all its CPUs
2540 Care must be taken to change a valid partition root to "member"
2544 their parent is switched back to a partition root with a proper
2548 "cpuset.cpus.partition" changes. That includes changes caused
2549 by write to "cpuset.cpus.partition", cpu hotplug or other
2550 changes that modify the validity status of the partition.
2552 to "cpuset.cpus.partition" without the need to do continuous
2555 A user can pre-configure certain CPUs to an isolated state
2558 into a partition, they have to be used in an isolated partition.
2562 -----------------
2572 device file, corresponding BPF programs will be executed, and depending
2573 on the return value the attempt will succeed or fail with -EPERM.
2578 If the program returns 0, the attempt fails with -EPERM, otherwise it
2586 ----
2595 A readwrite nested-keyed file that exists for all the cgroups
2616 A read-only file that describes current resource usage.
2625 -------
2642 A read-only flat-keyed file which exists on non-root cgroups.
2648 Similar to hugetlb.<hugepagesize>.events but the fields in the file
2649 are local to the cgroup i.e. not hierarchical. The file modified event
2650 generated on this file reflects only the local events.
2655 use hugetlb pages are included. The per-node values are in bytes.
2658 ----
2666 include/linux/misc_cgroup.h file and the corresponding name via misc_res_name[]
2667 in the kernel/cgroup/misc.c file. Provider of the resource must set its
2680 A read-only flat-keyed file shown only in the root cgroup. It shows
2689 A read-only flat-keyed file shown in the all cgroups. It shows
2697 A read-only flat-keyed file shown in all cgroups. It shows the
2706 A read-write flat-keyed file shown in the non root cgroups. Allowed
2722 file.
2725 A read-only flat-keyed file which exists on non-root cgroups. The
2727 change in this file generates a file modified event. All fields in
2728 this file are hierarchical.
2735 Similar to misc.events but the fields in the file are local to the
2736 cgroup i.e. not hierarchical. The file modified event generated on
2737 this file reflects only the local events.
2748 ------
2759 Non-normative information
2760 -------------------------
2775 kernel/sched/core.c file (values from this array should be scaled
2776 appropriately so the neutral - nice 0 - value is 100 instead of 1024).
2792 ------
2795 "/proc/$PID/cgroup" file and cgroup mounts. The CLONE_NEWCGROUP clone
2802 Without cgroup namespace, the "/proc/$PID/cgroup" file shows the
2805 "/proc/$PID/cgroup" file may leak potential system level information
2811 The path '/batchjobs/container_id1' can be considered as system-data
2816 # ls -l /proc/self/ns/cgroup
2817 lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835]
2823 # ls -l /proc/self/ns/cgroup
2824 lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183]
2828 When some thread from a multi-threaded process unshares its cgroup
2840 ------------------
2851 # ~/unshare -c # unshare cgroupns in some cgroup
2859 Each process gets its namespace-specific view of "/proc/$PID/cgroup"
2890 ----------------------
2919 ---------------------------------
2922 running inside a non-init cgroup namespace::
2924 # mount -t cgroup2 none $MOUNT_POINT
2930 The virtualization of /proc/self/cgroup file combined with restricting
2931 the view of cgroup hierarchy by namespace-private cgroupfs mount
2944 --------------------------------
2947 address_space_operations->writepage[s]() to annotate bio's using the
2964 super_block by setting SB_I_CGROUPWB in ->s_iflags. This allows for
2981 - Multiple hierarchies including named ones are not supported.
2983 - All v1 mount options are not supported.
2985 - The "tasks" file is removed and "cgroup.procs" is not sorted.
2987 - "cgroup.clone_children" is removed.
2989 - /proc/cgroups is meaningless for v2. Use "cgroup.controllers" or
2997 --------------------
3050 ------------------
3056 individual applications and system management interface.
3058 Generally, in-process knowledge is available only to the process
3059 itself; thus, unlike service-level organization of processes,
3066 sub-hierarchies and control resource distributions along them. This
3067 effectively raised cgroup to the status of a syscall-like API exposed
3077 that the process would actually be operating on its own sub-hierarchy.
3081 system-management pseudo filesystem. cgroup ended up with interface
3084 individual applications through the ill-defined delegation mechanism
3094 -------------------------------------------
3105 cycles and the number of internal threads fluctuated - the ratios
3121 clearly defined. There were attempts to add ad-hoc behaviors and
3135 ----------------------
3139 was how an empty cgroup was notified - a userland helper binary was
3142 to in-kernel event delivery filtering mechanism further complicating
3164 ------------------------------
3171 global reclaim prefers is opt-in, rather than opt-out. The costs for
3179 introduces high allocation latencies into the system, but also impacts
3180 system performance due to overreclaim, to the point where the feature
3181 becomes self-defeating.
3183 The memory.low boundary on the other hand is a top-down allocated
3213 system than killing the group. Otherwise, memory.max is there to
3221 new limit is met - or the task writing to memory.max is killed.
3230 groups can sabotage swapping by other means - such as referencing its
3231 anonymous memory in a tight loop - and an admin can not assume full
3237 resources. Swap space is a resource like all others in the system,