1.. SPDX-License-Identifier: GPL-2.0+ 2 3 4========== 5Maple Tree 6========== 7 8:Author: Liam R. Howlett 9 10Overview 11======== 12 13The Maple Tree is a B-Tree data type which is optimized for storing 14non-overlapping ranges, including ranges of size 1. The tree was designed to 15be simple to use and does not require a user written search method. It 16supports iterating over a range of entries and going to the previous or next 17entry in a cache-efficient manner. The tree can also be put into an RCU-safe 18mode of operation which allows reading and writing concurrently. Writers must 19synchronize on a lock, which can be the default spinlock, or the user can set 20the lock to an external lock of a different type. 21 22The Maple Tree maintains a small memory footprint and was designed to use 23modern processor cache efficiently. The majority of the users will be able to 24use the normal API. An :ref:`maple-tree-advanced-api` exists for more complex 25scenarios. The most important usage of the Maple Tree is the tracking of the 26virtual memory areas. 27 28The Maple Tree can store values between ``0`` and ``ULONG_MAX``. The Maple 29Tree reserves values with the bottom two bits set to '10' which are below 4096 30(ie 2, 6, 10 .. 4094) for internal use. If the entries may use reserved 31entries then the users can convert the entries using xa_mk_value() and convert 32them back by calling xa_to_value(). If the user needs to use a reserved 33value, then the user can convert the value when using the 34:ref:`maple-tree-advanced-api`, but are blocked by the normal API. 35 36The Maple Tree can also be configured to support searching for a gap of a given 37size (or larger). 38 39Pre-allocating of nodes is also supported using the 40:ref:`maple-tree-advanced-api`. This is useful for users who must guarantee a 41successful store operation within a given 42code segment when allocating cannot be done. Allocations of nodes are 43relatively small at around 256 bytes. 44 45.. _maple-tree-normal-api: 46 47Normal API 48========== 49 50Start by initialising a maple tree, either with DEFINE_MTREE() for statically 51allocated maple trees or mt_init() for dynamically allocated ones. A 52freshly-initialised maple tree contains a ``NULL`` pointer for the range ``0`` 53- ``ULONG_MAX``. There are currently two types of maple trees supported: the 54allocation tree and the regular tree. The regular tree has a higher branching 55factor for internal nodes. The allocation tree has a lower branching factor 56but allows the user to search for a gap of a given size or larger from either 57``0`` upwards or ``ULONG_MAX`` down. An allocation tree can be used by 58passing in the ``MT_FLAGS_ALLOC_RANGE`` flag when initialising the tree. 59 60You can then set entries using mtree_store() or mtree_store_range(). 61mtree_store() will overwrite any entry with the new entry and return 0 on 62success or an error code otherwise. mtree_store_range() works in the same way 63but takes a range. mtree_load() is used to retrieve the entry stored at a 64given index. You can use mtree_erase() to erase an entire range by only 65knowing one value within that range, or mtree_store() call with an entry of 66NULL may be used to partially erase a range or many ranges at once. 67 68If you want to only store a new entry to a range (or index) if that range is 69currently ``NULL``, you can use mtree_insert_range() or mtree_insert() which 70return -EEXIST if the range is not empty. 71 72You can search for an entry from an index upwards by using mt_find(). 73 74You can walk each entry within a range by calling mt_for_each(). You must 75provide a temporary variable to store a cursor. If you want to walk each 76element of the tree then ``0`` and ``ULONG_MAX`` may be used as the range. If 77the caller is going to hold the lock for the duration of the walk then it is 78worth looking at the mas_for_each() API in the :ref:`maple-tree-advanced-api` 79section. 80 81Sometimes it is necessary to ensure the next call to store to a maple tree does 82not allocate memory, please see :ref:`maple-tree-advanced-api` for this use case. 83 84You can use mtree_dup() to duplicate an entire maple tree. It is a more 85efficient way than inserting all elements one by one into a new tree. 86 87Finally, you can remove all entries from a maple tree by calling 88mtree_destroy(). If the maple tree entries are pointers, you may wish to free 89the entries first. 90 91Allocating Nodes 92---------------- 93 94The allocations are handled by the internal tree code. See 95:ref:`maple-tree-advanced-alloc` for other options. 96 97Locking 98------- 99 100You do not have to worry about locking. See :ref:`maple-tree-advanced-locks` 101for other options. 102 103The Maple Tree uses RCU and an internal spinlock to synchronise access: 104 105Takes RCU read lock: 106 * mtree_load() 107 * mt_find() 108 * mt_for_each() 109 * mt_next() 110 * mt_prev() 111 112Takes ma_lock internally: 113 * mtree_store() 114 * mtree_store_range() 115 * mtree_insert() 116 * mtree_insert_range() 117 * mtree_erase() 118 * mtree_dup() 119 * mtree_destroy() 120 * mt_set_in_rcu() 121 * mt_clear_in_rcu() 122 123If you want to take advantage of the internal lock to protect the data 124structures that you are storing in the Maple Tree, you can call mtree_lock() 125before calling mtree_load(), then take a reference count on the object you 126have found before calling mtree_unlock(). This will prevent stores from 127removing the object from the tree between looking up the object and 128incrementing the refcount. You can also use RCU to avoid dereferencing 129freed memory, but an explanation of that is beyond the scope of this 130document. 131 132.. _maple-tree-advanced-api: 133 134Advanced API 135============ 136 137The advanced API offers more flexibility and better performance at the 138cost of an interface which can be harder to use and has fewer safeguards. 139You must take care of your own locking while using the advanced API. 140You can use the ma_lock, RCU or an external lock for protection. 141You can mix advanced and normal operations on the same array, as long 142as the locking is compatible. The :ref:`maple-tree-normal-api` is implemented 143in terms of the advanced API. 144 145The advanced API is based around the ma_state, this is where the 'mas' 146prefix originates. The ma_state struct keeps track of tree operations to make 147life easier for both internal and external tree users. 148 149Initialising the maple tree is the same as in the :ref:`maple-tree-normal-api`. 150Please see above. 151 152The maple state keeps track of the range start and end in mas->index and 153mas->last, respectively. 154 155mas_walk() will walk the tree to the location of mas->index and set the 156mas->index and mas->last according to the range for the entry. 157 158You can set entries using mas_store(). mas_store() will overwrite any entry 159with the new entry and return the first existing entry that is overwritten. 160The range is passed in as members of the maple state: index and last. 161 162You can use mas_erase() to erase an entire range by setting index and 163last of the maple state to the desired range to erase. This will erase 164the first range that is found in that range, set the maple state index 165and last as the range that was erased and return the entry that existed 166at that location. 167 168You can walk each entry within a range by using mas_for_each(). If you want 169to walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as 170the range. If the lock needs to be periodically dropped, see the locking 171section mas_pause(). 172 173Using a maple state allows mas_next() and mas_prev() to function as if the 174tree was a linked list. With such a high branching factor the amortized 175performance penalty is outweighed by cache optimization. mas_next() will 176return the next entry which occurs after the entry at index. mas_prev() 177will return the previous entry which occurs before the entry at index. 178 179mas_find() will find the first entry which exists at or above index on 180the first call, and the next entry from every subsequent calls. 181 182mas_find_rev() will find the first entry which exists at or below the last on 183the first call, and the previous entry from every subsequent calls. 184 185If the user needs to yield the lock during an operation, then the maple state 186must be paused using mas_pause(). 187 188There are a few extra interfaces provided when using an allocation tree. 189If you wish to search for a gap within a range, then mas_empty_area() 190or mas_empty_area_rev() can be used. mas_empty_area() searches for a gap 191starting at the lowest index given up to the maximum of the range. 192mas_empty_area_rev() searches for a gap starting at the highest index given 193and continues downward to the lower bound of the range. 194 195.. _maple-tree-advanced-alloc: 196 197Advanced Allocating Nodes 198------------------------- 199 200Allocations are usually handled internally to the tree, however if allocations 201need to occur before a write occurs then calling mas_expected_entries() will 202allocate the worst-case number of needed nodes to insert the provided number of 203ranges. This also causes the tree to enter mass insertion mode. Once 204insertions are complete calling mas_destroy() on the maple state will free the 205unused allocations. 206 207.. _maple-tree-advanced-locks: 208 209Advanced Locking 210---------------- 211 212The maple tree uses a spinlock by default, but external locks can be used for 213tree updates as well. To use an external lock, the tree must be initialized 214with the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the 215MTREE_INIT_EXT() #define, which takes an external lock as an argument. 216 217Functions and structures 218======================== 219 220.. kernel-doc:: include/linux/maple_tree.h 221.. kernel-doc:: lib/maple_tree.c 222