Lines Matching +full:4 +full:kb +full:- +full:page

1 .. SPDX-License-Identifier: GPL-2.0
4 Page Tables
10 feature of all Unix-like systems as time went by. In 1985 the feature was
13 Page tables map virtual addresses as seen by the CPU into physical addresses
16 Linux defines page tables as a hierarchy which is currently five levels in
21 by the underlying physical page frame. The **page frame number** or **pfn**
22 is the physical address of the page (as seen on the external memory bus)
26 the last page of physical memory the external address bus of the CPU can
29 With a page granularity of 4KB and a address range of 32 bits, pfn 0 is at
31 and so on until we reach pfn 0xfffff at 0xfffff000. With 16KB pages pfs are
34 As you can see, with 4KB pages the page base address uses bits 12-31 of the
36 `PAGE_SIZE` is usually defined in terms of the page shift as `(1 << PAGE_SHIFT)`
39 sizes. When Linux was created, 4KB pages and a single page table called
40 `swapper_pg_dir` with 1024 entries was used, covering 4MB which coincided with
41 the fact that Torvald's first computer had 4MB of physical memory. Entries in
42 this single table were referred to as *PTE*:s - page table entries.
44 The software page table hierarchy reflects the fact that page table hardware has
45 become hierarchical and that in turn is done to save page table memory and
48 One could of course imagine a single, linear page table with enormous amounts
49 of entries, breaking down the whole memory into single pages. Such a page table
51 remains unused. By using hierarchical page tables large holes in the virtual
52 address space does not waste valuable page table memory, because it will suffice
53 to mark large areas as unmapped at a higher level in the page table hierarchy.
55 Additionally, on modern CPUs, a higher level page table entry can point directly
57 megabytes or even gigabytes in a single high-level page table entry, taking
61 The page table hierarchy has now developed into this::
63 +-----+
65 +-----+
67 | +-----+
68 +-->| P4D |
69 +-----+
71 | +-----+
72 +-->| PUD |
73 +-----+
75 | +-----+
76 +-->| PMD |
77 +-----+
79 | +-----+
80 +-->| PTE |
81 +-----+
84 Symbols on the different levels of the page table hierarchy have the following
87 - **pte**, `pte_t`, `pteval_t` = **Page Table Entry** - mentioned earlier.
89 mapping a single page of virtual memory to a single page of physical memory.
92 A typical example is that the `pteval_t` is a 32- or 64-bit value with the
93 upper bits being a **pfn** (page frame number), and the lower bits being some
94 architecture-specific bits such as memory protection.
97 this did refer to a single page table entry in the single top level page
98 table, it was retrofitted to be an array of mapping elements when two-level
99 page tables were first introduced, so the *pte* is the lowermost page
100 *table*, not a page table *entry*.
102 - **pmd**, `pmd_t`, `pmdval_t` = **Page Middle Directory**, the hierarchy right
105 - **pud**, `pud_t`, `pudval_t` = **Page Upper Directory** was introduced after
106 the other levels to handle 4-level page tables. It is potentially unused,
109 - **p4d**, `p4d_t`, `p4dval_t` = **Page Level 4 Directory** was introduced to
110 handle 5-level page tables after the *pud* was introduced. Now it was clear
113 is only used on systems which actually have 5 levels of page tables, otherwise
116 - **pgd**, `pgd_t`, `pgdval_t` = **Page Global Directory** - the Linux kernel
117 main page table handling the PGD for the kernel memory is still found in
122 `struct pgt_t *pgd` pointer to the corresponding page global directory.
124 To repeat: each level in the page table hierarchy is a *array of pointers*, so
127 pointers on each level is architecture-defined.::
130 --> +-----+ PTE
131 | ptr |-------> +-----+
132 | ptr |- | ptr |-------> PAGE
137 +-----+ +----> +-----+
138 | ptr |-------> PAGE
143 Page Table Folding
146 If the architecture does not use all the page table levels, they can be *folded*
147 which means skipped, and all operations performed on page tables will be
148 compile-time augmented to just skip a level when accessing the next lower
151 Page table handling code that wishes to be architecture-neutral, such as the
154 architecture-specific code, so as to be robust to future changes.
157 MMU, TLB, and Page Faults
162 called `Translation Lookaside Buffers (TLBs)` and `Page Walk Caches` to speed up
166 which checks if there is the existing translation in the TLB or in the Page
168 MMU uses the page walks to determine the physical address and create the map.
170 The dirty bit for a page is set (i.e., turned on) when the page is written to.
171 Each page of memory has associated permission and dirty bits. The latter
172 indicate that the page has been modified since it was loaded into memory.
181 When these conditions happen, the MMU triggers page faults, which are types of
185 There are common and expected causes of page faults. These are triggered by
187 "Copy-on-Write". Page faults may also happen when frames have been swapped out
193 and "Copy-on-Write" because these subjects are out of scope as they belong to
204 because they avoid the need for complex page table lookups at the expenses of
208 physical frames, the kernel invokes the out-of-memory (OOM) killer to make room
212 Additionally, page faults may be also caused by code bugs or by maliciously
214 could use instructions to address (non-shared) memory which does not belong to
216 to a read-only location.
218 If the above-mentioned conditions happen in user-space, the kernel sends a
223 Linux kernel handles these page faults, creates tables and tables' entries,
233 `__handle_mm_fault()` to carry out the actual work of allocating the page
239 condition resolves to the kernel sending the above-mentioned SIGSEGV signal
243 find the entry's offsets of the upper layers of the page tables and allocate
249 above-mentioned convention to name them after the corresponding types of tables
252 The page table walk may end at one of the middle or upper layers (PMD, PUD).
254 Linux supports larger page sizes than the usual 4KB (i.e., the so called
256 directly map them, with no need to use lower level page entries (PTE). Huge
258 1GB. They are respectively mapped by the PMD and PUD page entries.
261 reduced page table overhead, memory allocation efficiency, and performance
263 trade-offs, like wasted memory and allocation challenges.
272 Linux to handle page faults in a way that is tailored to the specific
276 To conclude this high altitude view of how Linux handles page faults, let's
277 add that the page faults handler can be disabled and enabled respectively with
281 disable traps into the page faults handler, mostly to prevent deadlocks.