Lines Matching +full:step +full:- +full:up
1 .. SPDX-License-Identifier: GPL-2.0-only
4 Design of dm-vdo
7 The dm-vdo (virtual data optimizer) target provides inline deduplication,
8 compression, zero-block elimination, and thin provisioning. A dm-vdo target
9 can be backed by up to 256TB of storage, and can present a logical size of
10 up to 4PB. This target was originally developed at Permabit Technology
12 production environments ever since. It was made open-source in 2017 after
14 dm-vdo. For usage, see vdo.rst in the same directory as this file.
18 deduplication rates of 254:1, i.e. up to 254 copies of a given 4K block can
25 The design of dm-vdo is based on the idea that deduplication is a two-part
27 storing multiple copies of those duplicates. Therefore, dm-vdo has two main
34 -------------------
41 design attempts to be lock-free.
59 reflected in the on-disk representation of each data structure. Therefore,
64 -----------------------
79 trade-off between the storage saved and the resources expended to achieve
83 Each block of data is hashed to produce a 16-byte block name. An index
87 because it is too costly to update the index when a block is over-written
89 with the blocks, which is difficult to do efficiently in block-based
102 When the open chapter fills up, it is closed and a new open chapter is
116 chapters are read-only structures and their contents are never altered in
119 Once enough records have been written to fill up all the available index
138 look up the name in the volume index. This search will either indicate that
140 request looks up its name in the chapter index. This will indicate either
143 This process may require up to two page reads per request (one for the
149 memory-efficient structure called a delta index. Instead of storing the
154 the deltas are expressed using a Huffman code to take up even less space.
157 table, but it is slightly more expensive to look up entries, because a
158 request must read every entry in a delta list to add up the deltas in order
160 splitting its key space into many sub-lists, each starting at a fixed key
190 indexing, the memory requirements do not increase. The trade-off is
193 duplicate data, sparse indexing will detect 97-99% of the deduplication
197 -------------------------------
200 fields and data to track vdo-specific information. A struct vio maintains a
221 --------------
230 collection of slabs. The slabs can be up to 32GB, and are divided into
240 to free up space. The slab journal is used both to ensure that the main
241 recovery journal can regularly free up space, and also to amortize the cost
243 memory and are written out, a block at a time in oldest-dirtied-order, only
249 zones" in round-robin fashion. If there are P physical zones, then slab n
283 0-811 belong to tree 0, logical addresses 812-1623 belong to tree 1, and so
284 on. The interleaving is maintained all the way up to the 60 root nodes.
289 need to pre-allocate space for the entire set of logical mappings and also
296 time, and is large enough to hold all the non-leaf pages of the entire
358 compression packer (step 8d) rather than allowing it to continue
372 a. If any page-node in the tree has not yet been allocated, it must be
373 allocated before the write can continue. This step requires the
374 data_vio to lock the page-node that needs to be allocated. This
375 lock, like the logical block lock in step 2, is a hashtable entry
382 step 4. Once a new node has been allocated, that node is added to
385 map tree (step 10), updates the reference count of the new block
386 (step 11), and reacquires the implicit logical zone lock to add the
387 new mapping to the parent tree node (step 12). Once the tree is
391 b. In the steady-state case, the block map tree nodes will already be
398 4. If the block is a zero block, skip to step 9. Otherwise, an attempt is
412 added to a hashtable like the logical block locks in step 2. This
416 sub-component of the slab and are thus also covered by the implicit
436 tracked in step 2. This hashtable is covered by the implicit lock on
451 step 8h and attempts to write its data directly. This can happen if two
473 physical block as their new physical address and proceed to step 9
476 data_vios becomes the new agent and continues to step 8d as if no
480 it has an allocated physical block (from step 3) that it can write
484 are out of space, so they proceed to step 13 for cleanup.
487 compress, the data_vio will continue to step 8h to write its data
496 The packer can combine up to 14 compressed blocks in a single 4k
507 data_vio will proceed to step 8h to write its data directly.
511 using the allocated physical block from one of its data_vios. Step
518 zone lock and proceeds to step 8i.
521 step 3. It will write its data to that allocated physical block.
526 possible. Each data_vio will then proceed to step 9 to record its
540 physical mapping"), if any, and records it. This step requires a lock
548 recovery blocks up to the one containing its entry have been written
565 logical-to-physical mapping in the block map to point to the new
577 the logical block lock acquired in step 2.
588 data from the write data_vio and return it. Otherwise, it will look up the
589 logical-to-physical mapping by traversing the block map tree as in step 3,
595 acknowledgment as in step 13, although it only needs to release the logical
602 a read-modify-write operation that reads the relevant 4K block, copies the
611 recovery journal. During the pre-resume phase of the next start, the
620 *Read-only Rebuild*
622 If a vdo encounters an unrecoverable error, it will enter read-only mode.
626 to the possibility that data has been lost. During a read-only rebuild, the