Lines Matching +full:delta +full:- +full:y +full:- +full:threshold
1 .. SPDX-License-Identifier: GPL-2.0
33 details logged are made up of the changes to in-core structures rather than
34 on-disk structures. Other objects - typically buffers - have their physical
64 place. This means that permanent transactions can be used for one-shot
65 modifications, but one-shot reservations cannot be used for permanent
68 In the code, a one-shot transaction pattern looks somewhat like this::
97 While this might look similar to a one-shot transaction, there is an important
123 the on-disk journal.
165 transaction, we have to reserve enough space to record a full leaf-to-root split
183 For one-shot transactions, a single unit space reservation is all that is
190 transaction rolling mechanism to re-reserve space on every transaction roll. We
194 For example, an inode allocation is typically two transactions - one to
205 means we can roll the transaction multiple times before we have to re-reserve
210 re-reserve physical space in the log. This is somewhat complex, and requires
219 of a cycle number - the number of times the log has been overwritten - and the
233 reservations currently held by active transactions. It is a purely in-memory
251 - and it mostly does track exactly the same location as the reserve grant head -
269 grant head does not track physical space - it only accounts for the amount of
278 xfs_trans_commit() calls, while the physical log space reservation - tracked by
279 the write head - is then reserved separately by a call to xfs_log_reserve()
287 "Re-logging" the locked items on every transaction roll ensures that the items
292 move the tail of the log forwards to free up write grant space. Re-logging the
294 making cannot self-deadlock.
303 Re-logging Explained
309 method called "re-logging". Conceptually, this is quite simple - all it requires
324 E E Y (> X+n+m+o)
325 F E+F Y+p
334 implement long-running, multiple-commit permanent transactions.
347 the log - repeated operations to the same objects write the same changes to
357 in memory - batching them, if you like - to minimise the impact of the log IO on
362 buffers available and the size of each is 32kB - the size can be increased up
366 that can be made to the filesystem at any point in time - if all the log
383 but only one of those copies needs to be there - the last one "D", as it
402 actually relatively easy to do - all the changes to logged items are already
438 4. No on-disk format change (metadata or log format).
446 ---------------
463 The solution is relatively simple - it just took a long time to recognise it.
486 Object +---------------------------------------------+
487 Vector 1 +----+
488 Vector 2 +----+
489 Vector 3 +----------+
493 Log Buffer +-V1-+-V2-+----V3----+
497 Object +---------------------------------------------+
498 Vector 1 +----+
499 Vector 2 +----+
500 Vector 3 +----------+
504 Memory Buffer +-V1-+-V2-+----V3----+
505 Vector 1 +----+
506 Vector 2 +----+
507 Vector 3 +----------+
518 buffer writing (i.e. double encapsulation). This would be an on-disk format
525 self-describing object that can be passed to the log buffer write code to be
527 Hence we avoid needing a new on-disk format to handle items that have been
532 ----------------
543 and as such are stored in the Active Item List (AIL) which is a LSN-ordered
561 its place in the list and re-inserted at the tail. This is entirely arbitrary
562 and done to make it easy for debugging - the last items in the list are the
569 ----------------------------
576 log replay - all the changes in all the objects in a given transaction must
594 to any other transaction - it contains a transaction header, a series of
596 perspective, the checkpoint transaction is also no different - just a lot
607 per-checkpoint context that travels through the log write process through to
638 Log Item <-> log vector 1 -> memory buffer
639 | -> vector array
641 Log Item <-> log vector 2 -> memory buffer
642 | -> vector array
647 Log Item <-> log vector N-1 -> memory buffer
648 | -> vector array
650 Log Item <-> log vector N -> memory buffer
651 -> vector array
659 log vector 1 -> memory buffer
660 | -> vector array
661 | -> Log Item
663 log vector 2 -> memory buffer
664 | -> vector array
665 | -> Log Item
670 log vector N-1 -> memory buffer
671 | -> vector array
672 | -> Log Item
674 log vector N -> memory buffer
675 -> vector array
676 -> Log Item
703 --------------------------------------
710 re-using a freed metadata extent for a data extent), a special, optimised log
720 As discussed in the checkpoint section, delayed logging uses per-checkpoint
725 atomic counter - we can just take the current context sequence number and add
754 else for such serialisation - it only matters when we do a log force.
767 ------------------------------------------------
785 inode changes. If you modify lots of inode cores (e.g. ``chmod -R g+w *``), then
792 buffer format structure for each buffer - roughly 800 vectors or 1.51MB total
810 reservation of around 150KB, which is a non-trivial amount of space.
812 A static reservation needs to manipulate the log grant counters - we can take a
832 maximal amount of log metadata space they require, and such a delta reservation
843 the maximum threshold, we need to push the CIL to the log. This is effectively
859 ---------------------------------
875 That is, we now have a many-to-one relationship between transaction commit and
883 pin the object the first time it is inserted into the CIL - if it is already in
900 ---------------------------------------
910 points in the design - the three important ones are:
917 that we have a many-to-one interaction here. That is, the only restriction on
924 relatively long period of time - the pinning of log items needs to be done
932 really needs to be a sleeping lock - if the CIL flush takes the lock, we do not
941 compared to transaction commit for asynchronous transaction workloads - only
942 time will tell if using a read-write semaphore for exclusion will limit
979 -----------------
1019 Essentially, steps 1-6 operate independently from step 7, which is also
1020 independent of steps 8-9. An item can be locked in steps 1-6 or steps 8-9
1021 at the same time step 7 is occurring, but only steps 1-6 or 8-9 can occur
1023 and steps 1-6 are re-entered, then the item is relogged. Only when steps 8-9
1075 logging methods are in the middle of the life cycle - they still have the same
1081 As a result of this zero-impact "insertion" of delayed logging infrastructure