Lines Matching refs:repair
47 then present case studies of how each repair function actually works.
109 `kernel changes <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair…
111 …ges <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=repair-dirs>`_.
112 Each kernel patchset adding an online repair function will use the same branch
119 XFS (on Linux) to check and repair filesystems.
125 metadata, though it lacks any ability to repair what it finds.
126 Due to its high memory requirements and inability to repair things, this
176 metadata, an in-kernel facility to repair metadata, and a userspace driver
192 | "online repair". |
209 metadata to enable targeted checking and repair operations while the system
224 and repair each type of online fsck work item.
240 In principle, online fsck should be able to check and to repair everything that
279 resubmit the kernel scrub call with the repair flag enabled; this is
337 2. The repair function is called to rebuild the data structure.
340 If the repair fails, the scan results from the first step are returned to
385 Repairs for this class of scrub item are simple, since the repair function
387 The repair function scans available metadata as needed to record all the
390 atomically to complete the repair.
393 Because ``xfs_scrub`` locks a primary object for the duration of the repair,
394 this is effectively an offline repair operation performed on a subset of the
396 This minimizes the complexity of the repair code because it is not necessary to
401 The only infrastructure needed by the repair code are the staging area for
403 Despite these limitations, the advantage that online repair holds is clear:
412 Most primary metadata repair functions stage their intermediate results in an
417 duration of the repair is *always* an offline algorithm.
442 duration of the repair.
444 Instead, repair functions set up an in-memory staging structure to store
446 Depending on the requirements of the specific repair function, the staging
448 specific to that repair function.
450 When the repair scanner needs to record an observation, the staging data are
452 While the filesystem scan is in progress, the repair function hooks the
460 Introducing concurrency helps online repair avoid various locking problems, but
462 Live filesystem code has to be hooked so that the repair function can observe
477 Inspiration for the secondary metadata repair strategy was drawn from section
498 To minimize changes to the rest of the codebase, XFS online repair keeps the
501 while repair is running.
506 facilitate a repair also be used to implement a comprehensive check?
532 Check and repair require full filesystem scans, but resource and lock
537 Check and repair of the other types of summary counters (quota resource counts
543 Inspiration for quota and file link count repair strategies were drawn from
574 reduces the ability of online fsck to find inconsistencies and repair them.
587 - **Inability to repair**: Sometimes, a filesystem is too badly damaged to be
590 coherent narrative cannot be formed from records collected, then the repair
592 To reduce the chance that a repair will fail with a dirty transaction and
593 render the filesystem unusable, the online repair functions have been
668 To start development of online repair, fstests was modified to run
670 This ensures that offline repair does not crash, leave a corrupt filesystem
673 To complete the first phase of development of online repair, fstests was
675 This enables a comparison of the effectiveness of online repair as compared to
676 the existing offline repair tools.
707 2. Offline repair (``xfs_repair``) to detect and fix
708 3. Online repair (``xfs_scrub``) to detect and fix
746 3. Offline repair (``xfs_repair``)
748 5. Online repair (``xfs_scrub``)
749 … 6. Both repair tools (``xfs_scrub`` and then ``xfs_repair`` if online repair doesn't succeed)
756 used to discover incorrect repair code and missing functionality for entire
780 impact on the running system, the online repair code should never introduce
794 * Race ``xfs_scrub`` in check and force-repair mode against ``fsstress`` while
796 * Race ``xfs_scrub`` in check and force-repair mode against ``fsstress`` while
813 repair.
816 that performs autonomous checking and repair.
903 service window to run the online repair tool to correct the problem.
905 run the traditional offline repair tool to correct the problem.
910 notifications and initiate a repair?
926 code that provide the ability to check and repair metadata while the system
1326 before starting a repair.
1501 After performing a repair, the checking code is run a second time to validate
1519 Furthermore, online repair must not run when operations are pending because
1676 If a repair is attempted in this state, the results will be catastrophic!
1700 The checking and repair operations must factor these pending operations into
1769 repair code as much as possible.
1849 For online repair to rebuild a metadata structure, it must compute the record
1883 | The first edition of online repair inserted records into a new btree as |
1922 error as an out of memory error. For online repair, squashing error conditions
1987 During a repair, scrub needs to stage new records during the gathering step and
2080 During the fourth demonstration of online repair, a community reviewer remarked
2081 that for performance reasons, online repair ought to load batches of records
2171 However, it should be noted that these repair functions only use blob storage
2176 `extended attribute repair
2177 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_ serie…
2336 As mentioned previously, early iterations of online repair built new btree
2340 blocks if the system went down during a repair.
2341 Loading records one at a time also meant that repair could not control the
2434 Once repair knows the number of blocks needed for the new btree, it allocates
2448 While repair is writing these new btree blocks, the EFIs created for the space
2464 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-bitmap-rework>`_
2467 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-l…
2532 individual repair function that called the bulk loader.
2533 The repair function must log the location of the new root in a transaction,
2556 the repair transaction.
2558 The transaction rolling in steps 2c and 3 represent a weakness in the repair
2561 Online repair functions minimize the chances of this occurring by using very
2617 Once the repair function accumulates one chunk's worth of data, it calls
2627 `AG btree repair
2628 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_
2720 `AG btree repair
2721 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_
2761 `file mapping repair
2762 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-file-mappings>`_
2777 Offline repair rebuilds all space metadata after recording the usage of
2781 As part of a repair, online fsck relies heavily on the reverse mapping records
2843 As stated earlier, online repair functions use very large transactions to
2848 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-prep-for-bulk-l…
2873 If it is possible to maintain the AGF lock throughout the repair (which is the
2903 btree repair:
2912 However, repair holds the AGF buffer lock for the duration of the free space
2919 information changes the number of free space records, repair must re-estimate
2922 As part of committing the new btrees, repair must ensure that reverse mappings
2926 is atomic, similar to the other btree repair functions.
2928 Third, finding the blocks to reap after the repair is not overly
2936 When repair walks reverse mapping records to synthesize free space records, it
2939 The repair context maintains a second bitmap corresponding to the rmap btree
2947 `AG btree repair
2948 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_
2956 Old reverse mapping btrees are less difficult to reap after a repair.
2976 `AG btree repair
2977 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-ag-btrees>`_
3016 If the second ``iget`` fails, the repair has failed.
3018 Once the in-memory representation is loaded, repair can lock the inode and can
3029 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-inodes>`_
3030 repair series.
3050 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quota>`_
3051 repair series.
3150 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-fscounters>`_
3159 Like every other type of online repair, repairs are made by writing those
3286 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_
3430 Consider the directory parent pointer repair code as an example.
3447 `directory repair
3448 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_
3607 It is useful to compare the mount time quotacheck code to the online repair
3691 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-quotacheck>`_
3748 `file link count repair
3757 Most repair functions follow the same pattern: lock filesystem resources,
3761 repair code -- code and data are entirely contained within the scrub module,
3764 A secondary advantage of this repair approach is atomicity -- once the kernel
3772 btree repair strategy because it must scan every space mapping of every fork of
3774 Therefore, rmap repair foregoes atomicity between scrub and repair.
3827 to :ref:`reap after rmap btree repair <rmap_reap>`.
3832 `rmap repair
3833 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rmap-btree>`_
3853 Therefore, online repair of file-based metadata createas a temporary file in
3856 fork contents) to commit the repair.
3857 Once the repair is complete, the old fork can be reaped as necessary; if the
3863 This dependency is the reason why online repair can only use pageable kernel
3876 Temporary files created for repair are similar to ``O_TMPFILE`` files created
3887 | In the initial iteration of file metadata repair, the damaged metadata |
3891 | This strategy did not survive the introduction of the atomic repair |
3908 | - Even if repair could build an alternate copy of a data structure in a |
3909 | different part of the fork address space, the atomic repair commit |
3910 | requirement means that online repair would have to be able to perform |
3918 | - Reaping blocks after a repair is not a simple operation, and |
3943 Online repair code should use the ``xrep_tempfile_create`` function to create a
3975 `repair temporary files
3976 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-tempfiles>`_
3982 Once repair builds a temporary file with a new data structure written into
3988 for online repair because:
4004 d. Online repair needs to swap the contents of two files that are by definition
4010 not reappear if the system goes down mid-repair.
4251 repair builds every block in the new data structure with the owner field of the
4254 After a successful exchange operation, the repair operation must reap the old
4256 extent reaping <reaping>` mechanism that is done post-repair.
4257 If the filesystem should go down during the reap part of the repair, the
4261 repair, and is not completely foolproof.
4266 To repair a metadata file, online repair proceeds as follows:
4268 1. Create a temporary repair file.
4270 2. Use the staging data to write out new contents into the temporary repair
4283 6. Commit the transaction to complete the repair.
4322 To repair the summary file, write the xfile contents into the temporary file
4327 `realtime summary repair
4328 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-rtsummary>`_
4373 `extended attribute repair
4374 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-xattrs>`_
4382 The offline repair tool scans all inodes to find files with nonzero link count,
4388 The best that online repair can do at this time is to read directory data
4430 **Future Work Question**: Should repair revalidate the dentry cache when
4452 `directory repair
4453 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-dirs>`_
4477 Both online and offline repair can use this strategy.
4495 | 2. Referential integrity was not integrated into offline repair. |
4509 | a file system repair to depend on. |
4524 | of repair tools needing to to ensure that the ``dirent_pos`` field |
4549 | name uniqueness that we require, without forcing repair code to |
4616 `parent pointers directory repair
4666 `parent pointers repair
4673 Examining parent pointers in offline repair works differently because corrupt
4759 `offline parent pointers repair
4763 Rebuilding directories from parent pointers in offline repair would be very
4779 Free space metadata has not been ensured yet, so repair cannot yet use the
4809 However, one of online repair's design goals is to avoid locking the entire
4907 `directory tree repair
4943 The directory and file link count repair setup functions must use the regular
4959 3. Use ``xrep_adoption_trans_alloc`` to reserve resources to the repair
4978 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage>`_
5126 repairs for a given filesystem object with a single repair item.
5127 Each repair item represents a single lockable object -- AGs, metadata files,
5130 Phase 4 is responsible for scheduling a lot of repair work in as quick a
5133 means that ``xfs_scrub`` must try to complete the repair work scheduled by
5134 phase 2 before trying repair work scheduled by phase 3.
5135 The repair process is as follows:
5137 1. Start a round of repair with a workqueue and enough workers to keep the CPUs
5140 a. For each repair item queued by phase 2,
5142 i. Ask the kernel to repair everything listed in the repair item for a
5150 If the revalidation succeeds, drop the repair item.
5155 c. For each repair item queued by phase 3,
5157 i. Ask the kernel to repair everything listed in the repair item for a
5165 If the revalidation succeeds, drop the repair item.
5170 2. If step 1 made any repair progress of any kind, jump back to step 1 to start
5171 another round of repair.
5173 3. If there are items left to repair, run them all serially one more time.
5175 to repair anything.
5183 `repair warning improvements
5184 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-better-repair…
5186 `repair data dependency
5187 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-data-d…
5192 `repair scheduling
5193 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-repair-schedu…
5313 necessary refinements to online repair and lack of customer demand mean that
5378 As it turns out, the :ref:`refactoring <scrubrepair>` of repair items mentioned
5413 be too much work to allow userspace to specify a timeout for a scrub/repair
5415 However, most repair functions have the property that once they begin to touch
5434 The third piece is the ability to force an online repair.
5440 ``GETFSMAP`` and issues forced repair requests on the data structure.