1==========================================
2Explicit volatile write back cache control
3==========================================
4
5Introduction
6------------
7
8Many storage devices, especially in the consumer market, come with volatile
9write back caches.  That means the devices signal I/O completion to the
10operating system before data actually has hit the non-volatile storage.  This
11behavior obviously speeds up various workloads, but it means the operating
12system needs to force data out to the non-volatile storage when it performs
13a data integrity operation like fsync, sync or an unmount.
14
15The Linux block layer provides two simple mechanisms that let filesystems
16control the caching behavior of the storage device.  These mechanisms are
17a forced cache flush, and the Force Unit Access (FUA) flag for requests.
18
19
20Explicit cache flushes
21----------------------
22
23The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
24the filesystem and will make sure the volatile cache of the storage device
25has been flushed before the actual I/O operation is started.  This explicitly
26guarantees that previously completed write requests are on non-volatile
27storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
28set on an otherwise empty bio structure, which causes only an explicit cache
29flush without any dependent I/O.  It is recommend to use
30the blkdev_issue_flush() helper for a pure cache flush.
31
32
33Forced Unit Access
34------------------
35
36The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
37filesystem and will make sure that I/O completion for this request is only
38signaled after the data has been committed to non-volatile storage.
39
40
41Implementation details for filesystems
42--------------------------------------
43
44Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
45worry if the underlying devices need any explicit cache flushing and how
46the Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
47may both be set on a single bio.
48
49Feature settings for block drivers
50----------------------------------
51
52For devices that do not support volatile write caches there is no driver
53support required, the block layer completes empty REQ_PREFLUSH requests before
54entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
55requests that have a payload.
56
57For devices with volatile write caches the driver needs to tell the block layer
58that it supports flushing caches by setting the
59
60   BLK_FEAT_WRITE_CACHE
61
62flag in the queue_limits feature field.  For devices that also support the FUA
63bit the block layer needs to be told to pass on the REQ_FUA bit by also setting
64the
65
66   BLK_FEAT_FUA
67
68flag in the features field of the queue_limits structure.
69
70Implementation details for bio based block drivers
71--------------------------------------------------
72
73For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to
74the driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver
75needs to handle them.
76
77*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is
78_not_ set.  Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to
79handle REQ_FUA.
80
81For remapping drivers the REQ_FUA bits need to be propagated to underlying
82devices, and a global flush needs to be implemented for bios with the
83REQ_PREFLUSH bit set.
84
85Implementation details for blk-mq drivers
86-----------------------------------------
87
88When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests
89with a payload are automatically turned into a sequence of a REQ_OP_FLUSH
90request followed by the actual write by the block layer.
91
92When the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the
93REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer
94after the completion of the write request for bio submissions with the REQ_FUA
95bit set.
96