Skip to content

Commit c6be848

Browse files
committed
btrfs-progs: docs: add more chapters (part 2)
The feature pages share the contents with the manual page section 5 so put the contents to separate files. Progress: 2/3. Signed-off-by: David Sterba <[email protected]>
1 parent b871bf4 commit c6be848

19 files changed

+772
-332
lines changed

Documentation/Auto-repair.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
11
Auto-repair on read
22
===================
33

4-
...
4+
Data or metadata that are found to be damaged (eg. because the checksum does
5+
not match) at the time they're read from the device can be salvaged in case the
6+
filesystem has another valid copy when using block group profile with redundancy
7+
(DUP, RAID1, RAID5/6). The correct data are returned to the user application
8+
and the damaged copy is replaced by it.

Documentation/Convert.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
Convert
22
=======
33

4-
...
4+
.. include:: ch-convert-intro.rst

Documentation/Deduplication.rst

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,44 @@
11
Deduplication
22
=============
33

4-
...
4+
Going by the definition in the context of filesystems, it's a process of
5+
looking up identical data blocks tracked separately and creating a shared
6+
logical link while removing one of the copies of the data blocks. This leads to
7+
data space savings while it increases metadata consumption.
8+
9+
There are two main deduplication types:
10+
11+
* **in-band** *(sometimes also called on-line)* -- all newly written data are
12+
considered for deduplication before writing
13+
* **out-of-band** *(sometimes alco called offline)* -- data for deduplication
14+
have to be actively looked for and deduplicated by the user application
15+
16+
Both have their pros and cons. BTRFS implements **only out-of-band** type.
17+
18+
BTRFS provides the basic building blocks for deduplication allowing other tools
19+
to choose the strategy and scope of the deduplication. There are multiple
20+
tools that take different approaches to deduplication, offer additional
21+
features or make trade-offs. The following table lists tools that are known to
22+
be up-to-date, maintained and widely used.
23+
24+
.. list-table::
25+
:header-rows: 1
26+
27+
* - Name
28+
- File based
29+
- Block based
30+
- Incremental
31+
* - `BEES <https://github.com/Zygo/bees>`_
32+
- No
33+
- Yes
34+
- Yes
35+
* - `duperemove <https://github.com/markfasheh/duperemove>`_
36+
- Yes
37+
- No
38+
- Yes
39+
40+
Legend:
41+
42+
- *File based*: the tool takes a list of files and deduplicates blocks only from that set
43+
- *Block based*: the tool enumerates blocks and looks for duplicates
44+
- *Incremental*: repeated runs of the tool utilizes information gathered from previous runs

Documentation/Defragmentation.rst

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,22 @@
11
Defragmentation
22
===============
33

4-
...
4+
Defragmentation of files is supposed to make the layout of the file extents to
5+
be more linear or at least coalesce the file extents into larger ones that can
6+
be stored on the device more efficiently. The reason there's a need for
7+
defragmentation stems from the COW design that BTRFS is built on and is
8+
inherent. The fragmentation is caused by rewrites of the same file data
9+
in-place, that has to be handled by creating a new copy that may lie on a
10+
distant location on the physical device. Fragmentation is the worst problem on
11+
rotational hard disks due to the delay caused by moving the drive heads to the
12+
distant location. With the modern seek-less devices it's not a problem though
13+
it may still make sense because of reduced size of the metadata that's needed
14+
to track the scattered extents.
15+
16+
File data that are in use can be safely defragmented because the whole process
17+
happens inside the page cache, that is the central point caching the file data
18+
and takes care of synchronization. Once a filesystem sync or flush is started
19+
(either manually or automatically) all the dirty data get written to the
20+
devices. This however reduces the chances to find optimal layout as the writes
21+
happen together with other data and the result depens on the remaining free
22+
space layout and fragmentation.

Documentation/Flexibility.rst

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,18 @@
11
Flexibility
22
===========
33

4-
* dynamic inode creation (no preallocated space)
4+
The underlying design of BTRFS data structures allows a lot of flexibility and
5+
making changes after filesystem creation, like resizing, adding/removing space
6+
or enabling some features on-the-fly.
57

6-
* block group profile change on-the-fly
8+
* **dynamic inode creation** -- there's no fixed space or tables for tracking
9+
inodes so the number of inodes that can be created is bounded by the metadata
10+
space and it's utilization
11+
12+
* **block group profile change on-the-fly** -- the block group profiles can be
13+
changed on a mounted filesystem by running the balance operation and
14+
specifying the conversion filters
15+
16+
* **resize** -- the space occupied by the filesystem on each device can be
17+
resized up (grow) or down (shrink) as long as the amount of data can be still
18+
contained on the device

Documentation/Qgroups.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
Quota groups
22
============
33

4-
...
4+
.. include:: ch-quota-intro.rst

Documentation/Reflink.rst

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,29 @@
11
Reflink
22
=======
33

4-
...
4+
Reflink is a type of shallow copy of file data that shares the blocks but
5+
otherwise the files are independent and any change to the file will not affect
6+
the other. This builds on the underlying COW mechanism. A reflink will
7+
effectively create only a separate metadata pointing to the shared blocks which
8+
is typically much faster than a deep copy of all blocks.
9+
10+
The reflink is typically meant for whole files but a partial file range can be
11+
also copied, though there are no ready-made tools for that.
12+
13+
.. code-block:: shell
14+
15+
cp --reflink=always source target
16+
17+
There are some constaints:
18+
19+
- cross-filesystem reflink is not possible, there's nothing in common between
20+
so the block sharing can't work
21+
- reflink crossing two mount points of the same filesystem does not work due
22+
to an artificial limitation in VFS (this may change in the future)
23+
- reflink requires source and target file that have the same status regarding
24+
NOCOW and checksums, for example if the source file is NOCOW (once created
25+
with the chattr +C attribute) then the above command won't work unless the
26+
target file is pre-created with the +C attribute as well, or the NOCOW
27+
attribute is inherited from the parent directory (chattr +C on the directory)
28+
or if the whole filesystem is mounted with *-o nodatacow* that would create
29+
the NOCOW files by default

Documentation/Resize.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,12 @@
11
Resize
22
======
33

4-
...
4+
A BTRFS mounted filesystem can be resized after creation, grown or shrunk. On a
5+
multi device filesystem the space occupied on each device can be resized
6+
independently. Data tha reside in the are that would be out of the new size are
7+
relocated to the remaining space below the limit, so this constrains the
8+
minimum size to which a filesystem can be shrunk.
9+
10+
Growing a filesystem is quick as it only needs to take note of the available
11+
space, while shrinking a filesystem needs to relocate potentially lots of data
12+
and this is IO intense. It is possible to shrink a filesystem in smaller steps.

Documentation/Scrub.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
Scrub
22
=====
33

4-
...
4+
.. include:: ch-scrub-intro.rst

Documentation/Send-receive.rst

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,23 @@
1-
Balance
2-
=======
1+
Send/receive
2+
============
33

4-
...
4+
Send and receive are complementary features that allow to transfer data from
5+
one filesystem to another in a streamable format. The send part traverses a
6+
given read-only subvolume and either creates a full stream representation of
7+
its data and metadata (*full mode*), or given a set of subvolumes for reference
8+
it generates a difference relative to that set (*incremental mode*).
9+
10+
Receive on the other hand takes the stream and reconstructs a subvolume with
11+
files and directories equivalent to the filesystem that was used to produce the
12+
stream. The result is not exactly 1:1, eg. inode numbers can be different and
13+
other unique identifiers can be different (like the subvolume UUIDs). The full
14+
mode starts with an empty subvolume, creates all the files and then turns the
15+
subvolume to read-only. At this point it could be used as a starting point for a
16+
future incremental send stream, provided it would be generated from the same
17+
source subvolume on the other filesystem.
18+
19+
The stream is a sequence of encoded commands that change eg. file metadata
20+
(owner, permissions, extended attributes), data extents (create, clone,
21+
truncate), whole file operations (rename, delete). The stream can be sent over
22+
network, piped directly to the receive command or saved to a file. Each command
23+
in the stream is protected by a CRC32C checksum.

0 commit comments

Comments
 (0)