Creating mirrored zfs on Debian
ZFS is a powerful filesystem that combines the functionality of a volume manager with a filesystem. One of its key features is the ability to create mirrored storage pools for redundancy and increased reliability. This guide will walk you through the process of creating a mirrored ZFS pool on Debian (Ubuntu/Mint etc) using SATA SSDs.
I recently acquired two high-endurance Intel SATA SSDs: the Intel DC S3710 and Intel DC S3700.
Advantages of ZFS
- Pooled Storage: Combine multiple drives into a single storage pool.
- Copy-on-Write: Ensures data is never overwritten in place, enhancing data integrity.
- Snapshots: Create point-in-time copies of your data for easy recovery.
- Data Integrity Verification and Automatic Repair: Detects and corrects data corruption automatically.
- RAID-Z: Provides RAID-like redundancy with the benefits of ZFS.
I primarily use ZFS for mirroring (RAID-Z) and data integrity. Since both SSDs are enterprise-grade drives previously used in servers, my goal is to create a mirrored pool to ensure data redundancy and reliability.
Prerequisites
Before you begin, ensure you have:
- Using debian flavor OS on a separate boot drive (my case NVME SSD).
- Two extra SATA SSDs to create the mirrored ZFS pool.
- Basic knowledge of command-line operations.
Step 1: Install ZFS Utilities
First, you need to install the ZFS utilities. Open a terminal and run:
sudo apt update
sudo apt install zfsutils-linux
Step 2: Prepare the SATA SSDs
-
Identify the Disks
Use the
lsblk
orfdisk -l
command to identify your SATA SSDs. Assume they are/dev/sdb
and/dev/sdc
.lsblk
should output something like thisNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1.8T 0 disk
sdb 8:16 0 372.6G 0 disk
sdc 8:32 0 372.6G 0 disk
nvme0n1 259:0 0 931.5G 0 disk
├─nvme0n1p1 259:1 0 524M 0 part /boot/efi
├─nvme0n1p2 259:2 0 465.2G 0 part /
└─nvme0n1p3 259:3 0 465.8G 0 partor
fdisk -l
should have something like thisDisk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P3SSD8
...
...
Disk /dev/sdb: 372.61 GiB, 400088457216 bytes, 781422768 sectors
Disk model: INTEL SSDSC2BA40
...
...
Disk /dev/sdc: 372.61 GiB, 400088457216 bytes, 781422768 sectors
Disk model: INTEL SSDSC2BA40
...
... -
Partition the Disks (Optional)
If you want to create partitions, use
fdisk
orparted
. For simplicity, we will use the entire disks.
Step 3: Create the ZFS Pool
To create a mirrored ZFS pool, use the zpool create
command:
sudo zpool create mypool mirror /dev/sdb /dev/sdc
In this command:
mypool
is the name of your ZFS pool.mirror
specifies that the disks will be mirrored./dev/sdb
and/dev/sdc
are the SATA SSDs.
Step 4: Verify the Pool
Check the status of your ZFS pool to ensure it's created correctly:
sudo zpool status
You should see output indicating that the pool mypool
is created and the disks are mirrored.
pool: mypool
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
errors: No known data errors
Step 5: Create a ZFS Filesystem
Create a ZFS filesystem within your pool:
sudo zfs create mypool/mydataset
Replace mypool/mydataset
with your preferred dataset name.
Step 6: Mount the Filesystem
ZFS filesystems are mounted automatically by default. You can check the mount point with:
sudo zfs get mountpoint mypool/mydataset
NAME PROPERTY VALUE SOURCE
mypool/mydataset mountpoint /mypool/mydataset default
If you need to change the mount point, use:
sudo zfs set mountpoint=/path/to/mount mypool/mydataset
Replace /path/to/mount
with your desired directory.
Step 7: Check Space Usage
To check the free space available on your ZFS pool, use:
sudo zpool list
This command provides an overview of the pool’s capacity, including the total space, used space, and free space.
Example Output:
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
mypool 1.00T 200G 800G 20% 1.00x ONLINE -
In this example:
SIZE
is the total size of the pool.ALLOC
is the amount of space currently allocated.FREE
is the amount of free space available.CAP
is the percentage of space used.
To see space usage for a specific ZFS filesystem or dataset, use:
sudo zfs list
To get detailed information about a dataset, use:
sudo zfs get all mypool/mydataset
Things to keep in mind
1. User Access
Since we are running almost all of the commands as sudo
local user won;t have access to newly created mount. We can use chown
command to give access to current user.
sudo chown -R $USER /mypool/mydataset
2. Monitoring and Maintenance
Proper maintenance of your ZFS pool is crucial for ensuring long-term performance and data integrity. Two important maintenance tasks are running TRIM and SCRUB operations. Here’s a guide on how to perform these tasks:
2.1. Running TRIM
TRIM is a command used to inform the SSD that certain blocks of data are no longer in use and can be cleaned up. This helps in maintaining SSD performance and prolonging the lifespan of the drives.
Why TRIM?
- Performance: Helps SSDs maintain their performance by freeing up unused blocks.
- Lifespan: Reduces unnecessary write operations, which can extend the lifespan of the SSD.
How to Enable and Run TRIM:
-
Check if TRIM is Enabled:
To ensure TRIM is enabled on your ZFS pool, check the
autotrim
property:sudo zpool get autotrim mypool
If
autotrim
is set tooff
, you should enable it. -
Enable TRIM:
Enable TRIM on your ZFS pool with the following command:
sudo zpool set autotrim=on mypool
This will ensure that TRIM operations are automatically performed on your pool as needed.
-
Manual TRIM (Optional):
If you want to manually trigger a TRIM operation, use:
sudo zpool trim -a
This command will run TRIM on all datasets in your pool. Depending on the size of your pool and the amount of data, this operation may take some time.
2.2. Running SCRUB
SCRUB is a maintenance operation that checks the integrity of your ZFS pool’s data. It scans the pool for errors, verifies checksums, and attempts to correct any detected issues.
Why SCRUB?
- Data Integrity: Ensures that all data is accurate and free from corruption.
- Error Detection: Identifies and repairs silent data corruption before it becomes a problem.
How to Perform a SCRUB:
-
Start a SCRUB Operation:
Initiate a SCRUB on your ZFS pool with the following command:
sudo zpool scrub mypool
This command starts the SCRUB process, which may take some time depending on the size and health of the pool.
-
Monitor SCRUB Progress:
To check the status and progress of the SCRUB operation:
sudo zpool status
This command will show you detailed information about the SCRUB, including the progress and any errors detected.
-
Review Results:
After the SCRUB is complete, review the output of the
zpool status
command. It will show if any errors were found and if they were repaired. Regularly reviewing these results helps ensure the ongoing health of your ZFS pool.~ sudo zpool status -t mypool
pool: mypool
state: ONLINE
scan: scrub repaired 0B in 00:06:06 with 0 errors on Tue Aug 6 20:58:36 2024
config:
NAME STATE READ WRITE CKSUM
mypool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdb ONLINE 0 0 0 (100% trimmed, completed at Tue 06 Aug 2024 20:51:02)
sdc ONLINE 0 0 0 (100% trimmed, completed at Tue 06 Aug 2024 20:51:02)
errors: No known data errors
By regularly running TRIM and SCRUB operations, you ensure that your ZFS pool remains in optimal condition, providing reliable performance and data integrity for your storage needs.
3. Limiting RAM usage
Managing RAM usage in ZFS is essential for systems with limited memory resources. ZFS utilizes a caching mechanism known as the Adaptive Replacement Cache (ARC) to keep frequently accessed data in RAM, enhancing access speed. By default, ZFS can use up to 50% of the system's memory for ARC. If the system requires additional memory for other tasks, ZFS will automatically release some of the ARC's memory to accommodate those needs, returning it to the pool of available memory. This behavior ensures efficient memory usage while maintaining ZFS performance.
You can check how much arc is using by running arc_summary
command. Which will show minimum and maximum amount of memory that will be used for caching:
...
ARC size (current): 1.4 % 56.3 MiB
Target size (adaptive): 25.0 % 1.0 GiB
Min size (hard limit): 25.0 % 1.0 GiB
Max size (high water): 16:1 16.0 GiB
...
A good rule of thumb will be setting up a minimum of 1GB
and maximum as:
1GB + [1 GB/TB of storage] + [5GB/TB for Dedupe]
Which in my case will approximates to 3.4GB. Since I've plenty of RAM I will be rounding this up to 4GB.
sudo echo "options zfs zfs_arc_max=4294967296" >> /etc/modprobe.d/zfs.conf
After this we need to either reboot
the system or reload the zfs module.
sudo modprobe -r zfs
sudo modprobe zfs
After reboot
and rerunning arc_summary
:
Max size (high water): 4:1 4.0 GiB
4. Properly Shut Down the PC
Before shutting down, ensure that all data has been correctly copied and that there are no ongoing write operations. You can verify the pool’s status:
sudo zpool status
Since zfs
has COW (Copy-On-Write) it should be fine even if you have a powerloss. In my case both of the disk also has enhanced power-loss data protection feature.
5. Unmount the ZFS Filesystem (Optional)
While ZFS usually handles unmounting automatically, you can manually unmount the filesystem if needed:
sudo zfs unmount mypool/mydataset
If you have multiple datasets, you might need to unmount them individually or use:
sudo zfs unmount -a
Snapshots and backup
To take a zfs
snapshot:
sudo zfs snapshot mypool/mydataset@backup_123
Snapshots can be listed using:
zfs list -t snapshot
Use send
subcommand to copy the snapshot to another location:
sudo zfs send mypool/mydataset@backup_123 > /mnt/backups/backup_123.zfs
Snapshots can be deleted using:
sudo zfs destroy mypool/mydataset@backup_123
Upgrading or replacing disks
When I first wrote the article, I was only storing a few GitHub repositories in mypool
. However, ever since I moved my Docker data location (data-root
) to mypool
, I've been running out of space very quickly. ZFS switches from performance mode to space-saving write mode once you hit around 90% of its capacity (source: 45drives). To address this, I upgraded from 400GB disks to 960GB disks. Despite the upgrade, the process for replacing a bad or failed disk in ZFS remains the same. Make sure to take the backup of your data before doing this.
Turn off computer and replace one disk. When you turn on it will be on DEGRADED state:
~ zfs status
pool: mypool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub repaired 0B in 00:10:04 with 0 errors on Mon Dec 2 21:39:41 2024
config:
NAME STATE READ WRITE CKSUM
mypool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
13722977832891186657 UNAVAIL 0 0 0 was /dev/sdb1
sdc ONLINE 0 0 0
Here I've replaced the sdb
disk. To resilver run
sudo zpool replace mypool sdb
Once resilvering processing started you can check the status using:
~ zpool status
pool: mypool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Dec 3 10:58:34 2024
135G scanned at 555M/s, 55.0G issued at 226M/s, 135G total
55.8G resilvered, 40.78% done, 00:06:01 to go
config:
NAME STATE READ WRITE CKSUM
mypool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
replacing-0 DEGRADED 0 0 0
13722977832891186657 UNAVAIL 0 0 0 was /dev/sdb1/old
sdb ONLINE 0 0 0 (resilvering)
sdc ONLINE 0 0 0
errors: No known data errors
If you are upgrading the disks to higher capacity then you need to repeat the process for all disks. Once done, the pool should automatically expand to new capacity if the autoexpand
flag is set. You can check flag and set using:
~ zpool get autoexpand mypool
NAME PROPERTY VALUE SOURCE
mypool autoexpand off default
~ sudo zpool set autoexpand=on mypool
If the autoexpand
was off during the process then it won't expand automatically.
~ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
mypool 372G 135G 237G - 520G 1% 36% 1.00x ONLINE -
But you can manually expand the pool after resilvering using:
sudo zpool online -e mypool /dev/sdb /dev/sdc
~ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
mypool 894G 135G 759G - - 0% 15% 1.00x ONLINE -
You may need to rerun pool permissions once everything is done:
sudo chown -R $USER /mypool/mydataset
Conclusion
By following these steps, you can create a mirrored ZFS pool on Ubuntu or Linux Mint using SATA SSDs. ZFS offers robust data protection features, and a mirrored setup ensures that your data is duplicated across multiple drives for enhanced reliability.
Always back up important data. Remember the old saying:
Feel free to explore further ZFS features and configurations to optimize your setup based on your specific needs.
First published on 2024-08-04