What is zfs scrub
Last updated: April 1, 2026
Key Facts
- ZFS was first released in November 2005 with Solaris 10, and the scrub mechanism has been a core feature of the filesystem since that initial public release.
- ZFS uses the Fletcher-4 checksum algorithm by default (256-bit non-cryptographic hash) and optionally SHA-256 or SHA-512 for every data and metadata block in the pool.
- Carnegie Mellon University research found that approximately 1–2% of hard drives per year experience at least one silent data error undetectable by traditional filesystems without checksum verification.
- A ZFS scrub on a 1 TB HDD pool takes approximately 2–6 hours; the same pool on NVMe SSDs typically completes in 15–30 minutes due to the ~10× speed difference in sequential read throughput.
- OpenZFS 2.0, released in November 2020, introduced the ability to pause and resume scrubs via 'zpool scrub -p', a capability absent from ZFS for its first 15 years of existence.
Overview of ZFS Scrub
A ZFS scrub is a proactive data integrity verification operation built into the ZFS (Zettabyte File System) that systematically reads every data and metadata block stored in a ZFS pool and verifies each block's checksum against a separately stored known-good value. If a discrepancy is found — indicating data corruption — ZFS can automatically repair the corrupted data using redundant copies available from mirrors or RAID-Z vdev configurations. This mechanism is one of the defining characteristics that sets ZFS apart from traditional filesystems such as ext4, NTFS, and XFS, which historically lack any mechanism to detect or repair silent data corruption at the block level.
ZFS was originally developed by Sun Microsystems engineer Jeff Bonwick and his team, with the filesystem first made publicly available in November 2005 as part of Solaris 10. The scrub mechanism has been a core feature since that initial release. Following Oracle's acquisition of Sun Microsystems in January 2010, ZFS development continued through the open-source OpenZFS project, formally established in September 2013. OpenZFS today maintains ZFS implementations for Linux (integrated into the mainline kernel module ecosystem), FreeBSD (included natively since FreeBSD 8.0 in 2009), and other platforms. Major NAS operating systems including TrueNAS CORE, TrueNAS SCALE, and OmniOS are built on OpenZFS.
The fundamental problem that ZFS scrub addresses is known as silent data corruption or bit rot — the gradual, undetected degradation of stored data caused by hardware failures, cosmic rays flipping bits in DRAM or storage media, storage controller firmware bugs, or magnetic domain decay on spinning hard drives. A landmark study by researchers at Carnegie Mellon University found that approximately 1–2% of hard drives per year experience at least one silent read error that conventional filesystems cannot detect, let alone repair. A 2007 Google study analyzing over 100,000 consumer and enterprise drives reached similar conclusions, finding that a significant fraction of drives develop undetected errors within 3–5 years of operation. ZFS scrub provides a systematic, scheduled defense against this class of failure.
How ZFS Scrub Works: Technical Architecture and Process
ZFS uses an end-to-end checksumming strategy that is architecturally different from the checksums used in older filesystems. In ext4 or NTFS, checksums (where they exist) protect only filesystem metadata structures. In ZFS, every data block and every metadata block has a checksum stored in its parent block pointer — a completely separate location from the data itself. This separation is critical: even if the data block is silently corrupted on disk, the stored checksum remains valid in the parent pointer, enabling ZFS to detect the mismatch reliably.
By default, ZFS uses the Fletcher-4 checksum algorithm, a fast non-cryptographic hash that produces a 256-bit value. For applications requiring stronger integrity guarantees — such as deduplication workloads or high-security environments — ZFS supports SHA-256 and SHA-512/256, full cryptographic hash functions. The checksum algorithm is a per-dataset property configurable with zfs set checksum=sha256 poolname/datasetname.
When a scrub is initiated — either manually via zpool scrub [pool-name] or automatically by a scheduled systemd timer or cron job — ZFS traverses its entire Merkle-tree-like block reference structure, reading every live block from the physical storage devices and computing its checksum. The computed checksum is compared against the stored value. The scrub process runs as a background I/O task with lower priority than normal read/write operations, minimizing performance impact on active workloads, though total disk I/O throughput is reduced during the operation.
When a checksum mismatch is detected, ZFS responds based on pool topology:
- Mirrored pool (2-way or 3-way mirror): ZFS reads the corresponding block from the mirror device. If that copy passes its own checksum verification, ZFS automatically overwrites the corrupted block with the good data — transparent self-healing with no administrator intervention required. On a 2-way mirror, ZFS can repair 100% of single-device corruptions.
- RAID-Z1 pool (single parity): ZFS uses parity information to reconstruct and repair the corrupted block, similar in concept to software RAID-5 but more reliably because ZFS knows the exact location of the bad block rather than scanning blindly.
- RAID-Z2 or RAID-Z3 pool: Same reconstruction approach but tolerates 2 or 3 simultaneous device errors respectively, providing progressively higher protection against concurrent failures.
- Single-disk pool (no redundancy): ZFS detects and reports the corruption but cannot repair it. The error is logged to the system journal, and
zpool statusreports the affected block. Manual data recovery from a backup is required.
After a scrub completes, detailed results are visible via zpool status [pool-name], which reports the total number of blocks scanned, errors detected, errors repaired, and the timestamp of the last completed scrub. A clean scrub producing zero errors is the expected and desired result, confirming that all stored data matches its recorded checksums as of the scrub date.
Common Misconceptions About ZFS Scrub
Misconception 1: ZFS scrub is equivalent to running fsck on traditional filesystems. This comparison is misleading and obscures an important distinction. Traditional fsck (filesystem check) tools verify and repair the logical structure of filesystem metadata — directory trees, inode tables, and block allocation maps — after events like unexpected shutdowns or power failures. ZFS scrub instead verifies the actual binary content of every stored data block against its checksum, detecting physical media errors, cosmic ray bit flips, and controller-induced corruption that are completely invisible to structural checks. Critically, because ZFS uses copy-on-write (COW) transactions, the filesystem metadata is always in a consistent state after any crash — a ZFS pool never requires an fsck-equivalent after sudden power loss. Scrub is therefore an ongoing data-quality operation, not a crash-recovery tool.
Misconception 2: Running a ZFS scrub is risky and can cause data loss. This concern is unfounded. A ZFS scrub is a predominantly read-only operation. Writes to disk occur only when ZFS has confirmed corruption in a block AND has a verified good copy from a redundant device to replace it with. ZFS never overwrites data based on assumption — it only writes during repair when a checksum-verified good copy is available. The operation is explicitly designed to be safe to run on active production pools at any time. The only practical concern is the temporary increase in disk I/O, which can slow application response times on I/O-intensive workloads. Scheduling scrubs during off-peak periods (overnight or on weekends) addresses this entirely.
Misconception 3: ZFS scrub eliminates the need for external backups. While ZFS scrub and the underlying RAID-Z/mirror redundancy can repair a wide range of hardware-induced corruption, it is absolutely not a substitute for independent backups. ZFS scrub provides no protection against accidental deletion of files (immediately reflected across all mirrors), ransomware that encrypts pool data, logical errors in applications, or catastrophic events like fire or flood that destroy all physical devices in the pool simultaneously. The classic 3-2-1 backup strategy — 3 copies of data, on 2 different storage media types, with 1 copy stored off-site — remains essential even with ZFS scrub enabled and operating correctly.
Practical Guide to Using ZFS Scrub
For system administrators and home users managing ZFS pools, the following practical steps maximize the protection provided by ZFS scrubbing.
Starting a manual scrub: Run sudo zpool scrub [pool-name] to initiate a scrub immediately. The command returns instantly and the scrub runs in the background. Replace [pool-name] with the actual pool name visible in zpool list.
Monitoring scrub progress: Use sudo zpool status [pool-name] to check real-time scrub progress. The output includes a percentage complete, current scan rate in MB/s, estimated time remaining, and a summary of any errors found. Refreshing this command every few seconds provides a live progress view.
Pausing and resuming scrubs (OpenZFS 2.0+): On systems running OpenZFS 2.0 (released November 2020) or later — including Ubuntu 22.04 LTS, Debian 12, and FreeBSD 13 — scrubs can be paused with sudo zpool scrub -p [pool-name] and resumed by running sudo zpool scrub [pool-name] again. The scrub continues from where it left off. On older ZFS versions, the only option to stop a running scrub is to cancel it entirely with sudo zpool scrub -s [pool-name], which requires a full restart next time.
Automating monthly scrubs on Linux with systemd: The zfsutils-linux package on Ubuntu and Debian includes a pre-built systemd timer for monthly scrubs. Enable it for a specific pool with: sudo systemctl enable zfs-scrub-monthly@[pool-name].timer && sudo systemctl start zfs-scrub-monthly@[pool-name].timer. On FreeBSD, the monthly periodic script at /etc/periodic/monthly/510.zfsscrub is available and enabled by default in many configurations.
Setting up error alerting with ZED: The ZFS Event Daemon (zed) monitors ZFS events and can send email notifications when scrubs detect errors. Install and configure zed (part of the zfs-zed package on Debian/Ubuntu) and edit /etc/zfs/zed.d/zed.rc to set ZED_EMAIL_ADDR to your address. With this configured, any scrub that detects corruption triggers an immediate email alert, ensuring errors are not silently missed between scheduled checks.
Enterprise monitoring: For organizations managing large numbers of ZFS pools, the open-source Prometheus ZFS exporter combined with Grafana dashboards provides centralized visibility into scrub status, error counts, and pool health across hundreds of systems. TrueNAS (both CORE and SCALE editions) includes a built-in web interface for scheduling and monitoring scrubs without any command-line access, making ZFS scrub accessible to users without Linux administration experience.
Related Questions
How do I start a ZFS scrub on Linux?
To start a ZFS scrub on Linux, run the command <code>sudo zpool scrub [pool-name]</code> from a terminal, replacing [pool-name] with your pool's actual name (visible via <code>zpool list</code>). The scrub begins immediately in the background and does not block the terminal. Progress can be monitored in real time with <code>sudo zpool status [pool-name]</code>, which shows percentage complete, scan speed in MB/s, and estimated time remaining. No special options are required for a standard scrub — the single command is sufficient.
How long does a ZFS scrub take?
ZFS scrub duration depends primarily on pool size and storage device speed. A 1 TB pool on spinning hard drives (typically 150–200 MB/s sequential read) takes approximately 2–6 hours, while a 10 TB HDD pool can take 10–30 hours. On NVMe SSDs with sequential reads exceeding 3,000 MB/s, a 1 TB pool completes in approximately 15–30 minutes. The scrub runs at background I/O priority, so heavy foreground disk activity can extend completion time significantly. System administrators often schedule scrubs during low-activity periods — overnight or on weekends — to minimize user impact.
What happens when ZFS scrub finds an error?
When ZFS scrub detects a checksum mismatch on a block, its response depends on pool redundancy. On a mirrored or RAID-Z pool, ZFS automatically repairs the corrupted block using the verified good copy from a redundant device — this self-healing happens transparently with no administrator action required. On a single-disk pool with no redundancy, ZFS records the error in the pool's error counters, visible in <code>zpool status</code>, but cannot repair the data. After a scrub with errors, the administrator should check <code>zpool status</code> for a count of repaired blocks, investigate the cause (often a failing disk), and restore unrecoverable data from backups.
How often should I run a ZFS scrub?
The OpenZFS project and Oracle's Solaris documentation both recommend running ZFS scrubs at least once every 30 days for pools containing important data. Many Linux distributions with ZFS support (including Ubuntu 20.04 LTS and later) configure a monthly scrub timer automatically via systemd upon installation. For archival cold-storage pools accessed infrequently, scrubbing every 90 days may be acceptable, though monthly remains the conservative best practice. Pools that have experienced recent hardware issues, unusual errors, or power outages should be scrubbed immediately rather than waiting for the next scheduled interval.
What is the difference between ZFS scrub and ZFS resilver?
ZFS scrub and ZFS resilver both perform block-level reads and checksum verifications, but they serve different purposes and cover different data sets. A scrub reads and verifies every allocated block in the pool on demand or schedule, checking for silent corruption across the entire dataset. A resilver is triggered automatically when a device is replaced or brought back online after failure — it reads only the blocks that are relevant to the missing or new device and reconstructs them onto the replacement drive using parity or mirror data. Resilver is effectively a targeted rebuild operation focused on restoring full redundancy, while scrub is a comprehensive health audit of existing data.