As with many things in life, deduplication is a compromise. In order to gain increased levels of storage efficiency, additional cluster resources (CPU, memory and disk IO) are utilized to find and execute the sharing of common data blocks.
Another important performance impact consideration with dedupe is the potential for data fragmentation. After deduplication, files that previously enjoyed contiguous on-disk layout will often have chunks spread across less optimal file system regions. This can lead to slightly increased latencies when accessing these files directly from disk, rather than from cache.
To help reduce this risk, SmartDedupe will not share blocks across node pools or data tiers, and will not attempt to deduplicate files smaller than 32KB in size. On the other end of the spectrum, the largest contiguous region that will be matched is 4MB.
Because deduplication is a data efficiency product rather than performance enhancing tool, in most cases the consideration will be around cluster impact management. This is from both the client data access performance front, since, by design, multiple files will be sharing common data blocks, and also from the dedupe job execution perspective, as additional cluster resources are consumed to detect and share commonality.
The first deduplication job run will often take a substantial amount of time to run, since it must scan all files under the specified directories to generate the initial index and then create the appropriate shadow stores. However, deduplication job performance will typically improve significantly on the second and subsequent job runs (incrementals), once the initial index and the bulk of the shadow stores have already been created.
If incremental deduplication jobs do take a long time to complete, this is most likely indicative of a data set with a high rate of change. If a deduplication job is paused or interrupted, it will automatically resume the scanning process from where it left off.
Deduplication can significantly increase the storage efficiency of data. However, the actual space savings will vary depending on the specific attributes of the data itself. As mentioned above, the deduplication assessment job can be run to help predict the likely space savings that deduplication would provide on a given data set.
For example, virtual machines files often contain duplicate data, much of which is rarely modified. Deduplicating similar OS type virtual machine images (VMware VMDK files, etc, that have been block-aligned) can significantly decrease the amount of storage space consumed. However, the potential for performance degradation as a result of block sharing and fragmentation should be carefully considered first.
OneFS SmartDedupe does not deduplicate across files that have different protection settings. For example, if two files share blocks, but file1 is parity protected at +2:1, and file2 has its protection set at +3, SmartDedupe will not attempt to deduplicate them. This ensures that all files and their constituent blocks are protected as configured. Additionally, SmartDedupe won’t deduplicate files that are stored on different node pools. For example, if file1 and file2 are stored on tier 1 and tier 2 respectively, and tier1 and tier2 are both protected at 2:1, OneFS won’t deduplicate them. This helps guard against performance asynchronicity, where some of a file’s blocks could live on a different tier, or class of storage, than others.
OneFS performance resource management provides statistics for the resources used by jobs – both cluster-wide and per-node. This information is provided via the ‘isi statistics workload’ CLI command. Available in a ‘top’ format, this command displays the top jobs and processes, and periodically updates the information.
For example, the following syntax shows, and indefinitely refreshes, the top five processes on a cluster:
# isi statistics workload --limit 5 –format=top last update: 2020-09-23T16:45:25 (s)ort: default CPU Reads Writes L2 L3 Node SystemName JobType 1.4s 9.1k 0.0 3.5k 497.0 2 Job: 237 IntegrityScan 1.2s 85.7 714.7 4.9k 0.0 1 Job: 238 Dedupe 1.2s 9.5k 0.0 3.5k 48.5 1 Job: 237 IntegrityScan 1.2s 7.4k 541.3 4.9k 0.0 3 Job: 238 Dedupe 1.1s 7.9k 0.0 3.5k 41.6 2 Job: 237 IntegrityScan
From the output, we can see that two job engine jobs are in progress: Dedupe (job ID 238), which runs at low impact and priority level 4 is contending with IntegrityScan (job ID 237), which runs by default at medium impact and priority level 1.
The resource statistics tracked per job, per job phase, and per node include CPU, reads, writes, and L2 & L3 cache hits. Unlike the output from the ‘top’ command, this makes it easier to diagnose individual job resource issues, etc.
Below are some examples of typical space reclamation levels that have been achieved run SmartDedupe on various data types. Be aware though that these space savings values are provided solely as rough guidance. Since no two data sets are alike (unless they’re replicated), actual results can and will vary considerably from these examples.
|Workflow / Data Type
|Typical Space Savings
|Virtual Machine Data
|Home Directories / File Shares
|Engineering Source Code
SmartDedupe is included as a core component of OneFS but requires a valid product license key in order to activate. An unlicensed cluster will show a SmartDedupe warning until a valid product license has been applied to the cluster.
For optimal cluster performance, observing the following SmartDedupe best practices is recommended.
- Deduplication is most effective when applied to data sets with a low rate of change – for example, archived data.
- Enable SmartDedupe to run at subdirectory level(s) below /ifs.
- Avoid adding more than ten subdirectory paths to the SmartDedupe configuration policy,
- SmartDedupe is ideal for home directories, departmental file shares and warm and cold archive data sets.
- Run SmartDedupe against a smaller sample data set first to evaluate performance impact versus space efficiency.
- Schedule deduplication to run during the cluster’s low usage hours – i.e. overnight, weekends, etc.
- After the initial dedupe job has completed, schedule incremental dedupe jobs to run every two weeks or so, depending on the size and rate of change of the dataset.
- Always run SmartDedupe with the default ‘low’ impact Job Engine policy.
- Run the dedupe assessment job on a single root directory at a time. If multiple directory paths are assessed in the same job, you will not be able to determine which directory should be deduplicated.
- When replicating deduplicated data, to avoid running out of space on target, it is important to verify that the logical data size (i.e. the amount of storage space saved plus the actual storage space consumed) does not exceed the total available space on the target cluster.
- Run a deduplication job on an appropriate data set prior to enabling a snapshots schedule.
- Where possible, perform any snapshot restores (reverts) before running a deduplication job. And run a dedupe job directly after restoring a prior snapshot version.
With dedupe, there’s always trade-off between cluster resource consumption (CPU, memory, disk), the potential for data fragmentation and the benefit of increased space efficiency. Therefore, SmartDedupe is not ideally suited for high performance workloads.
- Depending on an application’s I/O profile and the effect of deduplication on the data layout, read and write performance and overall space savings can vary considerably.
- SmartDedupe will not permit block sharing across different hardware types or node pools to reduce the risk of performance asymmetry.
- SmartDedupe will not share blocks across files with different protection policies applied.
- OneFS metadata, including the deduplication index, is not deduplicated.
- Deduplication is a long running process that involves multiple job phases that are run iteratively.
- SmartDedupe will not attempt to deduplicate files smaller than 32KB in size.
- Dedupe job performance will typically improve significantly on the second and subsequent job runs, once the initial index and the bulk of the shadow stores have already been created.
- SmartDedupe will not deduplicate the data stored in a snapshot. However, snapshots can certainly be created of deduplicated data.
- If deduplication is enabled on a cluster that already has a significant amount of data stored in snapshots, it will take time before the snapshot data is affected by deduplication. Newly created snapshots will contain deduplicated data, but older snapshots will not.
- Any file on a cluster that is ‘un-deduped’ is automatically marked to ‘not re-dupe’. In order to reapply deduplicate to an un-deduped file, specific flags on the shadow store need to be cleared. For example:How to check the setting
# isi get -D /ifs/data/test | grep -i dedupe
* Do not dedupe: 0
Undedupe the file via isi_sstore :
# isi_sstore undedupe /ifs/data/test
Verify the setting:
# isi get -D /ifs/data/test | grep -i dedupe
* Do not dedupe: 1
If you want that file to participate in dedupe again then you need reset the “Do not dedupe” flag.
How to reset the path.
isi_sstore attr –no_dedupe=false <path>
SmartDedupe is one of several components of OneFS that enable OneFS to deliver a very high level of raw disk utilization. Another major storage efficiency attribute is the way that OneFS natively manages data protection in the file system. Unlike most file systems that rely on hardware RAID, OneFS protects data at the file level and, using software-based erasure coding, allows most customers to enjoy raw disk space utilization levels in the 80% range or higher. This is in contrast to the industry mean of around 50-60% raw disk capacity utilization. SmartDedupe serves to further extend this storage efficiency headroom, bringing an even more compelling and demonstrable TCO advantage to primary file based storage.
SmartDedupe post process dedupe is compatible with OneFS in-line data reduction (which we’ll cover in another blog post series) and vice versa. In-line compression is able to compress OneFS shadow stores. However, for SmartDedupe to process compressed data, the SmartDedupe job will have to decompress it first in order to perform deduplication, which is an addition resource overhead.