April 2022 – Unstructured Data Quick Tips

OneFS Signed Upgrades

Introduced as part of the OneFS 9.4 security enhancements, signed upgrades help maintain system integrity by preventing a cluster from being compromised by the installation of maliciously modified upgrade packages. This is required by several industry security compliance mandates, such as the DoD Network Device Management Security Requirements Guide, which stipulates “The network device must prevent the installation of patches, service packs, or application components without verification the software component has been digitally signed using a certificate that is recognized and approved by the organization”.

With this new OneFS 9.4 signed upgrade functionality, all packages must be cryptographically signed before they can be installed. This applies to all upgrade types including core OneFS, patches, cluster firmware, and drive firmware. The underlying components that comprise this feature include an updated .isi format for all package types plus a new OneFS Catalog to store the verified packages. In OneFS 9.4, the actual upgrades themselves are still performed via either the CLI or WebUI, and are very similar to previous versions.

Under the hood, the new signed upgrade process works as follows:

The primary change is that, in OneFS 9.4, everything goes through the catalog, which comprises four basic components. There’s a small SQLite database that tracks metadata, a library which has the basic logic for the catalog, the signature library based around OpenSSL which handles all of the verification, and a couple of directories to store the verified packages.

With signed upgrades, there’s a single file to download that contains the upgrade package, README text, and all signature data, and no file unpacking required.

The .isi file format is a follows:

A ‘readme’ text file can be incorporated directly in the second region of the package file, providing instructions, version compatibility requirements, etc.

The first region, which contains the main package data, is also compatible with previous OneFS versions that don’t support the .isi format. This allows a signed firmware of DSP package to be installed on OneFS 9.3 and earlier.

The new OneFS catalog provides a secure place to store verified .isi packages, and only the root account has direct access. The catalog itself is stored at /ifs/,ifsvar/catalog and all maintenance and interaction is via the ‘isi upgrade catalog’ CLI command set. The contents, or artifacts, of the catalog each have an ID which corresponds to the SHA256 hash of the file.

Any user account with ISI_PRIV_SYS_UPGRADE privilege can perform the following catalog-related actions, expressed as flags to the ‘isi upgrade catalog’ command:

Action	Description
Clean	List packages in the catalog
Export	Save a catalog item to a user specified file location
Import	Verify and add a new .isi package file into the catalog
List	List packages in the catalog
Readme	Display the README text from a catalog item or .isi package file
Remove	Manually remove a package from the catalog
Repair	Re-verify all catalog packages an rebuild the database
Verify	Verify the signature of a catalog item or .isi package file

Package verification leverages the OneFS’ OpenSSL library, which enables a SHA256 hash of the manifest to be verified against the certificate. As part of this process, the chain-of-trust for the included certificate is compared with contents of the /etc/ssl/certs directory, and the distinguished name on the checked against /etc/upgrade/identities file. Finally, the SHA256 hash of the data regions is compared against values from manifest.

The signature can be checked using the ‘isi upgrade catalog verify’ command. For example:

# isi upgrade catalog verify --file /ifs/install.isi

Item             Verified

--------------------------

/ifs/install.isi True

--------------------------

Total: 1

Additional install image details are available via the ‘isi_packager view’ command

# isi_packager view --package /ifs/install.isi

== Region 1 ==

Type: OneFS Install Image

Name: OneFS_Install_0x90500B000000AC8_B_MAIN_2760(RELEASE)

Hash: ef7926cfe2255d7a620eb4557a17f7650314ce1788c623046929516d2d672304

Size: 397666098

== Footer Details ==

Format Version: 1

 Manifest Size: 296

Signature Size: 2838

Timestamp Size: 1495

 Manifest Hash: 066f5d6e6b12081d3643060f33d1a25fe3c13c1d13807f49f51475a9fc9fd191

Signature Hash: 5be88d23ac249e6a07c2c169219f4f663220d4985e58b16be793936053a563a3

Timestamp Hash: eca62a3c7c3f503ca38b5daf67d6be9d57c4fadbfd04dbc7c5d7f1ff80f9d948

== Signature Details ==

Fingerprint:     33fba394a5a0ebb11e8224a30627d3cd91985ccd

Issuer:          ISLN

Subject:         US / WA / Sea / Isln OneFS.

Organization:    Isln Powerscale OneFS

Expiration:      2022-09-07 22:00:22

Ext Key Usage:   codesigning

Packages in the catalog can be listed as follows:

# isi upgrade catalog list

ID    Type  Description                                               README

-----------------------------------------------------------------------------

cdb88 OneFS OneFS 9.4.0.0_build(2797)style(11) / B_MAIN_2797(RELEASE) -

3a145 DSP   Drive_Support_v1.39.1                                    Included

840b8 Patch HealthCheck_9.2.1_2021-09                                Included

aa19b Patch 9.3.0.2_GA-RUP_2021-12_PSP-1643                          Included

-----------------------------------------------------------------------------

Total: 4

Note that the package ID is comprised from first few characters of SHA256 hash

Packages are automatically imported when used, and verified upon import. Verification and import can also be performed manually, if desired:

# isi upgrade catalog verify --file Drive_Support_v1.39.1.isi

Item                                      Verified

------------------------------------------------- /ifs/packages/Drive_Support_v1.39.1.isi True

-------------------------------------------------

# isi upgrade catalog import Drive_Support_v1.39.1.isi

Packages can also be exported from the catalog and copy to another cluster, for example. Generally, exported packages can be re-imported, too.

# isi upgrade catalog list

ID    Type Description                                               README

----------------------------------------------------------------------------- 00b9c OneFS OneFS 9.4.0.0_build(2625)style(11) / B_MAIN_2625(RELEASE) –

3a145 DSP Drive_Support_v1.39.1 Included

----------------------------------------------------------------------------- Total: 5

# isi upgrade catalog export --id 3a145 --file /ifs/Drive_Support_v1.39.1.isi

However, auto-generated OneFS images cannot be reimported.

The README column of the ‘isi upgrade catalog list’ output indicates whether release notes are included for a .isi file or catalog item. If available, these can be viewed as follows:

# isi upgrade catalog readme --file HealthCheck_9.2.1_2021-09.isi | less Updated: September 02, 2021 *****************************************************************************

HealthCheck_9.2.1_2021-09: Patch for OneFS 9.2.1.x.

This patch contains the 2021-09 RUP for the Isilon HealthCheck System

***************************************************************************** This patch can be installed on clusters running the following OneFS version:

* 9.2.1.x

:

Within a readme file, details typically include a short description of the artefact, and also which minimum OneFS version the cluster is required to be running for installation.

Cleanup of patches and OneFS images is performed automatically upon commit, and any installed packages require the artefact to be present in the catalog for successful uninstall. Similarly, the committed OneFS image is required for both patch removal and cluster expansion via node addition.

Artifacts can be removed manually as follows:

# isi upgrade catalog remove --id 840b8

This will remove the specified artifact and all related metadata.

Are you sure? (yes/[no]): yes

However, always use caution if attempting to manually removing a package.

When it comes to catalog housekeeping, the ‘clean’ function will remove any catalog artifact files without database entries, although normally this happens automatically when an item is removed.

# isi upgrade catalog clean

This will remove any artifacts that do not have associated metadata in the database.

Are you sure? (yes/[no]): yes

Additionally, the catalog ‘repair’ function will rebuild the database and re-import all valid items, as well as re-verifying their signatures:

# isi upgrade catalog repair

This will attempt to repair the catalog directory. This will result in all stored artifacts being re-verified. Artifacts that fail to be verified will be deleted. Additionally, a new catalog directory will be initialized with the remaining artifacts.

Are you sure? (yes/[no]): yes

When installing a signed upgrade, patch, firmware or drive support package (DSP) on a cluster running OneFS 9.4, the command syntax used is fundamentally the same as in prior OneFS versions, with only the file extension itself having changed. The actual install file will have the ‘.isi’ extension, and the file containing the hash value for download verification will have a ‘.isi.sha256’ suffix. For example, take the OneFS 9.4 install files:

4.0.0_Install.isi
4.0.0_Install.isi.sha256

The following syntax can be used to initiate a parallel OneFS signed upgrade:

# isi upgrade start --install-image-path /ifs/install.isi -–parallel

Alternatively, if the desired upgrade image package is already in the catalog, it can be installed using the ‘—install-image-id’ flag instead:

# isi upgrade start --install-image-id 00b9c –parallel

Or to upgrade a cluster’s firmware:

# isi upgrade firmware start --fw-pkg /ifs/IsiFw_Package_v10.3.7.isi –-rolling

And upgrading a cluster’s firmware using the ID of a package that’s in the catalog:

# isi upgrade firmware start --fw-pkg-id cf01b -–rolling

To initiate a simultaneous upgrade of a patch:

# isi upgrade patches install --patch /ifs/patch.isi -–simultaneous

And finally, to initiate a simultaneous upgrade of a drive firmware package:

# isi_dsp_install Drive_Support_v1.39.1.isi

Note that patches and drive support firmware are not currently able to be installed by their package IDs.

The current version of node firmware that a cluster is running can be determined by viewing the contents of the isi_hwmon file, located in a log set under each node’s root path. Ie: <node-lnn>/isi_hwmon

--- FirmwareCheck Diagnostics ---

Packages:

IsiFw_Package_v11.5.1.tar

Drive_Support_v1.41.1.tgz

Similarly, there is a JSON file in a cluster’s log gather that holds all the firmware versions of the components on the system, in addition to the firmware package used to install the firmware. This file is located on each node under the path:

<node-lnn>/upgrade_local.tar/var/ifs/upgrade/firmware_status.json

A committed upgrade image from the previous OneFS upgrade is automatically saved in the catalog, and also created automatically when a new cluster is configured. This image is required for new node joins, as well as when uninstalling patches. However, it’s worth noting that auto-created images will not have a signature and, while they may be exported, they cannot be re-imported back into the catalog.

In the event that the committed upgrade image is missing, CELOG events will be generated and the ‘isi upgrade catalog repair’ command output will display an error. Additionally, when it comes to troubleshooting the signed upgrade process, it can pay to check both /var/log/messages and /var/log/isi_papi_d.log, as well as to the OneFS upgrade logs .

OneFS Data Reduction and Efficiency Reporting

Among the objectives of OneFS reduction and efficiency reporting is to provide ‘industry standard’ statistics, allowing allow easier comprehension of cluster efficiency. It’s an ongoing process, and prior to OneFS 9.2 there was limited tracking of certain filesystem statistics – particularly application physical and filesystem logical – which meant that data reduction and storage efficiency ratios had to be estimated. This is no longer the case, and OneFS 9.2 and later provides accurate data reduction and efficiency metrics at a per-file, quota, and cluster-wide granularity.

The following table provides descriptions for the various OneFS reporting metrics, while also attempting to rationalize their naming conventions with other general industry terminology:

OneFS Metric	Also Known As	Description
Protected logical	Application logical	Data size including sparse data, zero block eliminated data, and CloudPools data stubbed to a cloud tier.
Logical data	Effective Filesystem logical	Data size excluding protection overhead and spars data, and including data efficiency savings (compression and deduplication).
Zero-removal saved		Capacity savings from zero removal.
Dedupe saved		Capacity savings from deduplication.
Compression saved		Capacity savings from in-line compression.
Preprotected physical	Usable Application physical	Data size excluding protection overhead and including storage efficiency savings.
Protection overhead		Size of erasure coding used to protect data.
Protected physical	Raw Filesystem physical	Total footprint of data including protection overhead FEC erasure coding) and excluding data efficiency savings (compression and deduplication).
Dedupe ratio		Deduplication ratio. Will be displayed as 1.0:1 if there are no deduplicated blocks on the cluster.
Compression ratio		Usable reduction ratio from compression, calculated by dividing ‘logical data’ by ‘preprotected physical’ and expressed as x:1.
Inlined data ratio		Efficiency ratio from storing small files’ data within their inodes, thereby not require any data or protection blocks for their storage.
Data reduction ratio	Effective to Usable	Usable efficiency ratio from compression and deduplication. Will display the same value as the compression ratio if there is no deduplication on the cluster.
Efficiency ratio	Effective to Raw	Overall raw efficiency ratio expressed as x:1

So let’s take these metrics and look at what they represent and how they’re calculated.

Application logical, or protected logical, is the application data that can be written to the cluster, irrespective of where it’s stored.

Removing the sparse data from application logical results in filesystem logical, also known simply as logical data or effective. This can be data that was always sparse, was zero block eliminated, or data that has been tiered off-cluster via CloudPools, etc.

Note that filesystem logical was not accurately tracked in releases prior to OneFS 9.2, so metrics prior to this were somewhat estimated.

Next, data reduction techniques such as compression and deduplication further reduce filesystem logical to application physical, or pre-protected physical. This is the physical size of the application data residing on the filesystem drives, and does not include metadata, protection overhead, or data moved to the cloud.

Filesystem physical is application physical with data protection overhead added – including inode, mirroring and FEC blocks, etc. Filesystem physical is also referred to as protected physical.

The data reduction ratio is the amount that’s been reduced from the filesystem logical down to the application physical.

Finally, the storage efficiency ratio is the filesystem logical divided by the filesystem physical.

With the enhanced data reduction reporting in OneFS 9.2 and later, the actual statistics themselves are largely the same, just calculated more accurately.

The storage efficiency data was available in releases prior to OneFS 9.2, albeit somewhat estimated, but the data reduction metrics were introduced with OneFS 9.2.

The following tools are available to query these reduction and efficiency metrics at the file, quota, and cluster-wide granularity:

Realm	OneFS Command	OneFS Platform API
File	isi get -D
Quota	isi quota list -v	12/quota/quotas
Cluster-wide	isi statistics data-reduction	1/statistics/current?key=cluster.data.reduce.*
Detailed Cluster-wide	isi_cstats	1/statistics/current?key=cluster.cstats.*

Note that the ‘isi_cstats’ CLI command provides some additional, behind-the-scenes details. The interface goes through platform API to fetch these stats.

The ‘isi statistics data-reduction’ CLI command is the most comprehensive of the data reduction reporting CLI utilities. For example:

# isi statistics data-reduction

                      Recent Writes Cluster Data Reduction

                           (5 mins)

--------------------- ------------- ----------------------

Logical data                  6.18M                  6.02T

Zero-removal saved                0                      -

Deduplication saved          56.00k                  3.65T

Compression saved             4.16M                  1.96G

Preprotected physical         1.96M                  2.37T

Protection overhead           5.86M                910.76G

Protected physical            7.82M                  3.40T

Zero removal ratio         1.00 : 1                      -

Deduplication ratio        1.01 : 1               2.54 : 1

Compression ratio          3.12 : 1               1.02 : 1

Data reduction ratio       3.15 : 1               2.54 : 1

Inlined data ratio         1.04 : 1               1.00 : 1

Efficiency ratio           0.79 : 1               1.77 : 1

--------------------- ------------- ----------------------

The ‘recent writes’ data to the left of the output provides precise statistics for the five-minute period prior to running the command. By contrast, the ‘cluster data reduction’ metrics on the right of the output are slightly less real-time but reflect the overall data and efficiencies across the cluster. Be aware that, in OneFS 9.1 and earlier, the right-hand column metrics are designated by the ‘Est’ prefix, denoting an estimated value. However, in OneFS 9.2 and later, the ‘logical data’ and ‘preprotected physical’ metrics are tracked and reported accurately, rather than estimated.

The ratio data in each column is calculated from the values above it. For instance, to calculate the data reduction ratio, the ‘logical data’ (effective) is divided by the ‘preprotected physical’ (usable) value. From the output above, this would be:

6.02 / 2.37 = 2.54 Or a Data Reduction ratio of 2.54:1

Similarly, the ‘efficiency ratio’ is calculated by dividing the ‘logical data’ (effective) by the ‘protected physical’ (raw) value. From the output above, this yields:

6.02 / 3.40= 1.77 Or an Efficiency ratio of 1.77:1

OneFS SmartQuotas reports the capacity saving from in-line data reduction as a storage efficiency ratio. SmartQuotas reports efficiency as a ratio across the desired data set as specified in the quota path field. The efficiency ratio is for the full quota directory and its contents, including any overhead, and reflects the net efficiency of compression and deduplication. On a cluster with licensed and configured SmartQuotas, this efficiency ratio can be easily viewed from the WebUI by navigating to ‘File System > SmartQuotas > Quotas and Usage’. In OneFS 9.2 and later, in addition to the storage efficiency ratio, the data reduction ratio is also displayed.

Similarly, the same data can be accessed from the OneFS command line via is ‘isi quota quotas list’ CLI command. For example:

# isi quota quotas list

Type      AppliesTo  Path  Snap  Hard  Soft  Adv  Used  Reduction  Efficiency

------------------------------------------------------------------------------

directory DEFAULT    /ifs  No    -     -     -    6.02T 2.54 : 1   1.77 : 1

------------------------------------------------------------------------------

Total: 1

More detail, including both the physical (raw) and logical (effective) data capacities, is also available via the ‘isi quota quotas view <path> <type>’ CLI command. For example:

# isi quota quotas view /ifs directory

                        Path: /ifs

                        Type: directory

                   Snapshots: No

                    Enforced: No

                   Container: No

                      Linked: No

                       Usage

                           Files: 5759676

         Physical(With Overhead): 6.93T

        FSPhysical(Deduplicated): 3.41T

         FSLogical(W/O Overhead): 6.02T

        AppLogical(ApparentSize): 6.01T

                   ShadowLogical: -

                    PhysicalData: 2.01T

                      Protection: 781.34G

     Reduction(Logical/Data): 2.54 : 1

Efficiency(Logical/Physical): 1.77 : 1

To configure SmartQuotas for in-line data efficiency reporting, create a directory quota at the top-level file system directory of interest, for example /ifs. Creating and configuring a directory quota is a simple procedure and can be performed from the WebUI by navigate to ‘File System > SmartQuotas > Quotas and Usage’ and selecting ‘Create a Quota’. In the create pane, field, set the Quota type to ‘Directory quota’, add the preferred top-level path to report on, select ‘application logical size’ for Quota Accounting, and set the Quota Limits to ‘Track storage without specifying a storage limit’. Finally, select the ‘Create Quota’ button to confirm the configuration and activate the new directory quota.

The efficiency ratio is a single, current-in time efficiency metric that is calculated per quota directory and includes the sum of in-line compression, zero block removal, in-line dedupe and SmartDedupe. This is in contrast to a history of stats over time, as reported in the ‘isi statistics data-reduction’ CLI command output, described above. As such, the efficiency ratio for the entire quota directory will reflect what is actually there.

OneFS Inline Dedupe

Among the features and functionality delivered in the new OneFS 9.4 release is the promotion of inline dedupe to enabled by default, further enhancing PowerScale’s dollar per TB economics, rack density, and value.

Part of the OneFS data reduction suite, inline dedupe initially debuted in OneFS 8.2.1. However, until now, it needed to be manually enabled, so often customers simply didn’t use it. However, with this enhancement, new clusters running OneFS 9.4 will now have inline dedupe on by default.

Cluster Configuration	Inline Dedupe	Inline Compression
New cluster running OneFS 9.4	Enabled	Enabled
New cluster running OneFS 9.3 or earlier	Disabled	Enabled
Cluster with inline dedupe enabled that is upgraded to OneFS 9.4	Enabled	Enabled
Cluster with inline dedupe disabled that is upgraded to OneFS 9.4	Disabled	Enabled

That said, any clusters that upgrade to 9.4 will not see any change to their current inline dedupe config during upgrade. Additionally, there is also no change to the behavior for inline compression, which remains enabled by default in all OneFS versions from 8.1.3 onwards.

But before we examine the under the hood changes in OneFS 9.4, first, a quick dedupe refresher.

Currently OneFS inline data reduction, which encompasses compression, dedupe, and zero block removal, is supported on the F900, F600, F200 all-flash nodes, plus the F810, H5600, H700/7000, and A300/3000 Gen6.x chassis.

Within the OneFS data reduction pipeline, zero block removal is performed first, followed by dedupe, and then compression, and this order allows each phase to reduce the scope of work each subsequent phase.

Unlike SmartDedupe, which performs deduplication once data has been written to disk, or post-process, inline dedupe acts in real time, deduplicating data as is ingested into the cluster. Storage efficiency is achieved by scanning the data for identical blocks as it is received and then eliminating the duplicates.

When inline dedupe discovers a duplicate block, it moves a single copy of the block to a special set of files known as shadow stores. These are file system containers that allow data to be stored in a sharable manner. As such, files stored under OneFS can contain both physical data and pointers, or references, to shared blocks in shadow stores.

Shadow stores are similar to regular files but are hidden from the file system namespace, so cannot be accessed via a pathname. A shadow store typically grows to a maximum size of 2GB, which is around 256K blocks, with each block able to be referenced by 32,000 files. If the reference count limit is reached, a new block is allocated, which may or may not be in the same shadow store. Additionally, shadow stores do not reference other shadow stores. And snapshots of shadow stores are not permitted because the data contained in shadow stores cannot be overwritten.

When a client writes a file to a node pool configured for inline dedupe on a cluster, the write operation is divided up into whole 8KB blocks. Each of these blocks is then hashed and its cryptographic ‘fingerprint’ compared against an in-memory index for a match. At this point, one of the following will happen:

If a match is discovered with an existing shadow store block, a byte-by-byte comparison is performed. If the comparison is successful, the data is removed from the current write operation and replaced with a shadow reference.
When a match is found with another LIN, the data is written to a shadow store instead and replaced with a shadow reference. Next, a work request is generated and queued that includes the location for the new shadow store block, the matching LIN and block, and the data hash. A byte-by-byte data comparison is performed to verify the match and the request is then processed.
If no match is found, the data is written to the file natively and the hash for the block is added to the in-memory index.

In order for inline dedupe to be performed on a write operation, the following conditions need to be true:

Inline dedupe must be globally enabled on the cluster.
The current operation is writing data (ie. not a truncate or write zero operation).
The ‘no_dedupe’ flag is not set on the file.
The file is not a special file type, such as an alternate data stream (ADS) or an EC (endurant cache) file.
Write data includes fully overwritten and aligned blocks.
The write is not part of a ‘rehydrate’ operation.
The file has not been packed (containerized) by SFSE (small file storage efficiency).

OneFS inline dedupe uses the 128-bit CityHash algorithm, which is both fast and cryptographically strong. This contrasts with OneFS’ post-process SmartDedupe, which uses SHA-1 hashing.

Each node in a cluster with inline dedupe enabled has its own in-memory hash index that it compares block ‘fingerprints’ against. The index lives in system RAM and is allocated using physically contiguous pages and accessed directly with physical addresses. This avoids the need to traverse virtual memory mappings and does not incur the cost of translation lookaside buffer (TLB) misses, minimizing dedupe performance impact.

The maximum size of the hash index is governed by a pair of sysctl settings, one of which caps the size at 16GB, and the other which limits the maximum size to 10% of total RAM. The strictest of these two constraints applies. While these settings are configurable, the recommended best practice is to use the default configuration. Any changes to these settings should only be performed under the supervision of Dell support.

Since inline dedupe and SmartDedupe use different hashing algorithms, the indexes for each are not shared directly. However, the work performed by each dedupe solution can be leveraged by each other. For instance, if SmartDedupe writes data to a shadow store, when those blocks are read, the read hashing component of inline dedupe will see those blocks and index them.

When a match is found, inline dedupe performs a byte-by-byte comparison of each block to be shared to avoid the potential for a hash collision. Data is prefetched prior to the byte-by-byte check and then compared against the L1 cache buffer directly, avoiding unnecessary data copies and adding minimal overhead. Once the matching blocks have been compared and verified as identical, they are then shared by writing the matching data to a common shadow store and creating references from the original files to this shadow store.

Inline dedupe samples every whole block written and handles each block independently, so it can aggressively locate block duplicity. If a contiguous run of matching blocks is detected, inline dedupe will merge the results into regions and process them efficiently.

Inline dedupe also detects dedupe opportunities from the read path, and blocks are hashed as they are read into L1 cache and inserted into the index. If an existing entry exists for that hash, inline dedupe knows there is a block sharing opportunity between the block it just read and the one previously indexed. It combines that information and queues a request to an asynchronous dedupe worker thread. As such, it is possible to deduplicate a data set purely by reading it all. To help mitigate the performance impact, the hashing is performed out-of-band in the prefetch path, rather than in the latency-sensitive read path.

The original inline dedupe control path design had its limitations, since it did not provide a gconfig control settings for default disabled inline dedupe. The previous control path logic had no gconfig control settings for default disabled inline dedupe. But in OneFS 9.4, there are now two separate features that interact together to distinguishing between a new cluster or an upgrade to an existing cluster configuration: The first one is, upon upgrade to 9.4 on an existing cluster, if there is no inline dedupe config present, then explicitly set it to disabled in gconfig as part of the upgrade. This has no effect on an existing cluster since it’s already disabled. Similarly, if the upgrading cluster already has an existing inline dedupe setting in gconfig, then OneFS takes no action.

The other half of the functionality is that, when booting OneFS 9.4, a node looks in gconfig to see if there’s an inline dedupe setting. If no config is present, OneFS enables it by default. Therefore new OneFS 9.4 clusters automatically enable dedupe, and existing clusters retain their legacy setting upon upgrade.

Since inline dedupe’s configuration is binary, either on or off across a cluster, it can be easily manually controlled via the OneFS command line interface (CLI). As such, the ‘isi dedupe inline settings modify’ CLI command to either enable or disable dedupe at will – before, during, or after the upgrade, it doesn’t matter.

For example, inline dedupe can be globally disabled and verified via the following CLI command:

# isi dedupe inline settings viewMode: enabled# isi dedupe inline settings modify –-mode disabled

# isi dedupe inline settings view

Mode: disabled

Similarly, the following syntax will enable inline dedupe:

# isi dedupe inline settings view

Mode: disabled

# isi dedupe inline settings modify –-mode enabled

# isi dedupe inline settings view

Mode: enabled

While there are no visible userspace changes when files are deduplicated, if deduplication has occurred, both the ‘disk usage’ and the ‘physical blocks’ metric reported by the ‘isi get –DD’ CLI command will be reduced. Additionally, at the bottom of the command’s output, the logical block statistics will report the number of shadow blocks. For example:

Metatree logical blocks:   zero=260814 shadow=362 ditto=0 prealloc=0 block=2 compressed=0

Inline dedupe can also be paused from the CLI as follows:

# isi dedupe inline settings modify –-mode paused

# isi dedupe inline settings view

Mode: paused

However, it’s worth noting that this global setting states what you’d like to happen, after which each node attempts to enact the new configuration, but can’t guaranty the change, because not all node types support inline dedupe. For example, the following output is from a heterogenous cluster with an F200 three-node pool supporting inline dedupe, and an H400 four-node pool not supporting it.

Here, we can see that inline dedupe is globally enabled on the cluster:

# isi dedupe inline settings view

Mode: enabled

However, the ‘isi_for_array isi_inline_dedupe_status’ command can be used to display the actual setting and state of each node:

# isi dedupe inline settings view

Mode: enabled

# isi_for_array -s isi_inline_dedupe_status

1: OK Node setting enabled is correct

2: OK Node setting enabled is correct

3: OK Node setting enabled is correct

4: OK Node does not support inline dedupe and current is disabled

5: OK Node does not support inline dedupe and current is disabled

6: OK Node does not support inline dedupe and current is disabled

7: OK Node does not support inline dedupe and current is disabled

Additionally, any changes to the dedupe configuration are also logged to /var/log/messages, and can be found by grepping for ‘inline_dedupe’

So, in a nutshell: In-line compression has always been enabled by default since its introduction in OneFS 8.1.3. For new clusters running 9.4 and above, inline dedupe is on by default. For clusters running 9.3 and earlier, inline dedupe remains disabled by default. And existing clusters that upgrade to 9.4 will not see any change to their current inline dedupe config during upgrade.

And here’s the OneFS in-line data reduction platform support matrix for good measure:

PowerScale OneFS 9.4

Arriving in time for Dell Technologies World 2022, the new PowerScale OneFS 9.4 release shipped on Monday 4^th April 2022.

OneFS 9.4 brings with it a wide array of new features and functionality, including:

Feature	Description
SmartSync Data Mover	Introduction of a new OneFS SmartSync data mover, allowing flexible data movement and copying, incremental resyncs, push and pull data transfer, and one-time file to object copy. Complimentary to SyncIQ, SmartSync provides an additional option for data transfer, including to object storage targets such as ECS, AWS and Azure.
IB to Ethernet Backend Migration	Non-disruptive rolling Infiniband to Ethernet back-end network migration for legacy Gen6 clusters.
Secure Boot	Secure boot support is extended to include the F900, F600, F200, H700/7000, and A700/7000 platforms.
Smarter SmartConnect Diagnostics	Identifies non-resolvable nodes and provides their detailed status, allowing the root cause to be easily pinpointed.
In-line Dedupe	In-line deduplication will be enabled by default on new OneFS 9.4 clusters. Clusters upgraded to OneFS 9.4 will maintain their current dedupe configuration.
Healthcheck Auto-updates	Automatic monitoring, download, and installation of new healthcheck packages as they are released.
CloudIQ Protocol Statistics	New protocol statistics ‘count’ keys are added, allowing performance to be measured over a specified time window and providing point-in-time protocol information.
SRS Alerts and CELOG Event Limiting	Prevents CELOG from sending unnecessary event types to Dell SRS and restricts CELOG alerts from customer-created channels.
CloudPools Statistics	Automated statistics gathering on CloudPools accounts and policies providing insights for planning and troubleshooting CloudPools-related activities.

We’ll be taking a deeper look at some of these new features in blog articles over the course of the next few weeks.

Meanwhile, the new OneFS 9.4 code is available for download on the Dell Online Support site, in both upgrade and reimage file formats.

Enjoy your OneFS 9.4 experience!