Cluster – Unstructured Data Quick Tips

OneFS Auto-Remediation Configuration and Management

As we saw in the prior article in this series, the auto-remediation capability introduced in OneFS 9.12 implements an automated fault recovery mechanism to enhance cluster availability and reduce administrative overhead. Leveraging the HealthCheck framework, the system continuously monitors for OneFS anomalies and triggers corrective workflows without operator intervention. Upon detection of a known failure signature, auto-remediation executes a predefined repair procedure—such as service restarts or configuration adjustments—to restore operational integrity. By orchestrating these recovery actions programmatically, the feature minimizes manual troubleshooting and reduces mean time to repair (MTTR).

Auto-remediation is disabled by default in OneFS 9.12 to ensure administrators maintain full control over automated repair operations. Cluster administrators have the ability to enable or disable this functionality at any time through system settings. When a cluster is upgraded to OneFS 9.12, the WebUI HealthCheck page provides a notification banner informing users of the newly available auto-remediation capability:

This banner serves as an entry point for administrators to activate the feature if desired. If selected, the following enablement option is displayed:

Checking the ‘Enable Repair’ box provides the option to select either ‘Auto-repair’ or ‘Manual Repair’ behavior, with ‘Manual’ being the default:

Selecting ‘Auto-Repair’ will allow OneFS to automatically trigger and perform remediation on regular healthcheck failures:

Or from the CLI:

# isi repair settings view

Repair Behavior: manual

Repair Enabled: No

# isi repair settings modify --repair-enabled True --repair-behavior Auto

# isi repair settings view

Repair Behavior: auto

 Repair Enabled: Yes

# isi services -a | grep -i repair

   isi_repair           Repair Daemon                            Enabled

Once enabled, auto-remediation can automatically trigger repair actions in response to HealthCheck failures, reducing the need for manual intervention and improving operational efficiency.

The following repair actions are currently available in OneFS 9.12:

Repair Action	Description
fix_leak_freed_blocks	Disables the ‘leak_freed_blocks’ settings on all nodes where it is enabled, allowing the cluster to claim free disk space on file deletions. Note that if there is an active SR then this leak_freed_blocks must be enabled intentionally, in such case do not run this repair
fix_igi_ftp_insecure_upload	Disables insecure FTP upload of ISI gather
fix_mcp_running_status	Enables the MCP service
fix_smartconnect_enabled	Enable the SmartConnect service
fix_flexnet_running	Enable the flexnet service
fix_synciq_service_suggestion	Disables the SyncIQ service
fix_job_engine_enabled	Enables the ‘isi_job_d’ Job Engine service

These can also be viewed from the WebUI under Cluster management > HealthCheck > HealthChecks, and filtering by ‘Repair Available’:

This will display the following checklist, each of which contain ‘Repair Available’ healthchecks:

For example, the ‘basic’ checklist can be expanded to show its seven health checks which have the ‘repair available’ option. In addition to the description and actions, there’s a ‘repair behavior’ field which indicates a check’s repair type – either ‘auto’ or ‘manual’:

Under the hood, these repair actions are defined in the ‘/usr/libexec/isilon/repair-actions/mapping.json’ file on each node. For example, the repair action for the job engine, which has a ‘cluster’ level scope and ‘auto’ repair behavior:

"job_engine_enabled": {

            "script": "fix_job_engine_enabled.py",

            "script_type": "python",

            "enabled": true,

            "behavior": "auto",

            "risk": "low",

            "description": "This repair will enable isi_job_d service",

            "scope": "cluster"

        }

Note that any repair actions which are considered ‘high-risk’ will still need to be manually initiated, even with the repair behavior configured for ‘auto’. The ‘smartconnect_enabled’ repair action is an example of a high-risk action, so it has a ‘manual’ behavior type, as well as a node-level repair type, or scope, as can be seen in its definition:

“smartconnect_enabled”: {

      “script”: “fix_smartconnect_enable.py”,

      “script_type”: “python”,

      “enabled”: true,

      “behavior”: “manual”,

      “risk”: “high”,

      “description”: “This repair will enable the smartconnect service”,

      “scope”: “node”

},

The principal auto-remediation configuration, definition, and results files can typically be found in the following locations:

FIles	Location
repair_actions	/usr/libexec/isilon/repair-actions
mapping.json	/usr/libexec/isilon/repair-actions/mapping.json
global_request	/ifs/.ifsvar/modules/repair/requests
global_result	/ifs/.ifsvar/modules/repair/results
node_requests	/ifs/.ifsvar/modules/repair/requests/<devID>/
node_results	/ifs/.ifsvar/modules/repair/results/<devID>/

Auto-remediation is disabled when the PowerScale cluster operates in certain restricted modes. These include Compliance, Hardening, Read-Only, Upgrade Pre-Commit, and Root Lockdown Mode (RLM). These restrictions are in place to maintain security and compliance during critical operational states, ensuring that automated changes do not compromise system integrity.

Since auto-remediation is implemented as a new ‘isi_repair’ service within OneFS 9.12, the first step in troubleshooting any issues is to verify that the service is running and to check its operational status.

The service status can be checked using the following CLI command syntax:

# isi services -a isi_repair

Service 'isi_repair' is enabled.

Additionally, repair logs are available for review in the following location on each node:

/var/log/isi_repair.log

These utilities help to provide visibility into repair operations and assist in diagnosing issues related to auto-remediation.

OneFS Auto-Remediation

The new PowerScale Auto-Remediation feature introduced in OneFS 9.12 delivers an automated healing capability designed to improve cluster resilience and reduce operational overhead. This functionality leverages the HealthCheck framework to detect failures and initiate corrective actions without requiring manual intervention. When a known failure is identified, the system executes a predefined repair action to restore normal operation. By automating these processes, auto-remediation significantly reduces the number of incoming service requests and accelerates recovery times. Furthermore, repair actions can be delivered independently of the OneFS release cycle, enabling rapid deployment of critical fixes without waiting for a major upgrade.

Auto-remediation relies on the following key concepts:

Component	Description
Repair Action	Repair script/executable that fixes an issue.
Repair Behavior	Settings per repair action that determines if repair can run automatically or can only be invoked manually: · Manual Repair: Type of repair that will always be manually initiated by user. · Auto Repair: Type of repair that will be automatically triggered when required conditions are met and global auto repair setting is enabled.
Cluster Level Repair	Repair that requires cluster level resolution and does not need node level execution.
Node Level Repair	Repair that needs to run on each node that needs it.

A ‘repair action’ refers to a script or executable that resolves a specific issue within the cluster. Each repair action is governed by ‘repair behavior’, which determines whether the action runs automatically or requires manual initiation. Repairs classified as ‘manual’ must always be triggered by the user, whereas ‘auto’ repairs execute automatically when the required conditions are met and the global auto repair setting is enabled. Repairs may also be scoped at different levels: ‘cluster-level repairs’ address issues affecting the entire cluster and do not require node-level execution, while ‘node-level repairs’ run individually on each affected node.

The OneFS auto-remediation feature provides administrators with flexible control over repair operations. Users can enable or disable the global auto repair setting at any time. When enabled, repairs can be triggered automatically in response to HealthCheck failures, ensuring timely resolution of issues. Administrators also retain the ability to manually initiate repairs for failed HealthChecks when necessary. The system supports both node-level and cluster-level repairs, offering comprehensive coverage for a wide range of failure scenarios. Additionally, repair actions can be updated outside the standard OneFS release cycle, allowing for rapid deployment of new fixes as they become available.

The auto-remediation architecture in OneFS 9.12 is designed to enhance system reliability by automating the detection and resolution of known failures. This architecture operates by leveraging the HealthCheck framework to identify issues within the cluster.

Once a failure is detected, the system executes a series of scripts and executables—referred to as repair actions—to restore normal functionality. By automating these processes, the architecture significantly reduces the number of incoming service requests, thereby improving operational efficiency and minimizing downtime.

OneFS 9.12 includes several repair actions to address common failure scenarios. The architecture is designed for continuous evolution, so additional repair actions will be incorporated both within and independent from future releases to expand coverage and improve resilience. As such, a key feature of this design is its ability to deliver repair actions outside the standard OneFS release cycle. This is achieved through updates to the HealthCheck package, allowing new repair actions to be added without waiting for a major software upgrade. As new repair actions become available, storage admins can update their cluster via Dell Connectivity Services (DTCS), ensuring timely access to the latest remediation capabilities.

The auto-remediation architecture in OneFS 9.12 consists of two primary components: the PowerScale cluster residing within the customer’s data center, and the Dell Secure Remote Services (SRS) backend. OneFS utilizes the HealthCheck framework to detect issues within the cluster. When a failure is identified, the HealthCheck framework invokes the Repair Engine, a newly introduced service responsible for applying the appropriate corrective action for the detected issue.

The repair process supports two operational modes: automatic and manual.

Repair Mode	Description
Auto	Triggered automatically when required conditions are met and global auto repair setting is enabled.
Manual	Repair will always be invoked manually by user.

This dual-mode approach provides flexibility, allowing organizations to balance automation with administrative oversight based on their operational policies.

In an automatic scenario, the cluster admin initiates a HealthCheck, and if the check fails, the OneFS determines whether the conditions for auto repair are met. The Repair Engine is then called immediately to execute the corresponding repair action without user intervention. In contrast, the manual scenario requires explicit admin input. After a HealthCheck fails, OneFS waits for the administrator to either click the repair button in the WebUI or submit a repair request through the CLI. Once the request is received, the Repair Engine begins its workflow.

The Repair Engine follows a structured sequence to ensure accuracy and reliability. First, it retrieves the repair request and performs a ‘pre-check’ to validate the current state of the system. This step is particularly important for manual repairs, where the initial HealthCheck may have been executed several days earlier. If the pre-check confirms that the issue persists, the engine proceeds to execute the repair script associated with the failed HealthCheck. Each repair action is mapped one-to-one with a specific HealthCheck, ensuring precise remediation. The repair script is stored locally on the PowerScale cluster and is executed directly from that location.

After the repair script completes, the engine runs a ‘post-check’ to verify that the corrective action resolved the issue. If the post-check is successful, the system generates a repair result and stores it in the file system for future reference and reporting. This ensures transparency and provides administrators with a historical record of remediation activities.

In addition to the core repair workflow, the architecture includes an automated repair update mechanism. A scheduled ‘isi_repair_update’ process runs daily (by default) to check for new repair action packages available for the cluster. This process requires DTCS to be enabled on the cluster, and communicates with the SRS backend to retrieve updates. By decoupling repair action updates from the OneFS release cycle, the system ensures that customers can access the latest remediation capabilities without waiting for a major upgrade.

The Repair Engine’s workflow begins when a repair request is received. The trigger for this request can originate from two sources:

Automated HealthCheck failure
User-initiated repair action.

When the request is received, the engine first determines whether it was triggered by the HealthCheck framework (HCF) or by the user. An HCF-triggered request indicates an automatic repair scenario, while a user-triggered request corresponds to a manual repair.

For automatic repairs, the engine bypasses the pre-check phase because the HealthCheck failure has just occurred, and the system state is already validated. In contrast, manual repairs require an additional verification step. The engine performs a pre-check to confirm that the issue detected by the original HealthCheck still exists. This step is critical because the initial HealthCheck may have been executed some time ago, and the system state could have changed.

If the pre-check confirms that the issue persists, the engine proceeds to execute the repair script associated with the failed HealthCheck. Each repair script is mapped one-to-one with a specific HealthCheck, ensuring precise remediation. Upon successful execution of the repair script, the engine performs a post-check to validate that the corrective action resolved the issue. If the post-check passes, the engine updates the repair status and records the outcome in the system, marking the repair as successful.

In the next article in this series, we’ll focus on the configuration and management of OneFS auto-remediation.

OneFS S3 Object Lock and Bucket Lock Configuration and Management

As we saw in the previous article in this series, OneFS 9.12 introduces support for S3 Object Locks. This feature enables write-once-read-many (WORM) protection for objects, ensuring critical data remains immutable and safeguarded against accidental or malicious deletion. By applying Object Locks, organizations can prevent object deletion or modification for a specified duration, or indefinitely, helping maintain data integrity and meet regulatory retention requirements. This capability is particularly valuable for industries such as finance services, where compliance and secure data preservation are essential.

The S3 bucket and object locking features become available after the OneFS 9.12 release has been committed, and no additional licensing is required. To enable them, the pertinent access zone is configured to support bucket or object locks, during which, if not already active, the compliance clock is automatically enabled in support of locking. Within the S3 zone configuration, three new options are introduced.

Name	Value	Description
Object lock support	boolean	Allows Object Lock or Immutable Buckets to be created in the zone
Default lock protection mode	ObjectLock \| BucketLock	Default object lock mode for Buckets with object lock enabled.
Compliance clock support	NULL	Automatically enables compliance clock if not already enabled

‘Object-lock-support’ is disabled by default and must be enabled to allow the creation of locked buckets, while ‘default-lock-protection-mode’ determines the default lock type when none is specified in an API call. ‘Compliance-clock-support’ enables the compliance clock if it is not already active, and these are all represented in the OneFS 9.12 S3 CLI as follows:

# isi s3 settings zone modify -h | grep -i lock

[--object-lock-support <boolean>]

[--default-lock-protection-mode (ObjectLock | BucketLock)]

[--compliance-clock-support]

For example:

# isi s3 zone settings modify –object-lock-support 1

Similarly, the new S3 bucket settings include:

Name	Value	Description
Object lock enabled	boolean	Enable Locking on the bucket. Cannot be disabled afterwards.
Lock protection mode	ObjectLock \| BucketLock	Bucket Lock Mode
Default Retention Mode	GOVERNANCE \| COMPLIANCE	Only Governance is supported in OneFS 9.12
Default Retention Days	Int	Retention period in days. Mutually exclusive with years.
Default Retention Years	Int	Retention period in years. Mutually exclusive with days.

These can be configured in the S3 CLI with the following arguments:

# isi s3 buckets modify <bucket>

        [--object-lock-enabled <boolean>]

        [--lock-protection-mode (ObjectLock | BucketLock)]

        [--default-retention-mode (GOVERNANCE | COMPLIANCE)]

        [--default-retention-days <integer>]

        [--default-retention-years <integer>]

The OneFS S3 per-access zone configuration settings can be viewed from the WebUI under Protocols > Object Storage (S3) > Zone Settings, as follows:

Or from the CLI:

# isi s3 settings zone view

                   Root Path: /ifs/data

                 Base Domain: tme.isilon.com

           Object ACL Policy: replace

Bucket Directory Create Mode: 0777

            Use Md5 For Etag: Yes

        Validate Content Md5: Yes

         Object Lock Support: Yes

       Syslog Access Logging: No

Default Lock Protection Mode: Object Lock

These settings can modified, for example to change the default lock protection mode from object to bucket:

# isi s3 settings zone view | grep -i protection

Default Lock Protection Mode: Object Lock

# isi s3 settings zone modify --default-lock-protection-mode=BucketLock

# isi s3 settings zone view | grep -i protection

Default Lock Protection Mode: Bucket Lock

At the bucket level, new settings include ‘object-lock-enabled’, which, once set, cannot be disabled. The ‘lock-protection-mode’ setting specifies whether Object Lock or Bucket Lock is used. The default-retention-mode is currently limited to ‘Governance’ mode, since ‘Compliance’ mode is not currently supported in the OneFS 9.12 release. Retention periods can be specified in either days or years, with a maximum duration of 100 years.

From the WebUI, object or bucket locks and their retention period can be configured on buckets as follows:

Followed by a warning that locking cannot be disabled and confirmation:

Then confirmation of creation:

Or from the CLI:

# isi s3 buckets create test-bkt /ifs/data/zone-a --owner rlm --object-lock-enabled 1 --lock-protection-mode BucketLock

The S3 buckets and their lock status can be reported as follows:

# isi s3 buckets list

Bucket Name  Path                                    Owner  Object ACL Policy  Object Lock Enabled  Lock Protection Mode  Description

--------------------------------------------------------------------------------------------------------------------------------------

test-bkt     /ifs/data/zone-a                        rlm    replace            Yes                  Bucket Lock 

test-bkt-b   /ifs/data/zone-b                        rlm    replace            Yes                  Bucket Lock

--------------------------------------------------------------------------------------------------------------------------------------

Total: 4

And to view the details of a specific bucket:

# isi s3 buckets view test-bkt

         Bucket Name: test-bkt

                Path: /ifs/data/zone-a

               Owner: rlm

   Object ACL Policy: replace

 Object Lock Enabled: Yes

Lock Protection Mode: Bucket Lock

         Description:

Default Retention

                 Mode: GOVERNANCE

                Years: -

                 Days: 1

Access Logging Enabled: Yes

         Target Bucket: test-bkt

            Log Prefix:

Additionally, the following CLI syntax can be used to confirm the presence of a lock on the bucket:

# isi get -DDD /ifs/data/zone-a | grep -i ObjectLock

 IFS Domain IDs:     {2.0100(Snapshot), 1d.0900(ObjectLock) }

The following error will be displayed if attempting to create a bucket lock on a target bucket:

There are a couple of caveats and proclivities to bear in mind with S3 bucket and object locking in OneFS 9.12. Specifically, certain S3 clients may require additional configuration to support custom headers for BucketLock buckets.

Header	Description
x-amz-bypass-governance-retention	PutObjectLockConfiguration when lowering retention for BucketLock bucket
x-isi-lock-protection-mode	Specifying the lock-protection-mode of a bucket

Standard tools might require modification in order to support these special headers. For example, with the ubiquitous Boto3 client, callback handlers may need to be registered using the Boto toolkit to include these headers in requests.

If a client HTTP request is invalid, or goes awry, OneFS follows the general AWS S3 error codes format – albeit with modifications to remove any AWS-specific info. The OneFS S3 implementation also includes some additional error codes for its intrinsic behaviors. These include:

Since OneFS S3 Object Lock does not support legal hold and compliance retention mode, any attempts to use it will return an HTTP 501 ‘Not Implemented’ error. Additionally, due to the current absence of versioning support, attempts to overwrite a locked object will result in an HTTP 403 ‘Access Denied’ error.

For investigative and troubleshooting purposes, all relevant operations are logged in the standard S3 server logs, located at /var/log/s3.log.

In summary, the general behavior of the new OneFS S3 object locking feature in OneFS 9.12 includes the following:

OneFS S3 Specific Behavior

AWS S3-compliant Behavior

· Object locks may not be enabled for older buckets or non-empty buckets.

· Only a bucket’s owner can enable object lock for the bucket. The owner must match the file system directory owner of the directory bucket since there’s no concept of directory ownership in AWS S3.

· Unlike AWS S3 where lock modes (i.e. Governance and Compliance) are directly linked to S3 objects, in OneFS they are bucket-specific, with the same mode applying to all objects within a bucket.

· Compliance mode for buckets is not supported when the system is in enterprise state.

· Compliance mode buckets are not supported in OneFS 9.12.

· Object locks will be enabled at “Bucket” level.

· Once Object locks are enabled, they CANNOT be disabled.

· Every S3 bucket will have a default retention that will apply to all objects inside the bucket.

· Objects inside a bucket can have different retention periods.

· Retention period can be extended or lowered (privileged action) for S3 objects.

As a further level of Bucket lock security, OneFS multi-party authorization (MPA) includes the following S3 privileged actions that, if enabled, require an additional administrator to approve the reduction of an immutable bucket’s retention time configuration and/or changes to a bucket’s access logging configuration.

Service /

Component

Action

Description

reduce_immutable_bucket_retention

Reduce bucket retention for an immutable bucket.

modify_server_access_logging_config

Change a bucket’s access logging configuration.

MPA helps to mitigate the risk of data loss or system configuration damage from critical actions, by vastly reducing the likelihood of accidental or malicious execution of consequential operations. MPA enforces a security control mechanism wherein operations involving critical or sensitive systems or data require explicit approval from multiple authorized entities. This ensures that no single actor can execute high-impact actions unilaterally, thereby mitigating risk and enhancing operational integrity through enforced oversight. As such, many enterprises require MPA in order to meet industry data protection requirements, in addition to internal security mandates and best practices.

MPA S3 privileged actions supporting immutable bucket retention and logging are available for selection in the MPA configuration, for example from the WebUI MPA Requests dropdown menu, located under Access > Multi-Party Authorization > Requests:

Introduced in OneFS 9.12, MPA can be configured and managed from the CLI, WebUI or platform API, but the feature is not enabled by default. Once MPA is enabled and configured, all predefined ’privileged actions’ require additional approval from an authorizing user. So this is markedly different from earlier OneFS releases, where the original requester would likely have had the basic rights and been able to execute that action in its entirety themselves.

Note that MPA is incompatible with, and therefore not supported on, a PowerScale cluster that is already running in Compliance Mode.

OneFS S3 Object Lock and Bucket Lock

Within the OneFS 9.12 feature payload lies the introduction of support for S3 Object Locks, providing the ability to enforce write-once-read-many (WORM) protection on objects, ensuring the immutability of critical data to protect against accidental or malicious deletions. Object deletion or modification can be prevented for a defined period (or indefinitely), ensuring data integrity and allowing enterprises, such as financial institutions, to meet regulatory requirements for data retention.

But first some background. PowerScale OneFS has provided native support for the AWS S3 object API for several years now, as part of its suite of unstructured data access protocols. This integration enables OneFS clusters to support hybrid workloads that interact with the same underlying dataset using both traditional file-based protocols—such as NFS, HDFS, or SMB—and object-based access via S3. Data written through one protocol can be seamlessly accessed through another, allowing for flexible and efficient data workflows.

At the file system level, S3 objects and buckets are represented as files and directories. As a result, OneFS data services—including snapshots, replication, WORM (Write Once Read Many) immutability, and tiering—are fully integrated and available to both file and object workloads. Identity management, permissions enforcement, and access control policies are consistently applied across both access paradigms, ensuring unified security and governance.

This dual-protocol capability allows applications and workloads to access the same data through either file or object interfaces without requiring data duplication, migration, or transformation. This significantly simplifies data management and reduces operational overhead, particularly in environments with diverse access requirements.

To meet enterprise security and compliance standards, OneFS supports HTTPS/TLS for encrypted data transmission. The S3 protocol is implemented as a first-class citizen within OneFS, delivering performance characteristics comparable to those of the SMB protocol.

By default, the S3 service listens on port 9020 for HTTP and port 9021 for HTTPS. These ports are configurable to accommodate specific deployment requirements, if required.

In OneFS 9.12, the S3 protocol sees the introduction of two key security enhancements:

S3 Object Lock
S3 Bucket Lock

The Object Lock feature is an implementation of Amazon S3’s Object Lock functionality within the PowerScale S3 server that’s built implemented on top of the OneFS SmartLock feature within the file system. When configured, it allows individual objects to be locked, preventing their deletion, modification, or movement until their retention period expires. Retention is configured on a per-object basis, and users with the ‘ISI_PRIV_IFS_BYPASS_RETENTION’ privilege can reduce the retention period or delete locked objects.

# isi auth privileges | grep -i retention

ISI_PRIV_IFS_BYPASS_RETENTION                                            Bypass the retention setting of WORM committed files

Without such privileges, locked objects remain immutable until the retention period concludes. Note that OneFS S3 Object Lock does not support legal hold and compliance retention mode.

In contrast, Bucket Lock enforces immutability at the bucket level. Under this model, retention periods are not set per-object but are instead defined for the entire bucket. Any changes to the retention period apply uniformly to all objects within the bucket. Once Bucket Lock is enabled, it cannot be disabled or converted to Object Lock. Additionally, privileged deletion of locked objects is not supported under Bucket Lock. Objects can only be deleted once the bucket-level retention period has expired or has been explicitly reduced.

S3 object and bucket locking in OneFS has a slightly esoteric terminology, including the following abbreviations and definitions:

Term	Description
Bucket	An S3 term that refers to a container of S3 objects. For OneFS, it can be mapped to top level directory containing files.
Bucket Policy	Set of defined rules in S3 that govern which user will access which bucket or object. Supports regex and allows granular controls.
Compliance Mode	Stricter mode for S3 object locked file. No one can delete/modify an object locked file. Similar to compliance mode of WORM.
Data Protection Mode	In this mode, privileged users need Multi-party Authorization (MPA) to delete an S3 object locked file or to lower file retention in GOVERNANCE mode.
Default Retention	The default retention period associated with a bucket. Any new object within the bucket will inherit this retention period.
Governance Mode	Regular mode for S3 object locked file. ‘Privileged Users’ can delete locked files. Similar to “enterprise” mode of WORM
Legal Hold	When an S3 object is under legal hold, it remains locked until it is unlocked by a privileged user. No retention period is required.
Object	An S3 term that refers to a blob that contains data. For OneFS, it can be mapped to a file.
Object Lock	A feature in Amazon S3 that allows users to lock objects for a duration. When locked, files cannot be modified/deleted.
Object Versioning	In S3, any modification of an S3 object creates a new version of the object. The new version can have its own Object Lock configuration.
Privileged Users	Users who have special permissions/privileges, allowing them to delete files that are locked under some circumstances.
Retention Period	Time during which a file or object is locked. After this period, the file can be deleted if required
WORM Domain	A directory that is enabled for WORM
WORM File	A file that is locked or committed for WORM. A WORM file cannot be modified or deleted.

Architecturally, OneFS S3 consists of six main components, which are:

Component	Description
FS Layer	The kernel FS layer that performs WORM actions. All WORM domains and files are managed at this layer.
Iomgr	The platform agnostic IO manager that is connected to protocol heads. Exposes IO APIs that perform platform specific operations through file system drivers created for each platform. Object locking is only supported in OneFS file system driver.
CLI/WebUI/pAPI	The entry points for configuration S3 object locking as well as enabling/disabling object locking. PAPI interacts with S3 protocol head for S3 operation and Tardis for setting/getting configuration.
S3 Clients	Clients connected to OneFS cluster for S3 operations. Requests from S3 clients are in the form of ReST APIs. These clients can be local to the node or can be external entity.
S3 Protocol Head	The Powerscale S3 server. Processes requests from S3 clients and performs required actions.
Tardis	Stores the configuration of object locks. Every configuration item for object locking is stored inside a bucket entry under S3 namespace.

Both Object Lock and Bucket Lock are managed through a new locking domain based on the WORM (Write Once Read Many) domain. Unlike AWS S3, versioning is neither required nor supported in this implementation. Retention date calculations rely on the OneFS compliance clock, which must be enabled before these features can be used.

When configuring Object Lock, administrators can define a default retention period at the bucket level. This default applies to newly added objects unless overridden via the S3 API.

S3 API Endpoint	Description
CreateBucket	Creates a regular S3 bucket. • If header parameter ‘x-amz-bucket-object-lock-enabled’ is present, bucket is object lock enabled. • If header parameter ‘isi-lock-protection-mode’ is present for specifying lock mode type.
PutObjectLockConfiguration	Enables object lock on a regular bucket. Sets bucket default retention. Sets lock-protection-mode.
GetObjectLockConfiguration	Gets current object lock information with bucket retention for a bucket
PutObjectRetention	Locks a regular object with given retention period. Also used to extend/lower retention period of a locked object. Note that this is a disabled and invalid operation for Bucket Lock buckets.

Within an Object Lock-enabled bucket, objects may be either locked or unlocked. In contrast, Bucket Lock applies a dynamic retention model, where the retention period is calculated from each object’s creation time using the bucket-level setting.

Configuration of both Object Lock and Bucket Lock is supported through the OneFS CLI, WebUI, and platform API (pAPI). The S3 API has been extended to support these features.

S3 API Endpoint	Description
CreateBucket	Creates a regular S3 bucket. • If header parameter ‘x-amz-bucket-object-lock-enabled’ is present, the bucket is object lock-enabled. • If header parameter ‘isi-lock-protection-mode’ is present for specifying lock mode type.
PutObjectLockConfiguration	Enables object lock on a regular bucket. Sets bucket default retention. Sets lock-protection-mode.
GetObjectLockConfiguration	Gets current object lock information with bucket retention for a bucket
PutObjectRetention	Locks a regular object with a given retention period. Also used to extend/lower the retention period of a locked object. Note that PutObjectRetention is a disabled and invalid operation for Bucket Lock buckets.

For example, the CreateBucket operation now accepts the ‘x-amz-bucket-object-lock-enabled’ flag. Additionally, a custom header, ‘isi-lock-protection-mode’, allows the lock mode type to be specified. If this header is omitted, the system defaults to the zone’s configured protection mode.

The ‘HeadObject’ and ‘GetObject’ operations return retention metadata when locking is enabled. The ‘DeleteObject’ and ‘DeleteObjects’ operations support the ‘x-amz-bypass-governance-retention’ header, which is applicable only in Object Lock mode and requires the appropriate privileges. For object creation operations such as ‘PutObject’, ‘CopyObject’, and multipart uploads, the ‘x-amz-object-lock-retain-until-date’ header can be used to set object-level retention. This header is only valid in Object Lock mode, as Bucket Lock relies solely on the bucket-level retention setting.

New API operations have also been introduced. Specifically, PutObjectLockConfiguration’ and ‘GetObjectLockConfiguration’ allow administrators to manage bucket-level lock settings and default retention values.

S3 API Endpoint	Description
PutObjectLockConfiguration	Enables object lock or modify retention settings on a bucket. Custom header.
GetObjectLockConfiguration	Gets current object lock information with bucket retention for a bucket.
GetObjectRetention	Gets the retention period of a locked object.
PutObjectRetention	Locks a regular object with a given retention period. Also used to extend/lower the retention period of a locked object. Note that PutObjectRetention is a disabled and invalid operation for Bucket Lock buckets.

At the object level, ‘PutObjectRetention’ and ‘GetObjectRetention’ enable setting and retrieving retention information. Reducing an object’s retention period requires the ‘x-amz-bypass-governance-retention’ header and the appropriate privileges. These operations are not supported under Bucket Lock.

GetObjectLockConfiguration

The new ‘GetObjectLockConfiguration’ S3 endpoint retrieves the current object lock configuration of a bucket. If object lock is not enabled for the bucket, an HTTP 400 error code will be returned.

Request

GET /?object-lock HTTP/1.1
Host: <Bucket-address>
x-amz-expected-bucket-owner: Owner

Response

HTTP/1.1 200
<?xml version="1.0" encoding="UTF-8"?>
<ObjectLockConfiguration>
<ObjectLockEnabled>string</ObjectLockEnabled>
<Rule>
<DefaultRetention>
<Days>integer</Days>
<Mode>string</Mode>
<Years>integer</Years>
</DefaultRetention>
</Rule>
</ObjectLockConfiguration>

If object lock is enabled on a bucket the ‘ObjectLockEnabled’ parameter, the ‘Mode’ field is set to either ‘GOVERNANCE’ or ‘COMPLIANCE’, and ‘DefaultRetention’ period is expressed in either in Days or Years (but not both).

PutObjectLockConfiguration

The ‘PutObjectLockConfiguration’ endpoint sets object lock functionality on an existing bucket. It is also used to set the ‘Mode’ and ‘DefaultRetention’ period on the bucket. Note that only the bucket owner can perform this operation.

Request

PUT /?object-lock HTTP/1.1
Host: <Bucket-address>
x-amz-expected-bucket-owner: Owner
<ObjectLockConfiguration>
<ObjectLockEnabled>string</ObjectLockEnabled>
<Rule>
<DefaultRetention>
<Days>integer</Days>
<Mode>string</Mode>
<Years>integer</Years>
</DefaultRetention>
</Rule>
</ObjectLockConfiguration>

If enabled, ‘ObjectLockEnabled’ activates object locking on bucket. Once enabled, object locking cannot be disabled. The ‘DefaultRetention’ parameter is set for the bucket, but this can also be manually configured, and expressed in either days or years. A maximum value of 100 years is permitted, while a value of 0 indicates no retention. The ‘mode’ parameter is either ‘GOVERNANCE’ or ‘COMPLIANCE’, while noting that only ‘GOVERNANCE’ mode is supported when a cluster is in enterprise state.

Response

HTTP/1.1 200

GetObjectRetention

The ‘GetObjectRetention’ endpoint fetches the retention date associated with an object locked S3 object. If the object is not locked, the API returns an HTTP 400 error.

Request

GET /{Key}?retention HTTP/1.1
Host: <Bucket-location>
x-amz-expected-bucket-owner: ExpectedBucketOwner

The mandatory ‘Key’ field contains the S3 object key. Also, unlike AWS, OneFS 9.12 does not support the ‘VersionId’ parameter in this operation.

Response

HTTP/1.1 200
<?xml version="1.0" encoding="UTF-8"?>
<Retention>
<Mode>string</Mode>
<RetainUntilDate>timestamp</RetainUntilDate>
</Retention>

The ‘Mode’ field in the above is the bucket mode, and individual S3 objects share their common bucket mode. Also,
the ‘RetainUntilDate’ field contains the retention period of the object, of which the format type is a timestamp.

PutObjectRetention

The ‘PutObjectRetention’ endpoint sets the retention period of an S3 object, returning an error if object lock is not enabled on bucket.

Request

PUT /{Key}?retention HTTP/1.1
Host: <Bucket-location>
x-amz-expected-bucket-owner: ExpectedBucketOwner
x-amz-bypass-governance-retention: BypassGovernanceRetention
Content-MD5: ContentMD5
<?xml version="1.0" encoding="UTF-8"?>
<Retention xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Mode>string</Mode>
<RetainUntilDate>timestamp</RetainUntilDate>
</Retention>

The ‘key’ field contains the S3 object key, and is required. Unlike AWS, ‘VersionId’ is not supported in the OneFS 9.12 S3 object lock implementation. If setting retention that is lower than current retention period of the object, the request must have the URI parameter ‘x-amz-bypass-governance-retention’ set to ‘BypassGovernanceRetention’. Additionally, ‘mode’ field must match the bucket mode, and ‘RetainUntilDate’ be a future timestamp.

Response

HTTP/1.1 200

In the next article in this series, we’ll take a closer look at the configuration and management of OneFS S3 object and bucket locking.

OneFS Root Lockdown Mode Configuration

Root Lockdown Mode (RLM) is a security feature designed to disable and block access to the cluster’s root account. This new RLM functionality in OneFS 9.12 enforces role-based access controls (RBAC) for cluster administrators, thereby reducing the risk of a single privileged user bypassing security policies and performing potentially destructive operations on the cluster configuration or its data. Given that the root account has inherent, unaudited permissions to override security controls, restricting its access significantly strengthens the PowerScale’s security posture.

The basic process for enabling RLM is as follows:

Note that, in OneFS 9.12, RLM does not have a WebUI interface, and so is only configurable via the command line (CLI).

Several privileges are required to successfully deploy RLM. These include:

Privilege	Description
ISI_PRIV_AUTH	for account configuration
ISI_PRIV_LICENSE	for license activation
ISI_PRIV_HARDENING	for applying and disabling RLM
ISI_PRIV_SYS_SHUTDOWN	for rebooting nodes or the cluster
ISI_PRIV_DEVICES	for adding new nodes to the cluster

The specific steps to activate RLM on a cluster are as follows:

Once the cluster is running and committed to OneFS 9.12, create a dedicated user account to use in place of root (i.e. ‘RLM’, in the following example):

# isi auth users create RLM --enabled true --set-password

passwowrd: xxxxxxxx

confirm: xxxxxxxx

Next, assign this new user (‘RLM’) the necessary RBAC roles, which include ‘SystemAdmin’ and ‘SecurityAdmin’.

# isi auth roles modify SystemAdmin --add-user RLM
# isi auth roles modify SecurityAdmin --add-user RLM

Configure the new user and ‘compadmin’ account with a valid shell, such as ZSH:

# isi auth users modify compadmin --shell /usr/local/bin/zsh
# isi auth users modify RLM  --shell /usr/local/bin/zsh

The required licenses for OneFS Hardening must be activated. For example:

# isi license add --evaluation HARDENING

Once the above prerequisites are met, the root-lockdown hardening profile can be applied via the OneFS Hardening Engine.

# isi hardening apply root-lockdown

.Hardening operation complete.

A full cluster reboot is then required to enforce the new configuration.

# isi cluster reboot -–lnn=all

This command will reboot the cluster. Are you sure? (yes/[no]?): yes

Once rebooted, the status of RLM can be verified as follows:

# isi_for_array isi auth cache flush --all

# isi hardening list -az | grep -i lockdown

root-lockdown Enable security settings that disable the root account. Applied

# isi_for_array -s isi auth users view root | grep Enabled

tme-1:                  Enabled: No

tme-2:                  Enabled: No

tme-3:                  Enabled: No

Additionally, the hardening report associated with the root-lockdown profile will provide a more detailed status. For example:

# isi hardening reports create

# isi hardening reports view root-lockdown

Name                              Location  Status      Setting

--------------------------------------------------------------------------------

require_password_single_user_mode Node 1    Applied /etc/ttys

check_disable_root_account        Node 1    Applied /auth/users/root:enabled

check_root_shell_nologin          Node 1    Applied /auth/users/root:shell

check_rlm_sysctls                 Node 1    Applied -

require_password_single_user_mode Node 2    Applied /etc/ttys

check_disable_root_account        Node 2    Applied /auth/users/root:enabled

check_root_shell_nologin          Node 2    Applied /auth/users/root:shell

check_rlm_sysctls                 Node 2    Applied -

require_password_single_user_mode Node 3    Applied /etc/ttys

check_disable_root_account        Node 3    Applied /auth/users/root:enabled

check_root_shell_nologin          Node 3    Applied /auth/users/root:shell

check_rlm_sysctls                 Node 3    Applied -

set_rlm_cluster_wide              Cluster   Applied root-lockdown

require_disabled_STIG_profile     Cluster   Applied –

require_rlm_aclr_users            Cluster   Applied –

require_rlm_cluster_state         Cluster   Applied –

disable_kdb                       Cluster   Applied /etc/mcp/templates/sysctl.conf

disable_root_account              Cluster   Applied /auth/users/root:enabled

root_shell_nologin                Cluster   Applied /auth/users/root:shell

In addition to RLM, the OneFS 9.12 release also introduced Multi-party Authorization, or MPA, functionality. MPA is an administrative approval mechanism that requires at least one additional trusted party to sign off on a requested change, for certain privileged actions within a PowerScale cluster. As such, MPA helps to reduce the likelihood of data loss or system configuration damage from critical actions, by mitigating the risk of accidental or malicious execution of consequential operations.

A recommended practice is to configure and enable RLM in combination with, and complimentary to, the MPA feature. This will mean that any requested change to the RLM state will require additional approval. The application or removal of RLM necessitates a full cluster reboot to ensure that all relevant system controls (sysctls) are properly enforced or cleared. Administrators should consult the hardening report to determine which nodes require rebooting during this process.

OneFS Root Lockdown Mode

Tucked among the plethora of capabilities that debut in OneFS 9.12 is Root Lockdown Mode (RLM), a security feature designed to disable and block access to the cluster’s root account. This new RLM functionality enforces role-based access controls (RBAC) for cluster administrators, thereby reducing the risk of a single privileged user bypassing security policies and performing potentially destructive operations on the cluster configuration or its data. Given that the root account has inherent, unaudited permissions to override security controls, restricting its access can significantly strengthen PowerScale’s security posture.

RLM is managed through the OneFS hardening engine, which is part of a cluster’s compliance mode framework.

RLM leverages the OneFS compliance mode protections provided by the hardening engine and requires both the ‘Hardening’ and ‘SmartLock’ licenses to function. It introduces a new hardening profile, aptly named ‘root-lockdown’, which can be applied or disabled as desired. However, transitioning between RLM states does involve a complete cluster reboot. Plus, if MPA is active, the operation must be approved by multiple parties. OneFS services, and admins, can inspect ‘sysctl’ values to determine whether the cluster is operating in Compliance Mode or RLM.

Note that, unlike MPA, RLM in OneFS 9.12, does not have a WebUI interface, and so is only configurable via the command line (CLI).

Under the hood, when hardening is activated, the security hardening engine reads the desired configuration, such as RLM, from its config files. Sets of rules, or config items, are applied to the hardening configuration to increase security and, in the case of RML, ensure disablement of a cluster’s root account. These rules are grouped by profile, which contain collections of named rules. Profiles are stored in separate .xml files under /etc/isi_hardening/profiles.

# ls /etc/isi_hardening/profiles
profile_root-lockdown.xml       profile_stig.xml

As of OneFS 9.12, there are two hardening profiles available:

Root Lockdown Mode (RLM)
STIG compliance

Plus, the hardening infrastructure is in place to support additional profiles as and when they are required.

Individual rules are stored in separate .xml files under /etc/isi_hardening/rules:

# ls /etc/isi_hardening/rules

rules_apache.xml rules_celog.xml rules_root.xml

rules_audit.xml rules_fips.xml rules_shell_timeout.xml

rules_auth.xml rules_misc.xml rules_umask.xml

rules_banners.xml rules_password.xml

rules_cacpiv.xml rules_pki_ocsp.xml

For example, the ‘rules_root.xml’ file for disabling the root account:

# cat rules_root.xml
<!-- Copyright (c) 2024-2025 Dell Inc. or its subsidiaries. All Rights Reserved. -->
<CONFIG_ITEMS>
  <CONFIG_ITEM id="disable_root_account" version="1">
    <PapiOperation>
      <DO>
        <URI>/auth/users/root</URI>
        <BODY>{"enabled": false}</BODY>
        <KEY>users</KEY>
      </DO>
      <UNDO>
        <URI>/auth/users/root</URI>
        <BODY>{"enabled": true}</BODY>
        <KEY>users</KEY>
      </UNDO>
      <ACTION_SCOPE>CLUSTER</ACTION_SCOPE>
      <IGNORE>FALSE</IGNORE>
    </PapiOperation>
  </CONFIG_ITEM>
  <CONFIG_ITEM id="root_shell_nologin" version="1">
    <PapiOperation>
      <DO>
        <URI>/auth/users/root</URI>
        <BODY>{"shell": "/sbin/nologin"}</BODY>
        <KEY>users</KEY>
      </DO>
      <UNDO>
        <URI>/auth/users/root</URI>
        <BODY>{"shell": "/usr/local/bin/zsh"}</BODY>
        <KEY>users</KEY>
      </UNDO>
      <ACTION_SCOPE>CLUSTER</ACTION_SCOPE>
      <IGNORE>FALSE</IGNORE>
    </PapiOperation>
  </CONFIG_ITEM>
</CONFIG_ITEMS>

Rules are grouped by functional area affected, and can apply to platform API configuration ‘collections’. For example, a rule can be applied to all NFS exports or all SyncIQ policies. In addition to actionable rules, ‘check-only’ rules are supported which apply no changes.

The hardening engine rules allow comparator logic in addition to equality, and can evaluate conditions like whether a string is empty or non-empty, and if a given timeout is greater or equal to the required value. The RLM profile specifies the following hardening rule set:

# cat /etc/isi_hardening/profiles/profile_root-lockdown.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- Copyright (c) 2024-2025 Dell Inc. or its subsidiaries. All Rights Reserved. -->
<Profiles version="1">
    <Profile>
        <Name>root-lockdown</Name>
        <Description>Enable security settings that disable the root account.</Description>
        <Rule>set_rlm_cluster_wide</Rule>
        <Rule>require_disabled_STIG_profile</Rule>
        <Rule>require_password_single_user_mode</Rule>
        <Rule>require_rlm_aclr_users</Rule>
        <Rule>require_rlm_cluster_state</Rule>
        <Rule>disable_kdb</Rule>
        <Rule>disable_root_account</Rule>
        <Rule>root_shell_nologin</Rule>
        <Rule>check_disable_root_account</Rule>
        <Rule>check_root_shell_nologin</Rule>
        <Rule>check_rlm_sysctls</Rule>
    </Profile>
</Profiles>

Most of the above RLM rule definitions live in the ‘/etc/isi_hardening/rules/rules_misc.xml’ file. For example:

# grep -A 10 "disable_root" /etc/isi_hardening/rules/rules_misc.xml
  <CONFIG_ITEM id="check_disable_root_account" version="1">
    <PapiOperation>
      <CHECK>
        <URI>/auth/users/root</URI>
        <BODY>{"enabled": false}</BODY>
        <KEY>users</KEY>
      </CHECK>
      <ACTION_SCOPE>NODE</ACTION_SCOPE>
      <IGNORE>FALSE</IGNORE>
      <RETRIES>0</RETRIES>
    </PapiOperation>

The hardening engine also includes a reporting component, allowing detailed reports to be generated that indicate which hardening rules are applied or not, as well as overall compliance status. Hardening report generation can be initiated with the following CLI command syntax:

# isi hardening reports create
...............Hardening operation complete.

# isi hardening reports list
Name   Applied  Status        Creation Date            Report Age
-----------------------------------------------------------------
root-lockdown  Yes  Compliant Fri Oct 24 14:28:40 2025 3m18s
-----------------------------------------------------------------
Total: 1

A recommended practice is to configure and enable RLM in combination with the OneFS Multi-Party Approval (MPA) feature. MPA is an administrative approval mechanism that requires at least one additional trusted party to sign off on a requested change, for certain privileged actions within a PowerScale cluster. MPA helps to mitigate the risk of data loss or system configuration damage from critical actions, by vastly reducing the likelihood of accidental or malicious execution of consequential operations. As such, enabling RLM on a cluster with MPA configured means that any requested change to the RLM state will require additional approval.

The application or removal of RLM necessitates a full cluster reboot to ensure that all relevant system controls (sysctls) are properly enforced or cleared. The hardening report (‘isi hardening reports create/view’) can be used to determine which nodes require rebooting during this process.

When expanding a cluster that has RLM enabled, only nodes that have already been rebooted into Compliance Mode using the setup wizard (option 4) are permitted to join. Any new node joining an RLM-enabled cluster must be manually rebooted to inherit the RLM configuration.

In the next article in this series, we’ll explore the processes and procedures for configuring and managing RLM.

OneFS Secure Snapshots and SnapshotIQ Locks

In today’s landscape of heightened cyber threats and data breaches, the demand for immutable, tamper-resistant snapshots is growing rapidly. Often this need arises as part of a broader security mandate, ideally proactively, but oftentimes as a response to a security incident. OneFS addresses this requirement in the following ways:

On-cluster

Off-cluster

· Read-only snapshots

· Snapshot locks

· Role-based administration

· Secure snapshots with multi-party authorization

· SyncIQ snapshot replication

· SmartSync snapshot replication

· Cyber-vaulting

As we have seen over the course of this series of articles, the primary objective of Secure Snapshots, introduced in OneFS 9.12, is to ensure snapshot immutability, safeguarding them from accidental changes or deliberate tampering. Beyond protecting the snapshots themselves, Secure Snapshots also secures the associated snapshot schedules. Since these schedules govern the creation of future snapshots, any unauthorized modifications could lead to unintended outcomes—such as excessive snapshot generation that strains the cluster, or a complete halt in snapshot creation.

Secure Snapshots are built on the OneFS multi-party authorization (MPA) framework, which requires additional approvals before any changes can be made to snapshot configurations or schedules, effectively preventing unauthorized actions.

The Secure Snapshots functionality is in addition and complementary to traditional SnapshotIQ locking, which operates under the purview or the ‘isi snapshot locks’ CLI command set. As such, Secure Snapshots do not affect the addition, removal, or general operation of the regular SnapshotIQ snapshot locks.

Feature	Availability	Authorization	Description
Secure snapshots	OneFS 9.12 and later	Multiple approvers	One or more additional approvers must authorize snapshot privileged action.
SnapshotIQ locks	OneFS 7.0 and later	Single approver	A single administrator can configure and manage snapshot locks.

While SnapshotIQ locks help prevent snapshots from being accidentally or unintentionally deleted, unlike Secure Snapshots, they do not offer the additional anti-tamper protection of Multi-party Authorization (MPA).

SnapshotIQ locks function as follows. For example, a manual snapshot, ‘snaploc1’ is taken of /ifs/test:

# isi snapshot snapshots create /ifs/test --name snaploc1

# isi snapshot snapshots list | grep snaploc1
79188 snaploc1                                     /ifs/test

A lock is then placed on it (in this case lock ID=1):

# isi snapshot locks create snaploc1

# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

Or from the WebUI:

Attempts to delete the snapshot fails because the lock prevents its removal:

# isi snapshot snapshots delete snaploc1

Are you sure? (yes/[no]): yes

Snapshot "snaploc1" can't be deleted because it is locked

The CLI command ‘isi snapshot locks delete <lock_ID>’ can be used to clear existing snapshot locks, if desired. For example, to remove the only lock (ID=1) from snapshot ‘snaploc1’:

# isi snapshot locks list snaploc1
ID
----
1
----
Total: 1

# isi snapshot locks delete snaploc1 1
Are you sure you want to delete snapshot lock 1 from snaploc1? (yes/[no]): yes

# isi snap locks view snaploc1 1
No such lock

Or from the WebUI:

Once the lock is removed, the snapshot can then be deleted:

# isi snapshot snapshots delete snaploc1
Are you sure? (yes/[no]): yes

# isi snapshot snapshots list| grep -i snaploc1 | wc -l
       0

A snapshot can have up to a maximum of sixteen SnapshotIQ locks on it at any time. Also, lock numbers are continually incremented and not recycled upon deletion.

Similar to snapshot expiry configuration, snapshot locks can also have an expiry time configured too. For example, to set a lock on snapshot ‘snaploc1’ that expires at 12pm on April 1^st, 2026:

# isi snap lock create snaploc1 --expires '2026-04-01T12:00:00'

# isi snap lock list snaploc1
ID
----
36
----
Total: 1

# isi snap lock view snaploc1 33
     ID: 36
Comment:
Expires: 2026-04-01T12:00:00
  Count: 1

Or from the WebUI:

Note that if the duration period of a particular snapshot lock expires but others remain, OneFS will not delete that snapshot until all the locks on it have been removed or expired.

Secure Snapshots Configuration and Management

As discussed in the previous article in this series, OneFS Secure Snapshots are designed to ensure the immutability of snapshots and safeguard them against accidental or malicious modification or deletion. Beyond protecting the snapshots themselves, the multi-party authorization requirement also secures the associated snapshot schedules. These schedules are critical for generating future snapshots, and any tampering – such as altering the frequency or disabling them entirely – could result in unintended consequences, including excessive snapshot creation that strains cluster resources or a complete halt in snapshot generation.

Once the cluster is committed to OneFS 9.12, snapshots are licensed, and two or more approval admins have been configured, MPA can be enabled. This preparatory work is covered in-depth in this MPA configuration article. MPA can be enabled from the WebUI under Access > Multi-Party Authorization > Settings:

Or via the CLI with the following command syntax:

# isi mpa settings global modify --enable true

The general flow of Secure Snapshot privileged command execution is as follows:

Once MPA is activated, any Secure Snapshot deletion, and other privileged action, requests are blocked pending approval.

As such, an attempt to delete a secure snapshot generates a warning, and the action is suspended until approved by an authorized user. For example:

Or from the CLI:

# isi snapshot list

ID   Name                                          Path

-----------------------------------------------------------------------------

40    Snapshot:2025July21.2:44PM                        /ifs/data

42    SIQ-8b30297763c5db7307e2d84504839c25-new      /ifs/data/zone1

# isi snapshot delete 40

Are you sure? (yes/[no]): yes

Delete Snapshot is a privileged action . A request paareqa99439a172d09999 to perform this action is pending approval. Check Multi-Party Authorization to view the status of the request.

Details of the requested snapshot privileged action can be viewed as follows:

Similarly, from the CLI:

# isi mpa requests view --id= paareqa99439a172d09999

ID: paareqa99439a172d09999

Action: delete_snapshot

Action Payload: -

Created By: MPA

Last Update Time: 2025-07-21T14:44:56

Resource IDs: 40

Resource Type: snapshot

Service: snapshot

Status: pending

Request For: 0

Assuming they sanction the privileged action, the authorizer(s) grant approval for the snapshot deletion, or other privileged action.

This is achieved by clicking the ‘Approve’ button for the pending request, under Access > Multi-Party Authorization > Requests:

The authorizer is then prompted for their TOTP security authorization code as part of the approval process:

Or via the CLI:

# isi mpa requests list

# isi mpa requests approve <id> <comment> <approved> --totp-code <******> --approval-valid-before <timestamp> --zone <zone>

Once approval has been granted, the privileged snapshot operation can complete as expected.

The WebUI reports a successful approval status as follows:

Or from the CLI:

# isi mpa requests approve --approved=true –id= paareqa99439a172d09999–comment=approved

totp_code: xxxxxx

Request paareqa99439a172d09999has been approved successfully.

# isi snapshot delete 40

Are you sure? (yes/[no]): yes

# isi snapshot list

ID   Name                                          Path

-----------------------------------------------------------------------------

42    SIQ-8b30297763c5db7307e2d84504839c25-new      /ifs/data/zone1

It’s worth noting that snapshot schedules will only generate secure snapshots if they are configured to create snaps with expiration dates. For example:

It’s also important to note that the new Secure Snapshots feature in OneFS 9.12 is designed to complement, rather than replace, the existing SnapshotIQ locking mechanism, which is managed via the ‘isi snapshot locks’ CLI command set.

Feature	Availability	Authorization	Description
Secure snapshots	OneFS 9.12 and later	Multiple approvers	One or more additional approvers must authorize snapshot privileged action.
SnapshotIQ locks	OneFS 7.0 and later	Single approver	A single administrator can configure and manage snapshot locks.

Secure Snapshots operate independently and do not interfere with the creation, removal, or behavior of traditional SnapshotIQ locks. This topic will be covered in more depth in the next article in this series.

OneFS Secure Snapshots

OneFS 9.12 sees the addition of a new Secure Snapshots feature. The core ethos of this new functionality is to provide snapshot immutability, in order to protect them from being altered or deleted – either by accident or by a malicious actor. In addition to the snapshots themselves, Secure Snapshots functionality also includes protection of the snapshot schedules. Since these schedules are responsible for creating future snapshots, tampering with them could also lead to unforeseen and undesirable consequences, such as schedule modification which overwhelms the cluster with snapshots, or ceases to generate snapshots all together.

Secure snapshots are predicated upon OneFS multi-party authorization (MPA), which prevents a user from executing configuration changes on snapshots and snapshot schedules without additional approval.

Secure Snapshots in conjunction with MPA protect a subset of snapshot operations, or ‘privileged actions’. These include:

Deleting a snapshot prior to its expiry date
Shortening the expiry date of a snapshot
Renaming a snapshot using reserved name
Modifying a snapshot schedule

These following highlighted snapshot operations are privileged and require secondary approval when MPA is enabled:

Snapshot Operation	MPA Required?
Deletion of a snapshot without an expiry date	No
Deletion of a snapshot with an expiry date once that expiry date has passed	No
Deletion of a snapshot that has an expiry date before that expiry date has passed	Yes
Modifying a snapshot schedule	Yes
Adding an expiry date to a snapshot	No
Extending the expiry date of a snapshot	No
Reducing the expiry date of a snapshot	Yes
Renaming a snapshot using reserved name used	Yes
Adding a lock to a snapshot	No
Removing a lock from a snapshot	No

With Secure Snapshots, you cannot delete or modify snapshots with expiration dates or modify a snapshot schedule without these actions going through MPA first. So when you make these requests, the information about the request is also sent to MPA so that approver can see what changes you’re trying to make so they know exactly what they are actually approving or not.

OneFS 9.12 also checks whether user-generated snapshots have overlapping names with snapshots that could be potentially created by demons on the cluster. So the basic workflow is that users will make a request to delete a snapshot or edit a snapshot and then will have to go through MPA and make the exact same request again. If the approver grants the request, the snapshot deletion will proceed. If the approver rejects or does not respond to the request, if they have approved the request, no action occurs. They’ve made the same request a second time, and if they haven’t responded or if they haven’t, if they’ve rejected the request, it just will not make the request or issue a new request.

There are three principal prerequisites for deploying Secure Snapshots, which are as follows:

Secure Snapshots can only be enabled once the cluster has been committed to OneFS 9.12 or later.
SnapshotIQ needs to be licensed and the ISI_PRIV_SNAPSHOT privilege is required.
MPA must already be active on the cluster and configured for two or more approvers.

The MPA enablement and approver registration workflow is as follows:

1. First, Create users and grant approval privilege.

Approver registration be configured from the CLI with the following set of commands.

First, configure two approver users – in this case ‘mpa-approver1’ and ‘mpa-approver2’:

# isi auth users create mpa-approver1 --enabled true --password <passwd>

# isi auth users create mpa-approver2 --enabled true --password <passwd>

Next, create a group for these users. Eg. ‘mpa-group’:

# isi auth group create mpa-group

Add the approver users to their new group:

# isi auth group modify mpa-group --add-user mpa-approver1

# isi auth group modify mpa-group --add-user mpa-approver2

Then add the new group (mpa-group) to the RBAC ApprovalAdmin role:

# isi –debug auth roles modify ApprovalAdmin --add-group mpa-group

If needed, configure approvers’ SSH login to the cluster.

# isi auth roles create ssh-role

# isi auth roles modify ssh-role --add-group mpa-group --add-priv-read ISI_PRIV_LOGIN_SSH

Next, add the approver users to MPA and complete their registration. This involves initiating registration, using the secret or embedded URL in conjunction with a third party code generator to create a time-based one-time password or TOTP.

Here, the ‘mpa-approver1’ user is registered first:

# isi mpa initiate-registration

Initiated registration successfully: (Online utility can be used to   convert URL to QR code for use with authenticator)

account:  Dell Technologies

algorithm:  SHA1

digits:  6

issuer:  mpa-approver1

period:  30

secret:  5KHIHISJZPB7UTPRNEBLSIMI5L4R6K6UI

url:  otpath://totp/Dell%Technologies:mpa-approver1?secret=5KHIHISJZPB7UTPRNEBLSIMI5L4R6K6UI&issuer=Dell%20Technologies&algorithm=SHA1&digits=6&period=30

The user secret or URL returned can be used to set up a TOTP in Google Authenticator, or another similar application.

This secure one-time registration can be performed as follows:

a. First, initiate the OneFS registration process to obtain a URI, which can then be generated into a QR code by a 3^rd party app. For example:

b. Next, convert the QR code (or URI) into a time-based one-time password code via Google Authenticator, Microsoft Authenticator, or another TOTP app:

c. Finally, complete the OneFS approver registration wizard:

This can be performed from the CLI as follows:

# isi mpa complete-registration

totp_code:  ******

Approver registered.

Or from the WebUI, under WebUI under Access > Multi-Party Authorization > Registration:

This process should then be repeated for the second (ie. ‘mpa-approver2’) role.

Once the approvers are registered, MPA can now be enabled via its global settings.

This can be performed from the WebUI under Access > Multi-Party Authorization > Settings:

Note that MPA cannot be enabled on clusters without having at least two registered approvers.

When successfully enabled, the following status banner will be displayed:

Alternatively, enabling via the CLI can be done with the following command syntax:

# isi mpa settings global modify –enable true

# isi mpa settings global view

Multi Party Authorization is enabled

Once MPA is enabled on a cluster, any requests to delete Secure Snapshots or modify their schedules, etc, are paused pending approval. For example:

Or from the CLI:

# isi snapshot list

ID   Name                                          Path

-----------------------------------------------------------------------------

40    Snapshot:2025July21.2:44PM                        /ifs

42    SIQ-8b30297763c5db7307e2d84504839c25-new      /ifs/data/zone1

# isi snapshot delete 40

Are you sure? (yes/[no]): yes

Delete Snapshot is a privileged action . A request paareq43d57a6172a16de8 to perform this action is pending approval. Check Multi-Party Authorization to view the status of the request.

In the next article in this series, we’ll turn our attention to the configuration and management of Secure Snapshots.

Multi-party Authorization Management and Troubleshooting

OneFS Multi-party Authorization, or MPA, is an administrative approval mechanism that requires at least one additional trusted party to sign off on a requested change, for certain privileged actions within a PowerScale cluster. As such, MPA helps to reduce the likelihood of data loss or system configuration damage from critical actions, by mitigating the risk of accidental or malicious execution of consequential operations.

MPA can be managed via the OneFS WebUI, CLI, or platform API, and the WebUI portal can be found by navigating to Access > Multi-Party Authorization:

Similarly, the CLI uses ‘isi mpa’ command-set, with the following high-level options:

# isi mpa -h

Description:
    Manage Multi-Party Authorization.

Usage:
    isi mpa {<action> | <subcommand>}
        [--timeout <integer>]
        [{--help | -h}]

Actions:
    complete-registration    Complete Multi-Party Authorization registration.
    initiate-registration    Initiate Multi-Party Authorization registration.

Subcommands:
    approvers                Manage Multi-Party Authorization approvers.
    ca-certificates          Manage CA-Certificates for Multi-Party
                             Authorization.
    requests                 Manage Multi-Party Authorization requests.
    settings                 Manage Multi-Party Authorization settings.

Options:

  Display Options:
    --timeout <integer>
        Number of seconds for a command timeout (specified as 'isi --timeout NNN <command>').
    --help | -h
        Display help for this command.

See 'isi mpa <subcommand> --help' for more information on a specific subcommand.

Within the platform API (pAPI), the MPA endpoints all reside under http://<cluster_IP>:8080/platform/23/mpa/. Their hierarchy and syntax is as follows:

/23/mpa/approval/<ID>
/23/mpa/approvers
/23/mpa/approvers/<ID>
/23/mpa/complete-registration
/23/mpa/initiate-registration
/23/mpa/requests
/23/mpa/requests/<ID>
/23/mpa/settings/config/request-lifecycle
/23/mpa/settings/global
/23/mpa/settings/privilege-action/metadata
/23/mpa/signed-approval/<ID>
/23/mpa/trust-anchors

An MPA pAPI endpoint can be queried as follows:

https://<cluster_IP>:8080/platform/23/mpa/settings/global
{
"last_update_time" : -1,
"mpa_enabled" : true
}

Invalid MPA calls will return a fairly descriptive error message and code. For example, the following ‘AEC_BAD_REQUEST’ error and 400 code resulting from a request query of a nonexistent service and action:

# curl -u root:a -k 'https://<cluster_IP>:8080/platform/mpa/requests' --data '{"service":"TestService", "action":"TestAction"}'
{
"errors" :
[
{
"code" : "AEC_BAD_REQUEST",
"message" : "Invalid privileged action TestAction for service TestService"
}
]
}
Additionally, the API of choice can be ‘described’ as follows:
https://<cluster_IP>:8080/platform/23/mpa/approvers?describe
Resource URL: /platform/23/mpa/approvers

    Overview: List MPA approvers
     Methods: GET
********************************************************************************
Method GET: List MPA approvers
URL: GET /platform/23/mpa/approvers

Query arguments:
registration_status=<enum> Where <enum> is one of Registration_Initiated, Pending_Approval, Registered, Registration_Denied.

Registration status of a MPA approver.

GET response body schema:
{
  "type": [
    {
      "type": "object",
      "description": "A list of errors that may be returned.",
--<snip>--

When upgrading a cluster from an earlier release, the MPA framework’s privileged actions and approval system (PAAS) becomes available after committing the upgrade to OneFS 9.12. As part of the upgrade process, a pre-check is performed to ensure that no existing role named ‘ApprovalAdmin’ (case-insensitive) is present. If such a role exists, administrators will be prompted to rename it to avoid conflicts with the newly introduced system role. This is necessary because privileges associated with the new ApprovalAdmin role will not be applied to pre-existing roles of the same name.

Upon upgrading to OneFS 9.12, the feature upgrade flag (ERA2_MPA) is automatically set to ‘true’ when the upgrade is committed, thereby providing, but not enabling, MPA functionality. Additionally, all MPA-related APIs become accessible only after the upgrade is finalized.

It is important to note that MPA is not supported in Compliance Mode. If attempting to enable MPA on a compliance cluster, the following warning message is displayed:

# isi mpa settings global modify --enable=true

Cannot enable Multi-Party Authorization mode on clusters with compliance mode already enabled

Prior to initiating an upgrade to OneFS version 9.12, the system performs a validation check to ensure that no existing role named ApprovalAdmin—regardless of case sensitivity—is present. The Multi-Party Authorization (MPA) feature becomes available only after the upgrade has been successfully committed. MPA does not require any special licensing; however, for enhanced security, it is recommended to enable MPA in conjunction with Root Lockdown Mode (RLM), which is also introduced in OneFS 9.12. Activation of RLM requires that the compadmin user is enabled and that both the Hardening and SmartLock licenses are valid and active across the cluster.

Note that the ability to disable MPA is not supported in OneFS 9.12, so consider this before activating the MPA feature.

When troubleshooting MPA-related issues, particularly those involving Time-Based One-Time Passwords (TOTP), several factors should be considered. For example:

If a ‘TOTP is invalid’ error occurs during the approver registration process, verify that the TOTP code is being generated from the correct account in the third-party application. Time synchronization between the client and the cluster is critical, and any time skew can result in token validation failures. If an approver loses their TOTP token or secret key, they must reinitiate registration. When MPA is enabled, this re-registration becomes a privileged action and requires approval.

In a rare scenario where all the MPA approvers have lost access to their TOTP credentials and no one is available to approve privileged actions, OneFS does provide a last resort ‘break-glass’ mechanism to handle such cases. Where appropriate, Dell can issue a specific patch that allows the approval of a registration request for a new ApprovalAdmin. This patch must be executed by the root user and will internally run the ‘isi mpa request approve’ CLI command for the specific request ID. This bypasses the TOTP check and is limited to the ‘register_approval_admin’ action. If such a scenario is encountered, please contact Dell Support immediately.

An MPA approval request can be in one of five states: Approved, Cancelled, Completed, Pending, or Rejected.

These states, and their various expiry times, can be viewed with the following CLI command syntax:

# isi mpa settings request-lifecycle list

Approved
    Description: expire in configured days for privilege action execution.
    Expire_time: 7d

Cancelled
    Description: remove from system in configured days.
    Expire_time: 7d

Completed
    Description: remove from system in configured days.
    Expire_time: 30d

Pending
    Description: expire in configured days can be updated by user or approver.
    Expire_time: 7d

Rejected
    Description: remove from system in configured days.
    Expire_time: 30d

The MPA requests comprise the following attributes.


Request Attribute	Description
action	Name of privileged action
action_payload	Set of key/value pairs for privileged action payload
cluster_guid	Cluster GUID that uniquely identifies PowerScale
created_by	Specified who created the request
creation_time	Creation time of the request. Unix epoch time format
id	Unique ID of MPA request.
last_update_time	Last update time of the request. Unix epoch time format
request_for	User for whom request is created for.
resource_ids	List of resources IDs requested for approval.
resource_type	Type of resource requested for approval.
service	Name of service or component in system that owns the privileged action
status	Status of MPA request
system_created	A privileged action approval request was auto created by the system or not.
zone_id	Multi-tenant access zone ID of MPA request created.

If a privileged action fails to execute despite an approved request, it is necessary to verify whether the approval has expired. For troubleshooting purposes, diagnostic logs related to MPA can be found in the PAPI logs (/var/log/isi_papi_d.log). A fair amount of detail and context is captured by default without the need to increase logging verbosity. For example, the following secure snaphshot delete request, which includes snap ID, warning message, MPA request ID, URI info, error codes, etc:

2025-09-21T21:45:09.734523+00:00 <3.6> GLaDOS-8(id8) isi_papi_d[46945]: STACK snapshot ID: 2, Delete Snapshot is a privileged action. A request paareq43d57a6172a16de8 to perform this action is pending approval. Check Multi-Party Authorization to view status of the request.     from --- (---:0):      isi_exception::isi_exception(int, char const*, __va_list_tag*) (OFFSET:134)     api_exception::api_exception(api_error_code, char const*, ...) (OFFSET:146)     snapshot_mpa_interface::initiate_mpa_request(mpa_lib::MpaMatchContext&, std::__1::basic_string<char, mpa_lib::MpaMatchContext&::char_traits<char>, mpa_lib::MpaMatchContext&::allocator<char> > const&) (OFFSET:281)     snapshot_snapshot_handler::http_delete(request const&, response&) (OFFSET:2136)     uri_handler::execute_http_method_internal(request&, response&) (OFFSET:253)     uri_handler::execute_http_method(request&, response&, bool, bool, bool) (OFFSET:1894)     uri_manager::execute_request(request&, response&, bool, bool, bool, bool, bool, bool) (OFFSET:1578)     std::__1::basic_filebuf<char, std::__1::char_traits<char> >::basic_filebuf(void) (OFFSET:7452)     std::__1::basic_filebuf<char, std::__1::char_traits<char> >::basic_filebuf(void) (OFFSET:11172)     typeinfo name for api_exception (OFFSET:16919)     typeinfo name for api_exception (OFFSET:13861)     ADDRESS (UNKNOWN:-1806659576)

The OneFS services which include MPA privileged actions can be queried with the following CLI syntax:

# isi mpa settings pa-metadata list

For example:

For these MPA-integrated services, such as S3, service-specific logs can also provide useful additional diagnostics information. These MPA privileged actions and their associated logfiles in OneFS 9.12 include:

Service / Component	Privileged Action	Logfile
Hardening	apply_hardening disable_hardening	/var/log/hardening.log /var/log/hardening_engine.log
MPA	register_approval_admin upload_trust_anchor	/var/log/isi_papi_d.log
S3	reduce_immutable_bucket_retention modify_server_access_logging_config	/var/log/s3.log
Snapshot	delete_snapshot modify_snapshot delete_snapshot_schedule modify_snapshot_schedule	/var/log/isi_snapshot_d.log

Complementary to MPA, OneFS 9.12 also introduces Root Lockdown Mode (RLM), which disables and blocks access to a cluster’s root account, forcing the use of roles-based access controls (RBAC). Since the root has explicit permission to bypass security controls, locking it down can also help to prevent a single bad actor from performing destructive operations on the data or storage platform.

As a best practice, enabling Root Lockdown Mode alongside Multi-party Authorization (MPA) provides a more secure and resilient configuration, ensuring full benefits of the MPA framework. To activate RLM, the compadmin user must be enabled, and both Hardening and SmartLock licenses are required. MPA and RLM can be enabled independently, and once MPA is activated, enabling RLM is a privileged action which requires an approval. RLM will be covered in-depth in a near-future article.

For disaster recovery in environments where both RLM and MPA are enabled, two procedures are available for recovering access to a node with a lost root password. After first contacting Dell Support, the preferred method involves smart-failing the node, re-imaging it, and rejoining it to the cluster. If this is not feasible, an alternative approach is to boot the node from alternate media and manually edit the System file provider to restore root access. After recovery, the Configurable Hardening Engine report should be reviewed to ensure all RLM rules are applied. If not, follow the documented steps to re-enable RLM.

To ensure the one-time usage of TOTP tokens, OneFS stores both the token and its generation counter in the MPA database under the approver’s record. When a new TOTP validation request is received, the system checks whether the token matches the previously used one. If it does, an error is triggered. The counter used to generate the token must be higher than the last used counter, ensuring that even unused older tokens cannot be reused once a newer token has been validated.

For example, if a token is generated at 12:00:00 and submitted at 12:00:20, the MPA uses the current time window (12:00:00–12:00:30) to generate the counter and validate the token. Once validated, both the token and counter are stored. If a malicious actor attempts to reuse the token, validation will fail either because the token matches the last used one or because the counter is outdated.