OneFS Cluster Quick Checks

OneFS brings simplicity, scalability, and ease of use to unstructured data management complexity. However, as with many things in life, an ounce of prevention is key to keeping a cluster happily humming along. As such, the following daily, monthly, and quarterly checks provide a quick and easy method for keeping an eye on a PowerScale environment and ensuring smooth cluster operation.

1 Daily Checks
Daily health checks are important to ensure a cluster is operating at optimal performance and capacity. These checks require minimal effort and develop familiarity: Ie. Becoming aware of alerts that point to an area of interest, or something requiring further investigation, before it becomes something larger.
Category Check Method Option(s) Description / Recommendation
Health Score CloudIQ, Alerts Target a Health Score of 100
Cluster Capacity WebUI, CLI, DataIQ, InsightIQ, CloudIQ, Alerts Maintain a storage utilization below 90%
Events and Alerts WebUI, CLI, Alerts Address all Hardware and Software Alerts as they occur
 

2 Monthly Checks
Monthly cluster health checks help ensure the environment is performing as expected, while also providing an opportunity to measure progress against service level objectives and review new OneFS feature enhancements.
Category Check Method Option(s) Description / Recommendation
Upgrade Strategy Isilon On Cluster Analysis (IOCA)

Current OneFS Patch

OneFS and Firmware code currency.

ioca -u <target code version>

Example: ioca -u 8.2.2 to get latest upgrade plan

SRS/Alerts/Events/Email WebUI, CLI, SRS Confirm that all alerting and notifications are operating as planned
Failed Drive Status WebUI, CLI Review drive status and ensure that all drives that have SmartFailed have been replaced
SmartPools WebUI, CLI Review SmartPools settings and Node Pools utilization to identify if tiering changes are necessary
SnapshotIQ WebUI, CLI Snapshot utilization above 10% or any single snapshot possessing excessive size

[Ex. fsa snapshot]

SyncIQ WebUI, CLI Check the SyncIQ configuration to ensure RPO/RTO targets and required data is being replicated as needed
Monitoring Tools CloudIQ, DataIQ, InsightIQ Confirm that all monitoring tools are continuing to gather data from all storage devices
Healthcheck Framework CLI Ensure that the latest Healthcheck Framework checks are installed and Healthcheck Assessment is executed
DSA/DTA

(Dell Security Advisory/Dell Technical Advisory)

Isilon/PowerScale Alert Subscription, Service360

DSA | DTA

Verify if DSA/DTA is applicable to your cluster and if so, take immediate action

 

3 Quarterly Checks
Quarterly cluster health checks help ensure that the environment is running at optimal performance and capacity while at the same time provide an opportunity to evaluate against broader business level objectives and review new OneFS feature enhancements..
Category Check Method Option(s) Description / Recommendation
OneFS and Firmware Update Considerations Dell Technologies Support Site (https://www.dell.com/support/home/)

Target Code Information

Contact your Unstructured Data Solutions Systems Engineer

Review newer OneFS versions to determine the beneficial of upgrading to a newer release. In addition to OneFS, node and drive firmware should also be considered. Obtain the latest code information via the Dell EMC support site and utilize the Dell EMC team for the upgrade where possible.
OneFS Patch Updates Dell Technologies Support Site (https://www.dell.com/support/home/) Check the latest Roll-up Patches (RUPs) and upgrade as needed.
Monitoring Tools Updates InsightIQ, DataIQ, CloudIQ, SRS Review all monitoring tools to determine if upgrades are needed.
Capacity Trending InsightIQ, DataIQ, CloudIQ Evaluate capacity growth trends to determine if additional node purchase will be required over the next 6 months.

Further details on OneFS recommended practices and check list items are available in the OneFS Best Practices white paper.

Leave a Reply

Your email address will not be published.