As we saw in the previous article, OneFS 9.8 introduces new SmartLog functionality to help simply and streamline PowerScale’s issue investigation and time to resolution. SmartLog optimizes the log gathering process, while also integrating with OneFS health-checking, and CELOG events and alerting. Specifically:
Activity |
Description |
Gather |
• Scope of gathers can be limited by specifying one or more functional groups.
• Extends time-based gather functionality (both shorthand, ex. 2h, and timestamp)
• Allows for gathering of small and highly optimized gathers |
Healthcheck |
• Gathers can be triggered via ‘isi healthcheck evaluations gather’ CLI command.
• Healthcheck gathers cannot be triggered for passing evaluations |
CELOG |
• Gathers can now be triggered via `isi event groups gather `
• CELOG gathers can only be triggered for Critical and Emergency events |
In addition to the OneFS command line options in support of this new functionality, the WebUI diagnostics section has also seen a significant overhaul. This can be accessed by navigating to Cluster management > Diagnostics > Gather logs.
A gather can be easily started either by clicking the WebUI ‘Start Gather’ button below:
Or via the following CLI command:
# isi diagnostics gather start
Gather started.
Finished gathers can be found in: /ifs/data/Isilon_Support/pkg
The WebUI status monitor indicates when a gather is currently underway:
Or via the CLI:
# isi diagnostics gather status
Gather is running.
Finished gathers can be found in: /ifs/data/Isilon_Support/pkg
A running gather can also be easily terminated, either by clicking the ‘Stop Gather’ button:
Or via the following CLI command:
# isi diagnostics gather stop
Gather stopped.
When complete, SmartLog writes its gather tarfile to the /ifs/data/Isilon_Support/pkg/ directory by default. These gather files can be identified by their ‘IsilonLogs’ prefix. For example:
# ls -lsia /ifs/data/Isilon_Support/pkg/IsilonLogs*
6952453633 3124592 -rw-r--r-- 1 ese ese 2838789143 May 1 16:26 /ifs/data/Isilon_Support/pkg/IsilonLogs-HAL-9000-New1-20240501-162000-b8b6755a-eb48-467d-a5e3-3f6f650ae0d1.tgz
Note that the WebUI will display a warning recommendation to download gather log tarfiles great than 20MB in size via CLI, rather than using the WebUI option. For example:
When done, the gather file can be easily removed via the WebUI ‘Delete’ Actions button above, and successful deletion is confirmed:
The ‘Gather settings’ WebUI page remains largely unchanged in OneFS 9.8, with the choice of both a full or incremental gather, and the auto upload and various transport protocol options available:
Successful changes to the gather settings, in this case to incremental gather mode, are confirmed by a WebUI popup:
With SmartLog in OneFS 9.8, the three new options for initiating a more granular gather now include:
Gather Option |
Description |
CLI syntax |
Group |
Gather based on the feature group(s). Ie: protocol, data service, auth, security, cloud, etc. |
isi_gather_info –group <g1,g2,…,gn> |
Time interval |
Past Gather based on duration. Time Range specified as interval (hours, days, weeks). |
isi_gather_info –gather-past <nw/nd/nh> |
Timestamp |
Gather based on the beginning timestamp. |
isi_gather_info –gather-begin <YYYY-MM-DD [HH:MM]> |
Gather based on the timestamp.
The WebUI ‘Start Gather’ page’s ‘Time Range’ option allows timestamp-based log gathers to be specified:
Timestamp-based gathers can also be initiated from the CLI with the following syntax:
# isi diagnostics gather start --gather-begin <YYYY-MM-DD [HH:MM]>
Past Gather based on duration.
Similarly, the ‘Gather Past’ option on the WebUI ‘Start Gather’ page allows past duration log gathers to be specified:
Past-duration-based gathers can also be initiated from the CLI with the following syntax:
# isi diagnostics gather start --gather-past <nw/nd/nh>
Gather based on the feature group.
Upon initiating a gather via the WebUI, when the ‘Gather Group’ mode is selected, the full array of feature groups are displayed:
The full list of valid gather feature groups can also be displayed with the following CLI command:
# isi diagnostics gather groups
Valid components are 'abr, acct, acct_sensitive, admin, antivirus, application, auth, backup, bootmessages, celog, cloud, cloudpools, cluster, datamover, eth_backend, firmware, fs, hardware, hdfs, http, ib, iceage, job_engine, logs, messages, ndmp, network, nfs, node, performance, protocol, quotas, s3, security, smartpools, smb, snapshots, storage, synciq, usage'
For the more curious among us, the ‘isi_gather_info -l’ CLI command will list all the gather commands that SmartLog can run, plus also indicate which feature group(s) each command is a member of. For example:
# isi_gather_info -l | more
Known commands are listed by name first with important attributes nested under the commands name.
brand_data:
full_command_text=`cd /etc && tar -c -f /ifs/data/Isilon_Support/2024-05-02T16:47:52.717194/brand_data.tar brand`
timeout=`300`
is_default=True
isi_gconfig:
full_command_text=`/usr/bin/isi_gconfig`
timeout=`150`
is_default=True
groups=[auth, celog, cloudpools, fs, hdfs, job_engine, nfs, protocol, s3, smb]
isi_fputil_leds:
full_command_text=`/usr/bin/isi_hwtools/isi_fputil -g`
timeout=`150`
is_default=True
groups=[hardware]
upgrade_local:
full_command_text=`cd / && tar -c -f /ifs/data/Isilon_Support/2024-05-02T16:47:52.717194/upgrade_local.tar --exclude '/var/ifs/upgrade/AgentPersistent.db*' var/ifs/upgrade`
timeout=`150`
is_default=True
groups=[admin]
efs.lbm.drive_space:
full_command_text=`/sbin/sysctl efs.lbm.drive_space`
timeout=`150`
is_default=True
groups=[usage]
< snip >
The desired feature group(s) can be selected by clicking on their associated checkbox and then using the right arrow button to add them to the active groups column. In the following example, NFS, network, S3 and SMB have been selected, and the clicking the ‘Start Gather’ button will activate the job:
Similarly, the corresponding selected feature groups gather can be initiated from the CLI as follows:
# isi diagnostics gather start --group nfs,network,s3,smb
Gather started.
Finished gathers can be found in: /ifs/data/Isilon_Support/pkg
As of OneFS 9.5 and later, the ‘Edit gather settings’ page defaults to FTPS as the transport, with the associated radio buttons and text boxes for its configuration. These settings can also be viewed and/or modified via the CLI:
# isi diagnostics gather settings view
Upload: Yes
ESRS: Yes
Supportassist: Yes
Gather Mode: full
HTTP Insecure Upload: No
HTTP Upload Host:
HTTP Upload Path:
HTTP Upload Proxy:
HTTP Upload Proxy Port: -
Ftp Upload: Yes
Ftp Upload Host: ftp.isilon.com
Ftp Upload Path: /incoming
Ftp Upload Proxy:
Ftp Upload Proxy Port: -
Ftp Upload User: anonymous
Ftp Upload Ssl Cert:
Ftp Upload Insecure: No
Group:
Gather Begin:
Gather Past:
While FTPS is the default and (highly) recommended transport, the legacy plaintext FTP upload method is still available, if necessary. As such, Dell’s log server, ftp.isilon.com, also supports both encrypted FTPS and plaintext FTP, so will not impact older (pre-OneFS 9.5) release FTP log upload behavior.
However, a warning is displayed if cluster admin elects to continue using non-secure FTP as the transport for the SmartLog:
Similarly from the CLI, if the ‘—ftp-upload-insecure’ option is configured, the following message is displayed, informing the user that plain text FTP upload is being used, and that the connection and data stream will not be encrypted:
# isi diagnostics gather start --ftp-upload-insecure
You are performing plain text FTP logs upload.
This feature is deprecated and will be removed
in a future release. Please consider the possibility
of using FTPS for logs upload. For further information,
please contact PowerScale support
...
Once a logfile gather arrives at Dell, it is automatically unpacked by a support process and analyzed using the ‘logviewer’ tool.
Note that the ‘isi diagnostics gather’ is a limited scope wrapper for the underlying ‘isi_gather_info’ utility. For example, the following two CLI commands can be used interchangeably:
# isi diagnostics gather start --group nfs,network,s3,smb
Or:
# isi_gather_info --group nfs,network,s3,smb
For reference, the comprehensive ‘isi_gather_info’ CLI utility in OneFS 9.8 includes the following options:
Option |
Description |
–upload <boolean> |
Enable gather upload. |
–esrs <boolean> |
Use ESRS for gather upload. |
–noesrs |
Do not attempt to upload via ESRS. |
–supportassist |
Attempt SupportAssist upload. |
–nosupportassist |
Do not attempt to upload via SupportAssist. |
–gather-mode (incremental | full) |
Type of gather: incremental, or full. |
–gather-begin <YYYY-MM-DD [HH:MM]> |
Time to begin the gather. |
–gather-past <nw/nd/nh> |
How far in the past to gather logs. |
–group <g1,g2,…,gn> |
Which feature group(s) to gather logs for. |
–http-insecure <boolean> |
Enable insecure HTTP upload on completed gather. |
–http -host <string> |
HTTP Host to use for HTTP upload. |
–http -path <string> |
Path on HTTP server to use for HTTP upload. |
–http -proxy <string> |
Proxy server to use for HTTP upload. |
–http -proxy-port <integer> |
Proxy server port to use for HTTP upload. |
–ftp <boolean> |
Enable FTP upload on completed gather. |
–noftp |
Do not attempt FTP upload. |
–set-ftp-password |
Interactively specify alternate password for FTP. |
–ftp -host <string> |
FTP host to use for FTP upload. |
–ftp -path <string> |
Path on FTP server to use for FTP upload. |
–ftp-port <string> |
Specifies alternate FTP port for upload. |
–ftp-proxy <string> |
Proxy server to use for FTP upload. |
–ftp -proxy-port <integer> |
Proxy server port to use for FTP upload. |
–ftp-mode <value> |
Mode of FTP file transfer. Valid values are: both, active, passive |
–ftp -user <string> |
FTP user to use for FTP upload. |
–ftp-pass <string> |
Specify alternative password for FTP. |
–ftp -ssl-cert <string> |
Specifies the SSL certificate to use in FTPS connection. |
–ftp-upload-insecure <boolean> |
Whether to attempt a plain text FTP upload. |
–ftp-upload-pass <string> |
FTP user to use for FTP upload password. |
–set-ftp-upload-pass |
Specify the FTP upload password interactively. |
HealthCheck Enhancements
Failing HealthCheck evaluations also now support small gathers in OneFS 9.8. HealthCheck evaluation gathers are automatically sent to Dell Support, per the cluster’s SmartLog transport configuration (‘isi diagnostics gather settings’):
From the CLI, the corresponding healthcheck gather syntax is as follows:
# isi healthcheck evaluations gather --id <evaluation id>
Note that for dark sites with no external routing, SmartLog also offers the ability to download the log gather locally:
CELOG Enhancements
CELOG event groups also support SmartLog small gathers in OneFS 9.8. However, the event severity must be either Emergency or Critical severity for the gather option to be available. For example:
Additionally, the corresponding CELOG event group gather CLI syntax is as follows:
# isi event group gather --id <event group id>
Similar to healthchecks, SmartLog also offers the ability to download the log gather locally for dark sites with no external routing: