PowerScale Multipath Client Driver Configuration and Management

As discussed earlier in this series of articles, the multipath driver allows Linux clients to mount a PowerScale cluster’s NFS exports over NFS v3, NFS v4.1, or NFS v4.2 over RDMA.

The principal NFS mount options of interest with the multipath client driver are:

Mount option Description
nconnect Allows the admin to specify the number of TCP connections the client can establish between itself and the NFS server. It works with remoteports to spread load across multiple target interfaces.
localports Mount option that allows a client to use its multiple NICs to multiplex I/O.
localports_failover Mount option allowing transports to temporarily move from local client interfaces that are unable to serve NFS connections.
proto The underlying transport protocol that the NFS mount will use. Typically, either TCP or RDMA.
remoteports Mount option that allows a client to target multiple servers/ NICS to multiplex I/O. Remoteports spreads the load to multiple file handles instead of taking a single file handle to avoid thrashing on locks.
version The version of the NFS protocol that is to be used. The multipath driver supports NFSv3, NFSv4.1, and NFSv4.2. Note that NFSv4.0 is unsupported.

These options allow the multipath driver to be configured such that an IO stream to a single NFS mount can be spread across a number of local (client) and remote (cluster) network interfaces (ports). Nconnect allows you to specify how many socket connections you want to open to each combination of local and remote ports.

Below are some example topologies and NFS client configurations using the multipath driver.

  1. Here, NFSv3 with RDMA is used to spread traffic across all the front-end interfaces (remoteports) on the PowerScale cluster:
# mount -o proto=rdma,port=20049,vers=3,nconnect=18,remoteports=10.231.180.95- 10.231.180.98 10.231.180.98:/ifs/data /mnt/test

The above client NFS  mount configuration would open 5 socket connections to two of the ‘remoteports’ (cluster node) IP address specified and 4 socket connections to the other two.

As you can see, this driver can be incredibly powerful given its ability to multipath comprehensively. Clearly, there are many combinations of local and remote ports and socket connections that can be configured.

  1. This next example uses NFSv3 with RDMA across three ‘localports’ (client) and with 8 socket connections to each:
# mount -o proto=rdma,port=20049,vers=3,nconnect=24,localports=10.219.57.225- 10.219.57.227, remoteports=10.231.180.95-10.231.180.98 10.231.180.98:/ifs/data /mnt/test

  1. This final config specifies NFSv4.1 with RDMA using a high connection count to target multiple nodes (remoteports) on the cluster:
# mount -t nfs –o proto=rdma,port=20049,vers=4.1,nconnect=64,remoteports=10.231.180.95- 10.231.180.98 10.231.180.98:/ifs/data /mnt/test

In this case, 16 socket connections will be opened to each of the four specified remote (cluster) ports for a total of 64 connections.

Note that the Dell multipath driver has a hard-coded limit of 64 nconnect socket connections per mount.

Behind the scenes, the driver uses a network map to store the local and remote port and nconnect socket configuration.

The multipath driver supports both IPv4 and IPv6 addresses for local and remote port specification. If a specified IP address is unresponsive, the driver will remove the offending address from its network map.

Note that the Dell multipath driver supports NFSv3, NFSv4.1, and NFSv4.2 but is incompatible with NFSv4.0.

On Ubuntu 20.04, for example, NFSv3 and NFSv4.1 are both fully-featured. In addition, the ‘remoteports’ behavior is more obvious with NFSv4.1 because the client state is tracked:

# mount -t nfs -o vers=4.1,nconnect=8,remoteports=10.231.180.95-10.231.180.98 10.231.180.95:/ifs /mnt/test

And from the cluster:

cluster# isi_for_array 'isi_nfs4mgmt'

ID                Vers   Conn   SessionId     Client Address Port   O-Owners      Opens  Handles L-Owners

2685351452398170437  4.1    tcp    5           10.219.57.229  872    0     0     0     0

5880148330078271256   4.1    tcp    11          10.219.57.229  680    0     0     0     0

2685351452398170437   4.1    tcp    5           10.219.57.229  872    0     0     0     0

6230063502872509892   4.1    tcp    1           10.219.57.229  895    0     0     0     0

6786883841029053972   4.1    tcp    1           10.219.57.229  756    0     0     0     0

With a single mount, the client has created many connections across the server.

This also works with RDMA:

# mount -t nfs -o vers=4.1,proto=rdma,nconnect=4 10.231.180.95:/ifs /mnt/test

# isi_for_array 'isi_nfs4mgmt’

ID                 Vers   Conn   SessionId     ClientAddress  Port   O-Owners      Opens  Handles L-Owners

6786883841029053974   4.1    rdma   2           10.219.57.229  54807  0     0     0     0

6230063502872509894   4.1    rdma   2           10.219.57.229  34194  0     0     0     0

5880148330078271258   4.1    rdma   12          10.219.57.229  43462  0     0     0     0

2685351452398170443  4.1    rdma   8           10.219.57.229  57401  0     0     0     0




0

Once the Linux client NFS mounts have been configured, their correct functioning can be easily verified by generating read and/or write traffic to the PowerScale cluster and viewing the OneFS performance statistics. Running a load generator like ‘iozone’ is a useful way to generate traffic on the NFS mount. The iozone utility can be invoked with its ‘-a’ flag to select full automatic mode. This produces output that covers all tested file operations for record sizes of 4k to 16M for file sizes of 64k to 512M.

# iozone -a

For example:

From the PowerScale cluster, if the multipath driver is working correctly the ‘isi statistics client’ CLI command output will show the Linux client connecting and generating traffic to all of the cluster’s nodes that are specified in the’ –remoteports’ option for the NFS mount.

# isi statistics client

For example:

Alternatively, from the client, the ‘netstat’ CLI command can be used from the Linux client to query the number of TCP connections established:

# netstat -an | grep 2049 | grep EST | sort -k 5

On Linux systems, the ‘netstat’ command line utility typically requires the ‘net-tools’ package to be installed.

Since NFS is a network protocol, when it comes to investigating, troubleshooting, and debugging multipath driver issues, one of the most useful and revealing troubleshooting tools is a packet capturing device or sniffer. These provide visibility into the IP packets as they are transported across the Ethernet network between the Linux client and the PowerScale cluster nodes.

Packet captures (PCAPs) of traffic between client and cluster can be filtered and analyzed by tools such as Wireshark to ensure that requests, authentication, and transmission are occurring as expected across the desired NICs.

Gathering PCAPs is best performed on the Linux client-side and on the multiple specified interfaces if the client is using the ‘–localports’ NFS mount option.

In addition to PCAPs, the following three client-side logs are another useful place to check when debugging a multipath driver issue:

Log file Description
/var/log/kern.log Most client logging is written to this log file
/var/log/auth.log Authentication logging
/var/log/messages Error level message will appear here

 

Verbose logging can be enabled on the Linux client with the following CLI syntax:

# sudo rpcdebug -m nfs -s all

Conversely, the following command will revert logging back to the default level:

# sudo rpcdebug -m nfs -c all

 

Additionally, a ‘dellnfs-ctl’ CLI tool comes packaged with the multipath driver module and is automatically available on the Linux client after the driver module installation.

The command usage syntax for the ‘dellnfs-ctl’ tool is as follows:

# dellnfs-ctl

syntax: /usr/bin/dellnfs-ctl [reload/status/trace]

Note that to operate it in trace mode, the ‘dellnfs-ctl’ tool requires the ‘trace-cmd’ package to be installed.

For example, the trace-cmd package can be installed on an Ubuntu Linux system using the ‘apt install’ package utility command:

# sudo apt install trace-cmd

The current version of the dellnfs-ctl tool, plus the associated services and kernel modules, can be queried with the following CLI command syntax:

# dellnfs-ctl status

version: 4.0.22-dell

kernel modules: sunrpc rpcrdma compat_nfs_ssc lockd nfs_acl auth_rpcgss rpcsec_gss_krb5 nfs nfsv3 nfsv4

services: rpcbind.socket rpcbind rpc-gssd rpc_pipefs: /run/rpc_pipefs

With the ‘reload’ option, the ‘dellnfs-ctl’ tool uses ‘modprobe’ to reload and restart the NFS RPC services. For example:

# dellnfs-ctl reload

dellnfs-ctl: stopping service rpcbind.socket

dellnfs-ctl: umounting fs /run/rpc_pipefs

dellnfs-ctl: unloading kmod nfsv3

dellnfs-ctl: unloading kmod nfs

dellnfs-ctl: unloading kmod nfs_acl

dellnfs-ctl: unloading kmod lockd

dellnfs-ctl: unloading kmod compat_nfs_ssc

dellnfs-ctl: unloading kmod rpcrdma

dellnfs-ctl: unloading kmod sunrpc

dellnfs-ctl: loading kmod sunrpc

dellnfs-ctl: loading kmod rpcrdma

dellnfs-ctl: loading kmod compat_nfs_ssc

dellnfs-ctl: loading kmod lockd

dellnfs-ctl: loading kmod nfs_acl

dellnfs-ctl: loading kmod nfs

dellnfs-ctl: loading kmod nfsv3

dellnfs-ctl: mounting fs /run/rpc_pipefs

dellnfs-ctl: starting service rpcbind.socket

dellnfs-ctl: starting service rpcbind

In the event that a problem is detected, it is important to run the reload script before uninstalling or reinstalling the driver. Because the script runs modprobe, the kernel modules that are modified by the driver will be reloaded.

Note that this will affect existing NFS mounts. As such, any active mounts will need to be re-mounted after a reload is performed.

 

As we have seen throughout this series of articles, the core benefits of the Dell multipath driver include:

  • Better single NFS mount performance.
  • Increased bandwidth for single NFS mount with multiple R/W streaming files.
  • Improved performance for heavily used NIC’s to a single PowerScale Node.

The Dell multipath driver allows NFS clients to direct I/O to multiple PowerScale nodes through a single NFS mount point for higher single-client throughput. This enables Dell to deliver the first Ethernet storage solution validated on NVIDIA’s DGX SuperPOD.

PowerScale Multipath Client Driver – Compiling on OpenSUSE Linux

The previous article in this series explored building the PowerScale multipath client driver from source on Ubuntu Linux. Now we’ll turn our attention to compiling the driver on the OpenSUSE Linux platform.

Unlike the traditional one-to-one NFS server/client mapping, this multipath client driver allows the performance of multiple PowerScale nodes to be aggregated through a single NFS mount point to one or many compute nodes.

Building the PowerScale multipath client driver from scratch, rather than just installing it from a pre-built Linux package, helps guard against minor version kernel mismatches on the Linux client that would result in the driver not installing correctly.

The driver itself is available for download on the Dell Support Site. There is no license or cost for this driver, either for the pre-built Linux package or source code. The zipped tarfile download contains a README doc which provides basic instruction.

For an OpenSUSE Linux client to successfully connect to a PowerScale cluster using the multipath driver, there are a couple prerequisites that must be met:

  • The NFS client system or virtual machine must be running the following OpenSUSE version:
Supported Linux Distribution Kernel Version
OpenSUSE 15.4 5.14.x
  • If RDMA is being configured, the system must contain an RDMA-capable Ethernet NIC, such as the Mellanox CX series.
  • The ‘trace-cmd’ package should be installed, along with NFS client related packages.

For example,:

# zypper install trace-cmd nfs-common
  • Unless already installed, developer tools may also need to be added. For example:
# zypper install rpmbuild tar gzip git kernel-devel

The following CLI commands can be used to verify the kernel version and other pertinent details of the OpenSUSE client:

# uname -a

Unless all the Linux clients are known to be identical, the best practice is to build and install the driver per-client or you may experience failed installs.

Overall, the driver source code compilation process is as follows:

  1. Download the driver source code from the driver download site.
  2. Unpack the driver source code on the Linux client

Once downloaded, this file can be extracted using the Linux ‘tar’ utility. For example:

# tar -xvf <source_tarfile>
  1. Build the driver source code on the Linux client

Once downloaded to the Linux client, the multipath driver package source can be built with the following CLI command:

# ./build.sh bin

A successful build is underway when the following output appears on the console:

…<build takes about ten minutes>

------------------------------------------------------------------

When the build is complete, a package file is created in the ./dist directory, located under the top level source code directory.

  1. Install the driver binaries on the OpenSUSE client
# zypper in ./dist/dellnfs-4.0.22-kernel_5.14.21_150400.24.97_default.x86_64.rpm
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following NEW package is going to be installed:
dellnfs

1 new package to install.
  1. Check installed files
# rpm -qa | grep dell
dellnfs-4.0.22-kernel_5.14.21_150400.24.100_default.x86_64
  1. Reboot
# reboot
  1. Check services are started
# systemctl start portmap
# systemctl start nfs
# systemctl status nfs
nfs.service - Alias for NFS client
Loaded: loaded (/usr/lib/systemd/system/nfs.service; disabled; vendor preset: disabled)
Active: active (exited) since Tues 2024-10-15 15:11:09 PST; 2s ago
Process: 15577 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 15577 (code=exited, status=0/SUCCESS)

Oct 15 15:11:09 CLI22 systemd[1]: Starting Alias for NFS client...
Oct 15 15:11:09 CLI22 systemd[1]: Finished Alias for NFS client.
  1. Check client driver is loaded with dellnfs-ctl script
# dellnfs-ctl status
version: 4.0.22
kernel modules: sunrpc
services: rpcbind.socket rpcbind
rpc_pipefs: /var/lib/nfs/rpc_pipefs

Note, however, that when building and installing on an OpenSUSE virtual instance (VM), additional steps are required.

Since OpenSUSE does not reliably install a kernel-devel kit that matches the running kernel version, this must be forced to happen as follows:

  1. Install dependencies
# zypper install rpmbuild tar gzip git
  1. Install CORRECT kernel-devel package

The recommended way to install kernel-devel package according to OpenSUSE documentation is to use:

# zypper install kernel-default-devel

Beware that the ‘zypper install kernel-default-devel’ command occasionally fails to install the correct kernel-devel package. This can be verified by looking at the following paths:

# ls /lib/modules/

5.14.21-150500.55.39-default

# uname -a

Linux 6f8edb8b881a 5.15.0-91-generic #101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Note that the contents of /lib/modules above does not match the ‘uname’ command output:

‘5.14.21-150500.55.39-default’  vs.  ‘5.15.0-91-generic’

Another issue with installing with ‘kernel-devel’ is that sometimes the /lib/modules/$(uname -r) directory will not include the /build subdirectory.

If this occurs, the client side driver will fail with the following error:

# ls -alh /lib/modules/$(uname -r)/build

ls: cannot access '/lib/modules/5.14.21-150400.24.63-default/build': No such file or directory

....

Kernel root not found

The recommendation is to install the specific kernel-devel package for the client’s Linux version. For example:

# ls -alh /lib/modules/$(uname -r)

total 5.4M

drwxr-xr-x 1 root root  488 Dec 13 19:37 .

drwxr-xr-x 1 root root  164 Dec 13 19:42 ..

drwxr-xr-x 1 root root   94 May  3  2023 kernel

drwxr-xr-x 1 root root   60 Dec 13 19:37 mfe_aac

-rw-r--r-- 1 root root 1.2M May  9  2023 modules.alias

-rw-r--r-- 1 root root 1.2M May  9  2023 modules.alias.bin

-rw-r--r-- 1 root root 6.4K May  3  2023 modules.builtin

-rw-r--r-- 1 root root  17K May  9  2023 modules.builtin.alias.bin

-rw-r--r-- 1 root root 8.2K May  9  2023 modules.builtin.bin

-rw-r--r-- 1 root root  49K May  3  2023 modules.builtin.modinfo

-rw-r--r-- 1 root root 610K May  9  2023 modules.dep

-rw-r--r-- 1 root root 809K May  9  2023 modules.dep.bin

-rw-r--r-- 1 root root  455 May  9  2023 modules.devname

-rw-r--r-- 1 root root  802 May  3  2023 modules.fips

-rw-r--r-- 1 root root 181K May  3  2023 modules.order

-rw-r--r-- 1 root root 1.2K May  9  2023 modules.softdep

-rw-r--r-- 1 root root 610K May  9  2023 modules.symbols

-rw-r--r-- 1 root root 740K May  9  2023 modules.symbols.bin

drwxr-xr-x 1 root root   36 May  9  2023 vdso




# rpm -qf /lib/modules/$(uname -r)/

kernel-default-5.14.21-150400.24.63.1.x86_64 <---------------------

kernel-default-extra-5.14.21-150400.24.63.1.x86_64

kernel-default-optional-5.14.21-150400.24.63.1.x86_64

Take the package name and prefix it with ‘kernel-default-devel’:

====================================================================

# zypper install kernel-default-devel-5.14.21-150400.24.63.1.x86_64

Loading repository data...

Reading installed packages...

The selected package 'kernel-default-devel-5.14.21-150400.24.63.1.x86_64' from repository 'Update repository with updates from SUSE Linux Enterprise 15' has lower version than the installed one. Use 'zypper install --oldpackage kernel-default-devel-5.14.21-150400.24.63.1.x86_64' to force installation of the package.

Resolving package dependencies...

Nothing to do.

# zypper install --oldpackage  kernel-default-devel-5.14.21-150400.24.63.1.x86_64

Loading repository data...

Reading installed packages...

Resolving package dependencies...

The following 2 NEW packages are going to be installed:

kernel-default-devel-5.14.21-150400.24.63.1 kernel-devel-5.14.21-150400.24.63.1

2 new packages to install.

Now the build directory exists:

# ls -alh /lib/modules/$(uname -r)

total 5.4M

drwxr-xr-x 1 root root 510 Dec 13 19:46 .

drwxr-xr-x 1 root root 164 Dec 13 19:42 ..

lrwxrwxrwx 1 root root 54 May 3 2023 build -> /usr/src/linux-5.14.21-150400.24.63-obj/x86_64/default

drwxr-xr-x 1 root root 94 May 3 2023 kernel

drwxr-xr-x 1 root root 60 Dec 13 19:37 mfe_aac

It is less likely you will run into this if you run ‘zypper update’ first. Note that this can take more than fifteen minutes to complete.

Next, reboot the Linux client and then run:

# zypper install kernel-default-devel

In the next and final article of this series, we’ll be looking at the configuration and management of the multipath client driver.

PowerScale Multipath Client Driver – Compiling on Ubuntu Linux

As discussed in the first article in this series, the new PowerScale multipath client driver enables performance aggregation of multiple PowerScale nodes through a single NFS mount point to one or many compute nodes.

There are several good reasons to build the PowerScale multipath client driver from scratch rather than just installing it from a pre-built Linux package. The primary motivation is typically that any minor version kernel mismatch on the Linux client will result in the driver not installing correctly. For example, kernel version 5.4.0-150-generic is incompatible with 5.4.0-167-generic. Both are incompatible with 5.15.0-91-generic, which has an upgraded kernel.

The multipath driver bits are available for download on the Dell Support Site to any customer that has OneFS entitlement:

https://www.dell.com/support/home/en-us/product-support/product/isilon-onefs/drivers

There is no license requirement for this driver, nor charge for it, and it’s provided as both pre-built Linux package, and customer-compliable source code. There’s a README file included with the code that provides basic instruction.

This multipath client driver runs on both physical and virtual machines, and across several popular Linux distros. The following matrix shows the currently supported variants, plus the availability of a pre-compiled driver package and/or self-compilation option.

Linux distribution Kernel version Upstream driver version (minimum) Multipath driver version Package

available

Self-

compile

OpenSUSE 15.4 5.14.x 4.x 1.x Yes Yes
Ubuntu 20.04 5.4.x 4.x 1.x Yes Yes
Ubuntu 22.04 5.15.x 4.x 1.x Yes Yes

While the multipath driver’s major release version—1.x—is correct in the table, the second digit release number will be frequently incremented as updated versions of the multipath client driver are released.

By design, the multipath driver only supports newer and most recent versions of the popular Linux distributions. Older Linux kernel versions often do not support full NFS client functionality, particularly for the ‘–remoteports’ and/or ‘–localports’ mount configuration options. Additionally, older and end-of-life Linux versions can often present significant security risks, especially once current vulnerability patches and hotfixes are no longer being made available.

Both x86 CPU architectures and GPU-based platforms, such as the NVIDIA DGX range, are supported.

Linux system Processor type Example
Physical CPU Dell PE R760
Physical GPU Dell PE XE9680

NVIDIA DGX H100

Virtual machine CPU VMware ESXi
Virtual machine GPU VMware vDGA

While there is no specific NFS or OneFS core configuration required on the PowerScale cluster side when using Linux clients with the Dell multipath driver, there are a couple of basic prerequisites The following OneFS support matrix on the top right of this slide lays out which driver functionality is available in what release, from OneFS 9.5 to current.

Version NFSv3, NFSv4.1 TCP NFSv3 RDMA NFSv4.1 RDMA NVIDIA SuperPOD
OneFS 9.5 Yes Yes No No
OneFS 9.7 Yes Yes Yes No
OneFS 9.9 Yes   Yes  Yes Yes

Also note that OneFS 9.9 is required for any NVIDIA SuperPOD deployments, because there are some performance optimizations in 9.9 specifically for that platform.

The following CLI commands can be run on the PowerScale cluster to verify its compatibility. The cluster’s current OneFS version can be easily determined using the following CLI command:

# uname -or

Isilon OneFS 9.9.0.0

Also, to confirm RDMA is supported and enabled:

# isi nfs settings global view | grep -i RDMA

   NFS RDMA Enabled: Yes

Additionally, both the dynamic and static network pools can be configured on the cluster for use with the multipath driver. If F710 nodes are being deployed in the cluster, OneFS 9.7 or later is required.

Note that when deploying an NVIDIA SuperPOD or BasePOD solution, the reference architecture mandates a PowerScale cluster composed of F710 all-flash nodes running OneFS 9.9 or later.

For a Linux client to successfully connect to a PowerScale cluster using the multipath driver, there are a few prerequisites that must be met:

  • The NFS client system or virtual machine must be running one of the following Linux versions:
Supported Linux Distribution Kernel Version
OpenSUSE 15.4 5.14.x
Ubuntu 20.04 5.4.x
Ubuntu 22.04 5.15.x

By design, the multipath driver only supports newer and most recent versions of the popular Linux distributions. Older Linux kernels often don’t include full NFS client functionality, particularly for the ‘–remoteports’ and ‘–localports’ mount options.

  • If RDMA is being configured, the client must contain an RDMA-capable Ethernet NIC, such as the Mellanox CX series.
  • The Linux client should have the ‘trace-cmd’ package installed, along with NFS client related packages.

For example, on an Ubuntu system:

# sudo apt install trace-cmd nfs-common

The following CLI commands can be used to verify the kernel version, and other pertinent details, of a Linux client:

# uname -a

Similarly, depending on the flavor of Linux, the following commands will show the details of the particular distribution:

# lsb_release -a

Or:

# cat /etc/os-release

 For security purposes, the driver downloads are signed with SHA256. Using Debian linux as an example, the driver components can be accessed as follows:

  1. Verify the SHA256 Checksum Manually

If you have a separate SHA256 checksum file, you can verify the checksum manually as follows:

First, calculate the checksum:

# sha256sum dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb.signed

Then compare the output with the checksum provided.

  1. Extract the Signed Package

First, if not already available, install the necessary extraction tools:

# sudo apt-get install ar

Next, extract the signed driver file:

# ar x dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb.signed

This should yield multiple extracted files, including the following:

 control.tar.xz, data.tar.xz, and debian-binary.

Unless all the Linux clients are known to be identical, the best practice is to build and install the driver per-client or you may experience failed installs.

Within Ubuntu (and certain other Linux distros) , Dynamic Kernel Module Support (DKMS) can be used to allow kernel modules to be generated from sources outside the kernel source tree. As such, DKMS modules can be automatically rebuilt when a new kernel is installed, enabling drivers to continue operating after a Linux kernel upgrade. Similarly, DKMS can enable a new driver to be installed on a Linux client running a slightly different kernel version, without the need for manual recompilation.

Considerations when building the multipath driver from source include:

  • The recommendation is to build the multipath driver on all supported Linux versions unless all NFS clients have exactly the same kernel versions.
  • Since a multipath driver binary install package is not provided for NVIDIA DGX platforms, the driver must be manually built with DKMS.

Dependent packages that should be added before building and installing the multipath driver include:

OS Dependent package Install command
Ubuntu 20 debhelper sudo apt-get install debhelper
Ubuntu 22 debhelper

nfs-kernel-server

sudo apt-get install debhelper

sudo apt-get install nfs-kernel-server

The driver source code compilation process is as follows:

  1. Download the driver source code from the repository

As mentioned previously, the driver source code can be obtained as a tarfile from the Dell Support driver download site.

  1. Unpack the driver source code on the Linux client

Once downloaded, this file can be extracted using the ‘tar’ utility. For example:

# tar -xvf <source_tarfile>
  1. Build the driver source code on the Linux client

Once downloaded to the Linux client, the multipath driver package source can be built with the following CLI command:

# ./build.sh bin

Note that, in general, a successful build is underway when the following output appears on the console:

…<build takes about ten minutes>

------------------------------------------------------------------

When the build is complete, a package file is created in the ./dist directory, located under the top level source code directory. For example:

# ls -lsia ./dist

total 1048-rw-r--r-- 1 root root 1069868 Mar 24 16:23 dellnfs-modules_4.0.22-dell.kver.5.15.0-89-generic_amd64.deb

If the kernel version, Linux distribution, or OFED are not supported, an error message will be displayed during the build process.

  1. Install the driver binaries on the Linux client

The previous blog article in this series describes the process for adding the driver binary package to a Linux client.

That said, when building the driver on Ubuntu, there are a couple of idiosyncrasies to be aware of.

When running Ubuntu 18.04 or later, the “dellnfs-ctl” script can be used to reload the NFS client modules as follows:

# dellnfs-ctl reload

Also, Ubuntu 22.x clients should also have the ‘nfs-kernel-server’ package installed. For example:

# sudo apt install nfs-kernel-server
  1. Install the *.deb package generated:
# sudo apt-get install ./dist/dellnfs-modules_*-generic_amd64.deb
  1. Regenerate running kernel image:
# sudo update-initramfs -u -k `uname -r`
  1. Check that the package is installed correctly:
# dpkg -l | grep dellnfs-modules

dellnfs-modules   2:4.0.22-dell.kver.5.4.0-150-generic amd64        NFS RDMA kernel modules

# dpkg -S /lib/modules/`uname -r`/updates/bundle/net/sunrpc/xprtrdma/rpcrdma.ko

dellnfs-modules: /lib/modules/5.4.0-150-generic/updates/bundle/net/sunrpc/xprtrdma/rpcrdma.ko
  1. Optionally reboot the Linux client:
# reboot
  1. Verify the module is running:
# dellnfs-ctl status

version: 4.0.22-dell
kernel modules: sunrpc rpcrdma compat_nfs_ssc lockd nfs_acl nfs nfsv3
services: rpcbind.socket rpcbind rpc-gssdrpc_pipefs: /run/rpc_pipefs

For NVIDIA DGX clients in particular, the Mellanox OpenFabrics Enterprise Driver for Linux (MLNX_OFED) is a single Virtual Protocol Interconnect (VPI) software stack that operates across the network adapters in DGX systems. MLNX_OFED is an NVIDIA-tested and packaged version of OFED which supports Ethernet, as well as Infiniband, using RDMA and kernel bypass APIs (OFED verbs).

When building the multipath driver on an DGX platform, or alongside a DKMS install of the Mellanox OFED driver package, there are a few extra steps required beyond the package manager install itself.

If a DKMS install is required by your system (typically NVDIA DGX platforms), the package will be formatted with the term ‘multikernel’ in the package name. For example:

# ls

dellnfs-dkms_4.5-OFED.4.5.1.0.1.1.gb4fdfac.multikel_all.deb

This indicates the package is built for DKMS and therefore must be installed by DKMS. After installing with your package manager, the following files will be present under the /usr/src directory:

# ls /usr/src

dellnfs-99.4.5  kernel-mft-dkms-4.22.1           linux-hwe-5.19-headers-5.19.0-45  mlnx-ofed-kernel-5.8  srp-5.8

iser-5.8        knem-1.1.4.90mlnx3               mlnx-nfsrdma-5.8                  ofa_kernel

isert-5.8       linux-headers-5.19.0-45-generic  mlnx-nvme-5.8                     ofa_kernel-5.8

Next, the following CLI command will install the driver:

# dkms install -m dellnfs -v 99.4.5

Creating symlink /var/lib/dkms/dellnfs/99.4.5/source ->

/usr/src/dellnfs-99.4.5

Kernel preparation unnecessary for this kernel. Skipping...

Building module:

cleaning build area...

./_dkms-run.sh -j8 KVER=5.19.0-45-generic

K_BUILD=/lib/modules/5.19.0-45-generic/build......................................

Once the DKMS installation is complete, either reload the driver with the ‘dellnfs-ctl’ utility, or reboot the client:

# dellnfs-ctl reload

Or:

# reboot

In the next article in this series, we’ll turn our attention to compiling the driver source on the OpenSUSE Linux platform.

PowerScale Multipath Client Driver Pre-built Package Installation

As mentioned in the first article in this series, the PowerScale multipath client driver driver is able to aggregate the performance of multiple PowerScale nodes through a single NFS mount point to one or more Linux clients.

The driver itself is a kernel module, and that means it needs to be installed alongside a corresponding kernel version to that which it was built on. Version matching is strict, right down to the minor build version.

There are two installation options provided for the PowerScale multipath client driver:

  • As a pre-built binary installation package for each of the supported Linux distribution versions listed below.
  • Or via source code under the GPL 2 open source license, which can be compiled at a customer site.

This article covers the first option, and outlines the steps involved with the installation of the pre-built binary driver package for the following, currently supported Linux versions:

Linux distribution Kernel version Upstream driver version (minimum) Multipath driver version Package

available

OpenSUSE 15.4 5.14.x 4.x 1.x ü
Ubuntu 20.04 5.4.x 4.x 1.x ü
Ubuntu 22.04 5.15.x 4.x 1.x ü

Package installation is typically handled best by the client Linux distro’s native package manager. Since it’s a kernel module, installing and updating the driver typically requires a reboot. PowerScale engineering anticipate periodically releasing updated driver packages to keep up with Linux on the supported platforms list, as well as fix bugs and add additional functionality.

This multipath client driver runs on both physical and virtual machines, and both x86 CPU architectures and GPU-based platforms, such as the NVIDIA DGX range, are supported.

While there is no specific NFS or OneFS core configuration required on the PowerScale cluster side when using Linux clients with the Dell multipath driver, there are a couple of basic prerequisites. The following OneFS support matrix on the top right of this slide lays out which driver functionality is available in what release, from OneFS 9.5 to current.

Version NFSv3, NFSv4.1 TCP NFSv3 RDMA NFSv4.1 RDMA NVIDIA SuperPOD
OneFS 9.5 Yes Yes No No
OneFS 9.7 Yes Yes Yes No
OneFS 9.9 Yes Yes Yes Yes

Also note that OneFS 9.9 is required for any NVIDIA SuperPOD deployments, because there are some performance optimizations in 9.9 specifically for that platform.

The following CLI commands can be run on the PowerScale cluster to verify its compatibility. The cluster’s current OneFS version can be easily determined using the following CLI command:

# uname -or

Isilon OneFS 9.9.0.0

Also, to confirm RDMA is supported and enabled:

# isi nfs settings global view | grep -i RDMA

   NFS RDMA Enabled: Yes

Additionally, both the dynamic and static network pools can be configured on the cluster for use with the multipath driver. If F710 nodes are being deployed in the cluster, OneFS 9.7 or later is required.

Note that when deploying an NVIDIA SuperPOD or BasePOD solution, the reference architecture mandates a PowerScale cluster composed of F710 all-flash nodes running OneFS 9.9 or later.

For a Linux client to successfully connect to a PowerScale cluster using the multipath driver, there are a few prerequisites that must be met, in addition to running one of the Linux versions listed above. These include:

  • If RDMA is being configured, the client must contain an RDMA-capable Ethernet NIC, such as the Mellanox CX series.
  • The Linux client should have the ‘trace-cmd’ package installed, along with NFS client related packages.

For example, on an Ubuntu system:

# sudo apt install trace-cmd nfs-common

The following CLI commands can be used to verify the kernel version, and other pertinent details, of a Linux client:

# uname -a

Similarly, depending on the flavor of Linux, the following commands will show the details of the particular distribution:

# lsb_release -a

Or:

# cat /etc/os-release

The multipath client driver is available for download on the Dell Support Site to any customer that has OneFS entitlement: For security purposes, the download files are signed with SHA256. Using Debian linux as an example, the driver components can be accessed as follows:

  1. Verify the SHA256 Checksum Manually

The downloaded driver package’s authenticity can be manually verified via its SHA256 checksum as follows:

First, calculate the checksum on the signed driver package:

# sha256sum dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb.signed

Then compare the output with the value in the accompanying checksum file.

  1. Extract the Signed Package

First, if not already available, install the necessary extraction tools:

# sudo apt-get install ar

Next, extract the signed driver file:

# ar x dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb.signed

This should yield multiple extracted files, including the following:

 control.tar.xz, data.tar.xz, and debian-binary.
  1. Repackage the File

If needed, the extracted contents can be repackaged into a standard ‘.deb’ package file:

# ar rcs dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb debian-binary control.tar.xz data.tar.xz
  1. Install the Repackaged Debian File

Once repackaged, you can install the Debian package using the ‘dpkg’ utility. For example:

# sudo dpkg -i dellnfs-modules_4.0.24-Dell-Technologies.kver.5.4.0-190-generic_amd64.deb

Package installation is handled using the native package manager, and each of the supported Linux distributions uses the following format and package installation utility:

Linux Distribution Package Manager Package Utility
OpenSUSE RPM Zipper
Ubuntu Deb apt-get / dpkg

The RPM and DEB packages can either be obtained from the Dell download site or built manually at the customer site.

The multipath client driver is provided as a pre-built binary installation package for each of the supported Linux distribution versions. The process for package installation varies slightly across the different Linux versions, but the basic process is as follows:

Note that the multipath driver installation does require a reboot of the Linux system.

For Ubuntu, the following procedure describes how to install the multipath driver package:

  1. Download the DEB package.
  2. Verify that DEB package and Kernel Version Match.

Compare the package version and the kernel version to ensure they are an exact match. If they are not an exact  match, do not install the package. Instead, build the driver from source according to the instructions in the “Building and Installing the Driver’ section later in this document.

  1. Install the DEB package.
# sudo apt-get install ./dist/dellnfs-modules_*-generic_amd64.deb
  1. Check package is installed correctly.
# dpkg -l | grep dellnfs-modules

dellnfs-modules   2:4.0.22-dell.kver.5.4.0-150-generic amd64        NFS RDMA kernel modules

# dpkg -S /lib/modules/`uname -r`/updates/bundle/net/sunrpc/xprtrdma/rpcrdma.ko

dellnfs-modules: /lib/modules/5.4.0-150-generic/updates/bundle/net/sunrpc/xprtrdma/rpcrdma.ko

Regenerate the running kernel image.

# sudo update-initramfs -u -k `uname -r`
  1. Reboot the Linux client.
# reboot
  1. Confirm the module and services are running.
# dellnfs-ctl status

version: 4.0.22-dell
kernel modules: sunrpc rpcrdma compat_nfs_ssc lockd nfs_acl nfs nfsv3
services: rpcbind.socket rpcbind rpc-gssdrpc_pipefs: /run/rpc_pipefs

If the services are not up, run the ‘dellnfs-ctl reload’ command to start the services.

  1. Verify an NFS mount

Attempt a mount, since NFS occasionally needs to create a symlink to rpc.statd. For example:

# mount -t nfs -o proto=rdma,port=20049,rsize=1048576,wsize=1048576,vers=3,nconnect=32,remoteports=10.231.180.95-10.231.180.98,remoteports_offset=1 10.231.180.95:/ifs/data/fio /mnt/test/

Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service → /lib/systemd/system/rpc-statd.service.

In the above, a symlink created on the first mount. Perform a reload to confirm the service is running correctly. For example:

# dellnfs-ctl reload

dellnfs-ctl: stopping service rpcbind.socket

dellnfs-ctl: umounting fs /run/rpc_pipefs

dellnfs-ctl: unloading kmod nfsv3

dellnfs-ctl: unloading kmod nfs

dellnfs-ctl: unloading kmod nfs_acl

dellnfs-ctl: unloading kmod lockd

dellnfs-ctl: unloading kmod compat_nfs_ssc

dellnfs-ctl: unloading kmod rpcrdma

dellnfs-ctl: unloading kmod sunrpc

dellnfs-ctl: loading kmod sunrpc

dellnfs-ctl: loading kmod rpcrdma

dellnfs-ctl: loading kmod compat_nfs_ssc

dellnfs-ctl: loading kmod lockd

dellnfs-ctl: loading kmod nfs_acl

dellnfs-ctl: loading kmod nfs

dellnfs-ctl: loading kmod nfsv3

dellnfs-ctl: mounting fs /run/rpc_pipefs

dellnfs-ctl: starting service rpcbind.socket

dellnfs-ctl: starting service rpcbind

 

Similarly, for OpenSUSE and SLES, the driver package installation steps are as follows:

  1. Download the driver RPM package.
  2. Verify that RPM package and Kernel Version Match.

Compare the package version and the kernel version to ensure they are an exact match. If they are not an exact  match, do not install the package. Instead, build the driver from source according to the instructions in the “Building and Installing the Driver’ section later in this document.

  1. Install the downloaded RPM package.
# zypper in ./dist/dellnfs-4.0.22-kernel_5.14.21_150400.24.97_default.x86_64.rpm
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following NEW package is going to be installed:
  dellnfs

1 new package to install.
  1. Check the installed files.
# rpm -qa | grep dell
dellnfs-4.0.22-kernel_5.14.21_150400.24.100_default.x86_64
  1. Reboot the Linux client.
# reboot
  1. Verify the services are started.
# systemctl start rpcbind
# systemctl start nfs
# systemctl status nfs
nfs.service - Alias for NFS client
     Loaded: loaded (/usr/lib/systemd/system/nfs.service; disabled; vendor preset: disabled)
     Active: active (exited) since Wed 2023-12-13 15:11:09 PST; 2s ago
    Process: 15577 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 15577 (code=exited, status=0/SUCCESS)
  1. Verify the client driver is loaded with ‘dellnfs-ctl’ script.
# dellnfs-ctl status
version: 4.0.22
kernel modules: sunrpc
services: rpcbind.socket rpcbind
rpc_pipefs: /var/lib/nfs/rpc_pipefs
  1. Verify an NFS mount

Attempt a mount, since NFS occasionally needs to create a symlink to rpc.statd. For example:

# mount -t nfs -o proto=rdma,port=20049,rsize=1048576,wsize=1048576,vers=3,nconnect=32,remoteports=10.231.180.95-10.231.180.98,remoteports_offset=1 10.231.180.95:/ifs/data/fio /mnt/test/

Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service → /lib/systemd/system/rpc-statd.service.

In the above, a symlink is created on the first mount. Next, perform a reload to confirm the service is running correctly. For example:

# dellnfs-ctl reload

dellnfs-ctl: stopping service rpcbind.socket

dellnfs-ctl: umounting fs /run/rpc_pipefs

dellnfs-ctl: unloading kmod nfsv3

dellnfs-ctl: unloading kmod nfs

dellnfs-ctl: unloading kmod nfs_acl

dellnfs-ctl: unloading kmod lockd

dellnfs-ctl: unloading kmod compat_nfs_ssc

dellnfs-ctl: unloading kmod rpcrdma

dellnfs-ctl: unloading kmod sunrpc

dellnfs-ctl: loading kmod sunrpc

dellnfs-ctl: loading kmod rpcrdma

dellnfs-ctl: loading kmod compat_nfs_ssc

dellnfs-ctl: loading kmod lockd

dellnfs-ctl: loading kmod nfs_acl

dellnfs-ctl: loading kmod nfs

dellnfs-ctl: loading kmod nfsv3

dellnfs-ctl: mounting fs /run/rpc_pipefs

dellnfs-ctl: starting service rpcbind.socket

dellnfs-ctl: starting service rpcbind

Unlike installing the driver, removing it does not typically require a client reboot, and can also be performed via the appropriate package manager for the Linux flavor.

Also, be aware that there are no upgrade or patching systems for the driver. This means that if a client’s kernel version is updated, the module must be re-built or a matching package re-installed. And correspondingly, if there is an update to the dellnfs package, the module must be re-installed.

# sudo update-initramfs -u -k `uname -r`

Next, perform a reload to confirm the service is running correctly. For example:

# dellnfs-ctl reload

dellnfs-ctl: stopping service rpcbind.socket

dellnfs-ctl: umounting fs /run/rpc_pipefs

dellnfs-ctl: unloading kmod nfsv3

dellnfs-ctl: unloading kmod nfs

dellnfs-ctl: unloading kmod nfs_acl

dellnfs-ctl: unloading kmod lockd

dellnfs-ctl: unloading kmod compat_nfs_ssc

dellnfs-ctl: unloading kmod rpcrdma

dellnfs-ctl: unloading kmod sunrpc

dellnfs-ctl: loading kmod sunrpc

dellnfs-ctl: loading kmod rpcrdma

dellnfs-ctl: loading kmod compat_nfs_ssc

dellnfs-ctl: loading kmod lockd

dellnfs-ctl: loading kmod nfs_acl

dellnfs-ctl: loading kmod nfs

dellnfs-ctl: loading kmod nfsv3

dellnfs-ctl: mounting fs /run/rpc_pipefs

dellnfs-ctl: starting service rpcbind.socket

dellnfs-ctl: starting service rpcbind

Note that there are no upgrade or patching systems available for the ‘dellnfs’ multipath driver module. If a Linux client’s kernel version is updated, the module must be rebuilt or a matching package reinstalled. Similarly, if there is an update to the ‘dellnfs’ package, the module must be reinstalled.

Uninstalling the multipath client driver does not require a reboot and can be performed using the standard package manager for the pertinent Linux distribution. This unloads the loaded module and then removes the files. As such, uninstallation is a fairly trivial process.

The package manager commands for each Linux version to remove a package are as follows:

OS Package removal command
Ubuntu sudo apt-get autoremove <package_name>
OpenSUSE/SLES zypper remove -u <package_name>

In the next article in this series, we’ll take a look at the specifics of multipath driver binary package installation.

PowerScale Multipath Client Driver and AI Enablement

In order for success with large AI model customization, inferencing, and training, GPUs require data served to them quickly and efficiently. Compute and storage must be architected and provisioned accordingly, in order to eliminate potential bottlenecks in the infrastructure.

To meet this demand, the new PowerScale multipath client driver enables performance aggregation of multiple PowerScale nodes through a single NFS mount point to one or many compute nodes. As a result, this driver, in conjunction with OneFS GPUDirect support, has enabled Dell to deliver the first Ethernet storage solution to be certified for NVIDIA’s DGX SuperPOD.

SuperPOD is an AI-optimized data center architecture that delivers the formidable computational power required to train deep learning (DL) models at scale, accelerating time to outcomes which drive future innovation.

Using DGX A200, B200, or H200 GPU-based compute in concert with a PowerScale F710 clustered storage layer, NVIDIA’s SuperPOD is able to deliver groundbreaking performance.

Deployed as a fully integrated scalable system, SuperPOD is purpose-built for solving challenging computational problems across a diverse range of AI workloads. These include streamlining supply chains, building large language models, and extracting insights from petabytes of unstructured data.

The performance envelope delivered by DGX SuperPOD enables rapid multi-node training of LLMs at significant scale. This integrated approach of provisioning, management, compute, networking, and fast storage, enables a diverse system that can span data analytics, model development, and AI inferencing, right up to the largest, most complex transformer-based AI workloads, deep learning systems, and trillion-parameter generative AI models.

To drive the throughput required for larger NVIDIA SuperPOD deployments, NFS client connectivity to a PowerScale cluster needs to utilize both RDMA and nconnect, in addition to GPUDirect.

While the native Linux NFS stack supports their use, it does not allow nconnect and RDMA to be configured simultaneously.

To address this, the multipath driver permits Linux NFS clients to use RDMA in conjunction with nconnect mount options, while also increasing the maximum nconnect limit from 16 to 64 connections. Additionally, the SuperPOD solution mandates the use of the ‘localports_failover’ NFS mount option, which only works with RDMA currently.

The Dell multipath client driver can be of considerable performance benefit for workloads with streaming reads and writes to and from individual high-powered servers, particularly to multiple files within a single NFS mount – in addition to SuperPOD and BasePOD AI workloads. Conversely, single file streams, and multiple concurrent writes to the same file across multiple nodes typically don’t benefit substantially from the multipath driver.

Without the multipath client driver, a single NFS mount can only route to one PowerScale storage node IP address.

By way of contrast, the multipath driver allows NFS clients to direct I/O to multiple PowerScale nodes for higher aggregate single-client throughput.

The multipath driver enables a single NFS mount point to route to multiple node IP addresses. A group of IP addresses consists of one logical NFS client with the remote endpoint (cluster) using multiple remote machines (nodes), implementing a distributed server architecture.

The principle NFS mount options of interest with the multipath client driver are:

Mount option Description
nconnect Allows the admin to specify the number of TCP connections the client can establish between itself and the NFS server. It works with remoteports to spread load across multiple target interfaces.
localports Mount option that allows a client to use its multiple NICs to multiplex I/O.
localports_failover Mount option allowing transports to temporarily move from local client interfaces that are unable to serve NFS connections.
proto The underlying transport protocol that the NFS mount will use. Typically, either TCP or RDMA.
remoteports Mount option that allows a client to target multiple servers/ NICS to multiplex I/O. Remoteports spreads the load to multiple file handles instead of taking a single file handle to avoid thrashing on locks.
version The version of the NFS protocol that is to be used. The multipath driver supports NFSv3, NFSv4.1, and NFSv4.2. Note that NFSv4.0 is unsupported.

There are also several advanced mount options which can be useful to squeeze out some extra throughput, particularly with SuperPOD deployments. These options include ‘remoteport offsets’, which can help with loading up L1 cache, and’ spread reads and writes’, which can assist with load balancing.

The Dell multipath driver is available for download on the Dell Support Site to any customer that has OneFS entitlement:

https://www.dell.com/support/home/en-us/product-support/product/isilon-onefs/drivers

There is no license requirement for this driver, nor charge for it, and its provided as both pre-built Linux package, and customer-compliable source code. There’s a README file included with the code that provides basic instruction.

This multipath client driver runs on both physical and virtual machines, and across several popular Linux distros. The following matrix shows the currently supported variants, plus the availability of a pre-compiled driver package and/or self-compilation option.

Linux distribution Kernel version Upstream driver version (minimum) Multipath driver version Package

available

Self-

compile

OpenSUSE 15.4 5.14.x 4.x 1.x ü ü
Ubuntu 20.04 5.4.x 4.x 1.x ü ü
Ubuntu 22.04 5.15.x 4.x 1.x ü ü

While the multipath driver’s major release version—1.x—is correct in the table, the second digit release number will be frequently incremented as updated versions of the multipath client driver are released.

By design, the multipath driver only supports newer and most recent versions of the popular Linux distributions. Older Linux kernel versions often do not support full NFS client functionality, particularly for the ‘–remoteports’ and/or ‘–localports’ mount configuration options. Additionally, older and end-of-life Linux versions can often present significant security risks, especially once current vulnerability patches and hotfixes are no longer being made available.

Both x86 CPU architectures and GPU-based platforms, such as the NVIDIA DGX range, are supported.

Linux system Processor type Example
Physical CPU Dell PE R760
Physical GPU Dell PE XE9680

NVIDIA DGX H100

Virtual machine CPU VMware ESXi
Virtual machine GPU VMware vDGA

While there is no specific NFS or OneFS core configuration required on the PowerScale cluster side for multipath driver support, there are a couple of basic prerequisites The following OneFS support matrix on the top right of this slide lays out which driver functionality is available in what release, from OneFS 9.5 to current.

Version NFSv3, NFSv4.1 TCP NFSv3 RDMA NFSv4.1 RDMA NVIDIA SuperPOD
OneFS 9.5 Yes Yes No No
OneFS 9.7 Yes Yes Yes No
OneFS 9.9 Yes Yes Yes Yes

Also note that OneFS 9.9 is required for any NVIDIA SuperPOD deployments, because there are some performance optimizations in 9.9 specifically for that platform.

Additionally, both the dynamic and static network pools can be configured on the cluster for use with the multipath driver. If F710 nodes are being deployed in the cluster, OneFS 9.7 or later is required.

Note that when deploying an NVIDIA SuperPOD or BasePOD solution, the reference architecture mandates a PowerScale cluster composed of F710 all-flash nodes running OneFS 9.9 or later.

 

For a Linux client to successfully connect to a PowerScale cluster using the multipath driver it currently must be running one of the following Linux flavors:

Supported Linux Distribution Kernel Version
OpenSUSE 15.4 5.14.x
Ubuntu 20.04 5.4.x
Ubuntu 22.04 5.15.x

By design, the multipath driver only supports newer and most recent versions of the popular Linux distributions. Older Linux kernels often don’t include full NFS client functionality, particularly for the ‘–remoteports’ and ‘–localports’ mount options. You will also likely notice the conspicuous absence of Red Hat Enterprise Linux from this matrix. However, engineering do anticipate supporting both RHEL 8 and 9 in a near-future version.

There are also a couple of additional client prerequisites that must be met:

  • If RDMA is being configured, the client must contain an RDMA-capable Ethernet NIC, such as the Mellanox CX series.
  • The Linux client should have the ‘trace-cmd’ package installed, along with NFS client related packages.

In the next article in this series, we’ll take a look at the specifics of the multipath driver binary package installation.

OneFS QoS and DSCP Tagging – Configuration and Management

As we saw in the previous article in this series, OneFS 9.9 introduces support for DSCP marking, and the configuration is cluster-wide, and based on the class of network traffic. This is performed by the OneFS firewall, which inspects outgoing network traffic on the front-end ports and assigns it to the appropriate QoS class based on a set of DSCP tagging rules:

Configuration-wise, DSCP requires OneFS 9.9 or later, and is disabled by default – both for new installations and legacy cluster upgrades. The QoS feature can be configured through the CLI, WebUI, and pAPI endpoints. And for clusters that are upgrading to OneFS 9.9, the release must be committed before DSCP configuration can proceed.

Before enabling DSCP tagging, verify the current firewall and DSCP settings:

# isi network firewall settings view

Enabled: True

DSCP Enabled: False

Update these as required, remembering that both the firewall and DSCP must be running in order for QoS tagging to work. DSCP is off by default, but can be easily started with the following CLI syntax:

# isi network firewall settings modify dscp-enabled true

The OneFS DCSP implementation includes four default tagging rules:

Class  Traffic   Default DSCP Value  Source Ports  Destination Ports 
Transactional File Access and Sharing Protocols:

NFS, FTP, HTTPS data, HDFS, S3, RoCE

Security and Authentication Protocols:

Kerberos, LDAP, LSASS, DCE/RPC

RPC and Inter-Process Communication Protocols:

rpc.bind, mountd, statd, lockd, quotd, mgmntd

Naming Services Protocols: NetBIOS, Microsoft-DS

18 20, 21, 80, 88, 111, 135 137, 138, 139, 300, 302, 304, 305, 306, 389, 443, 445, 585, 636, 989, 990, 2049, 3268, 3269, 8020, 8082, 8440, 8441, 8443, 9020, 9021 Not defined by default, but administrator may configure.
Network Management WebUI, SSH, SMTP, syslog, DNS, NTP, SNMP, Perf collector, CEE, alerts 16 22, 25, 53, 123, 161, 162, 514, 6514, 6567, 8080, 9443, 12228 Not defined by default, but administrator may configure.
Bulk Data SmartSync, SyncIQ, NDMP 10 2097, 2098, 3148, 3149, 5667, 5668, 7722, 8470, 10000 Not defined by default, but administrator may configure.
Catch-All All other traffic that does not match any of the above 0 all Not defined by default, but administrator may configure.

The ‘isi network firewall dscp list’ command can be used to view all of a cluster’s DSCP firewall rules. For example:

# isi network firewall dscp list
DSCP Rules in Priority Order From High To Low:
ID                      Description                      DSCP Value  Src Ports  Dst Ports
------------------------------------------------------------------------------------------
rule_transactional_data DSCP Rule for transactional data 18          20         -
                                                                     21
                                                                     80
                                                                     88
                                                                    111
                                                                    135
                                                                    137
                                                                    138
                                                                    139
                                                                    300
                                                                    302
                                                                    304
                                                                    305
                                                                    306
                                                                    389
                                                                    443
                                                                    445
                                                                    585
                                                                    636
                                                                    989
                                                                    990
                                                                   2049
                                                                   3268
                                                                   3269
                                                                   8020
                                                                   8082
                                                                   8440
                                                                   8441
                                                                   8443
                                                                   9020
                                                                   9021
                                                                  20049

rule_network_management DSCP Rule for network management 16          22         -
                                                                     25
                                                                     53
                                                                    123
                                                                    161
                                                                    162
                                                                    514
                                                                   6514
                                                                   6567
                                                                   8080
                                                                   9443
                                                                  12228

rule_bulk_data          DSCP Rule for bulk data          10          2097       -
                                                                   2098
                                                                   3148
                                                                   3149
                                                                   5667
                                                                   5668
                                                                   7722
                                                                   8470
                                                                  10000

rule_best_effort        DSCP Rule for best effort        0           all        all
------------------------------------------------------------------------------------------
Total: 4

If desired, the ‘isi network firewall dscp modify’, followed by the appropriate rule name, can be used to modify a rule’s associated DSCP value, source ports, or destination ports. For example:

# isi network firewall dscp modify rule_transactional_data –src-port 123 –dst-ports 456 –dscp-value 10

Note that a ‘–live’ option is also available to effect the changes immediately on active rules. If the –live option is used when DSCP is inactive, the command is automatically rejected.

If needed, all of the DSCP configuration can be easily reset to its OneFS defaults and DSCP disabled as follows:

# isi network firewall reset-dscp-setting

This command will reset the global firewall DSCP setting to the original system defaults. Are you sure you want to continue? (yes/[no]): yes

GUI-wise, DSCP has a new ‘settings’ tab under the WebUI’s firewall section for managing its operation and configuration, and editing the rules:

Again, although the DSCP feature can be configured and enabled with the firewall itself still disabled, DSCP will only activate once the firewall is up and running too.

The WebUI allows modification of a rule’s associated DSCP value, source ports, or destination ports. For example:

Like the CLI, the WebUI also has a ‘Reset Default Settings’ option which clears all the current DSCP configuration parameters and resets them to the OneFS defaults:

Also, there’s a comprehensive set of RESTful platform API endpoints, which include:

  • GET/PUT platform/network/firewall/settings
  • POST platform/network/firewall/reset-dscp-setting?live=true
  • GET platform/network/firewall/dscp
  • PUT platform/network/firewall/dscp/<rule_name>?live=true

All DSCP’s configuration data is stored in gconfig at the cluster level, and all the firewall daemon instances across the nodes work as peers. So if it becomes necessary to troubleshooting QoS and tagging, the following logs and utilities are a great place to start.

  • /var/log/isi_firewall_d.log, which includes information from the Firewall daemon.
  • /var/log/isi_papi_d.log, which covers all the command handlers, including the firewall and DSCP related ones.
  • ‘isi_gconfig -t firewall’ utility, which returns all the firewall’s configuration info.
  • ‘ipfw show’ command, which dumps the kernel’s ipfw table.

Also note that all these logs and command outputs are included in a standard isi_gather_info log collection.

OneFS QoS and DSCP Tagging

As more applications contend for shared network links with finite bandwidth, ensuring Quality of Service (QoS) becomes more critical. Each application or workload can have varying QoS requirements to deliver not only service availability, but also an optimal client experience. Associating each app with an appropriate QoS marking helps provide some traffic policing, by allowing certain packets to be prioritized across a shared network, all while meeting SLAs.

QoS can be implemented using a variety of methods, but the most common is through a Differentiated Services Code Point, or DSCP, which specifies a value in the packet header that maps to a traffic effort level.

OneFS 9.9 introduces support for DSCP marking, and the configuration is cluster-wide, and based on the class of network traffic. Once configured, OneFS inserts the DSCP marking in the Traffic Class or Type of Service fields of the IP packet header, and away you go.

The pertinent part of each IPv4 and IPv6 packet header is as follows:

OneFS QoS tagging separates network traffic into four default classes, each with an associated DSCP value, plus configurable source and destination ports. The four classes OneFS provides are ‘transactional’, ‘network management’, ‘bulk data’, and ‘catch all’:

Class  Traffic   Default DSCP Value  Source Ports  Destination Ports 
Transactional File Access and Sharing Protocols:

NFS, FTP, HTTPS data, HDFS, S3, RoCE

Security and Authentication Protocols:

Kerberos, LDAP, LSASS, DCE/RPC

RPC and Inter-Process Communication Protocols:

rpc.bind, mountd, statd, lockd, quotd, mgmntd

Naming Services Protocols: NetBIOS, Microsoft-DS

18 20, 21, 80, 88, 111, 135 137, 138, 139, 300, 302, 304, 305, 306, 389, 443, 445, 585, 636, 989, 990, 2049, 3268, 3269, 8020, 8082, 8440, 8441, 8443, 9020, 9021 Not defined by default, but administrator may configure.
Network Management WebUI, SSH, SMTP, syslog, DNS, NTP, SNMP, Perf collector, CEE, alerts 16 22, 25, 53, 123, 161, 162, 514, 6514, 6567, 8080, 9443, 12228 Not defined by default, but administrator may configure.
Bulk Data SmartSync, SyncIQ, NDMP 10 2097, 2098, 3148, 3149, 5667, 5668, 7722, 8470, 10000 Not defined by default, but administrator may configure.
Catch-All All other traffic that does not match any of the above 0 all Not defined by default, but administrator may configure.

The default DSCP feature values for each were specifically chosen to meet US government requirements and satisfy the Fed APL needs. While destination ports are undefined in the classes by default, cluster admins can customize the DSCP values, source ports, and destination ports per site requirements.

Under the hood, QoS tagging is built upon the OneFS firewall (ipfw):

As such, QoS tagging is only functional when both the firewall and the DSCP features are enabled.

The firewall inspects outgoing network traffic on the front-end ports and assigns it to the appropriate QoS class. The outbound IP packets are matched to the cluster’s four DSCP rules, one by one, from top to bottom, using the source ports, and destination ports too, if configured.

When a good match is found, the Firewall engine marks the packets’ DSCP bits as specified by that rule. If no match is found, the last ‘Best Effort’ rule will catch all outgoing IP packets which are unmatched with the other 3 DSCP rules.

The firewall assigns the DSCP value based on the QoS class, and the DSCP configuration and values are cluster wide and preserved across upgrades.

Note though, that this DSCP feature does not allow the creation of any additional or custom DSCP rules currently. Additionally, DSCP tagging is disabled by default in both STIG hardening and compliance modes.

Also, consider that in order to provide QoS, the firewall has to inspect and filter the outgoing packets, which obviously comes with a performance cost. Although this overhead should be fairly minimal, the recommendation is to test DSCP tagging in a lab environment first, to confirm workloads are not significantly impacted, before letting it loose on a production cluster.

In the next article in this series, we’ll look at the DSCP configuration and management, plus some basic troubleshooting tools.

OneFS Namespace API (RAN) – Advanced Requests and Troubleshooting

A cluster’s files and directories can be accessed programmatically, and controlled by filesystem permissions, through the OneFS RESTful Access to Namespace (RAN) API, similarly to the way they’re accessed through the core NAS protocols such as NFS and SMB.

Within the RAN namespace, the following system attributes are common to directories and files:

Attribute Description Type
name Specifies the name of the object. String
size Specifies the size of the object in bytes. Integer
block_size Specifies the block size of the object. Integer
blocks Specifies the number of blocks that compose the object. Integer
last_modified Specifies the time when the object data was last modified in HTTP date/time format. HTTP date
create_time Specifies the date when the object data was created in HTTP date/time format. HTTP date
access_time Specifies the date when the object was last accessed in HTTP date/time format. HTTP date
change_time Specifies the date when the object was last changed (including data and metadata changes) in HTTP date/time format. String
type Specifies the object type, which can be one of the following values: container, object, pipe, character_device, block_device, symbolic_link, socket, or whiteout_file. String
mtime_val Specifies the time when the object data was last modified in UNIX Epoch format. Integer
btime_val Specifies the time when the object data was created in UNIX Epoch format. Integer
atime_val Specifies the time when the object was last accessed in UNIX Epoch format. Integer
ctime_val Specifies the time when the object was last changed (including data and metadata changes) in UNIX Epoch format. Integer
owner Specifies the user name for the owner of the object. String
group Specifies the group name for the owner of the object. String
uid Specifies the UID for the owner. Integer
gid Specifies the GID for the owner. Integer
mode Specifies the UNIX mode octal number. String
id Specifies the object ID, which is also the INODE number. Integer
nlink Specifies the number of hard links to the object. Integer
is_hidden Specifies whether the file is hidden or not. Boolean

In response, the following response headers may be returned when sending a request to RAN.

Attribute Description Type
Content-length Provides the length of the body message in the response. Integer
Connection Provides the state of connection to the server. String
Date Provides the date when the object store last responded. HTTP-date
Server Provides platform and version information about the server that responded to the request. String
x-isi-ifs-target-type Provides the resource type. This value can be a container or an object. String

For diagnostic and troubleshooting purposes, failed requests to the namespace can often be resolved via common error codes and viewing activity logs. Activity logs capture server and object activity and can help identify problems. The following table shows the location of different types of activity logs.

Log Location
Server logs
  • /var/log/<server>/webui_httpd_error.log

·         /var/log/<server>/webui_httpd_access.log

Object Daemon Log ·         /var/log/isi_object_d.log
Generic Log ·         /var/log/message

For <server> above, the path to the server directory should be used. For example: /apache2.

The common JSON error is returned in the following format:

{
"errors":[
{
"code":"<Error code>",
"message":"<some detailed error msg>"
}
]
}

The following table includes the common error codes, plus their status and description:

Error Code Description HTTP status
AEC_TRANSIENT The specified request returned a transient error code that is treated as OK. 200 OK
AEC_BAD_REQUEST The specified request returned a bad request error. 400 Bad Request
AEC_ARG_REQUIRED The specified request requires an argument for the operation. 400 Bad Request
AEC_ARG_SINGLE_ONLY The specified request requires only a single argument for the operation. 400 Bad Request
AEC_UNAUTHORIZED The specified request requires user authentication. 401 Unauthorized
AEC_FORBIDDEN The specified request was denied by the server. Typically, this response includes permission errors on OneFS. 403 Forbidden
AEC_NOT_FOUND The specified request has a target object that was not found. 404 Not Found
AEC_METHOD_NOT_ALLOWED The specified request sent a method that is not allowed for the target object. 405 Method Not Allowed
AEC_NOT_ACCEPTABLE The specified request is unacceptable. 406 Not Acceptable
AEC_CONFLICT The specified request has a conflict that prevents the operation from completing. 409 Conflict
AEC_PRE_CONDITION_FAILED The specified request has failed a precondition. 412 Precondition failed
AEC_INVALID_REQUEST_RANGE The specified request has requested a range that cannot be satisfied. 416 Requested Range not Satisfiable
AEC_NOT_MODIFIED The specified request was not modified. 304 Not Modified
AEC_LIMIT_EXCEEDED The specified request exceeded the limit set on the server side. 403 Forbidden
AEC_INVALID_LICENSE The specified request has an invalid license. 403 Forbidden
AEC_NAMETOOLONG The specified request has an object name size that is too long. 403 Forbidden
AEC_SYSTEM_INTERNAL_ERROR The specified request has failed because the server encountered an unexpected condition. 500 Internal Server Error

For example, an invalid copy source path yields the ‘AEC_BAD_REQUEST’ code:

# curl -X PUT --insecure --basic --user <name>:<passwd> --header "clone=true" --header "x-isi-ifs-copy-source:/namespace/ifs/data-other/testfile1/" https://10.1.10.20:8080/namespace/ifs/data/testfile1/
{
"errors" :
[
{
"code" : "AEC_BAD_REQUEST",
"message" : "Unable to open object '/data-other/testfile1/' in store 'ifs' -- a component of the path is not a directory."
}
]
}

When crafting straightforward HTTP requests to RAN, such as create a file (object), the ‘curl’ CLI utility can be a useful asset:

# curl -X PUT --insecure --basic --user <username>:<passwd> -H "x-isi-ifs-target-type:object" https://<cluster_ip>:8080/namespace/<path>/<file>/

For example, to create ‘file1’ under ‘/ifs/data’:

# curl -X PUT --insecure --basic --user <username>:<passwd> -H "x-isi-ifs-target-type:object" https://10.1.10.20:8080/namespace/ifs/data/file1/

# ls -lsia /ifs/data/file1

6668484639 64 -rw-------     1 root  wheel  0 Aug 28 00:58 /ifs/data/file1

And to read the contents of the file via RAN:

# echo "This is file1" > /ifs/data/file1

# curl -X GET --insecure --basic --user <username>:<passwd> https://10.1.10.20:8080/namespace/ifs/data/file1

This is file1

However, ‘curl’ and its ‘-H’ header option can quickly get unwieldy for more complex HTML requests, such as setting ACLs and configuring SmartLock immutability via RAN. As such, more versatile dev tools and/or scripting languages may be a better alternative in these cases. Plus, familiarity with HTTP/1.1 and experience writing HTTP-based client utilities is of considerable help when implementing RAN endpoints in production environments.

Next, are a couple of examples of more complex HTTP requests to RAN.

Set the ACL on a file

In the first instance, the following request syntax can be used to configure the access control list (ACL) of a file:

PUT /namespace/<access_point>/<container_path>/<file_name>?acl HTTP/1.1
Host: <hostname>[:<port>]
Content-Length: <length>
Date: <date>
Authorization: <signature>
x-isi-ifs-target-type: object
Content-Type: application/json

{
"owner":{
"id":"<owner id>",
"name":"<owner name>",
"type":"<type>"
},
"group":{
"id":"<group id>",
"name":"<group name>",
"type":"<type>"
},
"authoritative":"acl"|"mode",
"mode":"<POSIX mode>",
"action":"<action_value>",
"acl":[
{
"trustee":{
"id":"<trustee id>",
"name":"<trustee name>",
"type":"<trustee type>"
},
"accesstype":"allow"|"deny",
"accessrights":"<accessrights_list>",
"op":"<operation_value>"
}
]
}

The ACL endpoint parameters for RAN include:

Parameter Description
acl The acl argument must be placed at the first position of the argument list in the URI.
owner Specifies the JSON object for the owner persona. You should only specify the owner or group persona if you want to change the owner or group of the target.
group Specifies the JSON object for the group persona of the owner. You should only specify the owner or group persona if you want to change the owner or group of the target.
authoritative The authoritative field is mandatory and can take the value of either acl or mode.

acl: You can modify the owner, group personas, or access rights for the file by setting the authoritative field to acl and by setting <action_value> to update. When the authoritative field is set to acl, access rights are set for the file from the acl structure. Any value that is specified for the mode parameter is ignored.

Note: When the authoritative field is set to acl, the default value for the <action_value> field is replace. If the <action_value> field is set to replace, the system replaces the existing access rights of the file with the access rights that are specified in the acl structure. If the acl structure is empty, the existing access rights are deleted and default access rights are provided by the system. The default access rights for files are read access control list (‘std_read_dac’) and write access control list (‘std_write_dac’) for the owner.

mode: You can modify the owner and group personas by setting the authoritative field to mode. When the authoritative field is set to mode, POSIX permissions are set on the file. The <action_value> field and acl structure are ignored. If mode is set on a file that already has access rights or if access rights are set on a file that already has POSIX permissions set, the result of the operation varies based on the Global ACL Policy.

mode Specifies the POSIX mode as an cctal number string. By default, these are 0700 for directories and 0600 for files.
action The <action_value> field is applied when the authoritative field is set to acl. You can set the <action_value> field to either update or replace. The default value is replace.

When set to update, the existing access control list of the file is modified with the access control entries that are specified in the acl structure of the JSON body.

When set to replace, the entire access control list is deleted and replaced with the access control entries that are specified in the acl structure of the JSON body.

Also, when set to replace, the acl structure is optional. If the acl structure is left empty, the entire access control list is deleted and replaced with the system set default access rights. The default access rights for files are read access control list (‘ std_read_dac’) and write access control list (‘ std_write_dac’) for the owner.

acl Specifies the JSON array of access rights.
accesstype Can be set to allow or deny.

allow: Allows access to the file based on the access rights set for the trustee.

deny: Denies access to the file based on the access rights set for the trustee.

accessrights Specifies the access right values that are defined for the file.
inherit_flags Specifies the inherit flag values for the file.
op The <operation_value> field is applied when the <action_value> field is set to update. You can set the <operation_value> field to add, replace, or delete. If no <operation_value> field is specified, the default value is add.

add: Creates an access control entry (ACE) if an ACE is not already present for a trustee and trustee access type. If an entry is already present for that trustee and trustee access type, this operation appends the access rights list to the current ACE for that trustee and trustee access type.

delete: Removes the access rights list provided from the existing ACE for a trustee and trustee access type. If the input access rights list is empty , the entire ACE that corresponds to the trustee and trustee access type is deleted.

replace: Replaces the entire ACE for the trustee and trustee access type with the input access rights list.

The following HTTP ‘put’ syntax can be used to set the ACL of a file, in this case ‘file1’.

PUT /namespace/ifs/dir1/dir2/ns/file1?acl HTTP/1.1
Host: my_cluster:8080
Content-Length: <length>
Date: Tue, 22 May 2024 12:00:00 GMT
Authorization: <signature>
Content-Type: application/json

{
"owner":{
"id":"UID:0",
"name":"root",
"type":"user"
},
"group":{
"id":"GID:0",
"name”:"wheel",
"type":"group"
},
"authoritative":"acl",
"action":"update",
"acl": [
{
"trustee":{
"id":"UID:0",
"name":"root",
"type":"user"
},
"accesstype":"allow",
"accessrights":[
"file_read",
"file_write"
],
"op":"add"
},
{
"trustee":{
"id":"GID:1201",
"name":"group12",
"type":"group"
},
"accesstype":"allow",
"accessrights":"std_write_dac"
],
"op":"replace"
}
]
}

And the corresponding successful response from RAN is along the following lines:

HTTP/1.1 200 OK

Date: Tue, 22 May 2024 12:00:00 GMT

Content-Length: <length>

Connection: close

Server: Apache2/2.2.19

 

Set the retention period and commit a file in a SmartLock directory

Similarly, the following request syntax can be used to set the retention period and commits a file in a SmartLock directory.

PUT /namespace/<access_point>/<WORM_directory>/<file_name>?worm HTTP/1.1
Host: <hostname>[:<port>]
Date: <date>
Authorization: <signature>

{
"worm_retention_date":<"YYYY-MM-DD hh:mm:ss GMT">,
"commit_to_worm":<Boolean>
}

Note that if a file is not explicitly committed when an autocommit time period is configured for the SmartLock directory where the file resides, the file is automatically committed when the autocommit period elapses.

If the file is committed without setting a retention expiration date, the default retention period that is specified for the SmartLock directory where the file resides is applied. The retention date on the file can also be limited by the maximum retention period set on the SmartLock directory.

The pertinent WORM endpoint parameters in RAN include:

Parameter Description
Parameter Description
worm The worm argument must be placed at the first position of the argument list in the URI.
worm_committed Indicates whether the file was committed to the WORM state.
worm_retention_date Provides the retention expiration date in Coordinated Universal Time (such as UTC/GMT). If a value is not specified, the field has a null value.
worm_retention_date_val Provides the retention expiration date in seconds from UNIX Epoch or UTC.
worm_override_retention_date Provides the override retention date that is set on the SmartLock directory where the file resides. If the date is not set or is earlier than or equal to the existing file retention date, this field has a null value. Otherwise, the date is expressed in UTC/GMT, and is the retention expiration date for the file if the worm_committed parameter is also set to true.
worm_override_retention_date_val Provides the override retention date that is set on the SmartLock directory where the file resides. If the date is not set or if the date is set to earlier than or equal to the file retention date, this field has a null value. Otherwise, the date is expressed in seconds from UNIX Epoch and UTC, and is the retention expiration date set for the file if the worm_committed parameter is set to true. This parameter is the same as worm_override_retention_date, but is expressed in seconds from the Epoch or UTC.

For example, the following request will set the retention date for a ‘file1’ in the SmartLock directory ‘dir1’ to 25th December 2024:

PUT /namespace/ifs/dir1/file1?worm HTTP/1.1
Host: my_cluster:8080
Date: Wed, 25 Dec 2024 12:00:00 GMT
Authorization: <signature>

{
"worm_retention_date":"2024-12-25 12:00:00 GMT",
"commit_to_worm":true
}

And the corresponding successful response:

HTTP/1.1 200 OK
Date: Tue, 25 Dec 2024 12:00:00 GMT
Content-Length: 0
Connection: close
Server: Apache2/2.2.19

OneFS Namespace API (RAN) – Part 2

As we saw in the previous article in this series, a cluster’s files and directories can be accessed programmatically through the OneFS RESTful Access to Namespace (RAN) API, similarly to the way they’re accessed through SMB or NFS protocols – as well as controlled by filesystem permissions.

Under the hood, the general architecture and workflow of the OneFS RAN namespace API is as follows:

Upon receiving an HTTP request sent through the OneFS API, the cluster’s web server (Apache) verifies the username and password credentials – either through HTTP Basic Authentication for single requests or via an established session to a single node for multiple requests.

Once the user has been successfully authenticated, OneFS role-based access control (RBAC) then verifies the privileges associated with the account and, if sufficient, enables access to either the /ifs file system, or to the cluster configuration, as specified in the request URL.

The request URL that calls the API is comprised of a base URL and end-point, with the ‘namespace’ argument denoting the RAN API. For example:

And the GET request response to a <path><object> endpoint typically yields the object’s payload. For example, the ASCII contents of the ‘file1’, in this case:

Or from the CLI with ‘curl’:

# curl -X GET https://10.1.10.20:8080/namespace/ifs/data/dir1/file2 --insecure --basic --user <user>:<passwd>

Test file for RAN access...

If the object is unavailable, a response similar to the following is displayed:

As we saw in the previous article in this series, RAN supports the following types of file system operations:

Operation Action Description
Access points CREATE, DELETE Identify and configure access points (shares) and obtain protocol information.
Directory CREATE, GET, PUT, LIST, DELETE List directory content.; get and set directory attributes; delete directories from the file system.
File CREATE, GET, PUT, LIST, DELETE View, move, copy, and delete files from the file system.
Access control GET/SET ACLs Manage user rights; set ACL or POSIX permissions for files and directories. Set access list on access points (RAN Share Permissions).
Query QUERY

 

Search system metadata or extended attributes, and tag files.
SmartLock GET, SET, Commit Allow retention dates to be set on files; commit files to a WORM state.

In support of these, RAN allows pre-defined keywords to be appended to the URL when sending a namespace request. These keywords must be placed first in the argument list and must not contain any value. If these keywords are placed in any other position in the argument list, the keywords are ignored. Pre-defined keywords include: ‘acl’, ‘metadata’, ‘worm’, and ‘query’.

For example:

https://<cluster_ip>:8080/namespace/ifs/data/dir1?acl

When using the ‘curl’ CLI utility, the following syntax options can be useful for crafting PUT or POST requests to RAN:

  1. When sending form data:
# curl -X PUT -H "Content-Type: multipart/form-data;" -F "key1=val1" "YOUR_URI"
  1. If sending raw data as json:
# curl -X PUT -H "Content-Type: application/json" -d '{"key1":"value"}' "YOUR_URI"
  1. When sending a file with a POST request:
# curl -X POST "YOUR_URI" -F 'file=@/file-path.csv'

Where:

-X – option can be used for request command,

-d – option can be used in order to put data on remote URL.

-H – header option can express the content type.

-v – Plus the verbose option, which is handy for debugging..

When sending a request to RAN, data can be accessed through customized headers, in addition to the standard HTTP headers. The common RAN HTTP request headers include:

Name Description Type Required
Authorization Specifies the authentication signature. String Yes
Content-length Specifies the length of the message body. Integer Conditional
Date Specifies the current date according to the requestor. HTTP-date No. A client should only send a Date header in a request that includes an entity-body, such as in PUT and POST requests. A client without a clock must not send a Date header in a request.
x-isi-ifs-spec-version Specifies the protocol specification version. The client specifies the protocol version, and the server determines if the protocol version is supported. You can test backwards compatibility with this header. String Conditional
x-isi-ifs-target-type Specifies the resource type. For PUT operations, this value can be container or object. For GET operations, this value can be container, object, or any, or this parameter can be omitted. String Yes, for PUT operations.

Conditional, for GET operations.

The following curl syntax can be used to instruct RAN to create a file, or ‘object’:

# curl -X PUT --insecure --basic --user <username>:<passwd> -H "x-isi-ifs-target-type:object" https://<cluster_ip>:8080/namespace/<path>/<file>/

For example, to create ‘testfile1’ under ‘/ifs/data’:

# ls -lsia /ifs/data/testfile1

ls: /ifs/data/testfile1: No such file or directory

# curl -X PUT --insecure --basic --user <username>:<passwd> -H "x-isi-ifs-target-type:object" https://10.1.10.20:8080/namespace/ifs/data/testfile1/

# ls -lsia /ifs/data/testfile1

6668484639 64 -rw-------     1 root  wheel  0 Aug 28 00:58 /ifs/data/testfile1

And to read the contents of the file via RAN:

# echo "This is testfile1" > /ifs/data-other/testfile1

# curl -X GET --insecure --basic --user <username>:<passwd> https://10.1.10.20:8080/namespace/ifs/data-other/testfile1

This is testfile1

Or using the ‘POST’ option to move the file, say from the /ifs/data/ directory to /ifs/data-other/:

# curl -X POST --insecure --basic --user <username>:<passwd> --header "x-isi-ifs-target-type=object" --header "x-isi-ifs-set-location:/namespace/ifs/data-other/testfile1" https://10.1.10.20:8080/namespace/ifs/data/testfile1/

Then using ‘PUT’ in conjunction with ‘clone’ and ‘x-isi-ifs-copy-source’ headers to create a clone of ‘/usr/data-other/testfile1’ under /usr/data:

# curl -X PUT --insecure --basic --user <username>:<passwd> --header "clone=true" --header "x-isi-ifs-copy-source:/namespace/ifs/data-other/" https://10.1.10.20:8080/namespace/ifs/data/testfile1/

Note that, if the response body contains a JSON message, the operation has partially failed. If the server fails to initiate a copy due to an issue, such as an invalid copy source, an error is returned. If the server initiates the copy, and then fails, ‘copy_errors’ are returned in structured JSON format. Because the copy operation is synchronous, the client cannot stop an ongoing copy operation or check the status of a copy operation asynchronously.

To remove a file, the ‘DELETE’ option can be  used in the request. For example, to delete ‘testfile1’:

# curl -X DELETE --insecure --basic --user <username>:<passwd> -H "x-isi-ifs-target-type:object" https://10.1.10.20:8080/namespace/ifs/data/testfile1/

The following curl ‘PUT’ syntax can be to create a directory, or ‘container’:

# curl -X PUT --insecure --basic --user <username>:<passwd> --header "x-isi-ifs-target-type:container" https://10.1.10.20:8080/namespace/ifs/data/testdir1/

The ‘HEAD’ option can also be used to view the attributes of the directory, including its ACL (x-isi-ifs-access-control). For example:

# curl --head --insecure --basic --user <username>:<passwd> https://10.1.10.20:8080/namespace/ifs/data/testdir1

HTTP/1.1 200 Ok

Date: Wed, 28 Aug 2024 01:29:16 GMT

Server: Apache

Allow: GET, PUT, POST, DELETE, HEAD

Etag: "6668484641-18446744073709551615-1"

Last-Modified: Wed, 28 Aug 2024 01:16:28 GMT

x-isi-ifs-access-control: 0700

x-isi-ifs-spec-version: 1.0

x-isi-ifs-target-type: container

X-Frame-Options: sameorigin

X-Content-Type-Options: nosniff

X-XSS-Protection: 1; mode=block

Strict-Transport-Security: max-age=31536000;

Content-Security-Policy: default-src 'none'

Content-Type: application/json

The curl verbose option (-v) provides step by step insight into the HTTP client/server interaction, which can be valuable for debugging. For example, the output from a request to create the file /ifs/data/testfile2:

# curl -v -X PUT --insecure --basic --user <name>:<passwd> --header "x-isi-ifs-target-type:object" https://10.1.10.20:8080/namespace/ifs/data/testfile2/
*   Trying 10.1.10.20:8080...
* Connected to 10.1.10.20 (10.1.10.20) port 8080
* ALPN: curl offers http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384 / [blank] / UNDEF
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=US; ST=Washington; L=Seattle; O=Isilon Systems, Inc.; OU=Isilon Systems; CN=Isilon Systems; emailAddress=support@isilon.com
*  start date: Aug  4 17:39:14 2024 GMT
*  expire date: Nov  6 17:39:14 2026 GMT
*  issuer: C=US; ST=Washington; L=Seattle; O=Isilon Systems, Inc.; OU=Isilon Systems; CN=Isilon Systems; emailAddress=support@isilon.com
*  SSL certificate verify result: self signed certificate (18), continuing anyway.
* using HTTP/1.x
* Server auth using Basic with user 'root'
> PUT /namespace/ifs/data/testfile2/ HTTP/1.1
> Host: 10.1.10.20:8080
> Authorization: Basic cm9vdDph
> User-Agent: curl/8.7.1
> Accept: */*
> x-isi-ifs-target-type:object
> 
* Request completely sent off
< HTTP/1.1 200 Ok
< Date: Wed, 28 Aug 2024 00:46:36 GMT
< Server: Apache
< Allow: GET, PUT, POST, DELETE, HEAD
< x-isi-ifs-spec-version: 1.0
< X-Frame-Options: sameorigin
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 1; mode=block
< Strict-Transport-Security: max-age=31536000;
< Content-Security-Policy: default-src 'none'
< Transfer-Encoding: chunked
< Content-Type: text/plain
< 
* Connection #0 to host 10.219.64.11 left intact
#

Beyond this, crafting more complex HTML requests with the curl utility can start to become unwieldy, and more powerful dev tools can be beneficial instead. Plus a solid understanding of HTTP/1.1, and experience writing HTTP-based client software, before getting too heavily involved with implementing the RAN API in production environments.

OneFS Namespace API (RAN)

 Among the array of HTTP services, and in addition to the platform API, OneFS also provides a namespace API.

This RESTful Access to Namespace, or RAN, provides HTTP access to any of the files and directories within a cluster’s /ifs filesystem hierarchy.

RAN can be accessed by making an HTTP call that references the /namespace/ API, rather than the /platform/ API that we’ve seen in the previous articles in this series. For example:

namespace == “http share”

This provides HTTPS access to the files & directories on the filesystem, but more as a data management API rather than object.

As you would expect, by default, the root of a cluster’s RAN namespace is the top level /ifs directory:

Namespace resources are accessed through a URL, such as:

Where:

Attribute Description
Access point Access Point is the root path of the file system endpoint (RAN share), ie. /ifs.
Authority Authority is the IP address or FQDN for the cluster.
Container Container is a directory or folder.
Data object Data object contains content data, such as a file on the system.
Endpoint Endpoint is the targetable URL.
File File is the data object you wish to query/modify.
Namespace Namespace is the file system structure on the cluster.
Object Object is either a container or data object.
Path Path is the file path to the object you want to access.
Port Port is the number of the port; the default port number is 8080
Scheme Scheme is the access protocol method

For example, using the RAN API to access the file ‘file1’ which resides in the ‘dir1’ directory under the access point and path ‘/ifs/data/dir1/’.

From the OneFS CLI, the ‘curl’ utility can be used to craft a ‘GET’ request for this file:

# curl -u <user:password> -k https://10.1.10.20:8080/namespace/ifs/data/dir1/file1

Or from a browser:

Also, using ‘curl’ via the CLI to view the RAN access point:

# curl -X GET https://10.1.10.20:8080/namespace --insecure --basic --user root:a

{"namespaces":[{

   "name" : "ifs",

   "path" : "/ifs"

}

]}#

Additionally, you can append a pre-defined keyword to the end of the URL when you send a request to the namespace. These keywords must be placed first in the argument list and must not contain any value. If these keywords are placed in any other position in the argument list, the keywords are ignored. Pre-defined keywords include: ‘acl’, ‘metadata’, ‘worm’, and ‘query’.

For example:

https://10.1.10.20:8080/namespace/ifs/data/dir1?acl

Or for metadata. For example:

https://10.1.10.20:8080/namespace/ifs/data/dir1/file1?metadata

A cluster’s files and directories can be accessed programmatically through RAN, similarly to the way they’re accessed through SMB or NFS protocols, as well as limited by filesystem permissions. As such RAN enables the following types of file system operations to be performed.

Operation Action Description
Access points CREATE, DELETE Identify and configure access points (shares) and obtain protocol information.
Directory CREATE, GET, PUT, LIST, DELETE List directory content.; get and set directory attributes; delete directories from the file system.
File CREATE, GET, PUT, LIST, DELETE View, move, copy, and delete files from the file system.
Access control GET/SET ACLs Manage user rights; set ACL or POSIX permissions for files and directories. Set access list on access points (RAN Share Permissions).
Query QUERY

 

Search system metadata or extended attributes, and tag files.
SmartLock GET, SET, Commit Allow retention dates to be set on files; commit files to a WORM state.

Additionally, applications or external clients can be built to access RAN in any major programming or scripting language, such as C++, Java, .net, Python, etc.

Note, though, that RAN access in general is disabled for clusters running in ‘hardened’ mode. In such cases a warning will be displayed notifying that HTTP browse is disabled, similar to the following: