OneFS and Client Bandwidth Measurement with iPerf

Sometimes in a storage admin’s course of duty there’s a need to quickly and easily assess the bandwidth between a PowerScale cluster and client. The ubiquitous iPerf tool is a handy utility for taking active measurements of the maximum achievable bandwidth between a PowerScale cluster and client, across the node’s front-end IP network(s).

iPerf was developed by NLANR/DAST as a modern alternative for measuring maximum TCP and UDP bandwidth performance. iPerf is a flexible tool, allowing the tuning of various parameters and UDP characteristics, and reporting network performance stats including bandwidth, delay jitter, datagram loss, etc.

In addition and contrast to the classic iPerf (typically version 2.x), a newer and more feature rich iPerf3 version is also available. Unlike the classic incantation, iPerf3 is primarily developed and maintained by ESnet and the Lawrence Berkeley National Laboratory, and made available under BSD licensing. Note that iPerf3 neither shares code nor provides backwards compatibility with the classic iPerf.

Additional optional features of iPerf3 include:

  • CPU affinity setting
  • IPv6 flow labeling
  • SCTP
  • TCP congestion algorithm settings
  • Sendfile / zerocopy
  • Socket pacing
  • Authentication

Both iPerf and iPerf3 are available preinstalled on OneFS, and can be useful for measuring and verifying anticipated network performance prior to running any performance benchmark. The standard ‘iperf’ CLI command automatically invokes the classic (v2) version:

# iperf -v

iperf version 2.0.4 (7 Apr 2008) pthreads

Within OneFS, the iPerf binary can be found in the /usr/local/bin/ directory on each node:

# whereis iperf

iperf: /usr/local/bin/iperf /usr/local/man/man1/iperf.1.gz

Whereas the enhanced iPerf version 3 uses the ‘iperf3’ CLI syntax, and also lives under /usr/local/bin:

# iperf3 -v

iperf 3.4 (cJSON 1.5.2)

# whereis iperf3

iperf3: /usr/local/bin/iperf3 /usr/local/man/man1/iperf3.1.gz

For Linux and Windows clients, Iperf binaries can also be downloaded and installed from the following location:

https://iperf.fr/

The iPerf source code is also available at Sourceforge for those ‘build-your-own’ aficionados among us:

http://sourceforge.net/projects/iperf/

Under the hood, iPerf allows the configuration and tuning of a variety of buffering and timing parameters across both TCP and UDP, and with support for IPv4 and IPv6 environments. For each test, iPerf reports the maximum bandwidth, loss, and other salient metrics.

More specifically, iPerf supports the following features:

Attribute Details
TCP ·         Measure bandwidth

·         Report MSS / MTU size and observed read sizes

·         Supports SCTP multi-homing and redundant paths for reliability and resilience.

UDP ·         Client can create UDP streams of specified bandwidth

·         Measure packet loss

·         Measure delay jitter

·         Supports muti-cast

Platform support ·         Windows, Linux, MacOS, BSD UNIX, Solaris, Android, VxWorks.
Concurrency ·         Client and server can support multiple simultaneous connections (-P flag).

·         iPerf3 server accepts multiple simultaneous connections from the same client.

Duration ·         Can be configured run for a specified time (-t flag), in addition to a set amount of data (-n and -k flags).

·         Server can be run as a daemon (-D flag)

Reporting ·         Can display periodic, intermediate bandwidth, jitter, and loss reports at configurable intervals (-i flag).

When it comes to running iPerf, the most basic use case is testing a single connection from a client to a node on the cluster. This can be initiated as follows:

On the cluster node, the following CLI command will initiate the iPerf server:

# iperf -s

Similarly, on the client, the following CLI syntax will target the iPerf server on the cluster node:

# iperf -c <server_IP>

For example, with a freeBSD client with IP address 10.11.12.9 connecting to a cluster node at 10.10.11.12:

# iperf -c 10.10.11.12

------------------------------------------------------------

Client connecting to 10.10.11.12, TCP port 5001

TCP window size:   131 KByte (default)

------------------------------------------------------------

[  3] local 10.11.12.9 port 65001 connected with 10.10.11.12 port 5001

[ ID] Interval       Transfer     Bandwidth

[  3]  0.0-10.0 sec  31.8 GBytes  27.3 Gbits/sec

And from the cluster node:

# iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size:   128 KByte (default)

------------------------------------------------------------

[  4] local 10.10.11.12 port 5001 connected with 10.11.12.9 port 65001

[ ID] Interval       Transfer     Bandwidth

[  4]  0.0-10.0 sec  31.8 GBytes  27.3 Gbits/sec

As indicated in the above output, iPerf uses a default window size of 128KB. Also note that the classic iPerf (v2) uses TCP port 5001 by default on OneFS. As such, this port must be open on any and all firewalls and/or packet filters situated between client and node for the above to work. Similarly, iPerf3 defaults to TCP 5201, and the same open port requirements between clients and cluster apply.

Here’s the output from the same configuration but using iPerf3:

For example, from the server:

# iperf3 -s

-----------------------------------------------------------

Server listening on 5201

-----------------------------------------------------------

Accepted connection from 10.11.12.9, port 12543

[  5] local 10.10.11.12 port 5201 connected to 10.11.12.9 port 55439

[ ID] Interval           Transfer     Bitrate

[  5]   0.00-1.00   sec  3.22 GBytes  27.7 Gbits/sec

[  5]   1.00-2.00   sec  3.59 GBytes  30.9 Gbits/sec

[  5]   2.00-3.00   sec  3.52 GBytes  30.3 Gbits/sec

[  5]   3.00-4.00   sec  3.95 GBytes  33.9 Gbits/sec

[  5]   4.00-5.00   sec  4.07 GBytes  34.9 Gbits/sec

[  5]   5.00-6.00   sec  4.10 GBytes  35.2 Gbits/sec

[  5]   6.00-7.00   sec  4.14 GBytes  35.6 Gbits/sec

[  5]   6.00-7.00   sec  4.14 GBytes  35.6 Gbits/sec

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval           Transfer     Bitrate

[  5]   0.00-7.00   sec  27.8 GBytes  34.1 Gbits/sec                  receiver

iperf3: the client has terminated

-----------------------------------------------------------

Server listening on 5201

-----------------------------------------------------------

And from the client:

# iperf3 -c 10.10.11.12

Connecting to host 10.10.11.12, port 5201

[  5] local 10.11.12.9 port 55439 connected to 10.10.11.12 port 5201

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd

[  5]   0.00-1.00   sec  3.22 GBytes  27.7 Gbits/sec    0    316 KBytes

[  5]   1.00-2.00   sec  3.59 GBytes  30.9 Gbits/sec    0    316 KBytes

[  5]   2.00-3.00   sec  3.52 GBytes  30.3 Gbits/sec    0    504 KBytes

[  5]   3.00-4.00   sec  3.95 GBytes  33.9 Gbits/sec    2    671 KBytes

[  5]   4.00-5.00   sec  4.07 GBytes  34.9 Gbits/sec    0    671 KBytes

[  5]   5.00-6.00   sec  4.10 GBytes  35.2 Gbits/sec    1    664 KBytes

[  5]   6.00-7.00   sec  4.14 GBytes  35.6 Gbits/sec    0    664 KBytes

^C[  5]   7.00-7.28   sec  1.17 GBytes  35.6 Gbits/sec    0    664 KBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval           Transfer     Bitrate         Retr

[  5]   0.00-7.28   sec  27.8 GBytes  32.8 Gbits/sec    3             sender

[  5]   0.00-7.28   sec  0.00 Bytes  0.00 bits/sec                  receiver

iperf3: interrupt - the client has terminated

Regarding iPerf CLI syntax, the following options are available in each version of the tool:

Options Description iPerf iPerf3
<none> Default settings X
–authorized-users-path Path to the configuration file containing authorized users credentials to run iperf tests (if built with OpenSSL support) X
-A Set the CPU affinity, if possible (Linux, FreeBSD, and Windows only). X
-b Set target bandwidth/bitrate  to n bits/sec (default 1 Mbit/sec). Requires UDP (-u). X X
-B Bind to <host>, an interface or multicast address X X
-c Run in client mode, connecting to <host> X X
-C Compatibility; for use with older versions – does not sent extra msgs X
-C Set the congestion control algorithm (Linux and FreeBSD only) X
–cport Bind data streams to a specific client port (for TCP and UDP only, default is to use an ephemeral port) X
–connect-timeout Set timeout for establishing the initial control connection to the server, in milliseconds.  Default behavior is the OS’ timeout for TCP connection establishment. X
-d Simultaneous bi-directional bandwidth X
-d Emit debugging output X
-D Run the server as a daemon X X
–dscp Set the IP DSCP bits X
-f Format to report: Kbits/Mbits/Gbits/Tbits X
-F Input the data to be transmitted from a file X X
–forceflush Force flushing output at every interval, to avoid buffering when sending output to pipe. X
–fq-rate Set a rate to be used with fair-queueing based socket-level

pacing, in bits per second.

X
–get-server-output Get the output from the server.  The output format is determined by the server (ie. JSON ‘-j’) X
-h Help X X
-i Interval: Pause n seconds between periodic bandwidth reports. X X
-I Input the data to be transmitted from stdin X
-I Write a file with the process ID X
-J Output in JSON format X
-k Number of blocks (packets) to transmit (instead of -t or -n) X
-l Length of buffer to read or write.  For TCP tests, the default value is 128KB.  With UDP, iperf3 tries to dynamically determine a reasonable sending size based on the path MTU; if that cannot be determined it uses 1460 bytes as a sending size. For SCTP tests, the default size is 64KB. X
-L Set length read/write buffer (defaults to 8 KB) X
-L Set the IPv6 flow label X
–logfile Send output to a log file. X
-m Print TCP maximum segment size (MTU – TCP/IP header) X
-M Set TCP maximum segment size (MTU – 40 bytes) X X
-n number of bytes to transmit (instead of -t) X X
-N Set TCP no delay, disabling Nagle’s Algorithm X X
–nstreams Set number of SCTP streams. X
-o Output the report or error message to a specified file X
-O Omit the first n seconds of the test, to skip past the TCP slow-start period. X
-p Port: set server port to listen on/connect to X X
-P Number of parallel client threads to run X X
–pacing-timer Set pacing timer interval in microseconds (default 1000 microseconds, or 1 ms).  This controls iperf3’s internal pacing timer for the -b/–bitrate option. X
-r Bi-directional bandwidth X
-R Reverse the direction of a test, so that the server sends data to the client X
–rsa-private-key-path Path to the RSA private key (not password-protected) used to decrypt authentication credentials from the client (if built with OpenSSL support). X
–rsa-public-key-path Path to the RSA public key used to encrypt authentication credentials (if built with OpenSSL support) X
-s Run iPerf in server mode X X
-S Set the IP type of service. X
–sctp use SCTP rather than TCP (FreeBSD and Linux) X
-t Time in seconds to transmit for (default 10 secs) X X
-T Time-to-live, for multicast (default 1) X
-T Prefix every output line with this title string X
-u Use UDP rather than TCP. X X
-U Run in single threaded UDP mode X
–username Username to use for authentication to the iperf server (if built with OpenSSL support).  The password will be prompted for interactively when the test is run. X
-v Print version information and quit X X
-V Set the domain to IPv6 X
-V Verbose – give more detailed output X
-w TCP window size (socket buffer size) X X
-x Exclude C(connection), D(data), M(multicast), S(settings), V(server) reports X
-X Bind SCTP associations to a specific subset of links using sctp_bindx X
-y If set to C or c, report results as CSV (comma separated values) X
-Z Set TCP congestion control algorithm (Linux only) X
-Z Use a ‘zero copy’ method of sending data, such as sendfile instead of the usual write. X
-1 Handle one client connection, then exit. X
-4 Only use IPv4 X
-6 Only use Ipv6 X

To run the iPerf server across all nodes in a cluster, it can be initiated in conjunction with the OneFS ‘isi_for_array’ CLI utility, as follows:

# isi_for_array iperf -s

Bidirectional testing can also sometimes be a useful sanity-check, with OneFS acting as the client pointing to a client OS running the server instance of iPerf. For example:

# iperf -c 10.10.11.205 -i 5 -t 60 -P 4

Start the iperf client on a Linux client connecting to one of the PowerScale nodes.

# iperf -c 10.10.1.100

For a Windows client, the same CLI syntax, issued from the command shell (cmd.exe), can be used to start the iperf client and connect to a PowerScale nodes. For example:

C:\Users\pocadmin\Downloads\iperf-2.0.9-win64\iperf-2.0.9-win64>iperf.exe -c 10.10.0.196

iPerf Write Testing

When it comes to write performance testing, the following CLI syntax can be used on the client to executes a write speed (Client –> Cluster) test:

# iperf -P 8 -c <clusterIP>

Note that the ‘-P’ flag designates parallel client threads, allowing the iPerf threads to be match up with the number of physical CPU cores (not hyper-threads) available to the client.

Similarly, the following CLI command can be used on the client to initiate a read speed (Client <– Cluster) test:

# iperf -P 8 -R -c <clusterIP>

Below is an example command from a Linux VM to a single PowerScale node.  Testing was repeated from each Linux client to each node in the cluster to validate results and verify consistent network performance. Using the cluster nodes as the server, the bandwidth tested to ~ 7.2Gbps per VM. (Note that, in this case, the VM limit is 8.0 Gbps):

# iperf -c onefs-node1 -i 5 -t 60 -P 4

------------------------------------------------------------

Client connecting to isilon-node1, TCP port 5001

TCP window size: 94.5 KByte (default)

------------------------------------------------------------

[  4] local 10.10.0.205 port 44506 connected with 172.16.0.5 port 5001

[SUM]  0.0-60.0 sec  50.3 GBytes  7.20 Gbits/sec

Two Linux VMs were also testing running iPerf in parallel to maximize the ExpressRoute network link. This test involved dual iPerf writes from the Linux clients to separate cluster nodes.

[admin@Linux64GB16c-3 ~]$ iperf -c onefs-node3 -i 5 -t 40 -P 4

[SUM]  0.0-40.0 sec  22.5 GBytes  4.83 Gbits/sec 

[admin@linux-vm2 ~]$ iperf -c onefs-node2 -i 5 -t 40 -P 4

[SUM]  0.0-40.0 sec  22.1 GBytes  4.75 Gbits/sec

As can be seen from the results of the iPerf tests, writes appear to split evenly from the Linux clients to the cluster nodes, while saturating the bandwidth of its Azure ExpressRoute link.

Leave a Reply

Your email address will not be published. Required fields are marked *