OneFS System Partition Hygiene

In addition to the /ifs data storage partition, like most UNIX-derived operating systems, OneFS uses several system partitions, including:

Partition Description
/ Root partition containing all the data to start up and run the system, and which contains the base OneFS software image.
/dev Device files partition. Drives, for example, are accessed through block device files such as /dev/ad0.
/ifs Clustered filesystem partition, which spans all of a cluster’s nodes. Includes /ifs/.ifsvar.
/usr Partition for user programs.
/var Partition to store variable data, such as log files, etc. In OneFS, this partition is mostly used for /var/run and /var/log.
/var/crash The crash partition is configured for binary dumps.

One advantage of having separate partitions rather than one big chunk of space is that different parts of the OS are somewhat protected from each other. For example, if /var fills up, it doesn’t affect the root / partition.

While OneFS automatically performs the vast majority of its system housekeeping, occasionally the OneFS /var partition on one or more of a cluster’s nodes will fill up, typically as the result of heavy log writing activity and/or the presence corefile(s). If /var reaches 75%, 85%, or 95% of capacity, a CELOG event is  automatically fired and an alert sent.

The following CLI command will provide a view of /var usage across the cluster:

# isi_for_array -s "du -h /var | sort -n | tail -n10"

The typical resolution for this scenario is to rotate the logfiles under /var/log. If, after log rotation, the /var partition returns to a normal usage level, reviewing the list of recently written logs will usually determine if a specific log is rotating frequently/excessively. Log rotation will usually resolve the full-partition issue by compressing or removing large logs and old logs, thereby automatically reducing partition usage.
The ‘df -i’ CLI command, run on the node that reported the error, will display the details of the /var partition. For example:

# df -i | grep var | grep -v crash
Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on
/dev/mirror/var0 1013068 49160 882864 5% 1650 139276  92% /var

If the percentage used value is 90% or higher, as above, the recommendation is to reduce the number of files in the /var partition. To remove files that do not belong in the /var partition, first run the following ‘find’ command on the node that generated the alert. This will display any files in the /var partition that are greater than 5 MB in size:

# find -x /var -type f -size +10000 -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'

The output will show any large files that files that do not typically belong in the /var partition. These could include artifacts such as OneFS install packages, cluster log gathers, packet captures, or other user-created files. Remove the files or move them to the /ifs directory. If you are unsure which, if any, files are viable candidates for removal, contact Dell Support for assistance.

The ‘fstat’ CLI command is a useful tool for listing the open files on a node or in a directory, or to display files that were opened by a particular process. This information can be invaluable for determining if a process is holding a large file open. For example a node’s open files on a node can be displayed as follows:

# fstat

A list of the open files can help in monitoring the processes that are writing large files.

Using the ‘-f’flag will narrow the fstat output to a particularly directory:

# fstat -f <directory_path>

Similarly, to list the files opened by a particular process:

# fstat -p <pid>

If there are no open files found in the /var directory, it is entirely possible that a large file has become unlinked and is consuming space because one or more processes have the file open. The fstat command can be used to confirm this, as follows:

# fstat -f /var | grep var

If a process is holding a file open, output similar to the following is displayed:

root lwio 98281 4 /var 69612 -rw------- 100120000 rw

Here, the lwio daemon (PID 98281) has a 100MB file open that is approximately 100 MB (100120000 bytes). The file’s inode number, 69612, can be used to retrieve the its name:

# find -x /var -inum 69612 -print

/var/log/lwiod.log

If a process is holding a large file open and it’s inode cannot be found, the file is considered to be ‘unlinked’. In this case, the recourse is typically to restart the offending process. Note that, before stopping and restarting a process, consider any possible negative consequences. For example, stopping the OneFS SMB daemon, lwiod, in the example above would potentially disconnect SMB users.

If neither of the suggestions above resolves the issue, the logfile’s rollover file size limit can be reduced and the file itself compressed. To do this, first create a backup of the /etc/newsyslog.conf file as follows:

# cp /etc/newsyslog.conf /ifs/newsyslog.conf
# cp /etc/newsyslog.conf /etc/newsyslog.bak

Next, open the /ifs/newsyslog.conf file in emacs, vi, or editor or choice and locate the following line:

/var/log/wtmp 644 3 * @01T05 B

Change the line to:

/var/log/wtmp 644 3 10000 @01T05 ZB

These changes instruct the system to roll over the /var/log/wtmp file when it reaches 10 MB and then to compress the file with gzip. Save and close the /ifs/newsyslog.conf file, and then run the following command to copy the updated ‘newsyslog.conf’ file to the remaining nodes on the cluster:

# isi_for_array 'cp /ifs/newsyslog.conf /etc/newsyslog.conf'

If other logs are rotating frequently, or if the preceding solutions do not resolve the issue, run the isi_gather_info command to gather logs, and then contact Dell Support for assistance.

There are several options available to stop processes and create a corefile under OneFS:

CLI Command Description
gcore Generate a core dump file of the running process without actually killing it.
kill -6 Stop a single process and get a core dump file
killall -6 Stop all processes and get a core dump file
kill -9 Force a process to stop

The ‘gcore’ CLI command can generate a core dump file from a running process without actually killing it. First, the ‘ps’ CLI command can be used to find and display the process ID (PID) for a running process:

# ps -auxww | egrep 'USER|lsass' | grep -v grep

USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND
root   68547  0.0  0.3 150464 38868 ??   S    Sun11PM   0:06.87 lw-container lsass (lsass)

In the above example, the PID for the lsass process is 68547. Next, the ‘gcore’ CLI command can be used to generate a core dump of this PID and write the output to a location of choice, in this example a file aptly named ‘lsass.core’.

 # gcore -c /ifs/data/Isilon_Support/lsass.core 68547

# ls -lsia /ifs/data/Isilon_Support/lsass.core
4297467006 58272 -rw-------     1 root  wheel  239280128 Jun 10 19:10 /ifs/data/Isilon_Support/lsass.core

Typically, the /ifs/data/Isilon_Support directory provides an excellent location to write the coredump to. Clearly, /var is not a great choice, since the partition is likely already full.

Finally, when the coredump has been written, the ‘isi_gather_info’ tool can be used to coalesce both the core file and pertinent cluster logs and the core into a convenient tarfile.

# isi_gather_info --local-only -f /ifs/data/Isilon_Support/lsass.core

# ls -lsia /ifs/data/Isilon_Support | grep -i gather
4298180122    26 -rw-r--r-- +    1 root  wheel         19 Jun 10 15:44 last_full_gather

The resulting log set, ‘/ifs/data/Isilon_Support/last_full_gather’, is then ready for upload to Dell Support for further investigation and analysis.

Leave a Reply

Your email address will not be published. Required fields are marked *