OneFS Small File Storage Efficiency – Part 2

There are three main CLI commands that report on the status and effect of small file efficiency:

  • isi job reports view <job_id>
  • isi_packing –fsa
  • isi_sfse_assess

In when running the isi job report view command, enter the job ID as an argument. In the command output, the ‘file packed’ field will indicate how many files have been successfully containerized. For example, for job ID 1018:

# isi job reports view –v 1018

SmartPools[1018] phase 1 (2020-08-02T10:29:47

---------------------------------------------

Elapsed time                        12 seconds

Working time                        12 seconds

Group at phase end                  <1,6>: { 1:0-5, smb: 1, nfs: 1, hdfs: 1, swift: 1, all_enabled_protocols: 1}

Errors

‘dicom’:

      {‘Policy Number’: 0,

      ‘Files matched’: {‘head’:512, ‘snapshot’: 256}

      ‘Directories matched’: {‘head’: 20, ‘snapshot’: 10},

      ‘ADS containers matched’: {‘head’:0, ‘snapshot’: 0},

      ‘ADS streams matched’: {‘head’:0, ‘snapshot’: 0},

      ‘Access changes skipped’: 0,

‘Protection changes skipped’: 0,

‘Packing changes skipped’: 0,

‘File creation templates matched’: 0,

‘Skipped packing non-regular files’: 2,

‘Files packed’: 48672,

‘Files repacked’: 0,

‘Files unpacked’: 0,

},

}

The second command, isi_packing –fsa, provides a storage efficiency percentage in the last line of its output. This command requires InsightIQ to be licensed on the cluster and a successful run of the file system analysis (FSA) job.

If FSA has not been run previously, it can be kicked off with the following isi job jobs start FSAnalyze command. For example:

# isi job jobs start FSAnalyze

Started job [1018]

When this job has completed, run:

# isi_packing -–fsa -–fsa-jobid 1018

FSAnalyze job: 1018 (Mon Aug 2 22:01:21 2020)

Logical size:  47.371T

Physical size: 58.127T

Efficiency:    81.50%

In this case, the storage efficiency achieved after containerizing the data is 81.50%, as reported by isi_packing.

If you don’t specify an FSAnalyze job ID, the –fsa defaults to the last successful FSAnalyze job run results.

Be aware that the isi_packing –fsa command reports on the whole /ifs filesystem. This means that the overall utilization percentage can be misleading if other, non-containerized data is also present on the cluster.

There is also a Storage Efficiency assessment tool provided, which can be run as from the CLI with the following syntax:

# isi_sfse_assess <options>

Estimated storage efficiency is presented in the tool’s output in terms of raw space savings as a total and percentage and a percentage reduction in protection group overhead.

SFSE estimation summary:

* Raw space saving: 1.7 GB (25.86%)

* PG reduction: 25978 (78.73%)

When containerized files with shadow references are deleted, truncated or overwritten it can leave unreferenced blocks in shadow stores. These blocks are later freed and can result in holes which reduces the storage efficiency.

The actual efficiency loss depends on the protection level layout used by the shadow store.  Smaller protection group sizes are more susceptible, as are containerized files, since all the blocks in containers have at most one referring file and the packed sizes (file size) are small.

A shadow store deframenter helps reduce fragmentation resulting of overwrites and deletes of files. This defragmenter is integrated into the ShadowStoreDelete job. The defragmentation process works by dividing each containerized file into logical chunks (~32MB each) and assessing each chunk for fragmentation.

If the storage efficiency of a fragmented chunk is below target, that chunk is processed by evacuating the data to another location. The default target efficiency is 90% of the maximum storage efficiency available with the protection level used by the shadow store. Larger protection group sizes can tolerate a higher level of fragmentation before the storage efficiency drops below this threshold.

The ‘isi_sstore list’ command will display fragmentation and efficiency scores. For example:

# isi_sstore list -v                    

              SIN  lsize   psize   refs  filesize  date       sin type underfull frag score efficiency

4100:0001:0001:0000 128128K 192864K 32032 128128K Sep 20 22:55 container no       0.01        0.66

The fragmentation score is the ratio of holes in the data where FEC is still required, whereas the efficiency value is a ratio of logical data blocks to total physical blocks used by the shadow store. Fully sparse stripes don’t need FEC so are not included. The general rule is that lower fragmentation scores and higher efficiency scores are better.

The defragmenter does not require a license to run and is disabled by default. However, it can be easily activated using the following CLI commands:

# isi_gconfig -t defrag-config defrag_enabled=true

Once enabled, the defragmenter can be started via the job engine’s ShadowStoreDelete job, either from the OneFS WebUI or via the following CLI command:

# isi job jobs start ShadowStoreDelete

The defragmenter can also be run in an assessment mode. This reports on and helps to determine the amount of disk space that will be reclaimed, without moving any actual data. The ShadowStoreDelete job can run the defragmenter in assessment mode but the statistics generated are not reported by the job. The isi_sstore CLI command has a ‘defrag’ option and can be run with the following syntax to generate a defragmentation assessment:

# isi_sstore defrag -d -a -c -p -v

…

Processed 1 of 1 (100.00%) shadow stores, space reclaimed 31M

Summary:

    Shadows stores total: 1

    Shadows stores processed: 1

    Shadows stores skipped: 0

    Shadows stores with error: 0

    Chunks needing defrag: 4

    Estimated space savings: 31M

Leave a Reply

Your email address will not be published. Required fields are marked *