OneFS Job Engine Exclusion Sets

In order for the OneFS Job Engine framework to support concurrent job execution, it provides the concept of exclusion sets. These are classes of similar jobs that determine which jobs can run simultaneously.

A job is not required to be part of any exclusion set, and jobs may also belong to multiple exclusion sets. As the graphic above shows, there are currently two exclusion sets that jobs can be part of:

  • Restriping
  • Marking.

Let’s first take a look at the restriping exclusion set.

OneFS protects data by writing file blocks across multiple drives on different nodes. This process is known as ‘restriping’ in the OneFS lexicon. The Job Engine defines a restripe exclusion set that contains these jobs that involve file system management, protection and on-disk layout.

The restripe exclusion set contains the following jobs:

Job Name Job Description Access Method
AutoBalance Balances free space in the cluster. Drive + LIN
AutoBalanceLin Balances free space in the cluster. LIN
FlexProtect Rebuilds and re-protects the file system to recover from a failure scenario. Drive + LIN
FlexProtectLin Re-protects the file system. LIN
MediaScan Scans drives for media-level errors. Drive + LIN
MultiScan Runs Collect and AutoBalance jobs concurrently. LIN
SetProtectPlus Applies the default file policy. This job is disabled if SmartPools is activated on the cluster. LIN
ShadowStoreProtect Protect shadow stores which are referenced by a LIN with higher requested protection. LIN
SmartPools Job that runs and moves data between the tiers of nodes within the same cluster. Also executes the CloudPools functionality if licensed and configured. LIN
Upgrade Manage OneFS upgrades. LIN

The restriping exclusion set is per-phase instead of per job. This helps to more efficiently parallelize restripe jobs when they don’t need to lock down resources.  Restriping jobs only block each other when the current phase may perform restriping. This is most evident with MultiScan, whose final phase only sweeps rather than restripes. Similarly, MediaScan, which rarely ever restripes, is usually able to run to completion more without contending with other restriping jobs.

For example, below the two restripe jobs, MediaScan and AutoBalanceLin, are both running their respective first job phases. ShadowStoreProtect, also a restriping job, is in a ‘waiting’ state, blocked by AutoBalanceLin.

Running and queued jobs:

ID    Type               State       Impact  Pri  Phase  Running Time

----------------------------------------------------------------------

26850 AutoBalanceLin     Running     Low     4    1/3    20d 18h 19m 

26910 ShadowStoreProtect Waiting     Low     6    1/1    -           

28133 MediaScan          Running     Low     8    1/8    1d 15h 37m  

----------------------------------------------------------------------

MediaScan restripes in phases 3 and 5 of the job, and only if there are disk errors (ECCs) which require data reprotection. If MediaScan reaches its third job phase with ECCs, it will pause until AutoBalanceLin is no longer running. If MediaScan’s priority were in the range 1-3, it would cause AutoBalanceLin to pause instead.

If two jobs happen to reach their restriping phases simultaneously and the jobs have different priorities, the higher priority job (ie. priority value closer to “1”) will continue to run, and the other will pause. If the two jobs have the same priority, the one already in its restriping phase will continue to run, and the one newly entering its restriping phase will pause.

Next, we’ll examine the marking exclusion set.

The OneFS file system actively marks the blocks which are actually in use by the file system. A good example of this class of job is IntegrityScan, which traverses the live file system, marking every block of every LIN in the cluster to proactively detect and resolve any issues with the structure of data in a cluster.

The jobs that comprise the marking exclusion set are:

Job Name Job Description Access Method
Collect Reclaims disk space that could not be freed due to a node or drive being unavailable while they suffer from various failure conditions. Drive + LIN
IntegrityScan Performs online verification and correction of any file system inconsistencies. LIN
MultiScan Runs Collect and AutoBalance jobs concurrently. LIN

Jobs may also belong to both exclusion sets. An example of this is MultiScan, since it includes both AutoBalance and Collect.

Multiple jobs from the same exclusion set will not run at the same time. For example, Collect and IntegrityScan cannot be executed simultaneously, as they are both members of the marking jobs exclusion set. Similarly, MediaScan and SetProtectPlus won’t run concurrently, as they are both part of the restripe exclusion set.

The majority of the jobs do not belong to an exclusion set. These are typically the data services feature support jobs, and they can coexist and contend with any of the other jobs.

Exclusion sets do not change the scope of the individual jobs themselves, so any runtime improvements via parallel job execution are the result of job management and impact control. The Job Engine monitors node CPU load and drive I/O activity per worker thread every twenty seconds to ensure that maintenance jobs do not cause cluster performance problems.

If a job affects overall system performance, Job Engine reduces the activity of maintenance jobs and yields resources to clients. Impact policies limit the system resources that a job can consume and when a job can run. You can associate jobs with impact policies, ensuring that certain vital jobs always have access to system resources.

It’s worth noting that the job engine exclusion sets are pre-defined, and cannot be modified or reconfigured.

Leave a Reply

Your email address will not be published. Required fields are marked *