OneFS Quota Domains

In the previous article, we looked at the use of protection domains in OneFS, focusing on SyncIQ replication, SmartLock immutable archiving, and Snapshots and SnapRevert.

Under the hood, SmartQuotas is also based on the concept of domains – the linchpins of quota accounting. Since OneFS is a single file system, it relies on accounting domains for defining the scope of a quota in place of the typical volume boundaries found in most storage systems. As such, a domain defines which files belong to a quota, accounts for each resource type in that set and defines the top-level directory configuration point.

For SmartQuotas, the three main resource types are:

Resource Type Description
Directory A specific directory and all its subdirectories
User A specific user
Group All members of a specific group

A domain defined as “name@folder” would be the set of files under “folder”, owned by “name”, which could be either a user or a group. The files accounted include all files reachable from the given path, without traversing any soft links. The owner “name” can be ALL, and “/ifs”, the OneFS root directory is also an effective ALL for “folder”.

With SmartQuotas it’s easy to create traditional domain types quickly by using “ALL”. The following are examples of domain types:

  • All files belonging to user Jane: user:Jane@/ifs
  • All files under /ifs/home, belonging to any user: ALL@/ifs/home.
  • All files under /ifs/home that belong to user Jane: user:Jane@/ifs/home

Domains cannot be created on anything but directories. More specifically, domains are associated with the actual directories themselves, not directory paths. For example, if the domain is ALL@/ifs/home/data, but /ifs/home/data gets renamed to /ifs/home/files, the domain stays with the directory.

Domains can also be nested and may overlap. For example, say a hard quota is set on /ifs/data/marketing for 5TB. 1TB soft quotas are then placed on individual users in the marketing department. This ensures that the marketing directory as a whole never exceeds 5TB, while limiting the users in the marketing department to 1TB each.

A default quota domain is one that does not account for any specific set of files but instead specifies a policy for new domains that match a specific trigger. In other words, default domains are configuration templates for actual domains. SmartQuotas use the identity notation ‘default-user’, ‘default-group’, and ‘default directory’ to describe domains with default policies. For example, the domain default-user@/ifs/home becomes specific-user@/ifs/home for each specific-user that is not otherwise defined. All enforcements on default-user are copied to specific-user when specific-user allocates within the domain and the new inherited domain quota is termed as a Linked Quota. There may be overlapping defaults (i.e. default-user@/ifs and default-user@/ifs/home may both be defined).

Default quota domains help drastically simplify quota management for large environments by providing a mechanism to define top level template configurations from which many actual quotas are cloned, or linked. When a default quota domain is configured on a directory, any subdirectories created directly underneath this will automatically inherit the quota limits specified in the parent domain. This streamlines the provisioning and management quotas for large enterprise environments. Furthermore, default directory quotas can co-exist with user and/or group quotas and legacy default quotas.

Default directory quotas have been available since OneFS 8.2, in addition to the default user and group quotas available in earlier releases. For example:

  • Create default-directory quota
# isi quota create --path=/ifs/parent-dir --type=default-directory --hard-threshold=10M
  • Modify Default directory quota
# isi quota modify --path=/ifs/parent-dir --type=default-directory --advisory-threshold=6M --soft-threshold=7M --soft-grace=1D
  • List default-directory quota
# isi quota list                 

  Type              AppliesTo  Path            Snap  Hard   Soft  Adv  Used

  --------------------------------------------------------------------------

  default-directory DEFAULT    /ifs/parent-dir No    10.00M -    6.00M 0.00

  --------------------------------------------------------------------------

  Total: 1
  • Delete Default directory quota
# isi quota delete --path=/ifs/parent-dir --type=default-directory

If the enforcements on a default domain change, SmartQuotas will automatically propagate the changes to the Linked Quota domains. If a default quota domain is deleted, SmartQuotas will delete all children marked as inherited. An administrator may also choose to delete the default without deleting the children, but this will break inheritance on all inherited children.

For example, the creation & deletion of sub-directory under default directory folder causes inherited directory quota creation and removal:

A domain may be in one of three accounting states, as follows:

Quota Accounting States Description
Ready A domain in the ready state is fully accounted. SmartQuotas displays “ready” domains in all interfaces and all enforcements apply to such domains.
Accounting A domain is placed in the Accounting state when it’s waiting on accounting updates.
Deleting After a request to delete a domain, SmartQuotas will place the domain in the deleting state until tear-down is complete. Domain removal may be a lengthy process.

SmartQuotas displays accounting domains in all interfaces including usage data but indicate they are in the process of being “Accounted”. SmartQuotas applies all enforcements to accounting domains, even when it might reject an allocation that would have proceeded if it had completed the QuotaScan.

Domains in the deleting state are hidden from all interfaces and the top-level directory of a domain may be deleted while the domain is still in the deleting state (assuming there are no domains in “Ready” or “Accounting” state defined on the directory). No enforcements are applied for domains in “Deleting” state.

A quota scan is performed when the domain is in an Accounting State. This can occur during quota creation to account the new domain if a quota has been set for the domain and quota deletion to un-account the domain. A QuotaScan is required when creating a quota on a non-empty directory. If quotas are created up-front on an empty directory, no QuotaScan is necessary.

In addition, a QuotaScan job may be started from the WebUI or command line interface using the “isi job” command. Any path specified on the command line is treated as the root of a tree that should be processed. This is provided primarily as a means to re-scan a directory or maintenance reasons.

There are main three processes or daemons associated with SmartQuotas:

  • isi_quota_notify_d
  • isi_quota_sweeper_d
  • isi_quota_report_d

The job of the notification daemon, isi_quota_notify_d, is to listen for ‘limit exceeded’ and ‘link denied’ events and generate notifications for each. It also responds to configuration change events and instructs the QDB to generate ‘expired’ and ‘violated’ over-threshold notifications.

A quota sweeper daemon, isi_quota_sweeper_d, is responsible for a number of quota housekeeping tasks such as propagating default changes, domain and notification rule garbage collection and kicking off QuotaScan jobs when necessary.

Finally, the reporting daemon, isi_quota_report_d, is responsible for generating quota reports. Since the QDB only produces real-time resource usage, reports are necessary for providing point-in-time vies of a quota domain’s usage. These historical reports are useful for trend analysis of quota resource usage.

OneFS 8.2 and subsequent releases use the rpc.quotad service to facilitate client-side quota reporting on UNIX and Linux clients via native ‘quota’ tools. The service which runs on tcp/udp port 762 is enabled by default, and control is under NFS global settings.

Additionally, in OneFS 8.2 and later, users can now see their available user capacity set by soft and/or hard user and group quotas rather than the entire cluster capacity or parent directory-quotas. This avoids the ‘illusion’ of seeing available space that may not be associated with their quotas.

OneFS Protection Domains

In OneFS, a domain defines a set of behaviors for a collection of files under a specified directory tree. More specifically, a protection domain is a marker which prevents a configured subset of files and directories from being deleted or modified.

If a directory has a protection domain applied to it, that domain will also affect all of the files and subdirectories under that top-level directory. As we’ll see, in some instances, OneFS creates protection domains automatically, but they can also be configured manually.

With the recent introduction of domain-based snapshots, OneFS now supports four types of protection domain:

  • SnapRevert domains
  • SmartLock domains
  • SyncIQ domains
  • Snapshot domains

The process of restoring a snapshot in full to its top level directory can easily be accomplished by the SnapRevert job. This enables cluster administrators to quickly revert to a previous, known-good recovery point – for example in the event of a virus or malware outbreak, The SnapRevert job can be run from the job engine WebUI or CLI, and simply requires adding the desired snapshot ID.

SnapRevert domains are assigned to directories that are contained in snapshots to prevent files and directories from being modified while a snapshot is being reverted. OneFS does not automatically create SnapRevert domains. The SnapRevert domain is described as a ‘restricted writer’ domain, in OneFS jargon. Essentially, this is a piece of extra filesystem metadata and associated locking that prevents a domain’s files being written to while restoring a last known good snapshot.

Because the SnapRevert domain is essentially just a metadata attribute, or marker, placed onto a file or directory, a preferred practice is to create the domain before there is data. This avoids having to wait for DomainMark or DomainTag (the aptly named Job Engine jobs that mark a domain’s files) to walk the entire tree, setting that attribute on every file and directory within it.

There are two main components to SnapRevert:

  • The file system domain that the objects are put into.
  • The job that reverts everything back to what’s in a snapshot.

The SnapRevert job itself actually uses a local SyncIQ policy to copy data out of the snapshot, discarding any changes to the original directory. When the SnapRevert job completes, the original data is left in the directory tree. In other words, after the job completes, the file system (HEAD) is exactly as it was at the point in time that the snapshot was taken. The LINs for the files/directories don’t change, because what’s there is not a copy.

The SnapRevert job can either be scheduled or manually run from the OneFS WebUI by navigating to Cluster Management > Job Operations > Job Types > SnapRevert and clicking the ‘Start Job’ button.

A snapshot can’t be reverted until a SnapRevert domain has been created on its top level directory. If necessary, SnapRevert domains can also be nested. For example, domains could be successfully created on both /ifs/snap1 and /ifs/snap1/snap2. Also. A SnapRevert domain can easily be deleted if you no longer need to restore snapshots of that directory.

It’s worth noting that CloudPools also supports SnapRevert for SmartLink (stub) files. For example, if CloudPools archived “/ifs/cold_data”, the files in this directory would be replaced with stubs and the data moved off to the cloud provider of choice. If you then created a domain for the directory and ran the SnapRevert job, the original files would be restored to the directory, and CloudPools would remove any cloud data that was created as part of the original archive process.

SmartLock domains are assigned to WORM (write once, read many) immutable archive directories to prevent committed files from being modified or deleted. OneFS automatically sets up a SmartLock domain when a SmartLock directory is created. Note that a SmartLock domain cannot be manually deleted. However, if you remove a SmartLock directory, OneFS automatically deletes the associated SmartLock domain.

Once a file is SmartLocked (WORM committed) it cannot ever be modified or moved. It cannot be deleted until its ‘committed until’ or ‘expiry’ date has passed.  Even when the expiry date has passed (ie. the file is in an ‘expired’ state) it cannot be modified or moved.  All you can do with an expired file is either delete it or extend its ‘committed until’ date into the future.

SyncIQ domains can be assigned to both the source and target directories of replication policies. OneFS automatically creates a SyncIQ domain for the target directory of a replication policy the first time that the policy is run. OneFS also automatically creates a SyncIQ domain for the source directory of a replication policy during the failback process.

A SyncIQ domain can be manually created for a source directory before initiating the failback process, by configuring the policy for accelerated failback. However, a SyncIQ domain that marks the target directory of a replication policy cannot be deleted.

SnapshotIQ also uses a domain-based model for governance of scheduled snapshots in OneFS 8.2 and later releases. By utilizing the OneFS IFS domains infrastructure, recurring snapshot efficiency and performance is increased by limiting the scope of governance to a smaller, well defined domain boundary.

IFS Domains provide a Mark Job that proactively marks all the files in the domain. Creating a new snapshot on a fully marked domain will not cause further “painting” operations, thereby avoiding a significant portion of the resource overhead caused by taking a new snapshot.

Once a domain has been fully marked, subsequent snapshot creation operations will not cause any further painting. The new snapshot ID is simply added to the domain data section, so the creation of a new snapshot will not trigger a system-wide painting event anymore. Domains are re-used whenever possible.

Creating two domains of the same type on the same directory will cause the second domain to become an alias of the first domain. Aliases don’t require marking since they share the already existing marks. This benefits both snapshots and snapshot schedules taken on the same directory. For all these reasons, the number of I/O and locking operations needed to resolve snapshot governance is greatly reduced. Because the SnapIDs are stored in a single location (as opposed to being stored on individual inodes), this greatly simplifies Snapshot ID garbage collection whenever a Snapshot is deleted. By leveraging IFS Domains, creating a new snapshot on a domain that is fully marked will not cause further “painting” operations, so a significant portion of the performance impact caused by taking a new snapshot is avoided.

The illustration above shows an example of domain-based snapshots. In this case, a snapshot was taken on the ‘projects’ directory, and the on the directory named ‘video’. File v1.mp4 is tagged with the domain IDs, making it more efficient to determine snapshot governance.

A snapshot of file v1.mp4 creates a snap_ID in the domain’s SBT (system b-tree) providing a single place to store snapshot metadata. In previous OneFS versions, snapIDs were stored in the inode, which resulted in duplication of the snap_IDs and metadata usage.

Note that only snapshots taken after upgrade to OneFS 8.2 will use IFS domains backing. Any snapshots created prior to upgrade will not be converted and will remain in their original form.

Additionally, the new domain-based snapshot functionality in OneFS 8.2 brings other benefits including:

  • Improved management of SnapIDs
  • Reduced number of operations needed to resolve snapshot governance.
  • More efficient use of metadata
  • The automatic exclusion of the cluster’s /ifs/.ifsvar subtree from all root (/ifs) snapshots – although this behavior is configurable.
  • The write cache, or coalescer, is enhanced to better support parallel snapshot creates.
  • The snapshot create path is improved to reduce contention on the STF during copy-on-write.

Sync and snap domains can be easily created to enable snapshot revert and replication failover operations. However, SmartLock domains cannot be manually created, however, since OneFS automatically creates a domain upon creation of a SmartLock directory.

For example, the following CLI syntax will create a SnapRevert domain for /ifs/snap1:

# isi job jobs start domainmark --root /ifs/snap1 --dm-type SnapRevert

And from the WebUI:

You can delete a replication or snapshot revert domain if you want to move directories out of the domain. However, SmartLock domains cannot be manually removed, but will be automatically removed upon deletion of a SmartLock directory.

The following CLI command will delete a SnapRevert domain on /ifs/snap1:

# isi job jobs start domainmark --root /ifs/snap1 --dm-type SnapRevert –delete

Similarly, via the WebUI:

Protection domains can (and usually should) be manually created before they are required by OneFS to perform certain actions. However, manually creating protection domains can limit the ability to interact with the data marked by the domain.

OneFS 8.2 and later releases provide an ‘isi_pdm’ CLI utility for managing protection domains, with the following syntax:

#isi_pdm -h

usage: isi_pdm [-h] [-v]

               {base,domains,exclusions,operations,ifsvar-sysdom} ...




positional arguments:

  {base,domains,exclusions,operations,ifsvar-sysdom}

    base                Read base domains.

    domains             Read or manipulate domain instances.

    exclusions          Add or list domain exclusions.

    operations          Read pending pdm operations.

    ifsvar-sysdom       Manage .ifsvar system domain.




optional arguments:

  -h, --help            show this help message and exit

  -v, --verbose

For example:

# isi_pdm domains list /ifs/data All

[ 2.0100, 315.0100 ]

# isi_pdm exclusions list 2.0100

{

    DomID = 16.8100

    Owner LIN = 1:0000:0001

}

Domain membership can also be viewed via the ‘isi get’ command.

Here are some OneFS domain recommendations, constraints, and considerations:

  • Copying a large number of files into a protection domain can be a lengthy process, since each file must be marked individually as belonging to the protection domain.
  • The best practice is to create protection domains for directories while the directories are empty, and then add files to the directory.
  • Theisi sync policies create command contains an ‘—accelerated-failback true’ option, which automatically marks the domain. This can save considerable time during failback.
  • If you use SyncIQ to create a replication policy for a SmartLock compliance directory, the SyncIQ and SmartLock compliance domains must be configured at the same root directory level. A SmartLock compliance domain cannot be nested inside a SyncIQ domain.
  • If a domain is currently preventing the modification or deletion of a file, you cannot create a protection domain for a directory that contains that file. For example, if /ifs/data/smartlock/file.txt is set to a WORM state by a SmartLock domain, you cannot create a SnapRevert domain for /ifs/data/.
  • Directories cannot be moved in or out of protection domains. However, you can move a directory to another location within the same protection domain.

OneFS MultiScan, AutoBalance, & Collect

As we’ve seen throughout the recent file system maintenance job articles, OneFS utilizes file system scans to perform such tasks as detecting and repairing drive errors, reclaiming freed blocks, etc. Since these scans typically involve complex sequences of operations, they are implemented via syscalls and coordinated by the Job Engine. These jobs are generally intended to run as minimally disruptive background tasks in the cluster, using spare or reserved capacity.

FS Maintenance Job Description
AutoBalance Restores node and drive free space balance
Collect Reclaims leaked blocks
FlexProtect Replaces the traditional RAID rebuild process
MediaScan Scrub disks for media-level errors
MultiScan Run AutoBalance and Collect jobs concurrently

In this final article of the series, we’ll turn our attention to MultiScan. This job is a combination of both the of the AutoBalance job, which rebalances data across drives, and the Collect job, which recovers leaked blocks from the filesystem. In addition to reclaiming unused capacity as a result of drive replacements, snapshot and data deletes, etc, MultiScan also helps expose and remediate any filesystem inconsistencies.

The OneFS job engine defines two exclusion sets that govern which jobs can execute concurrently on a cluster. MultiScan straddles both of the job engine’s exclusion sets, with AutoBalance (and AutoBalanceLin) in the restripe set, and Collect in the mark set.

The restriping exclusion set is per-phase instead of per job, which helps to more efficiently parallelize restripe jobs when they don’t need to lock down resources. However, with the marking exclusion set, OneFS can only accommodate a single marking job at any point in time.

MultiScan is an unscheduled job that runs by default at ‘LOW’ impact and executes AutoBalance and Collect simultaneously. It is triggered by cluster group change events, which include node boot, shutdown, reboot, drive replacement, etc. While AutoBalance will execute each time the MultiScan job is triggered, Collect typically won’t be run more often that once every 2 weeks. AutoBalance and/or Collect are typically only run manually if MultiScan has been disabled.

When a new node or drive is added to the cluster, its blocks are almost entirely free, whereas the rest of the cluster is usually considerably more full, capacity-wise. AutoBalance restores the balance of free blocks in the cluster. As such, AutoBalance runs if a cluster’s nodes have a greater than 5% imbalance in capacity utilization. In addition, AutoBalance also fixes recovered writes that occurred due to transient unavailability and also addresses fragmentation.

If the cluster’s nodes contain SSDs, AutoBalanceLin (as opposed to the regular AutoBalance job) runs most efficiently by performing a LIN scan using a flash-backed metadata mirror. When a cluster is unbalanced, there is not an obvious subset of files to filter, since the files to be restriped are the ones which are not using the node or drive with less free space. In the case of an added node or drive, no files will be using it. As a result, almost any file scanned is enumerated for restripe.

As mentioned, the Collect job reclaims leaked blocks using a mark and sweep process. In traditional UNIX systems this function is typically performed by the ‘fsck’ utility. With OneFS, however, the other traditional functions of fsck are not required, since the transaction system keeps the file system consistent. Leaks only affect free space.

Collect’s ‘mark and sweep’ gets its name from the in-memory garbage collection algorithm. First, the in-use blocks and any new allocations are marked with the current generation in the Mark phase. When this is complete, the drives are swept of any blocks which don’t have the current generation in the Sweep phase.

In addition to automatic job execution following a group change event, Multiscan can also be initiated on demand. The following CLI syntax will kick of a manual job run:

# isi job start multiscan

Started job [209]

# isi job list

ID   Type      State   Impact  Pri  Phase  Running Time

--------------------------------------------------------

209  MultiScan Running Low     4    1/4    1s

--------------------------------------------------------

Total: 1

The Multiscan job’s progress can be tracked via a CLI command as follows:

# isi job jobs view 209

               ID: 209

             Type: MultiScan

            State: Running

           Impact: Low

           Policy: LOW

              Pri: 4

            Phase: 1/4

       Start Time: 2021-01-03T20:15:16

     Running Time: 34s

     Participants: 1, 2, 3

         Progress: Collect: 225 LINs, 0 errors

                   AutoBalance: 225 LINs, 0 errors

                   LIN Estimate based on LIN count of 2793 done on Jan 04 20:02:57 2021

                   LIN Based Estimate:  3m 2s Remaining (8% Complete)

                   Block Based Estimate: 5m 48s Remaining (4% Complete)

                   0 errors total

Waiting on job ID: -

      Description: Collect, AutoBalance

The LIN (logical inode) statistics above include both files and directories.

Be aware that the estimated LIN percentage can occasionally be misleading/anomalous. If concerned, verify that the stated total LIN count is roughly in line with the file count for the cluster’s dataset. Even if the LIN count is in doubt, the estimated block progress metric should always be accurate and meaningful.

If the job is in its early stages and no estimation can be given (yet), isi job will instead report its progress as “Started”. Note that all progress is reported per phase, with MultiScan phase 1 being the one where the lion’s share of the work is done. By comparison, phases 2-4 of the job are comparatively short.

A job’s resource usage can be traced from the CLI as such:

# isi job statistics view

     Job ID: 209

      Phase: 1

   CPU Avg.: 11.46%

Memory Avg.

        Virtual: 301.06M

       Physical: 28.71M

        I/O

            Ops: 3513425

          Bytes: 26.760G

Finally, upon completion, the Multiscan job report, detailing all four stages, can be viewed by using the following CLI command with the job ID as the argument:

# isi job reports view 209

MultiScan[209] phase 1 (2021-01-03T20:02:57)

--------------------------------------------

Elapsed time          307 seconds (5m7s)

Working time          307 seconds (5m7s)

Errors                0

Rebalance/LINs        2793

Rebalance/Files       2416

Rebalance/Directories 377

Rebalance/Errors      0

Rebalance/Bytes       372607773184 bytes (347.018G)

Collect/LINs          2788

Collect/Files         2411

Collect/Directories   377

Collect/Errors        0

Collect/Bytes         130187742208 bytes (121.247G)




MultiScan[209] phase 2 (2021-01-03T20:02:57)

--------------------------------------------

Elapsed time     0 seconds

Working time     0 seconds

Errors           0

LINs traversed   0

LINs processed   0

SINs traversed   0

SINs processed   0

Files seen       0

Directories seen 0

Total bytes      0 bytes




MultiScan[209] phase 3 (2021-01-03T20:02:58)

--------------------------------------------

Elapsed time          1 seconds

Working time          1 seconds

Errors                0

Rebalance/SINs        0

Rebalance/Files       0

Rebalance/Directories 0

Rebalance/Errors      0

Rebalance/Bytes       0 bytes

Collect/SINs          0

Collect/Files         0

Collect/Directories   0

Collect/Errors        0

Collect/Bytes         0 bytes

Unbalanced diskpools  Pool_Name = h600_18tb_3.2tb-ssd_256gb:2, free_blocks = 8693136159, total_blocks = 8715355092

Pool_Name = h600_18tb_3.2tb-ssd_256gb:3, free_blocks = 7259260440, total_blocks = 7262795910







MultiScan[209] phase 4 (2021-01-03T20:03:17)

--------------------------------------------

Elapsed time 19 seconds

Working time 19 seconds

Errors       0

Drives swept 33

LINs freed   0

Inodes freed 128359

Bytes freed  80022503424 bytes (74.527G)

Keys freed   0

Inodes lost  0

Enable RFC2307 for OneFS and Active Directory

Windows Active Directory(AD) supports authenticate the Unix/Linux clients with the RFC2307 attributes ((e.g. GID/UID etc.). The Isilon OneFS is also RFC2307 compatible. So it is recommended to use Active Directory as the OneFS authentication provider to enable the centric identity management and authentication. This post will talk about the configurations to integrate AD and OneFS with RFC2307 compatible. In this post, Windows 2012R2 AD and OneFS 8.1.0 is used to show the process.

Prepare Windows 2012R2 AD for Unix/Linux

Unlike Windows 2008, Windows 2012 comes equipped with the UNIX attributes already loaded within the schema. And as of this release the Identity Services for UNIX feature has been deprecated, although still available until Windows 2016 the NIS and Psync services are not required.

The UI elements to configure RFC2307 attributes are not as nice as they were in 2008 since the IDMU MMC snap-in has also been depreciated. So we will install the IDMU component first to make it easier to configure the UID/GID attributes. With the following command, you can install the IDMU component in Windows 2012R2.

  • To install the administration tools for Identity Management for UNIX.
dism.exe /online /enable-feature /featurename:adminui /all
  • To install Server for NIS.
dism.exe /online /enable-feature /featurename:nis /all
  • To install Password Synchronization.
dism.exe /online /enable-feature /featurename:psync /all

After restarting the AD, you can see the UI element(UNIX Attributes) tab same as Windows 2008R2, shown as below. Now you can configure your AD users/groups to compatible with Unix/Linux environment. Recommended to configure the UID/GID to 10000 and above, meanwhile, do not overlap with the OneFS default auto-assign UID/GID range (1000000 – 2000000).

Configure the OneFS  Active Directory authentication provider to enable RFC2307

For mixed mode(Unix/Linux/Windows) authentication operations, there are several advanced options Active Directory authentication provider will need to be enabled.

  • Services for UNIX: rfc2307 – This leverages the Identity Management for UNIX services in the Active Directory schema
  • Auto-Assign UIDs: No – OneFS by default will generate pseudo UIDs for users it cannot match to SIDs this can cause potential user mapping issues.
  • Auto-Assign GIDs: No – OneFS by default will generate pseudo GIDs for groups it cannot match to SIDs as with the user mapping equally a group-mapping mismatch could occur.

You can do this configuration using both WebUI and CLI, with command isi auth ads modify EXAMPLE.LOCAL –sfu-support=rfc2307 –allocate-uids=false –allocate-gids=false. Or change the settings from the WebUI, shown below:

After the configurations above, the OneFS can use Active Directory as identity source for Unix/Linux client, and in this method, you can also simplify the identity management, as you have a centric identity source (AD) to be used for both Unix/Linux clients and Windows clients.

Configure SSH Multi-Factor Authentication on OneFS 8.2 Using Duo

SSH Multi-Factor Authentication (MFA) with Duo is a new feature introduced in OneFS 8.2. Currently, OneFS supports SSH MFA with Duo service through SMS (short message service), phone callback, and Push notification via the Duo app. This blog will cover the configuration to integrate OneFS SSH MFA with Duo service.

Duo provides service to many kinds of applications, like Microsoft Azure Active Directory, Cisco Webex, Amazon Web Services and etc. For an OneFS cluster, it is represented as a “Unix Application” entry.  To integrate OneFS with Duo service, configuration is required on Duo service and OneFS cluster. Before configuring OneFS with Duo, you need to have Duo account. In this blog, we used a trial version account for demonstration purposes.

Failback mode

By default, the SSH failback mode for Duo in OneFS is “safe”, which will allow common authentication if Duo service is not available. The “secure” mode will deny SSH access if Duo service is not available, including the bypass users, because the bypass users are defined and validated in the Duo service. To configure the failback mode in OneFS, specify –failmode  option using command isi auth duo modify .

Exclusion group

By default, all groups are required to use Duo unless the Duo group is configured to bypass Duo auth. The groups option allows you to exclude or specify dedicated user groups from using Duo service authentication. This method provides a way to configure users that can still SSH into the cluster even when the Duo service is not available and failback mode is set to “secure”. Otherwise, all users may be locked out of cluster in this situation.

To configure the exclusion group option, add an exclamation character “!” before the group name and preceded by an asterisk to ensure that all other groups use Duo service. An example is shown as below:

# isi auth duo modify --groups=”*,!groupname”

Note: zsh shell requires the “!” to be escaped. In this case, the example above should be changed to isi auth duo modify –groups=”*,\!groupname”

Prepare Duo service for OneFS

  1. Use your new Duo account to log into the Duo Admin Panel. Select the “Application” item from the left menu. And then click “Protect an Application”, Shown in Figure 1.
Figure 1 Protect an Application
  1. Type “Unix Application” in the search bar. Click “Protect this Application” to create a new Unix Application entry. See Figure 2.
Figure 2 Search for Unix Application
  1. Scroll down the creation page and find the “Settings” section. Type a name for the new Unix Application. It is recommended to use a name which can recognize your OneFS cluster, shown as Figure 3. In this section, you can also find the Duo’s name normalization setting. By default, Duo username normalization is not AD aware, it will alter incoming usernames before trying to match them to a user account. For example, “DOMAIN\username”, “username@domain.com“, and “username” are treated as the same user. For other options, refer to here.
Figure 3 Unix Application Name
  1. Check the required information for OneFS under “Details” section, including API hostnameintegration key, and secret key. Shown as Figure 4
Figure 4 Required Information for OneFS
  1. Manually enroll a user. In this example, we will create a user named “admin” which is the default OneFS administrator user. Switch the menu item to “Users” and click “Add User” button, shown as Figure 5. For details about user enrollment on Duo service, refer to Duo documentation Enrolling Users.
Figure 5 User Enrollment
  1. Type the user name, shown as Figure 6.
Figure 6 Manually User Enrollment
  1. Find the “Phones” settings in the user page and click “Add Phone” button to add a device for the user. Shown in Figure 7.
Figure 7 Add Phone for User
  1. Type your phone number.
Figure 8 Add New Phone
  1. (optional) If you want to use Duo push authentication methods, you need to install Duo Mobile app in the phone and activate the Duo Mobile. As highlighted in Figure 9, click the link to activate the Duo Mobile.
Figure 9 Activate Duo Mobile

OneFS Configuration and Verification

  1. By default, the authentication setting template is set for “any”. To use OneFS with Duo service, the authentication setting template must not be set to “any” or “custom”. It should be set to “password”, “publickey”, or “both”. In this example, we configure the setting to “password”, which will use user password and Duo for SSH MFA. Shown as the following command:
# isi ssh modify --auth-settings-template=password
  1. Confirm the authentication method using the following command:
# isi ssh settings view| grep "Auth Settings Template"
      Auth Settings Template: password
  1. Configure required Duo service information and enable it for SSH MFA, shown as below, use the information when we set up Unix Application in Duo, including API hostname, integration key, and secret key.
# isi auth duo modify --enabled=true --failmode=safe --host=api-13b1ee8c.duosecurity.com --ikey=DIRHW4IRSC7Q4R1YQ3CQ --set-skey

Enter skey:

Confirm:
  1. Verify SSH MFA using the user “admin”. An SMS passcode and user’s password are used for authentication in this example, shown as Figure 10.
Figure 10 SSH MFA Verification

Using Dell EMC Isilon with Microsoft’s SQL Server Big Data Clusters

By Boni Bruno, Chief Solutions Architect | Dell EMC

Dell EMC Isilon

Dell EMC Isilon solves the hard scaling problems our customers have with consolidating and storing large amounts of unstructured data.  Isilon’s scale-out design and multi-protocol support provides efficient deployment of data lakes as well as support for big data platforms such as Hadoop, Spark, and Kafka to name a few examples.

In fact, the embedded HDFS implementation that comes with Isilon OneFS has been CERTIFIED by Cloudera for both HDP and CDH Hadoop distributions.  Dell EMC has also been recognized by Gartner as a Leader in the Gartner Magic Quadrant for Distributed File Systems and Object Storage four years in a row.  To that end, Dell EMC is delighted to announce that Isilon is a validated HDFS tiering solution for Microsoft’s SQL Server Big Data Clusters.

SQL Server Big Data Clusters & HDFS Tiering with Dell EMC Isilon

SQL Server Big Data Clusters allow you to deploy clusters of SQL Server, Spark, and HDFS containers on Kubernetes. With these components, you can combine and analyze MS SQL relational data with high-volume unstructured data on Dell EMC Isilon. This means that Dell EMC customers who have data on their Isilon clusters can now make their data available to their SQL Server Big Data Clusters for analytics using the embedded HDFS interface that comes with Isilon OneFS.

Note:  The HDFS Tiering feature of SQL Server 2019 Big Data Clusters currently does not support Cloudera Hadoop, Isilon provides immediate access to HDFS data with or without a Hadoop distribution being deployed in the customers’ environment.  This is a unique value proposition of Dell EMC Isilon storage solution for SQL Server Big Data Clusters.  Unstructured data stored on Isilon is directly accessed over HDFS and will transparently appear as local data to the SQL Server Big Data Cluster platform.

The Figure below depicts the overall architecture between SQL Server Big Data Cluster platform and Dell EMC Isilon or ECS storage solutions.

Dell EMC provides two storage solutions that can integrate with SQL Server Big Data Clusters. Dell EMC Isilon provides a high-performance scale-out HDFS solution and Dell EMC ECS provides a high-capacity scale-out S3A solution, both are on-premise storage solutions.

We are currently working with the Microsoft’s Azure team to get these storage solutions available to customers in the cloud as well.  The remainder of this article provides details on how Dell EMC Isilon integrates with SQL Server Big Data Cluster over HDFS.

Setting up HDFS on Dell EMC Isilon

Enabling HDFS on Isilon is as simple as clicking a button in the OneFS GUI.  Customers have the choice of having multiple access zones if needed, access zones provide a logical separation of the data and users with support for independent role-based access controls.  For the purposes of this article, a “msbdc” access zone will be used for reference.  By default, HDFS is disabled on a given access zone as shown below:

To activate HDFS, simply click the Activate HDFS button.  Note:  HDFS licenses are free with the purchase of Isilon, HDFS licenses can be installed under Cluster Management\Licenses.

Once an HDFS license in installed and HDFS is activated on a given access zone, the HDFS settings can be viewed as shown below:

The GUI allows you to easily change the HDFS block size, Authentication Type, Enable the Ranger Security Plugin, etc.  Isilon OneFS also supports various authentication providers and additional protocols as shown below:

Simply pick the authentication provider of your choice and specify the provider details to enable remote authentication services on Isilon.  Note:  Isilon OneFS has a robust security architecture and authentication, identity management, and authorization stack, you can find more details here.

The multi-protocol support included with Isilon allows customers to land data on Isilon over SMB, NFS, FTP, or HTTP and make all or part of the data available to SQL Server Big Data Clusters over HDFS without having a Hadoop cluster installed – Beautiful!

A key performance aspect of Dell EMC Isilon is the scale-out design of both the hardware and the integrated OneFS storage operating system.  Isilon OneFS provides a unique SmartConnect feature that provides HDFS namenode and datanode load balancing and redundancy.

To use SmartConnect, simply delegate a sub-domain of your choice on your internal DNS server to Isilon and OneFS will automatically load balance all the associated HDFS connections from SQL Server Big Data Clusters transparently across all the physical nodes on the Isilon storage cluster.

The SmartConnect zone name is configured under Cluster Management\Network Configuration\ in the OneFS GUI as shown below:

 

In the example screen shot above, the SmartConnect Zone name is msbdc.dellemc.com, this means the delegated subdomain on the internal DNS server should be msbdc, a nameserver record for this msbdc subdomain needs to point to the defined SmartConnect Service IP.

The Service IP information is in the subnet details in the OneFS GUI as shown below:

In the above example, the service IP address is 10.10.10.10.  So, creating DNS records for 10.10.10.10 (e.g. isilon.dellemc.com) and a NS record for msbdc.dellemc.com that is served by isilon.dellemc.com (10.10.10.10) is all that would be needed on the internal DNS server configuration to take advantage of the built-in load balancing capabilities of Isilon.

Use “ping” to validate the SmartConnect/DNS configuration.  Multiple ping tests to msbdc.dellemc.com should result with different IP address responses returned by Isilon, the range of IP addresses returned is defined by the IP Pool Range in the Isilon GUI.

SQL Server Big Data Cluster would simply have a single mount configuration pointing to the defined SmartConnect Zone name on Isilon.  Details on how to setup the HDFS mount to Isilon from SQL Server Big Data Cluster is presented in the next section.

SmartConnect makes storage administration easy.  If more storage capacity is required, simply add more Isilon nodes to the cluster and storage capacity and I/O performance instantly increases without having to make a single configuration change to the SQL Server Big Data Clusters – BRILLIANT!

With HDFS enabled, the access zone defined, and the network/DNS configuration complete, the Isilon storage system can now be mounted by SQL Server Big Data Clusters.

Mounting Dell EMC Isilon from SQL Server Big Data Cluster

Assuming you have a SQL Server Big Data Cluster running, begin with opening a terminal session to connect to your SQL Server Big Data Cluster.  You can obtain the IP address of the end point controller-svc-external service of your cluster with the following command:

Using the IP of the controller end point obtained from the above command, log into your big data cluster:

Mount Isilon using HDFS on your SQL Server Big Data Cluster with the following command:

Note:  hdfs://msbdc.dellemc.com is shown as an example, the hdfs uri must match the SmartConnect Zone name defined in the Isilon configuration.  The data directory specified is also an example, any directory name that exists within the Isilon Access Zone can be used.  Also, the mount point /mount1 that is shown above is just an example, any name can be used for the mount point.

An example of a successful response of the above mount command is shown below:

Create mount /mount1 submitted successfully.  Check mount status for progress.

Check the mount status with the following command:

sample output:

Run an hdfs shell and list the contents on Isilon:

sample output:

In addition to using hdfs shell commands, you can use tools like Azure Data Studio to access and browse files over the HDFS service on Dell EMC Isilon.  The example below is using Spark to read the data over HDFS:

To learn more about Dell EMC Isilon, please visit us at DellEMC.com.

 

Setting Up Share Host ACLs Isilon OneFS

Setting Up Share Host ACLs

How do you allow or deny host for SMB shares?

In Isilon’s OneFS administrators can set Host ACLs on SMB shares. Setting up theses ACLs can add an extra layer of security for files in a specific share. For example administrators can deny all traffic except from certain servers.

OneFS Setting Up Share Host ACLs Commands

Below are the commands used in the Setting Up Share Host ACLs demo. NASA refers to the SMB Share used deny all traffic except from the specific host or hosts.

List out all the shares specific zone

isi smb shares list

View specifics on particular share in access zone

isi smb shares view nasa

Modify Host ACLs on particular share in access zone

isi smb share modify nasa --add-acl

Clear Host ACLs on specific share

isi smb share modify nasa --clear-host-acl
or 
isi smb share modify nasa --revert-host-acl

 

Video – Setting Up Host ACLs on Isilon File Share

Transcript

 

Hi, folks. Thomas Henson here with thomashenson.com. And today is another episode of Isilon Quick Tips. So, what we want to cover on today’s episode is I want to go in through the CLI, and look at some of the commands that we can do on isi shares. And specifically, I want to look at some of the advanced features. So, something around the ACLs where we can deny certain hosts or allow certain hosts, too. So, follow along with me right after this. [Music]. So, in today’s episode we want to look at SMB Shares, but specifically from the Command Line. What we’re really going to focus on as I open this Share here is some of these advanced settings. So, you can see that we have some of these advanced settings, like continuous availability of time. And it looks like that we can change some of these. But when we change them, we’re just going to type in how we want to change those here. So, if you wanted to, for example in the host ACL, be able to deny or allow certain hosts, this is where we can do that. But let’s find out how we can this from the Command Line. Because there is a couple of different options, and a couple ways we can do it, and specifically we want to learn how to do it from the Command Line. So, here we are. I’m log back in to my Command Line. So, you can see I’m on Isilon-2. So, the first command I want to do is I want to list out all those SMB Shares that we had. So, we had three of those. So, the command is that we’re going to use in is the smb shares. And I’m just going to type return, so we can see what those actions are. So, you can see that we can do a list, which is the first thing we want to do. But you can also create those shares, you can delete shares, and we can view specific properties on each one of those shares. So, going back in. Let’s run a list on our shares. And you can see… All right. So, we have all those shares that we were just looking at from our [INAUDIBLE 00:02:00]. One thing to note here is if you are using this shares list command and you don’t see your zones, make sure that you type in the zone here. So, we will type in a specific zone. So, if you didn’t see the shares, make sure that you’re specifying exactly what zone there is. I only have one zone in my lab environment here on the system, so I can see that all may shares are there. So, now that I know my shares are there, let’s go back. I want to look at the nasa share that we have. So, let’s use the view command NASA. And you can see here that it’s going to give me my permissions, but then also those advanced features that we were talking about, we can see those here. So, for example we have the Access Based Enumeration. So, if you’re looking to be able to hide files or folders for users that don’t have those permissions, you can see that if that set here. Then also the File Mask. So, you can see that on default directly in File Mask is 700. So, if you’re looking about [INAUDIBLE 00:02:54] the File Mask is, if you’re not familiar, that’s the default permissions that are set whenever you have a File Directory that’s created in this share. So, you can see that in mine, the default setting is 700. Then specifically, the one that I really want to go over was the Host ACL. So, you can see the Hos ACL. I don’t have anything set here. And this is the property we can change, that will allow or deny certain hosts to the specific share. So, one of the reasons this came up is we were trying to secure an application from a share, and we wanted to able to say, ͞Hey, it’s only going to accept traffic from two or one specific server, and then we’re going to deny all those.͟ So, what we’re going to do is I want to walk through how to do that. So specifically, we’re still going to use our isismb share. But now we’re going to use the modify. So, you see the isi smb share modify command. You can see that when we do that… I’m just going to show you some of the commands that we have here. But you can see we have a lot of different options we can do. But the first thing is, remember, we’re going to type in that share.

So, here I want to pass in my nasa string. I don’t have to pass in zone, because I only have one zone. But if you have different zones, then you’re going to want to pass that zone in. The command that we’re specifically looking for is this host-acl. So, we have some options here with the host-acl. We can clear the host, we can add a host, and we can remove a host. So, what we want to do is we want to add a host that’s going to allow for host coming from. We’re just going to say 192.170.170.001. Then we’re going to deny our host from that. So, we’re going to clear this out, so we can have that at the top of the screen. So, you can see we have it here. So, that isi smb shares modify. Then you’re going to put in here you share name. So, mine is nasa. And we’re going to do –add-host-acl=, the first thing that we’re going to do is we’re going to allow. So, we’re going to allow traffic from 192.170.170.001 Then we’re going to use a comma to separate that out, and then we’re going to say that we’re going to deny all. So, specifically we could do this different, and say that we want to allow traffic from all and then deny from specific ones. But from this use case, and this is probably the most common one especially when you’re trying to lock down a certain share, you’re going to want to use this command. So, we’re typing the command, get the command prompt back again. And now let’s do that view. So, it’s view our nasa, and see if our changes are in there. So, you can see in our Host ACL, we have it. Then if we wanted to go back to our share from the [INAUDIBLE 00:05:43] and just see if those changes took. You can see in our advanced setting here, now it showing us are allow and deny all. Now, [INAUDIBLE 00:05:52] to say that I want to keep this going on my [INAUDIBLE 00:05:55] or if I want to revert back. So, there is a couple of different options. If you remember we had the clear-host-acl or the revert back. So, now I can just use this isi smb shares modify on my nasa directory. Once again, just as a reminder, use your own name if you have a specific zone. Then now I can revert my Host ACL. Now, we have that, I’m going to clear this out, and check. You can see our Host ACL is reverted back. We don’t have one set there. So, now we’re allowing traffic as long as you have the permissions to get to this file, and we don’t have one set. Well, that’s all for Isilon Quick Tips for today. Make sure to subscribe so that you never miss an episode of Isilon Quick Tips, or some of the other amazing contents that I have on my YouTube Channel here. And I will see you next time. [Music]