OneFS SmartSync Backup-to-Object Configuration – Unstructured Data Quick Tips

As we saw in the previous article in this series, SmartSync in OneFS 9.11 sees the addition of backup-to-object, which provides high performance, full-fidelity incremental replication to ECS, ObjectScale, Wasabi, and AWS S3 & Glacier IR object stores.

This new SmartSync backup-to-object functionality supports the full spectrum of OneFS path lengths, encodings, and file sizes up to 16TB – plus special files and alternate data streams (ADS), symlinks and hardlinks, sparse regions, and POSIX and SMB attributes. Specifically:

Copy-to-object (OneFS 9.10 & earlier)

Backup-to-object (OneFS 9.11)

· One-time file system copy to object

· Baseline replication only, no support for incremental copies

· Browsable/accessible filesystem-on-object representation

· Certain object limitations

o No support for sparse regions and hardlinks

o Limited attribute/metadata support

o No compression

· Full-fidelity file system baseline & incremental replication to object

o Supports ADS, special files, symlinks, hardlinks, sparseness, POSIX/NT attributes, and encoding

o Any file size and any path length

· Fast incremental copies

· Compact file system snapshot representation in native cloud

· Object representation

o Grouped by target base-path in policy configuration

o Further grouped by Dataset ID, Global File ID

SmartSync backup-to-object operates on user-defined data set, which are essentially OneFS file system snapshots with plus additional properties.

A data set creation policy takes snapshots and creates a data set out of it. Additionally, there are also copy and repeat copy policies which are the policies that will transfer that data set to another system. And the execution of these two policy types can be linked and scheduled separately. So you can have one schedule for your data set creation, say to create a data set every hour on a particular path. And you can have a tiered or different distribution system for the actual copy itself. For example, to copy every hour to a hot DR cluster in data center A. But also copy every month to a deep archive cluster in data center B. So all these things are possible now, without increasing the bloat of snapshots on the system, since they’re now able to be shared.

Currently, SmartSync does not have a WebUI presence, so all its configuration is either via the command-line or platform API.

Here’s the procedure for crafting a baseline replication config:

Essentially, create the replication account, which in OneFS 9.11 will be either Dell ECS or Amazon AWS. Then configure that dataset creation policy, run it, and, if desired, create a repeat-copy policy. These specific steps with their CLI syntax include:

Create a replication account:

# isi dm account create --account-type [AWS_S3 | ECS_S3]

Configure a dataset creation policy

# isi dm policies create [Policy Name] --policy-type CREATION

Run the dataset creation policy:

# isi dm policies list

# isi dm policies modify [Creation policy id] –-run-now=true

# isi dm jobs list

# isi dm datasets list

create a repeat-copy policy

# isi dm policies create [Policy Name] --policy-type=' REPEAT_COPY'

Run the repeat-copy policy:

# isi dm policies list

# isi dm policies modify [Repeat-copy policy id] –-run-now=true

View the data replication job status

# isi dm jobs list

Similarly for an incremental replication config:

Note that the dataset creation policy and repeat-copy policy are already created in the baseline replication configure and can be ignored.

Incremental replication using the dataset create and repeat-copy policies from the previous slide’s baseline config.

Run the dataset creation policy

# isi dm policies list

# isi dm policies modify [Creation policy id] –-run-now=true

# isi dm jobs list

# isi dm datasets list

Run the repeat-copy policy:

# isi dm policies list

# isi dm policies modify [Repeat-copy policy id] –-run-now=true

View the data replication incremental job status

# isi dm jobs list

And here’s the basic procedure for creating and running a partial or full restore:

Note that the replication account is already created on the original cluster and the creation step can be ignored. Replication account creation is only required if restoring the dataset to a new cluster.

Additionally, partial restoration involves a subset of the directory structure, specified via the ‘source path’ , whereas full restoration invokes a restore of the entire dataset.

The process includes creating the replication account if needed, finding the ID of the dataset to be restored, creating and running the partial or full restoration policy, and checking the job status to verify it ran successfully.

Create a replication account:

# isi dm account create --account-type [AWS_S3 | ECS_S3]

For example:

# isi dm account create --account-type ECS_S3 --name [Account Name] --access-id [access-id] --uri [URI with bucket-name] --auth-mode CLOUD --secret-key [secret-key] --storage-class=[For AWS_S3 only: STANDARD or GLACIER_IR]

Verify the dataset ID for restoration:

# isi_dm browse

Checking the following attributes:

list-accounts
connect-account [Source Account ID created in step 1]
list-datasets
connect-dataset [Dataset id]

Create a partial or full restoration policy

# isi dm policies create [Policy Name] --policy-type='COPY'

Run the partial or full restoration policy:

# isi dm policies modify [Restoration policy id] –-run-now=true

View the data restoration job status

# isi dm jobs list

OneFS 9.11 also introduces recovery point objective or RPO alerts for SmartSync, but note that these are for repeat-copy policies only. These RPO alerts can be configured through the replication policy by adding the desired time value to the ‘repeat-copy-rpo-alert’ parameter. If this configured threshold is exceeded, an RPO alert is triggered. This RPO alert is automatically resolved after the next successful policy job run.

Also be aware that the default time value for a repeat copy RPO is zero, which instructs SmartSync to not generate RPO alerts for that policy.

The following CLI syntax can be used to create a replication policy, with the ‘–repeat-copy-rpo-alert’ flag set for the desired time:

# isi dm policies create [Policy Name] --policy-type=' REPEAT_COPY' --enabled='true' --priority='NORMAL' --repeat-copy-source-base-path=[Source Path] --repeat-copy-base-base-account-id=[Source account id] --repeat-copy-base-source-account-id=[Source account id] --repeat-copy-base-target-account-id=[Target account id] --repeat-copy-base-new-tasks-account=[Source account id] --repeat-copy-base-target-dataset-type='FILE_ON_OBJECT_BACKUP' --repeat-copy-base-target-base-path=[Bucket Name] --repeat-copy-rpo-alert=[time]

And similarly to change the RPO alert configuration on an existing replication policy:

# isi dm policies modify [Policy id] --repeat-copy-rpo-alert=[time]

An alert is triggered and corresponding CELOG event created if the specified RPO for the policy is exceeded. For example:

# isi event list

ID   Started     Ended       Causes Short                     Lnn  Events  Severity

--------------------------------------------------------------------------------------

1898 07/15 00:00 07/15 00:00 SW_CELOG_HEARTBEAT               1    1       information

2012 07/15 06:03 --          SW_DM_RPO_EXCEEDED               2    1       warning

--------------------------------------------------------------------------------------

And then once RPO alert has been resolved after a successful replication policy job run:

# isi event list

ID   Started     Ended       Causes Short                     Lnn  Events  Severity

--------------------------------------------------------------------------------------

1898 07/15 00:00 07/15 00:00 SW_CELOG_HEARTBEAT               1    1       information

2012 07/15 06:03 07/15 06:12 SW_DM_RPO_EXCEEDED               2    2       warning

--------------------------------------------------------------------------------------

Leave a Reply Cancel reply