Another feature addition that OneFS 9.7 delivers is support for Google Cloud (GCP) as a target for SmartSync, PowerScale’s next-gen data mover. With this enhancement, SmartSync Cloud Copy now supports all three of the principal public cloud hyperscalers – Amazon S3, Google Cloud Platform, and Microsoft Azure.
As you may be aware, this is not OneFS’ first foray into Google Cloud integration. CloudPools has supported GCP as a remote tiering target for several years now. Also, from the SmartSync perspective, while GCP represents a new account type, it fits within the existing cloud authentication mechanism, plus also uses an object protocol spec that’s based heavily on Amazon’s S3.
CloudCopy uses HTTP as the data replication transport layer to cloud storage, while traditional cluster to cluster SmartSync leverages a proprietary RCP-based messaging system.
In order to use SmartSync with GCP, the cluster must be running OneFS 9.7 and have SyncIQ licensed and active across all nodes in the cluster. Additionally, a cluster account with the ISI_PRIV_DATAMOVER privilege is needed in order to configure and run SmartSync data mover policies. While file-to-file replication requires SmartSync to be running on both source and target clusters, for OneFS Cloud Copy to transfer to/from cloud storage, only the cluster requires the SmartSync platform, and no data mover is required on the cloud systems. Be aware that the inbound TCP 7722 IP port must be open across any intermediate gateways and firewalls to allow SmartSync replication to occur.
Under the covers, replication is executed by the ‘isi_dm_d’ service, and the SmartSync data mover’s basic architecture is as follows:
The ‘isi_dm_d’ service is disabled by default and needs to be enabled prior to configuring and using SmartSync. SmartSync also uses TLS (transport layer security, or SSL) and, as such, requires trust to be established between the cluster and cloud target.
The SmartSync Datamover also includes a purpose-build, integrated scheduler and job control and execution framework, which operates along these lines:
Shared Key-Value Stores (KVS) are used for jobs/tasks distribution, and extra indexing is implemented for quick lookups by task state, task type, and alive time. There are no dependencies or communication between tasks, and job cancellation and pausing is handled by posting a ‘request’ into a job record (request polling).
Within the SmartSync hierarchy, accounts define the connections to remote systems, policies define the replication configurations, and jobs perform the work, or tasks:
– URI, eg. dm://remotenas.isln.com:7722
– Network pools defining nodes/interfaces to use for data transfer
– Client and server certificates to enable TLS
– Account type (AWS S3, Azure, GCP, ECS S3)
– URI, eg. https://cloudcluster.isln.com:9002/cloudbucket
|– Dataset creation policy
– Dataset copy policy
– Dataset repeat copy policy
– Dataset expiration policy
|Runtime entities created based on policies schedules. There are two major types of data transfer jobs:
– Baseline jobs for initial transfers and
– Incremental jobs for subsequent transfers between FILE Datamover systems.
|Spawned by jobs and are the individual chunks of work that a job must perform. No 1-to-1 relationship to their associated files.
So, in order to configure SmartSync to use GCP as a cloud target, the following prerequisites are required:
|GCP account and credentials to use with feature
|SyncIQ license across the cluster
|OneFS 9.7 or higher installed and committed for GCP..
|Cluster account with the ISI_PRIV_DATAMOVER role to configure & manage.
While SmartSync is automatically installed in OneFS 9.4 and later, it is inactive by default. As such, there is no impact from the feature unless it is enabled.
To verify that GCP support is available, the account type will be listed in the output of from the ‘isi dm account create –help’ CLI command.
# uname -sr Isilon OneFS 184.108.40.206 # isi dm account create --help | grep -i gcp <account-type> (DM | AWS_S3 | ECS_S3 | AZURE | GCP)
Currently, SmartSync configuration is limited to the CLI or platform API, with WebUI support planned for a future release. As such, configuration is typically performed via the ‘isi dm’ CLI utility, which contains the following the principal subcommands:
|isi dm accounts
|Manage Datamover accounts. An activate SyncIQ license is required to create Datamover accounts.
|isi dm base-policies
|Manage Datamover base-policy. Base policies are templates to provide common values to groups of related concrete Datamover policies. Eg. Define a base policy to override the run schedule of a concrete policy.
|isi dm certificates
|Manage Datamover certificates.
|isi dm config
|Show Datamover Manual Configuration.
|isi dm datasets
|Show Datamover Dataset Information.
|isi dm historical-jobs
|Manage Datamover historical jobs.
|isi dm jobs
|Manage Datamover jobs.
|isi dm policies
|Manage Datamover policy. Policies can be either:
CREATION – Creates/replicates a dataset, either once or on a schedule.
COPY – Defines a one-time copy of a dataset to or from a remote system
|isi dm throttling
|Manage Datamover bandwidth and CPU throttling. Bandwidth throttling rules can be configured for each Datamover job.
In the next article in this series, we’ll look at the configuration required to use SmartSync with Google Cloud (GCP).