Qumulo Cloud Q on the AWS Cloud

Quick Start Reference Deployment

QS

January 2022
Dack Busch and Gokul Kuppuraj, Qumulo
Dave May, AWS Integration & Automation team

Visit our GitHub repository for source files and to post feedback, report bugs, or submit feature ideas for this Quick Start.

This Quick Start was created by Qumulo in collaboration with Amazon Web Services (AWS). Quick Starts are automated reference deployments that use AWS CloudFormation templates to deploy key technologies on AWS, following AWS best practices.

Overview

This guide provides instructions for deploying the Qumulo Cloud Q Quick Start reference architecture in the AWS Cloud.

Amazon may share user-deployment information with the AWS Partner that collaborated with AWS on the Quick Start.

Qumulo Cloud Q on AWS

The Qumulo Cloud Q Quick Start provisions a 1-TB to 6-PB cluster of Qumulo file-storage nodes in the AWS Cloud. The Qumulo multiprotocol file data platform delivers enterprise scale and performance for compute-intensive workloads, accelerating the monetization of your unstructured data.

For more information, see the Qumulo knowledge base.

AWS costs

You are responsible for the cost of the AWS services and any third-party licenses used while running this Quick Start. There is no additional cost for using the Quick Start.

The AWS CloudFormation templates for Quick Starts include configuration parameters that you can customize. Some of the settings, such as the instance type, affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you use. Prices are subject to change.

After you deploy the Quick Start, create AWS Cost and Usage Reports to deliver billing metrics to an Amazon Simple Storage Service (Amazon S3) bucket in your account. These reports provide cost estimates based on usage throughout each month and aggregate the data at the end of the month. For more information, see What are AWS Cost and Usage Reports?

Software licenses

Before you deploy the Cloud Q Quick Start, subscribe to a Qumulo Amazon Machine Image (AMI) in the AWS Marketplace. See the Subscribe to a Qumulo Marketplace AMI section in this guide.

Architecture

Deploying this Quick Start for a new virtual private cloud (VPC) with default parameters builds the following Cloud Q environment in the AWS Cloud.

Architecture
Figure 1. Quick Start architecture for Cloud Q on AWS

As shown in Figure 1, the Quick Start sets up the following:

  • Two Availability Zones: one for the Qumulo cluster and another that you could use for a disaster recovery Qumulo cluster.*

  • A VPC configured with public and private subnets, according to AWS best practices, to provide you with your own virtual network on AWS.*

  • In the public subnet, a managed network address translation (NAT) gateway to allow outbound internet access for resources in the private subnet.

  • In the private subnet:

    • A cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances that run the Qumulo Core software. (Qumulo uses the term node instead of instance.)

    • Amazon Elastic Block Store (Amazon EBS) volumes, which store the files for the Qumulo cluster.

    • A provisioner EC2 instance (node), which automatically stops running after provisioning the Qumulo cluster. It automatically restarts during stack updates.

    • (Optional) An Amazon Route 53 hosted zone to configure DNS A records for the cluster.

  • AWS Key Management Service (AWS KMS) to use a customer managed key for encryption of EBS volumes.

  • AWS Secrets Manager to store credentials.

  • AWS Identity and Access Management (IAM) to manage roles.

  • Amazon CloudWatch to log metrics for the Qumulo cluster and access a CloudWatch dashboard for the cluster.

  • Amazon Simple Notification Service (Amazon SNS) to send alerts for EBS volume anomalies and EC2-instance recovery events.

  • AWS Systems Manager for monitoring and storing the Qumulo cluster’s provisioning state.

  • Amazon S3 for populating content on the Qumulo cluster.

  • AWS Lambda to collect metrics for the Qumulo cluster and monitor EBS volume health. (Qumulo refers to Lambda as Sidecar.)

* The template that deploys the Quick Start into an existing VPC skips the components marked by an asterisk and prompts you for your existing VPC configuration.

Planning the deployment

Specialized knowledge

This deployment requires a moderate level of familiarity with AWS services. If you’re new to AWS, see Getting Started Resource Center and AWS Training and Certification. These sites provide materials for learning how to design, deploy, and operate your infrastructure and applications on the AWS Cloud.

AWS account

If you don’t already have an AWS account, create one at https://aws.amazon.com by following the on-screen instructions. Part of the sign-up process involves receiving a phone call and entering a PIN using the phone keypad.

Your AWS account is automatically signed up for all AWS services. You are charged only for the services you use.

Technical requirements

Before you launch the Quick Start, review the following information and ensure that your account is properly configured. Otherwise, deployment might fail.

Resource quotas

If necessary, request service quota increases for the following resources. You might need to request increases if your existing deployment currently uses these resources and if this Quick Start deployment could result in exceeding the default quotas. The Service Quotas console displays your usage and quotas for some aspects of some services. For more information, see What is Service Quotas? and AWS service quotas.

Resource This deployment uses

VPCs

1

Elastic IP addresses (optional for public management)

1

Security groups

2

IAM roles

4

Qumulo EC2 instances

4–20

Provisioner EC2 instance

1

Lambda functions

2

EBS volumes

Depends on the disk configuration

Supported AWS Regions

For any Quick Start to work in a Region other than its default Region, all the services it deploys must be supported in that Region. You can launch a Quick Start in any Region and see if it works. If you get an error such as “Unrecognized resource type,” the Quick Start is not supported in that Region.

For an up-to-date list of AWS Regions and the AWS services they support, see AWS Regional Services.

Certain Regions are available on an opt-in basis. For more information, see Managing AWS Regions.

IAM permissions

Before launching the Quick Start, you must sign in to the AWS Management Console with IAM permissions for the resources that the templates deploy. The AdministratorAccess managed policy within IAM provides sufficient permissions, although your organization may choose to use a custom policy with more restrictions. For more information, see AWS managed policies for job functions.

Deployment options

This Quick Start provides two deployment options:

  • Deploy Cloud Q into a new VPC with advanced parameters. This option builds a new AWS environment consisting of the VPC, subnets, NAT gateway, security groups, and other infrastructure components. It then deploys Cloud Q into the new VPC, exposing all parameter options for full flexibility.

  • Deploy Cloud Q into an existing VPC with advanced parameters. This option provisions Cloud Q in your existing AWS infrastructure, exposing all parameter options for full flexibility.

  • Deploy Cloud Q into an existing VPC with standard parameters. This option provisions Cloud Q in your existing AWS infrastructure with the minimal set of required parameters.

The Quick Start provides separate templates for these options. It also lets you configure Classless Inter-Domain Routing (CIDR) blocks, instance types, and Cloud Q settings.

Allow Qumulo public access if you use AWS Direct Connect

If you use AWS Direct Connect to connect between an on-premises environment and your VPC, and if you restrict public access to specific URLs with a corporate firewall, ensure that your firewall allows access to the following Qumulo public URLs. Since these URLs' public IP addresses may occasionally change, allow access to the URL if you can. If you can’t, allow access to the IP address resolved.

URL Description

https://trends.qumulo.com

Pulls code for software upgrades during provisioning. You can use this URL for statistics for your cluster.

https://missionq.qumulo.com

Delivers statistics to Qumulo’s remote monitoring service, which is included in your Qumulo subscription.

https://ep1.qumulo.com

(Disabled by default) The cluster uses this URL if you enable remote VPN support for Qumulo Customer Success.

https://monitor.qumulo.com

Uses remote VPN support to deliver logs for collaborating with Qumulo Customer Success.

Additional planning and design information

Documentation Description

Cloud Q Quick Start: Template Comparison

Details on the parameters and options provided by this Quick Start’s templates.

DNS options in AWS to enable IP failover and client distribution

Details on the DNS options in AWS.

Cloud Q Quick Start: Supported AWS Regions

Details on supported AWS Regions for this Quick Start.

Cloud Q Quick Start: Deploying in a VPC with no internet access

Details on deploying this Quick Start into a VPC that has no internet access.

Cloud Q Quick Start: Deploying with an AWS Custom IAM role

Details on deploying this Quick Start with a custom IAM role, including policy requirements.

Cloud Q Quick Start: AWS resources & EBS Service Quota planning

Details on service-quota planning for Amazon EBS, including EC2 instance types and EBS volume types.

Cloud Q Quick Start: Qumulo sizing & performance on AWS

Details on Qumulo cluster performance and scalability on AWS.

Deployment steps

Confirm your AWS account configuration

  1. Sign in to your AWS account at https://aws.amazon.com with an IAM user or role that has the necessary permissions. For details, see Planning the deployment earlier in this guide.

  2. Make sure that your AWS account is configured correctly, as discussed in the Technical requirements section.

Subscribe to a Qumulo Marketplace AMI

This Quick Start supports all Qumulo AWS Marketplace offerings. The 1-TB and 12-TB offerings are free for 30 days.

  1. Go to the AWS Marketplace.

  2. In the search bar, enter Qumulo.

  3. Choose the offering with the appropriate capacity for your configuration and deployment Region.

  4. Choose Continue to Subscribe on the upper right. The subscription processes within a few minutes.

  5. If you have a private offer, accept the offer by clicking the link you receive in an email. For example, the Qumulo Customizable File Storage Node offering (unless you choose 320 TiB per EC2 instance) requires a private offer.

Launch the Quick Start

If you’re deploying Cloud Q into an existing VPC, ensure that your VPC has both a private and public subnet. This Quick Start doesn’t support shared subnets. These subnets require NAT gateways in their route tables to allow the instances to download packages and software without exposing them to the internet. Also ensure that the domain name option in the Dynamic Host Configuration Protocol (DHCP) options is configured as explained in DHCP options sets. You provide your VPC settings when you launch the Quick Start.

Each deployment takes about 15 minutes to complete.

  1. Sign in to your AWS account, and choose one of the following options to launch the AWS CloudFormation template for Cloud Q. For help with choosing an option, see Deployment options earlier in this guide.

    Deploy into a new VPC with advanced parameters

    View template

    Deploy into an existing VPC with advanced parameters

    View template

    Deploy into an existing VPC with standard parameters

    View template

  2. Check the AWS Region that’s displayed in the upper-right corner of the navigation bar, and change it if necessary. This Region is where you build the network infrastructure. The template is launched in the us-east-1 Region by default. You can deploy this Quick Start in all AWS Regions except those in China. You can deploy it in AWS Local Zones and on AWS Outposts. For more information, see Supported AWS Regions earlier in this guide.

  3. On the Create stack page, keep the default setting for the template URL, and then choose Next.

  4. On the Specify stack details page, change the stack name if needed. Review the parameters for the template. Provide values for the parameters that require input. For all other parameters, review the default settings and customize them as necessary. When you finish reviewing and customizing the parameters, choose Next.

  5. On the Configure stack options page, you can specify tags (key-value pairs) for resources in your stack and set advanced options. When you finish, choose Next.

  6. On the Review page, review and confirm the template settings. Under Capabilities, select the two check boxes to acknowledge that the template creates IAM resources and might require the ability to automatically expand macros.

  7. Choose Create stack to deploy the stack.

  8. Monitor the status of the stack. When the status is CREATE_COMPLETE, the Cloud Q deployment is ready.

  9. To view the created resources, see the values displayed in the Outputs tab for the stack.

Test the deployment

Check the EC2 instances

Follow these steps to confirm that all the cluster instances are running and that the provisioner instance has stopped running.

  1. Open the EC2 console.

  2. Choose the stack name.

  3. Clear the Instance state = running filter.

  4. Verify that all the cluster instances are running.

  5. Verify that the provisioner instance (…​Qumulo Provisioning Node) has stopped running. If it’s still running, wait. It takes up to 15 minutes after stack creation for this instance to finish initializing. If it hasn’t stopped running after 15 minutes, see the troubleshooting section The provisioner instance is still running.

Check cluster quorum formation and data protection

Follow these steps to confirm that the cluster formed quorum, that you have the expected number of instances in the cluster, and that your data is protected.

  1. Open the CloudFormation console.

  2. Choose the top-level stack name.

  3. Choose Outputs. A list of URLs appears.

  4. Copy the appropriate URL from the Value column, and paste it into your browser as follows:

    • If connecting by the public internet, copy the QumuloPublicIP URL, and open a page from your local machine.

    • If connecting from within your VPC, copy the QumuloPrivateIP URL, and paste it into the browser of an EC2 instance running Chrome.

  5. Log in to the Qumulo user interface with the user name 'admin' and administrator password you provided during deployment. When you see the Qumulo dashboard, shown in Figure 2, you know that your cluster formed quorum.

    Additional19
    Figure 2. Qumulo dashboard

    If, instead of a prompt for user name and password, you see the End User Agreement screen, the cluster failed to form quorum. See the troubleshooting section Qumulo doesn’t prompt me for user name and password.

  6. Choose More details. Verify the following, as shown in Figure 3.

    1. The number of instances (nodes) listed matches the number you expect.

    2. Each instance has a green checkmark in the Status column.

    3. This message appears: "Data is protected from 2 drive failures or 1 node failure at a time. The cluster is in balance."

      Additional20
      Figure 3. Qumulo dashboard details

Postdeployment steps

(Optional) Set up disaster recovery - Qumulo Recover Q

For disaster recovery and business continuity, you can deploy one or more Qumulo Recover Q clusters in other Availability Zones or AWS Regions. For more information, see Cloud Q Quick Start: Deploy a Recover Q Cluster.

(Optional) Copy data into your cluster from an S3 bucket

If you’re using Qumulo Core version 4.3.0 or newer, you can populate data on your Qumulo cluster by copying data from an Amazon S3 bucket using Qumulo Shift for Amazon S3. To create a Shift job, follow these steps:

  1. Log in to the Qumulo UI.

  2. Choose Cluster, Copy to/from S3.

  3. Fill in the parameters.

For more information on the Qumulo Shift feature set, user interface, and command line interface, see the following:

Additional information

To learn how to use the stack to maintain the Qumulo cluster through its lifecycle and view metrics in CloudWatch, see the following:

Documentation Description

Cloud Q Quick Start: Supported CloudFormation Stack Updates

Details on CloudFormation stack update options and examples, including adding instances (nodes) to the cluster and upgrading the Qumulo Sidecar.

Cloud Q Quick Start: Deleting the CloudFormation Stack

Details on termination protection and on cleaning up an AWS KMS customer managed key policy.

Cloud Q Quick Start: Using the Custom CloudWatch Dashboard

Details on viewing the CloudWatch dashboard and resource groups that are created for the Qumulo cluster.

Cloud Q Quick Start: Provisioning Instance Functions

Details on the functions of the provisioner instance.

Cloud Q Quick Start: Updating to the Advanced Template

Details on updating to the advanced parameters if you originally deployed the Qumulo Cloud Q Quick Start using the template with standard parameters.

FAQ

Q. I encountered a CREATE_FAILED error when I launched the Quick Start.

A. If AWS CloudFormation fails to create the stack, relaunch the template with Rollback on failure set to Disabled. This setting is under Advanced in the AWS CloudFormation console on the Configure stack options page. With this setting, the stack’s state is retained, and the instance keeps running so that you can troubleshoot the issue. (For Windows, look at the log files in %ProgramFiles%\Amazon\EC2ConfigService and C:\cfn\log.)

When you set Rollback on failure to Disabled, you continue to incur AWS charges for this stack. Delete the stack when you finish troubleshooting.

For more information, see Troubleshooting AWS CloudFormation.


Q. I encountered a size-limitation error when I deployed the AWS CloudFormation templates.

A. Launch the Quick Start templates from the links in this guide or from another S3 bucket. If you deploy the templates from a local copy on your computer or from a location other than an S3 bucket, you might encounter template-size limitations. For more information, see AWS CloudFormation quotas.

Troubleshooting

I need to find the UUID for the cluster

You may need to know your cluster’s universally unique identifier (UUID) for troubleshooting. The provisioner instance stores a copy of the UUID in Parameter Store, which is a capability of AWS Systems Manager.

To find the UUID, follow these steps:

  1. Open the Systems Manager console.

  2. Choose Parameter Store.

  3. Look for /qumulo/<my stack>/<uuid> (where the text in brackets represents your stack name). The value associated with the name is the UUID for the cluster.

I don’t remember the cluster administrator password

To retrieve the cluster administrator password, follow these steps:

  1. Open the Secrets Manager console.

  2. Choose Secrets.

  3. Choose the top-level stack name.

  4. Under ClusterSecrets, choose Retrieve secret value.

The stack failed when provisioning the nested stack AWSVPCSTACK or CloudQStack

To determine and remedy the cause of the failure, follow these steps:

  1. Open the CloudFormation console.

  2. Ensure that the View nested slider is set so that you can view nested stacks.

  3. Choose the failed stack.

  4. Under the Events tab, find the failure message.

  5. Take appropriate action. For example, if message indicates that the S3 bucket, S3 key prefix, or object URL parameter values are incorrect (a common reason that these stacks fail), delete the stack and relaunch with the correct parameter values.

The stack failed when provisioning the nested stack QSTACK

To determine and remedy the cause of the failure, follow these steps:

  1. Open the CloudFormation console.

  2. Ensure that the View nested slider is set so that you can view nested stacks.

  3. Select the failed stack.

  4. Under the Events tab, find the failure message.

  5. Take appropriate action.

    Common causes of QSTACK failing Actions

    An AWS Marketplace offer has not been accepted that matches the QMarketPlaceType parameter value you entered.

    Open AWS Marketplace, search for the correct Qumulo Marketplace offering, and subscribe.

    The EBS volumes configuration doesn’t match the requirements for the QAmiID parameter value you entered in the template.

    Check the EBS volume configuration selected in the template, and relaunch the stack with EBS parameter values supported by the AMI.

    The cluster failed to place in the placement group.

    Deploy the cluster into a different Availability Zone, or use a different private subnet ID within the VPC to find more available resources.

    The message "Service limit exceeded" indicates that the QSTACK failed because AWS service quotas (formerly referred to as limits) were not planned.

    Either delete resources to free available capacity or contact AWS Support and request an increase in service quotas.

Qumulo doesn’t prompt me for user name and password

When you open the Qumulo software, if you see the End User Agreement screen instead of a prompt for your user name and password, the cluster didn’t form quorum.

Common causes of the cluster not forming quorum Actions

The software version specified in the template doesn’t exist or is older than the AMI software version.

Ensure that the software version specified for the cluster is equal to or newer than the version that the Marketplace offer lists.

The VPC doesn’t have public internet access.

Either add a NAT gateway to your existing VPC or, if you want to deploy without internet access, follow these instructions: Cloud Q Quick Start: Deploying in a VPC with no internet access.

Do not form quorum manually, or the provisioner instance won’t be able to complete the secondary provisioning of the cluster and AWS infrastructure.

The provisioner instance is still running

The provisioner instance usually stops running within five minutes of the stack completing deployment. It can take longer if your AMI ID has an older software version. This is because each quarterly software upgrade takes about four minutes, and the upgrades happen one at a time until the instance reaches the desired version. (Cluster instances are upgraded in parallel, so instance count has a minimal impact on the time this takes.) If the provisioner instance hasn’t stopped running after 15 minutes, there’s probably an issue.

Common causes of the provisioner instance continuing to run Actions

The VPC doesn’t have access to the public internet. Without access to public infrastructure, the provisioner instance can’t talk to AWS services (such as Secrets Manager, AWS KMS, and Systems Manager) and can’t download the desired version of Qumulo Core software.

Review the public and private subnets, their route tables, and the NAT gateway. Make any needed corrections. Then reboot the provisioner instance as follows: Open the EC2 console. Select the provisioner instance. Choose Instance state, Reboot Instance. (If deploying without internet access, see Cloud Q Quick Start: Deploying in a VPC with no internet access.)

A customer managed key ID was entered in the VolumesEncyrptionKey parameter, and the key policy could not be modified because the key policy didn’t have valid statement identifiers (SIDs) before the template was launched.

Go to AWS KMS and correct the key policy for the key you specified. Then reboot the provisioner instance as follows: Open the EC2 console. Select the provisioner instance. Choose Instance state, Reboot Instance. (To learn more on KMS key policies and cleanup see Deleting the CloudFormation Stack.)

A stack update was executed to add cluster instances. The stack update succeeded, but the instances were not added to the cluster. The cluster’s administrator password was probably changed after deployment.

Open the Secrets Manager console, and choose the top-level stack name. Under ClusterSecrets, choose Retrieve secret value, Edit. Update the administrator password, and save the secret. Then reboot the provisioner instance as follows: Open the EC2 console. Select the provisioner instance. Choose Instance state, Reboot Instance.

My problem is not described in this guide

If the earlier troubleshooting steps don’t rectify your problem, review the AWS Parameter Store history. This history, as shown in Figure 4, often helps you discover where the provisioner instance is failing. To see this history, open the Parameter Store with the name /qumulo/<my stack>/last-run-status (where the text in brackets represents your stack name).

Additional37
Figure 4. Parameter Store history

Finally, review the provisioning-instance log, which often shows an error that points you to the resolution. You can review the log in the console or download it to collaborate with Qumulo Care.

To retrieve the log follow these steps:

  1. Open the EC2 console.

  2. Select the provisioner instance.

  3. Choose Actions on the upper right.

  4. Choose Monitor & troubleshoot, Get system log.

  5. (Optional) Download the log by choosing Download on the upper right.

Customer responsibility

After you successfully deploy this Quick Start, confirm that your resources and services are updated and configured — including any required patches — to meet your security and other needs. For more information, see the AWS Shared Responsibility Model.

Send us feedback

To post feedback, submit feature ideas, or report bugs, use the Issues section of the GitHub repository for this Quick Start. To submit code, see the Quick Start Contributor’s Guide.

Quick Start reference deployments

GitHub repository

Visit our GitHub repository to download the templates and scripts for this Quick Start, to post your comments, and to share your customizations with others.


Notices

This document is provided for informational purposes only. It represents AWS’s current product offerings and practices as of the date of issue of this document, which are subject to change without notice. Customers are responsible for making their own independent assessment of the information in this document and any use of AWS’s products or services, each of which is provided “as is” without warranty of any kind, whether expressed or implied. This document does not create any warranties, representations, contractual commitments, conditions, or assurances from AWS, its affiliates, suppliers, or licensors. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

The software included with this paper is licensed under the Apache License, version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the accompanying "license" file. This code is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either expressed or implied. See the License for specific language governing permissions and limitations.