Discngine 3decision on AWS

Partner Solution Deployment Guide

QS

April 2022
Jonathan Manassen and Alexandre Gillet, Discngine S.A.S.
Dattaprasad Sadwelkar and Vinod Shukla, AWS Integration & Automation team

Refer to the GitHub repository to view source files, report bugs, submit feature ideas, and post feedback about this Partner Solution. To comment on the documentation, refer to Feedback.

This Partner Solution was created by Discngine S.A.S. in collaboration with Amazon Web Services (AWS). Partner Solutions are automated reference deployments that help people deploy popular technologies on AWS according to AWS best practices. If you’re unfamiliar with AWS Partner Solutions, refer to the AWS Partner Solution General Information Guide.

Overview

This Partner Solution deploys Discngine 3decision on the AWS Cloud. If you are unfamiliar with AWS Partner Solutions we recommend that you read the AWS Partner Solution General Content Guide.

Costs and licenses

This Partner Solution requires a license for 3decision for use in a production environment. To obtain a license, contact Discngine at contact@discngine.com. A license is not required for use in development and test environments.

There is no cost to use this Partner Solution, however you will be billed by AWS for the resources deployed. For more information see the AWS Partner Solution General Content Guide.

Architecture

Deploying this Partner Solution for a new virtual private cloud (VPC) with default parameters builds the following 3decision environment in the AWS Cloud.

Architecture
Figure 1. Partner Solution architecture for 3decision on AWS

As shown in Figure 1, the Partner Solution sets up the following:

  • A highly available architecture that spans three Availability Zones.*

  • A virtual private cloud (VPC) configured with public and private subnets, according to AWS best practices, to provide you with your own virtual network on AWS.*

  • In the public subnets:

    • Managed network address translation (NAT) gateways to allow outbound internet access for resources in the private subnets.

    • Linux bastion hosts in an Auto Scaling group to allow inbound SSH (Secure Shell) access to Amazon Elastic Compute Cloud (Amazon EC2) instances in public and private subnets.

  • In the private subnets, Kubernetes nodes in an Auto Scaling group to set up 3decision software.

  • Amazon Elastic Load Balancing for the 3decision application and extract, transform, and load (ETL) configuration panel.

  • Amazon Elastic Kubernetes Service (Amazon EKS) for Kubernetes orchestration of the EKS cluster of 3decision nodes.

  • Amazon Elastic Block Storage (Amazon EBS) for persistent storage of Kubernetes data.

  • Amazon Relational Database Service (Amazon RDS) for 3decision data and analysis.

  • Amazon Route 53 for a Domain Name System (DNS) record.

* The template that deploys the Partner Solution into an existing VPC skips the components marked by asterisks and prompts you for your existing VPC configuration.

Deployment options

This Partner Solution provides three deployment options:

The Partner Solution provides separate templates for these options. It also lets you configure Classless Inter-Domain Routing (CIDR) blocks, instance types, and 3decision settings.

Predeployment steps

Administrator privileges

To deploy the Partner Solution, you must have an administrator account with permissions to deploy the resources the Partner Solution contains, which include IAM roles. IAM roles deployed by this Partner Solution follow the principle of least privilege.

Authentication

During deployment, you are prompted to provide information about your authentication provider. 3decision supports OpenID Connect for both Azure and Okta. Refer to the following instructions to create Azure and Okta OpenID Connect applications.

Creating an Azure OpenID Connect application

  1. Sign in to the Azure portal using an account with administrator permissions.

  2. Choose Azure Active Directory.

  3. Under App registrations, choose New registration.

  4. Enter a name for your application (for example, Discngine 3decision prod).

  5. For Redirect URI, enter the following. Replace <your domain> and <your_top_level_domain> in the URI shown with your information (for example, discngine.com).

    https://3decision-api.<your_domain>.<your_top_level_domain>/auth/azure/callback

  1. Choose Register.

  2. In the Authentication section of your application, under Implicit grant and hybrid flows, select ID Token to enable ID tokens.

  3. In the Token configuration section, add the following optional claims to the ID token:

    • email

    • prefered_username

    • family_name

    • given_name

  4. On the Certificates & secrets page, generate a new secret and copy it. You’ll use it during the Partner Solution deployment.

  5. On the Overview page, choose the Endpoints tab.

  6. Copy the URL in OpenID Connect metadata document. You’ll use it during the Partner Solution deployment.

  7. In the Azure portal, navigate to Enterprise Applications

  8. Choose your 3decision application and open it

  9. Navigate to Properties

  10. Set * Assignment required?* to Yes if you want to filter out users that are not a member of this application.

Creating an Okta OpenID Connect application

  1. In the Okta Admin Console, under Applications, choose Applications.

  2. Choose Create app integration.

  3. For Sign In method, choose OIDC - OpenID Connect.

  4. For the application type, choose Web Application.

  5. Choose Next.

  6. Name the application 3decision prod.

  7. For Grant type, choose Client Credentials.

  8. Select Refresh Token.

  9. For Sign-in redirect URI, enter the following. Replace <your domain> and <your_top_level_domain> in the URI shown with your information (for example, discngine.com).

    https://3decision-api.<you_domain>.<your_top_level_domain>/auth/okta/callback

  1. On the Assignments tab, under Assign, choose Assign to Groups. Choose group assignments to manage who will be able to access the application.

  2. Choose Save.

  3. Copy the secret, as you’ll use it during the Partner Solution deployment.

  4. In the upper-right corner of the Okta dashboard, copy the Okta domain in the global header (for example, example.okta-emea.com). You’ll use it during the Partner Solution deployment.

During Partner Solution deployment, you can choose default as your Okta server ID. 3decision will use the scopes email, profile, openid, and offline_access and the claims email and name from the ID token. For more information, refer to Create OIDC app integrations using AIW.

VPC Tagging

If you are deploying 3decision in an existing VPC, ensure that the private subnets have the correct tags. The tag kubernetes.io/role/elb needs to be added on each public subnet and the tag kubernetes.io/role/internal-elb on each private subnet. The value of the tags should remain empty.

Encrypting the Database

The RDS database created during the deployment is based on a public unencrypted snapshot. To use an encrypted database snapshot, create and encrypt a copy of the default snapshot.

To find the default snapshot:

  1. Navigate to the Amazon RDS console.

  2. In the navigation pane, choose Snapshots.

  3. Choose the Public tab to find the latest version of the db3dec snapshot.

Enter the ARN of the snapshot copy in the Database snapshot identifier (DBSnapShotIdentifier) parameter during deployment.

Deployment steps

  1. Sign in to your AWS account, and launch this Partner Solution, as described under Deployment options. The AWS CloudFormation console opens with a prepopulated template.

  2. Choose the correct AWS Region, and then choose Next.

  3. On the Create stack page, keep the default setting for the template URL, and then choose Next.

  4. On the Specify stack details page, change the stack name if needed. Review the parameters for the template. Provide values for the parameters that require input. For all other parameters, review the default settings and customize them as necessary. When you finish reviewing and customizing the parameters, choose Next.

    Unless you’re customizing the Partner Solution templates or are instructed otherwise in this guide’s Predeployment section, don’t change the default settings for the following parameters: QSS3BucketName, QSS3BucketRegion, and QSS3KeyPrefix. Changing the values of these parameters will modify code references that point to the Amazon Simple Storage Service (Amazon S3) bucket name and key prefix. For more information, refer to the AWS Partner Solutions Contributor’s Guide.
  5. On the Configure stack options page, you can specify tags (key-value pairs) for resources in your stack and set advanced options. When you finish, choose Next.

  6. On the Review page, review and confirm the template settings. Under Capabilities, select all of the check boxes to acknowledge that the template creates AWS Identity and Access Management (IAM) resources that might require the ability to automatically expand macros.

  7. Choose Create stack. The stack takes about 2 hours to deploy.

  8. Monitor the stack’s status, and when the status is CREATE_COMPLETE, the Discngine 3decision deployment is ready.

  9. To view the created resources, choose the Outputs tab.

Postdeployment steps

(Optional) Add your SSL certificate to the Application Load Balancer

If you did not provide an SSL certificate during the Partner Solution deployment, edit the listener to configure an SSL certificate.

  1. Open the Amazon EC2 console.

  2. In the navigation pane, under Load Balancing, choose Load Balancers.

  3. Select the load balancer and choose Listeners.

  4. Select the check box for the listener and then choose Edit.

  5. For Protocol, choose HTTPS.

  6. Choose Update.

  7. Change the default Secure Socket Layer (SSL) certificate. For Default SSL certificate, do one of the following:

    • If you created or imported a certificate using AWS Certificate Manager, choose From ACM and choose the certificate.

    • If you uploaded a certificate using IAM, choose From IAM and choose the certificate.

    • If you have a certificate that isn’t managed by ACM or IAM, import it to ACM and add it to your listener. For more information, refer to Listeners for your Network Load Balancer.

(Optional) Create a CNAME record in your DNS provider

If you did not provide a Route 53 hosted zone ID during the Partner Solution installation, you must create the following three DNS records.

  1. 3decision user URI: 3decision.<your domain>.

  2. 3decision API URI: 3decision-api.<your domain>.

  3. 3decision help URI: 3decision-help.<your domain>.

Both records must target the 3decision Application Load Balancer. To obtain the address of the Application Load Balancer, open the Amazon EC2 console. In the navigation pane, under Load Balancing, choose Load Balancers. The address of the Application Load Balancer is in the DNS Name column. For more information, refer to Network Load Balancers.

Additional information

Infrastructure monitoring and log collection

This Partner Solution does not include infrastructure monitoring or log collection. 3decision updates are provided as Helm charts updates. Although these updates are free, you must apply Helm upgrades manually to the deployed EKS cluster.

Data administration

In 3decision, the data owner (the users that uploads the structure to 3decision or is granted ownership by another user) manages data administration.

User access management

Configure user access management in the authentication provider (Okta or Azure) by granting access to the OpenID Connect application. For more information, refer to Authentication provider, earlier in this guide. By default, the data access policy in 3decision denies access to private data. Users can only access public data by default.

Operations and security

By default, Amazon RDS and Amazon EBS backup databases and volumes, respectively. You can create a new 3decision environment from backups using the solution’s CloudFormation template.

You do not need to manage patching of AWS managed services. You must maintain security patching of the optional bastion host if you choose to deploy it.

Consult your Discngine representative for information about available deployment and maintenance services and upload requirements.

3decision synchronizes with public structures made available by the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB). Data synchronization uses the rsync protocol. 3decision calls the following domains:

  • rsync.ebi.ac.uk on port 873.

  • rsync.wwpdb.org on ports 873 and 33444.

3decision in this Partner Solution supports updating 3decision from the command line. Contact Discngine for 3decision update commands and instructions. For more information about 3decision SaaS with continuous deployment, contact your Discngine representative.

Testing and verification

This Partner Solution does not currently support automated testing. Although each 3decision microservice provides a liveness and readiness endpoint, you must configure monitoring probes.

Database size and instance type

The recommended database instance type for up to 20 concurrent users is t3.xlarge. You can choose the database instance type during deployment. The deployed Amazon RDS for Oracle database storage is 1 TB (extensible to 3 TB). The overall Amazon EBS storage for the solution is about 1.2 TB. For more information, refer to Amazon Elastic Block Store (EBS).

Uploading large-scale datasets like Alphafold may require up to 1 TB additional Amazon EBS storage and up to 1 TB additional Amazon RDS storage. A typical cryo-electron microscopy (cryo-EM) entry of 3.0 Å resolution is 1–2 MB, with an associated Medical Research Council (MRC) file of 30–150 MB. MRC files can reach over 1 GB depending on the number of protein chains. A typical x-ray file is between 100 KB–1 MB, with associated file sizes depending on the amount of customer data included.

For AWS Partner Solution costs and licenses information, refer to the AWS Partner Solution General Information Guide or contact your Discngine sales representative.

Kubernetes nodes

The minimal configuration is three Amazon EKS nodes, one per Availability Zone. The recommended EC2 instance type for EKS worker nodes is t3.xlarge.

Troubleshooting

For troubleshooting common Partner Solution issues visit the AWS Partner Solution General Content Guide or the Troubleshooting CloudFormation page in the AWS documentation.

Customer responsibility

After you deploy a Partner Solution, confirm that your resources and services are updated and configured—including any required patches—to meet your security and other needs. For more information, refer to the Shared Responsibility Model.

Feedback

To submit feature ideas and report bugs, use the Issues section of the GitHub repository for this Partner Solution. To submit code, refer to the Partner Solution Contributor’s Guide. To submit feedback on this deployment guide, use the following GitHub links:

Notices

This document is provided for informational purposes only. It represents current AWS product offerings and practices as of the date of issue of this document, which are subject to change without notice. Customers are responsible for making their own independent assessment of the information in this document and any use of AWS products or services, each of which is provided "as is" without warranty of any kind, whether expressed or implied. This document does not create any warranties, representations, contractual commitments, conditions, or assurances from AWS, its affiliates, suppliers, or licensors. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

The software included with this paper is licensed under the Apache License, version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at https://aws.amazon.com/apache2.0/ or in the accompanying "license" file. This code is distributed on an "as is" basis, without warranties or conditions of any kind, either expressed or implied. Refer to the License for specific language governing permissions and limitations.