Integrating Megaport with Snowflake on AWS
You can use Megaport to create an AWS Direct Connect Layer 2 connection between your on-premises or colocation-based infrastructure and your Snowflake environment on AWS.
Before you begin, ensure that you have created a Port. After you create the Port, you can connect a Virtual Cross Connect (VXC) from the Port to the virtual gateway associated with the AWS VPC infrastructure. A VXC is a point-to-point Ethernet connection between an A-End (your Port) and a B-End (in this case, your AWS instance).
If you aren’t a Megaport customer, you can create a 1 Gbps, 10 Gbps, or 100 Gbps Port in one of our global data centers/Points of Presence. If your company isn’t located in one of our PoPs, you can procure a last mile circuit to one of the sites to connect to Megaport. Contact Megaport for more information.
If you require a Port in a different location to physically separate this solution from other existing traffic traversing your Ports, we recommend that you create a new one before proceeding.
This figure shows a high level topology diagram of a solution integrating Megaport with Snowflake Data Warehousing on AWS using a single connection.
Setting up your Snowflake environment
To set up your Snowflake environment, you will:
- Log in to Snowflake
- Create Snowflake objects
- Stage the data files
- Copy data into the target table
- Query the loaded data
For details on setting up Snowflake in an AWS environment, see Snowflake Prerequisites.
To ensure that data transfer between your Snowflake solution and your VPC remains secure, you can use AWS PrivateLink to connect the two environments. PrivateLink is an AWS service that lets you implement direct and secure connectivity between AWS VPCs. This allows your data to stay within the AWS infrastructure without being exposed to the internet which reduces some of the security risks related to having data traversing the internet.
Enabling AWS PrivateLink can take up to two business days. For more information on enabling AWS PrivateLink for your Snowflake on AWS solution, see AWS PrivateLink & Snowflake.
You can use the Megaport Portal to create the VXC to the Snowflake on AWS environment.
In the Megaport Portal, go to the Services page and select the Port you want to use.
If you haven’t already created a Port, see Creating a Port.
Add an AWS connection for the Port.
If this is the first connection for the Port, click the AWS tile. The tile is a shortcut to the configuration page. Alternatively, click +Connection, click Cloud, and click AWS.
For AWS Connection Type, click Hosted VIF or Hosted Connection and click Next.
For this example, we will click Hosted Connection.
Next, you’ll create a new VXC. In the Select Destination Port list, select the AWS region and the interconnection point for your connection and click Next.
You can use the Country filter to narrow the selection.
Specify these connection details:
- Connection Name – The name of your VXC to be shown in the Megaport Portal.
- Service Level Reference (optional) – Specify a unique identifying number for the VXC to be used for billing purposes, such as a cost center number or a unique customer ID. The service level reference number appears for each service under the Product section of the invoice. You can also edit this field for an existing service.
- Rate Limit – The speed of your connection in Mbps. You must choose from the provided bandwidth options (50 Mbps to 10 Gbps). The sum of all hosted virtual VXCs to a service can exceed the Port capacity (1, 10, or 100 Gpbs), however the total aggregate will never burst beyond the Port capacity.
Preferred A-End VLAN – Optionally, specify an unused VLAN ID for this connection. This VLAN ID must be a unique ID on this Port and can range from 2 to 4093. If you specify a VLAN ID that is already in use, the system displays the next available VLAN number. The VLAN ID must be unique to proceed with the order. If you don’t specify a value, Megaport will assign one.
Alternatively, you can click Untag to remove the VLAN tagging for this connection. The untagged option limits you to only one VXC deployed on this Port.
Specify the cloud details:
- AWS Connection Name – This is a text field and will be the name of your virtual interface that appears in the AWS console. For easy mapping, use the same name for this field as you did for the VXC name on the previous screen.
- AWS Account ID – This is the ID of the account you want to connect. You can find this value in the Account Settings section of your AWS console.
- Review the connection details and click Add VXC.
- Click Order.
- Click Order Now.
Your work in the Megaport Portal is complete. Next, you will connect the new VXC to your AWS environment.
- In approximately 2 minutes, log in to your AWS account. The VXC you implemented will appear in your Direct Connect under Connections.
- Click Create connection.
You will then need to connect to a Direct Connect Gateway. For details on connecting to various types of gateways, see Working with Direct Connect gateways - AWS Direct Connect.
To achieve redundancy for the connectivity portion of this solution, establish an additional VXC to a diverse edge router within the AWS environment. Follow the Integrating Megaport with Snowflake Data Warehousing on AWS procedure from step 3, and choose the alternate Diversity Zone (identified by the blue circle icon) when choosing the AWS region.
The Diversity Zones identified in the Megaport Portal only refer to AWS edge router locations, and not to the Availability Zones within the AWS infrastructure.
Complete the procedure, including the required steps within the AWS console.
This figure shows a high-level topology diagram of a solution integrating Megaport with diverse connections into the Snowflake Data Warehousing on AWS infrastructure.
For additional physical redundancy for the connectivity, you can implement the VXCs on separate Ports:
Snowflake is a Software-as-a-Service (SaaS) data warehouse service that is purpose-built for the cloud. The infrastructure is built within the AWS cloud and allows for fast and simple implementation and integration with your AWS VPC environment and on-premise infrastructure. In a Snowflake on AWS environment, you can take advantage of benefits such as elasticity and scalability while only paying for the resources that you use. It provides metadata management, security of data, and an ANSI compliant engine on a resilient and redundant platform that allows you to store, query, and analyze all of your data in one place. Snowflake is available within the AWS Marketplace; it competes with services that are offered directly by AWS, but can offer more functionality at lower costs. We recommend that you review available options to determine which is best for your business requirements.
For more information on Snowflake, see these additional links:
- The Modern Cloud Data Platform Built for Any Cloud
- AWS PrivateLink for Snowflake: No Internet Required
- AWS Marketplace: Snowflake On Demand - Premier
- Data Warehouse Architecture