Loading Data from Cloud Platforms into Snowflake: Part 1 - Stages

In our last blog we covered loading local data files into Snowflake via the GUI, in this blog we're going to get a little more advanced and look at how to bring in data from external cloud storage services, of which Snowflake supports three; Amazon S3, Microsoft Azure, and Google Cloud Storage.

However, in order to load in data from these external cloud storage platforms Snowflake requires something called a stage. In "Snowflake speak", a stage is the location of the data files.

To copy files into Snowflake from an external stage, such as Amazon S3, you will need to specify a named stage. This named stage acts as a pointer to the URL of the data files in the external cloud storage, while also allowing you to configure various functionality based settings.


How to Create a Stage

A stage is created using the base command below:

create or alter stage STAGE_NAME

This command creates a new stage if it doesn't already exist and if it does exist, it alters it.

However, several additional arguments are typically required to set up the stage properly. The exact syntax and format will vary based on what platform you are connecting to but below is an example of how you might structure the command:

create or replace stage STAGE_NAME
  url = 'INSERT_URL_HERE'
  credentials = 
  directory = (enable = true) 
  file_format = FILE_FORMAT 

Explaining the additional arguments

Essential

Url = ''

  • This is an essential argument which must be included when creating your external stage. It specifies the location of the external data for the stage.

File_format =

  • This is an essential argument which also must be included when creating your external stage. It specifies the structure of the data files in the external stage and allows Snowflake to properly interpret the data. When specifying you need to either provide an existing file format object or define a new one.

Useful/Helpful

Directory = true

  • This argument is set to true to enable a directory table to be created. A directory table acts as a virtual representation of the file system structure of your stage through which you can interact with and query metadata about your files such as file names and sizes.

Credentials =

  • This arguments provides authentication details required for Snowflake to access and read from the external cloud storage services. This can be in the form of an access key, secret key, among others.

...with Amazon S3

create or replace stage STAGE_NAME
  url = 's3//bucket_name/path/'
  credentials = ( AWS_KEY_ID = 'XXXX', AWS_SECRET_KEY = 'XXX') 
    -- OR -- credentials ( AWS_ROLE = 'string' )
  directory = (enable = true) 
  file_format = (type = 'JSON')
  • URL | The URL follows the format of 's3//your_bucket_name/path', just swapping out the bolded text for the details of your chosen S3 bucket.
  • Credentials | For the credentials argument you need to use either IAM user credentials or IAM role credentials - as shown in the example above.

...with Microsoft Azure

create or replace stage STAGE_NAME
  url = 'azure://account.blob.core.windows.net/container[/path/]
  encryption =  ( TYPE = 'AZURE_CSE' MASTER_KEY = 'string' | TYPE = 'NONE' )
  directory = (enable = true) 
  file_format = FILE_FORMAT 
  • URL | The URL follows the format of 'azure://account.blob.core.windows.net/container[/path/], just swapping out the bolded text for the details of your chosen S3 bucket.
  • Credentials | I generated an SAS token to allow access to my azure container for this argument.
  • Encryption | Specifies the encryption type used. Possible values are;
    • AZURE_CSE: Client side encryption and requires a master key value.
    • NONE: No encryption

...with Google Cloud Storage

create or replace stage STAGE_NAME
  url = 'gcs://load/files/'
  directory = (enable = true) 
  file_format = FILE_FORMAT 
  • URL | The URL follows the format of gcs://<bucket_name>[/path/] - including the path portion if you want to direct to a specific folder.
  • Credentials | You will need to provide GCP service account key credentials to allow Snowflake access to your account. For example - CREDENTIALS = (GCP_JSON_KEY = '<YOUR_KEY>')
  • Encryption | Specifies the encryption type used. Possible values are;
    • GCS_CSE: Client side encryption and requires a master key value.
    • NONE: No encryption

Creating external stages in Snowflake is an integral step in setting up data pipelines from external cloud storage platforms like Microsoft Azure and Amazon S3. Defining stages not only specifies where to find your data files but also allows you to configure things like authentication, file format, and extra things like directory tables and encryption - all allowing Snowflake to easily access and pull in your data.

Stay tuned for Part2 where we will learn how to copy the data in using the external stages into our Snowflake tables !

Author:
Trea McElhone
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab