Skip to main content

Integration: Snowflake

Configuration & Authentication

To configure this module, list all the Snowflake accounts you want to collect metadata from, using their format 2 account identifiers. For each account included, every Table within that account will be ingested as a dataset within the registry.

Within each account block, the relevant authentication type and credentials must also be provided. Currently, of Snowflake's available authentication options, the password-based and key pair set-ups are supported. Set the authenticator key to 'SNOWFLAKE' or 'SNOWFLAKE_JWT' and add the relevant details representing the account Backstage will connect with.

Snowflake Source password auth Snowflake Source keyfile auth

Roles

Users in Snowflake can have multiple roles, but the default role for the user will be the one used during all queries to Snowflake. This can be overridden using the optional role key to the config underneath the relevant account. Note that the role selection will affect which tables are ingested. Tables with a higher level role in the role hierarchy than the one used in the connection won't appear.

Snowflake Role

Using the example config above, the role is specifically set to PUBLIC. This would mean any tables visible only to ACCOUNTADMINs would not be ingested into the Registry.

Naming

The naming structure for Datasets created from Snowflake is as follows: [database].[schema].[table_name].

Tags & Labels

Snowflake Tags are pieces of metadata that can be added to various resources. Since these are key-value pairs in the source system, the Snowflake connector for Registry will convert them to labels (which are a key-value pair mapping) within the Software Catalog. Tags that can't be converted into Catalog labels (because they're too long, contain invalid UTF-8 characters, etc) will be omitted from the Dataset and emit a WARN log.

The snowflake connector will parse the tags looking for ones where the key value is OWNER or LIFECYCLE in order to populate those values onto the dataset. If an OWNER key is not found within the tags, it will use the ownership metadata (which is usually the principal role on the table). Finally, it will use the values populated within the entityDefaults section of the config as a fallback option for both owner and lifecycle.

Warehouses

Snowflake Warehouses are required for certain queries to be run. While the basic ingestion of tables doesn't rely on any such queries, ingesting the tags on each table does. For this step, the default warehouse of the user associated with the account in the config will be used. To override this selection, add the optional warehouse key-value pair to the relevant account. Reference the screen shot above in roles to see where to configure this.

Since warehouses are also scoped to roles, it's important to ensure the warehouse used in the connection (be it the user's default or defined via the config) can be accessed by the role used in the connection. Also note that the warehouse must be running (or have auto-resume enabled) in order for tag ingestion to occur.