Skip to main content

Data Experience Quick Start

This guide contains the minimal steps you'll need to take to get connected with the Data Experience. More detailed information can be found for the Data Experience and each of the integrations in the sidebar.

What you'll achieve

  • Ingest data warehouse tables as APIs into Portal's Software Catalog
  • Make datasets searchable alongside your software components
  • Provide ownership and lifecycle information for data governance

Prerequisites

  • Admin access to Portal's Config Manager
  • Access to create credentials in your preferred integration(s) (Redshift, BigQuery, or Snowflake)

Step 1: Enable Data Experience

  1. Go to Config Manager in Portal
  2. Under Data Experience, click the Start plugin button.
  3. Under Catalog, select the Modules tab, and start @spotify/backstage-plugin-catalog-backend-module-data-registry-provider
  4. Under Search, select the Modules tab and start @spotify/backstage-plugin-search-backend-module-data-registry

Step 2: Configure Authentication

  1. Follow Snowflake's guide to generate a key-pair for authentication. Be sure to follow the instructions to assign the public key to a Snowflake user.
  2. Navigate to Config Manager > Data Experience
  3. Expand the keys on the sidebar dataExperience > registry > integrations > snowflake
  4. Under the sources key, select the Option 2 tab and add a new item to the list
    • Enter SNOWFLAKE_JWT in the authenticator field
    • Enter the username of the user the public key was assigned to
    • Enter the privateKey
    • Enter the warehouse that should be used for executing queries. If omitted, the user's default warehouse will be used.
    • Enter the role that should be used for executing queries. If omitted, the user's default role will be used.
  5. Scroll to the bottom of the page and click the Save changes button

(Optional) Step 3: Configure Ingest Schedule

Configure how often datasets are ingested from your sources to the data registry.

  1. From Config Manager > Data Experience, expand the keys on the sidebar dataExperience > registry > integrations > defaults > schedule > frequency > cron
  2. Enter a valid crontab string. This is 0 */6 * * * (every 6 hours) by default.
  3. Scroll to the bottom of the page and click the Save changes button

(Optional) Step 4. Configure Catalog Sync Schedule

Configure how often datasets are replicated to the Software Catalog from the data registry.

  1. From Config Manager > Data Experience, expand the keys on the sidebar dataExperience > catalog > schedule > frequency > cron
  2. Enter a valid crontab string. This is 0 */6 * * * (every 6 hours) by default.
  3. Scroll to the bottom of the page and click the Save changes button

Step 5: Test & Verify

  1. Wait for the first sync to complete - when this happens will depend on how you've configured your schedules in steps 3 and 4. You can monitor progress by visiting the Data Overview page, accessibile from Portal's navigation.
  2. Search for your datasets in Portal's search

Next Steps

Troubleshooting