AiKA Data Flow & Security

This document provides a detailed overview of how data flows through AiKA.

Key features

Customer-controlled models: You configure and manage your own LLM provider credentials. AiKA uses your specified models.
No training on customer data: Spotify does not train models on your data.
Data stays within Backstage Portal: Customer queries, responses, and internal documentation never flow outside Backstage Portal.

Data Flow Architecture

AiKA Data Flow

The diagram above shows how AiKA processes user queries through a ReAct (Reasoning and Acting) agent loop:

User submits a query through the AiKA UI
AiKA Backend receives the query and enters a reasoning loop where it:
- Sends the query with context to your configured LLM Provider (e.g., OpenAI)
- Receives completion responses that may include instructions to call tools
- Calls Tools (Portal Actions, GitHub MCP, etc.) as needed to gather information
- Optionally sends traces to your tracing collector (e.g., Arize Phoenix) if configured
- Iterates through this loop until it has sufficient information
Final response is streamed back to the user

All data remains within your Portal instance, your chosen LLM provider, and your optional tracing collector—no data flows to Spotify systems.

Frequently Asked Questions

Does Spotify train models on our data?

No. Spotify does not train any models. You configure which LLM provider and model to use. User queries and context are sent to your chosen provider's infrastructure for processing, subject to that provider's data handling policies.

Where does our data go?

Your queries and documentation only flow between:

Your Portal instance (AiKA UI and Backend)
Your Portal's search index and other configured tools
Your configured LLM provider
Your configured tools (e.g., GitHub MCP)
Your tracing collector (e.g., Arize Phoenix), if configured

No data flows to Spotify's other systems.

Key features​

Data Flow Architecture​

Frequently Asked Questions​

Does Spotify train models on our data?​

Where does our data go?​

Key features

Data Flow Architecture

Frequently Asked Questions

Does Spotify train models on our data?

Where does our data go?