Skip to main content

Kubernetes

Kubernetes integration for Soundcheck.

This integration collects facts from your kubernetes clusters through its API. How it collects these facts is similar to the backstage kubernetes plugin.

The kubernetes integration currently supports the following facts:

See the facts section for more details.

Prerequisites

Configure Kubernetes Integration in Backstage

Configure kubernetes according to the backstage kubernetes plugin docs.

Note: the Kubernetes plugin has the following limitations:

  1. We only support the catalog and config clusterLocatorMethods.
  2. Custom resources are currently not supported.
  3. apiVersionOverrides are currently not supported.
  4. We only support certain server-side authentication strategies. See authentication section for details.

Example:

kubernetes:
serviceLocatorMethod:
type: 'singleTenant'
clusterLocatorMethods:
- type: catalog
- type: 'config'
clusters:
- url: https://127.0.0.1:59974
name: minikube
authProvider: 'serviceAccount'
skipTLSVerify: true
skipMetricsLookup: true
serviceAccountToken: ${K8S_MINIKUBE_TOKEN}

Ensure your components are annotated correctly. The fact collector will only collect facts from entities that have the backstage.io/kubernetes-id annotation.

Labels in Kubernetes

By default, the label selector used to query the kubernetes cluster is app={backstage.io/kubernetes-id}.

You can use a custom label selector by using the backstage.io/kubernetes-label-selector annotation.

Authentication

Configure your authentication according to the kubernetes plugin.

The fact collector only supports the following server side auth strategies:

  • aws
  • azure
  • googleServiceAccount
  • serviceAccount

Note: google and aks client side strategies will be mapped to googleServiceAccount and azure respectively.

Add the Kubernetes Fact Collector to Soundcheck

First, add the @spotify/backstage-plugin-soundcheck-backend-module-kubernetes package:

yarn workspace backend add @spotify/backstage-plugin-soundcheck-backend-module-kubernetes

Legacy Backend

If you are still using the legacy backend, in packages/backend/src/plugins/soundcheck.ts, add the KubernetesFactCollector:

packages/backend/src/plugins/soundcheck.ts
import { SoundcheckBuilder } from '@spotify/backstage-plugin-soundcheck-backend';
import { Router } from 'express';
import { PluginEnvironment } from '../types';
import { KubernetesFactCollector } from '@spotify/backstage-plugin-soundcheck-backend-module-kubernetes';

export default async function createPlugin(
env: PluginEnvironment,
): Promise<Router> {
return SoundcheckBuilder.create({ ...env })
.addFactCollectors(
KubernetesFactCollector.create({ ...env }),
)
.build();
}

New Backend System

If you are using the New Backend System, you can just add the following:

packages/backend/src/index.ts
const backend = createBackend();

backend.add(import('@spotify/backstage-plugin-soundcheck-backend'));
backend.add(
import('@spotify/backstage-plugin-soundcheck-backend-module-kubernetes'),
);
// ...

backend.start();

Consult the Soundcheck Backend documentation for additional details on setting up the Soundcheck backend.

Plugin Configuration

Kubernetes Fact Collector can be configured via YAML or No-Code UI. If you configure it via both YAML and No-Code UI, the configurations will be merged. It's preferable to choose a single source for the Fact Collectors configuration (either No-Code UI or YAML) to avoid confusing merge results.

No-Code UI Configuration Option

  1. Make sure the prerequisite Configure Kubernetes integration in Backstage is completed and Kubernetes instance details are configured.

  2. To enable the Kubernetes Integration, go to Soundcheck > Integrations > Kubernetes and click the Configure button. To learn more about the No-Code UI config, see the Configuring a fact collector (integration) via the no-code UI.

Kubernetes Integration

YAML Configuration Option

  1. Create a kubernetes-facts-collectors.yaml file in the root of your Backstage repository to use for configuration.

The following example will collect services, pods, deployment facts at the top of every hour:

kubernetes-facts-collectors.yaml
---
frequency:
cron: '0 * * * *'
collects:
- type: k8s_services
- type: k8s_pods
- type: k8s_deployments

Note: this file will be loaded at runtime along with the rest of your Backstage configuration files. Therefore, make sure that it's available in deployed environments in the same way as your app-config.yaml files are.

  1. Add a soundcheck collectors field to app-config.yaml and reference the newly created kubernetes-facts-collectors.yaml

    app-config.yaml
    soundcheck:
    collectors:
    kubernetes:
    $include: ./kubernetes-facts-collectors.yaml

Rate Limiting (Optional)

This fact collector can be rate limited in Soundcheck using the following configuration:

soundcheck:
job:
workers:
kubernetes:
limiter:
max: 1000
duration: 60000

The following configuration would limit the colection rate to 1000 requests per minute. The above is an example, adjut the numbers as needed for your setup._createMdxContent

Soundcheck will automatically pause and retry the fact collector if it encounters rate limit errors.

Note: Scheduled checks do NOT respect rate limits set here. Checks that utilize fact collectors do NOT need to be scheduled. Instead schedule the fact collector: see frequency.

Defining Kubernetes Fact Collector

This section describes the data shape and semantics of Kubernetes Fact Collector.

Overall Shape Of A Kubernetes Fact Collector

The following is an example of a descriptor file for a Kubernetes Fact Collector:

---
frequency:
cron: '0 * * * *'
initialDelay:
seconds: 30
filter:
kind: 'Component'
cache:
duration:
hours: 2
collects:
- type: k8s_services
- type: k8s_pods
filter:
- spec.lifecycle: 'production'
spec.type: 'service'
frequency:
cron: '0 1/2 * * *'
- type: k8s_deployments
- type: k8s_hpas
- type: k8s_stateful_sets
- type: k8s_pod_disruption_budgets
- type: k8s_ingresses

Below are the details for each field.

frequency [optional]

The frequency at which the collector should be executed. Possible values are either a cron expression { cron: ... } or HumanDuration. This is the default frequency for each fact type. Example:

frequency:
cron: '0 * * * *'

frequency:
hours: 3

If using a long HumanDuration we recommened you also add an initialDelay to ensure timely fact collection in the case of a deployment or restart.

initialDelay [optional]

The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... } or HumanDuration. This is the default initial delay for each fact type. Example:

initialDelay:
seconds: 30

filter [optional]

A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. This is the default filter for each fact type. Example:

filter:
- spec.lifecycle: 'production'

cache [optional]

If the collected facts should be cached, and if so for how long. Possible values are either true or false or a nested { duration: HumanDuration } field. This is the default cache config for each fact type. Example:

cache:
duration:
hours: 24

google [optional]

This is used for GKE authentication. By default that auth strategy uses the application default credentials. However, you can pass in a JSON key to use instead.

google:
jsonKey: ${GOOGLE_JSON_KEY}

collects [required]

An array describing which facts to collect and how to collect them. See below for details about the configuration of fact collection for each fact type.

  • type [required]

    The type of the collector. The following types are available:

    k8s_services
    k8s_pods
    k8s_deployments
    k8s_hpas
    k8s_stateful_sets
    k8s_pod_disruption_budgets
    k8s_ingresses
  • frequency [optional]

    The frequency at which the fact collection should be executed. Possible values are either a cron expression { cron: ... } or HumanDuration. If provided, it overrides the default frequency provided at the top level. If not provided, it defaults to the frequency provided at the top level. If neither collector's frequency, nor default frequency is provided, the fact will only be collected on demand.

  • initialDelay [optional]

    The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... } or HumanDuration. If provided, it overrides the default initial delay provided at the top level. If not provided, it defaults to the initial delay provided at the top level. If neither collector's initial delay, nor default initial delay is provided, the fact will be collected with no initial delay.

  • filter [optional]

    A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. If provided, it overrides the default filter provided at the top level. If not provided, it defaults to the filter provided at the top level. If neither collector's filter, nor default filter is provided, the fact will be collected for all entities.

  • cache [optional]

    If the collected facts should be cached, and if so for how long. Possible values are either true or false or a nested { duration: HumanDuration } field. If provided, it overrides the default cache config provided at the top level. If not provided, it defaults to the cache config provided at the top level. If neither collector's cache nor default cache config is provided, the fact will not be cached.

Collecting Facts

All kubernetes fact collections fetch data from the kubernetes api.

Schema

The general type of each fact type is as follows:

{
clusters: string[];
labelSelector: string;
items: {
spec: ObjectSpec;
metadata: V1ObjectMeta;
clusterName: string;
}[];
}

JSON Schemas are available in the dist package.

clusters

A list of cluster names where the items were retrieved from.

labelSelector

The label selector used to query the clusters.

items

A list of retrieved items. If there are items from multiple clusters, that are combined into this items list.

Example Fact

Example for k8s_deployments in JSON:

{
"factRef": {
"source": "kubernetes",
"scope": "default",
"name": "k8s_deployments"
},
"entityRef": "component:default/soundcheck-test-service",
"data": {
"clusters": [
"gke-test-cluster"
],
"labelSelector": "app=soundcheck-test-service",
"items": [
{
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"app": "soundcheck-test-service"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "soundcheck-test-service"
}
},
"spec": {
"containers": [
{
"name": "echo-server",
"image": "kicbase/echo-server:1.0",
"resources": {
"limits": {
"cpu": "500m",
"ephemeral-storage": "1Gi",
"memory": "2Gi"
},
"requests": {
"cpu": "500m",
"ephemeral-storage": "1Gi",
"memory": "2Gi"
}
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"capabilities": {
"drop": [
"NET_RAW"
]
}
}
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {
"seccompProfile": {
"type": "RuntimeDefault"
}
},
"schedulerName": "default-scheduler",
"tolerations": [
{
"key": "kubernetes.io/arch",
"operator": "Equal",
"value": "amd64",
"effect": "NoSchedule"
}
]
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"metadata": {
"name": "soundcheck-test-service",
"namespace": "default",
"uid": "327ee942-db74-4723-8d24-d812676edeaf",
"resourceVersion": "15896657",
"generation": 1,
"creationTimestamp": "2024-06-04T15:12:02Z",
"labels": {
"app": "soundcheck-test-service"
},
"annotations": {
"autopilot.gke.io/resource-adjustment": "{\"input\":{\"containers\":[{\"name\":\"echo-server\"}]},\"output\":{\"containers\":[{\"limits\":{\"cpu\":\"500m\",\"ephemeral-storage\":\"1Gi\",\"memory\":\"2Gi\"},\"requests\":{\"cpu\":\"500m\",\"ephemeral-storage\":\"1Gi\",\"memory\":\"2Gi\"},\"name\":\"echo-server\"}]},\"modified\":true}",
"autopilot.gke.io/warden-version": "2.8.83",
"deployment.kubernetes.io/revision": "1"
},
"managedFields": [
{
"manager": "kubectl-create",
"operation": "Update",
"apiVersion": "apps/v1",
"time": "2024-06-04T15:12:02Z",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:labels": {
".": {},
"f:app": {}
}
},
"f:spec": {
"f:progressDeadlineSeconds": {},
"f:replicas": {},
"f:revisionHistoryLimit": {},
"f:selector": {},
"f:strategy": {
"f:rollingUpdate": {
".": {},
"f:maxSurge": {},
"f:maxUnavailable": {}
},
"f:type": {}
},
"f:template": {
"f:metadata": {
"f:labels": {
".": {},
"f:app": {}
}
},
"f:spec": {
"f:containers": {
"k:{\"name\":\"echo-server\"}": {
".": {},
"f:image": {},
"f:imagePullPolicy": {},
"f:name": {},
"f:resources": {},
"f:terminationMessagePath": {},
"f:terminationMessagePolicy": {}
}
},
"f:dnsPolicy": {},
"f:restartPolicy": {},
"f:schedulerName": {},
"f:securityContext": {},
"f:terminationGracePeriodSeconds": {}
}
}
}
}
},
{
"manager": "kube-controller-manager",
"operation": "Update",
"apiVersion": "apps/v1",
"time": "2024-06-18T17:58:40Z",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:annotations": {
"f:deployment.kubernetes.io/revision": {}
}
},
"f:status": {
"f:availableReplicas": {},
"f:conditions": {
".": {},
"k:{\"type\":\"Available\"}": {
".": {},
"f:lastTransitionTime": {},
"f:lastUpdateTime": {},
"f:message": {},
"f:reason": {},
"f:status": {},
"f:type": {}
},
"k:{\"type\":\"Progressing\"}": {
".": {},
"f:lastTransitionTime": {},
"f:lastUpdateTime": {},
"f:message": {},
"f:reason": {},
"f:status": {},
"f:type": {}
}
},
"f:observedGeneration": {},
"f:readyReplicas": {},
"f:replicas": {},
"f:updatedReplicas": {}
}
},
"subresource": "status"
}
]
},
"clusterName": "gke-test-cluster"
}
]
},
"timestamp": "2024-06-21T15:04:15.214Z"
}