Skip to main content

Fact Collectors

Fact Framework

  • Soundcheck allows organizations to push results through the Soundcheck API.
  • Organizations may also leverage the Soundcheck Fact Framework, which provides a mechanism by which Soundcheck itself can collect facts about an entity in Backstage, and execute checks based on those facts.
  • The Fact Framework allows Soundcheck to collect facts (un-opinionated information) about an entity in the software catalog. Facts are made available to Soundcheck through fact collectors.

Soundcheck comes with two built-in fact collectors: the Software Catalog Collector and the Soundcheck Collector. A fact collector collects one or more facts about entities, and Soundcheck can be extended with additional fact collectors.

NOTE: At the moment, only the Github fact collector can be configured via the Collectors UI, with support for others coming in future releases.

Soundcheck FC Main Screen

Configuring a fact collector via the no-code UI

To configure a fact collector, make sure you are on the Collectors tab. Click on a Fact's “Configure” link to open a modal that displays the Fact Collector's configuration form.

  • WARNING: If you already have a config YAML file setup, you will be unable to use No-Code UI to configure your collector. In order to use the No-Code UI, simply remove the YAML file/reference and you will have access to configure from No-Code UI.

Soundcheck FC Modal

Once you choose to configure a collector, you will see the following page with 3 configuration options. You can see what each configuration collects in its description. All 3 configs have the following options:

Frequency

Soundcheck FC Frequency

You can set the frequency of how often to collect details from each collector option. The frequency of runs can be set using regular intervals or defined as custom cron expressions.

Filters

Soundcheck FC Filters

You can set filters for each option as well. These filters contain the same options as Tracks. You can learn by going to the Creating a new track section.

Caching

Soundcheck FC Cache

Lastly, you can enable Caching and set up an optional duration for said cache.

Once you have finished making your desired changes, make sure to click on the save button in order to properly save you configuration. Optionally, you can click on the cancel button at anytime to discard your changes.

Configuring fact collectors via yaml configuration

Soundcheck can be extended with additional fact collectors. A fact collector can collect one or more facts on a given entity.

These fact collectors are provided by default with additional configuration required. A check using facts from these collectors can be defined normally. However, if you'd like the check to execute periodically the check must have a schedule with filter because facts from these collectors are not collected on a schedule. Catalog and soundcheck collectors do not require a collector.yaml file to be present, just the checks.

Catalog Fact Collector

The catalog fact collector exposes information from Backstage's Software Catalog as facts to Soundcheck. It provides a single fact on entities: catalog:default/entity_descriptor which provides the entity's descriptor as fact data.

This enables the creation of checks against an entity's metadata, to ensure that it is in compliance with your organizations standards and best practices.

An example fact collected by the catalog fact collector:

factRef: catalog:default/entity_descriptor
entityRef: component:default/artist-web
data:
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: artist-web
description: The place to be, for great artists
labels:
example.com/custom: custom_label_value
annotations:
example.com/service-discovery: artistweb
circleci.com/project-slug: github/example-org/artist-website
tags:
- java
links:
- url: https://admin.example-org.com
title: Admin Dashboard
icon: dashboard
type: admin-dashboard
spec:
type: website
lifecycle: production
owner: artist-relations-team
system: public-websites
timestamp: 2023-02-20T13:50:35Z

An example catalog check:

- id: has_required_tags
rule:
any:
- factRef: catalog:default/entity_descriptor
path: $.metadata.tags
operator: contains
value: java
- factRef: catalog:default/entity_descriptor
path: $.metadata.tags
operator: contains
value: data
schedule:
frequency:
cron: '* * * * *'
filter:
kind: 'Component'
passedMessage: |
Tag found, check passed.
failedMessage: |
No `java` or `data` tag found, check failed.

Note: Since the collector for catalog information is built into Soundcheck there is no need for a collector.yaml file. As a result, adding the schedule to the check is how Soundcheck is told when to collect these facts. Additionally, the filter informs Soundcheck which type of entities to collect these facts from. This is a break from our recommendation to configure schedule and filter at the collector level for other fact collectors.

Soundcheck Fact Collector

The Soundcheck fact collector exposes Soundcheck track certifications as facts. It provides facts for each Soundcheck track using the fact reference: soundcheck:default/program/:programId where :programId is the identifier for the track whose certification is contained in the fact.

This enables the creation of checks against an entity's certification level in other tracks.

Here is an example of a check using the Soundcheck fact collector:

- id: is_level_one_certified_track_id
rule:
- factRef: soundcheck:default/program/track_id
path: $.highestLevel.ordinal
operator: greaterThanInclusive
value: 1
schedule:
frequency:
cron: '* * * * *'
filter:
kind: 'Component'

Note: Similar to the Catalog check, this check does not have a dedicated collector YAML file. As a result, it requires a schedule with a filter and frequency.

Overall shape of a fact collector configuration file

However, if you'd like to use any of the available third-party integration fact collectors or create any custom collectors yourself, you will need to create a <COLLECTOR_NAME>-fact-collector.yaml file. Here is an example for the GitHub TPI Fact Collector plugin:

github-fact-collector.yaml
---
frequency:
cron: '* * * * *'
initialDelay:
seconds: 30
filter:
kind: 'Component'
cache:
duration:
hours: 24
collects:
- factName: repo_details
type: RepositoryDetails
- factName: protections
type: BranchProtections

Fact collector fields

#check-result-schema

FieldRequiredDescription
frequencyYesThe frequency the collector will look for facts. Possible values are either a cron expression { cron: ... } or HumanDuration. If cache is being used and the fact is unchanged, no checks will be retriggered.
initialDelayNoThe amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... } or HumanDuration.
cacheNoDuration to store facts in cache before; if a check requests a fact that is located in cache, the fact collector will NOT re-request new data from the endpoint.
filterNoFilter for entities to run fact collection on.
collectsNoArray of facts to store from the endpoint for the use by checks.

Adding yaml fact collectors to your fact library

To add fact collectors to your fact library, you will need to add the following to your app-config.yaml file:

app-config.yaml
soundcheck:
collectors:
- $include: ./path-to-local-folder/github-fact-collector.yaml
- $include: ./path-to-local-folder/scm-fact-collector.yaml
- $include: ./path-to-local-folder/pagerduty-fact-collector.yaml

Note: unlike checks and tracks, most fact collectors cannot be managed via a remote repo, but we are working to add this capability. Your fact collector configuration must be present in the local Backstage instance. The exception to this is the SCM fact collector, which accepts a remote URL from which it can fetch its configuration.

Notes

  • The remote URL must be accessible by the Backstage instance, and that Soundcheck will use Backstage's shared URLReader to fetch the configuration. You must therefore have proper integration(s) setup in Backstage for the URLReader to be able to fetch the configuration. For example, to use the URL provided above, you would need to have a GitHub integration setup in Backstage.
  • The remote URL must be a URL to a YAML file, not a JSON file or any other format.
  • There is a known limitation with configuring the SCM Fact Collector with a remote URL: The configuration will schedule fact collection appropriately on initial load, but subsequent changes to the configuration will not change which facts are collected nor their collection schedules. The SCM Fact Collector will only pick up changes to its fact collection configuration on a restart of the Backstage instance.

REST API

We include a REST API for Facts. See API Reference for details.

Third Party Integrations

We are always adding new third-party integrations to Soundcheck. You can find the list of available integrations here.