The Data Registry fact collector is installed by default on Portal and exposes information from Portals’s Data Registry as facts to Soundcheck. Before configuring the collector, ensure the Data Registry itself has been enabled and configured. This collector enables the creation of checks against metadata stored inside the Registry, to ensure that it is in compliance with your organization’s standards and best practices. It supports the collection of the following fact:Documentation Index
Fetch the complete documentation index at: https://backstage.spotify.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
dataset-schema- contains schema information about dataset entities.
Prerequisites
Configure Data Registry in Backstage
Thedataset-schema fact is typically only applicable to Catalog entities with Kind:Api and Type:dataset, which are programmatically ingested by the Data Registry. To make full use of this collector, be sure to configure some dataset integrations.
Data Registry Fact Collector Configuration
The collection of Data Registry facts is driven by configuration. To learn more about the configuration, consult the Defining Data Registry Fact Collections section. Similar to other collectors, Data Registry Fact Collector can be configured via YAML or No-Code UI. If you configure it via both YAML and No-Code UI, the configurations will be merged. It’s preferable to choose a single source for the Fact Collectors configuration (either No-Code UI or YAML) to avoid confusing merge results. Since this collector is designed for Portal, the No-Code UI is the easiest option, as YAML isn’t configurable within the Portal site.No-Code UI Configuration
To enable the Data Registry Integration, go toSoundcheck > Integrations > Data Registry and click the Configure button. To learn more about the No-Code UI config, see the Configuring a fact collector (integration) via the no-code UI.
By default, the dataset-schema check is filtered to Kind:Api, Type:dataset entities, and runs on a one hour interval.

YAML Configuration Option
Add asoundcheck.collectors.date-registry.collects field to the app-config.yaml.
A simple example Catalog fact collector is listed below.
Defining Data Registry Fact Collections
This section describes the data shape and semantics of Data Registry Fact Collection configurations.Shape Of A Data Registry Fact Collection Configuration
The following is an example of a Data Registry Fact Collection Configuration in YAML:type [required]
The type of the collector: dataset-schema.
frequency [optional]
The frequency at which the fact collection should be executed. Possible values are either a cron expression { cron: ... } or HumanDuration.
If not provided, the fact will only be collected on demand.
Example:
initialDelay [optional]
The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... } or HumanDuration.
Example:
batchSize [optional]
The number of entities to collect facts for at once. Optional, the default value is 1.
Example:
filter [optional]
A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API.
The dataset-schema fact in particular is only relevant to datasets, and so by default the No-Code UI has filters for Kind:Api, Type:dataset.
exclude [optional]
Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.
cache [optional]
If the collected facts should be cached, and if so for how long. Possible values are either true or false or a nested { duration: HumanDuration } field.
If not provided, the fact will not be cached.
Example:
Shape of A Data Registry Fact
The shape of a Data Registry Fact is based on the Fact Schema. The following is an example of the collecteddataset-schema fact: