Source Control Management

The Source Control Management (SCM) integration plugin for Soundcheck enables integration with the following source control management providers:

azure
bitbucketCloud
bitbucketServer
gerrit
gitea
github
gitlab

Prerequisites

SCM Integrations - Connecting the SCM module to SCM providers

To connect to external providers, an 'integration' must be provided in the main app-config.yaml file as follows:

integrations:
  github:
    - host: github.com
      token: ${GITHUB_TOKEN}

The above example provides a GitHub integration, with the host set to github.com. Authentication is provided via a token issued from github.com for the repository that you'd like to connect to.

Consult the Backstage GitHub integration instructions for full configuration details.

Add the ScmFactCollector to Soundcheck

Source Control Management integration for Soundcheck is not installed by default. It must be manually installed and configured.

First, add the @spotify/backstage-plugin-soundcheck-backend-module-scm package:

yarn workspace backend add @spotify/backstage-plugin-soundcheck-backend-module-scm

Then add the following to your packages/backend/src/index.ts file:

packages/backend/src/index.ts
const backend = createBackend();

backend.add(import('@spotify/backstage-plugin-soundcheck-backend'));
backend.add(import('@spotify/backstage-plugin-soundcheck-backend-module-scm'));
// ...

backend.start();

Consult the Soundcheck Backend documentation for additional details on setting up the Soundcheck backend.

Adding SCM Entities

To use Source Control Management (SCM) integrations, an entity hosted by an SCM provider is needed. As an example, an entity could be added to the catalog with a type set to url and a target set to the entity's hosted location, like so:

catalog:
  locations:
    # Soundcheck external demo
    - type: url # Denotes SCM entities.
      target: https://github.com/your_repo/blob/main/all-components.yaml

The configuration above adds a component hosted by github.com and configured by the target yaml file.

Entity configuration

To be able to determine the location of a file the SCM integration will use the value from the backstage.io/source-location annotation as its base. In many cases this will be set for you but if it is not you will need to add it to your catalog-info.yaml file, here's a simple example:

metadata:
  annotations:
    backstage.io/source-location: url:https://github.com/my-org/my-service/

Configuring the SCM Module

SCM Fact Collector can be configured via YAML, URL or No-Code UI. If you configure it via both YAML and No-Code UI, the configurations will be merged. It's preferable to choose a single source for the Fact Collectors configuration (either No-Code UI, YAML or URL) to avoid confusing merge results.

SCM Fact Collector: Default Configuration

To add the default initial configuration of SCM Fact Collector on startup, the following flag must be set to true in the app-config.yaml file:

soundcheck:
  addStartingConfigurations:
    collectors: true

This configuration is required to be able to collect the facts necessary for the pre-canned checks and tracks.

Note: The configuration will be stored in the database and will be configurable via No-Code UI. If you'd like to add this configuration via yaml, you should add the following config instead:

soundcheck:
  collectors:
    scm:
      collects:
        # Extracts if files exist at the given paths.
        - factName: required_files_exist
          type: exists
          data:
            - name: license
              path: LICENSE.md
            - name: code_of_conduct
              path: CODE_OF_CONDUCT.md
            - name: contributing
              path: CONTRIBUTING.md
            - name: readme
              path: README.md
            - name: api_report
              path: api-report.md
          frequency:
            cron: '7 9 * * *'
          filter:
            - kind: Component
            - spec.lifecycle: production

        # Checks that the api-report contains the correct 'do not edit' disclaimer.
        - factName: api-report-has-no-edit-warning
          type: regex
          path: api-report.md
          regex: .*Do not edit this file.*
          frequency:
            cron: '7 11 * * *'
          filter:
            - kind: Component
            - spec.lifecycle: production

SCM Fact Collector: No-Code UI Configuration Option

Make sure the prerequisite SCM Integrations - Connecting the SCM module to SCM providers is completed and SCM instance details are configured.
To configure the SCM Integration, go to Soundcheck > Integrations > Source Code Management and click the Configure button. To learn more about the No-Code UI config, see the Configuring a fact collector (integration) via the no-code UI.

SCM Integration

SCM Fact Collector: YAML Configuration Option

The facts to be collected by the Source Control Management (SCM) module must be defined in one or more yaml files, and then referenced in the Soundcheck configuration in the app-config.yaml file, like so:

soundcheck:
  collectors:
    scm:
      - $include: ./scm-fact-extraction-configurations.yaml
      - $include: ./more-scm-fact-extraction-configurations.yaml
      - $include: ./even-more-scm-fact-extraction-configurations.yaml

With an SCM entity in your catalog, an SCM integration set, and SCM configuration files added to the soundcheck.collectors.scm field, your Backstage instance is almost ready to extract facts from SCM providers.

The next section covers how to set up the fact extraction configuration files to extract facts from SCM.

SCM Fact Collector: URL Configuration Option

The SCM Fact Collector can be configured to fetch its configuration from a remote URL, allowing its configuration to be managed as code.

The configuration file still follows the rules and structure as outlined for collectors above.

Here is how the SCM Fact Collector can be configured to fetch its configuration from a remote URL:

app-config.yaml
soundcheck:
  collectors:
    scm:
      url: https://github.com/JasonSmithSpotify/soundcheck-external-demo/blob/main/scm-collector.yaml

Here, instead of the standard $include directive or local specification of the collector configuration, the scm collector is configured with a url field that points to the remote URL from which the configuration can be fetched.

Rate Limiting (Optional)

This fact collector can be rate limited in Soundcheck using the following configuration:

soundcheck:
  job:
    workers:
      scm:
        limiter:
          max: 4900
          duration: 3600000

The rate limits for SCM are dependant on the source control API used. For example, GitHub has a rate limit of 5000 per hour; the above configuration is set for 4900 executions per hour to account for this limit.

This fact collector handles 429 rate limit errors from SCM. Soundcheck will automatically wait and retry requests that are rate limited.

When available by the underlying API (i.e. GitHub), ETags and conditional requests are utilized to reduce the total number of API calls.

Caching Full Etag Responses (Optional)

The SCM Fact Collector can be configured to cache the full response from the SCM provider when using ETags. This will cache not only the extracted fact data, but the full response from the SCM provider, including the file tree structure and file contents. This is especially useful when using GLOB paths that may return many files as it can save significantly on API calls to the SCM provider(s). If you're encountering API rate limits and are using GLOB patterns, consider enabling this option.

WARNING: Caching full responses can lead to high memory usage, particularly when using GLOB patterns that match many files or if the files themselves are large.

Enable this option with:

soundcheck:
  collectors:
    scm:
      cacheFullResultsForEtags: true

This option is disabled by default, and can currently only be enabled via YAML. If you are using the No-Code UI to configure your collectors, you will need to add this option to your app-config.yaml file. Enabling this option via YAML will work with but otherwise not impact any No-Code UI configurations.

SCM Fact Extraction Configuration

The SCM Fact Collector configuration yaml files have the following structure:

frequency:
  cron: '0 * * * *' # Defines a schedule for when the facts defined in this file should be collected
  # This is optional and if omitted, facts will only be collected on demand.
initialDelay:
  seconds: 30
filter: # A filter specifying which entities to collect the specified facts for
  kind: 'Component'
cache: # Defines if the collected facts should be cached, and if so for how long
  duration:
    hours: 2
collects: # An array of fact extractor configuration describing how to collect SCM facts.
  - SCM Fact Extractor Configuration One
  - SCM Fact Extractor Configuration Two
  - ...
  - SCM Fact Extractor Configuration N

Below are the details for each field.

`frequency` [Optional]

The frequency at which the collector should be executed. Possible values are either a cron expression { cron: ... } or HumanDuration. This is the default frequency for each extractor.

`initialDelay` [optional]

The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... } or HumanDuration.

`batchSize` [optional]

The number of entities to collect facts for at once. Optional, the default value is 1.

Note: Fact collection for a batch of entities is still considered as one hit towards the rate limits by the Soundcheck Rate Limiting engine, while the actual number of hits will be equal to the batchSize.

Example:

batchSize: 100

`filter` [Optional]

A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. This is the default filter for each extractor.

See filters for more details.

`exclude` [optional]

Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.

filter:
  - kind: component
exclude:
  - spec.type: documentation

`cache` [Optional]

If the collected facts should be cached, and if so for how long. Possible values are either true or false or a nested { duration: HumanDuration } field. This is the default cache config for each extractor.

`collects`

An array of SCM Fact Extractor configurations describing how to collect SCM facts. See the section below for details on configuring the extractors.

SCM Fact Extractors

The Exists, RegEx, and JSON/YAML Source Control Management (SCM) fact extractor configurations are described in detail below. Before going into the detailed schemas of the individual fact extractors, the base schema that they all share will be covered first.

Common Fact Extractor Schema

All SCM Fact Extractors share a common base schema, the variables for which are defined below:

`factName`

The name of the fact to be extracted.

Minimum length of 1
Maximum length of 100
Alphanumeric with single separator instances of periods, dashes, underscores, or forward slashes

`filter` [Optional]

A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. If provided, it overrides the default filter provided at the top level. If not provided, it defaults to the filter provided at the top level. If neither extractor's filter, nor default filter is provided, the fact will be collected for all entities.

`exclude` [optional]

Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.

filter:
  - kind: component
exclude:
  - spec.type: documentation

`cache` [Optional]

If the collected facts should be cached, and if so for how long. Possible values are either true or false or a nested { duration: HumanDuration } field. If provided, it overrides the default cache config provided at the top level. If not provided, it defaults to the cache config provided at the top level. If neither extractor's cache, nor default cache config is provided, the fact will not be cached. Example:

cache:
  duration:
    hours: 24

`frequency` [optional]

The frequency at which the fact extraction should be executed. Possible values are either a cron expression { cron: ... } or HumanDuration. If provided, it overrides the default frequency provided at the top level. If not provided, it defaults to the frequency provided at the top level. If neither extractor's frequency, nor default frequency is provided, the fact will only be collected on demand.

Example:

frequency:
  minutes: 10

`batchSize` [optional]

The number of entities to collect facts for at once. Optional, the default value is 1. If provided it overrides the default batchSize provided at the top level. If not provided it defaults to the batchSize provided at the top level. If neither collector's batchSize nor default batchSize is provided the fact will be collected for one entity at a time.

Example:

batchSize: 100

`branch` [optional]

The branch to extract the fact from. If not provided, defaults to the repository's default branch.

Exists Fact Extractor

The Exists Fact Extractor collects information on whether a given file exists in the SCM provider. The extensions to the base schema are as follows:

`type`

Must be exactly exists, like so:

type: exists

`data`

The data collected for this fact. This is an array consisting of two pairs of name and path:

name: An identifier for the data element.
path: The path to the file. GLOB paths are supported.

Both name and path are subject to the naming restrictions of factName.

Sample Exists Configuration

Here's a sample yaml configuration for a fact that gets information on the existence of two files, README.md and catalog-info.yaml:

collects:
  - factName:
      readme_and_catalog_info_files_exist_fact # This gives this fact an identifier which is
      # used to refer to the fact in other
      # configuration files.
    type: exists # This identifies the type of fact to collect.
    data: # This defines the data element which will be returned in the
      # fact object when the fact is collected.
      - name: readme_exists # Label for the data element.
        path: /*/README.md # A GLOB path. If any file is found matching the GLOB path, the value for the exists condition will be true.
      - name: catalog_info_exists # Label for the data element.
        path: /catalog-info.yaml # The file for which existence will be determined.
    filter: # A filter to narrow the applicability of this fact.
      metadata.name:
        soundcheck-external-demo # This filter makes this fact applicable only to the
        # component with the given name, in this case
        # 'soundcheck-external-demo'

The checks that will compare the data collected by this fact to the expected outcomes is specified in the app-config.yaml file. Since this fact collects two data elements, there will be two checks that check the value of each data element. The two checks would look like this:

soundcheck:
  checks:
    - id: has_readme_check # The name of the check
      rule: # How to evaluate this check
        factRef: scm:default/readme_and_catalog_info_files_exist_fact # The fact data to reference
        path: $.readme_exists # The path to the field to analyze
        operator: equal # Indicates the operation to apply
        value: true # The desired value of the field indicated in path, above.
    - id: has_catalog_info_file_check
      rule:
        factRef: scm:default/readme_and_catalog_info_files_exist_fact
        path: $.catalog_info_exists
        operator: equal
        value: true

Finally, these two checks need to be listed in a track level ordinal within the soundcheck-tracks.yaml file. Here's an example:

- id: demo
  name: Demo
  ownerEntityRef: group:default/owning_group
  description: >
    Demonstration of Soundcheck Exists Fact Extractor
  levels:
    - ordinal: 1
      name: First level
      description: Checks leveraging Soundcheck's SCM Exists Fact Extractor
      checks:
        - id: has_catalog_info_file_check # The identifier for the check.
          name: Has catalog-info.yaml # A human-readable name for this check
          description:
            > # The description to display on the Soundcheck page for this check.
            Repositories should contain a catalog-info.yaml file.
        - id: has_readme_check
          name: Has README.md
          description: |
            Repositories should provide a README.md file at root.

RegEx Fact Extractor

The RegEx Fact Extractor collects information about the contents of a file. Two modes are supported:

True/False Mode
RegEx Capture Groups Mode

True/False Mode

True/False Mode uses a Regular Expression, or RegEx, to search for a match in a specified file. If a RegEx match is found, the resulting fact data will contain a value of true for a field named matches. If not, the matches field will contain a value of false.

RegEx Capture Groups Mode

Using RegEx Capture Groups Mode allows the extractor to associate capture groups within a RegEx to named values. This allows checks to verify that the captured values are correct.

RegEx Fact Extractor Schema

The extension schema for RegEx Fact Extractors is as follows:

`type`

Must be exactly regex, like so:

type: regex

`path`

The path to the file to analyze. GLOB paths are supported. When GLOB paths are used, the fact data will be an array, with each element corresponding to a file that matched the GLOB path. If true/false mode is used, array will contain objects with a matches field, which will be true if the file matched the RegEx, and false if it did not. If RegEx Capture Groups Mode is used, the array will contain objects with fields corresponding to the capture groups defined in the RegEx.

NOTE: When using GLOB paths, ensure your check is prepared to handle an array of results. See the example below.

`regex`

A valid RegEx string. This string is used on the file to collect data elements or to provide a true/false response corresponding to whether there is a match for the RegEx or not in the file.

`flags` [Optional]

Accepts an optional regex flag parameter that must match:

/^[gimsuy]+$/

g | Global search. i | Case-insensitive search. m | Allows ^ and $ to match next to newline characters. s | Allows . to match newline characters. u | "Unicode"; treat a pattern as a sequence of Unicode code points. y | Perform a "sticky" search that matches starting at the current position in the target string.

A sample fact using the configuration is as follows:

- factName: api-report-has-no-edit-warning
  type: regex
  flags: mi
  path: api-report.md
  regex: .*do not edit this file.*

`data` [Optional]

Defines the data to collect for this fact. This is an array consisting of two pairs of name and type:

name: An identifier for the data element. Subject to the naming restrictions of factName.
type: The expected type of data to be collected.

Each pair defined in the data field must correspond to a capture group in the given regex. A mismatch between data element definition counts and RegEx Capture Groups is an error. Fact data will not be collected.

If the data element is not present, the mode of the RegEx Fact Extractor defaults to True/False Mode.

Sample RegEx Configuration

The yaml below defines both modes of the RegEx Extractor, True/False Mode and RegEx Capture Groups Mode.

Sample fact definitions are as follows:

collects:
  - factName: apache_license_fact # Name of the fact
    type: regex # Type of the fact
    path: /LICENSE.md # Path to the file whose contents will be searched
    regex: .*Apache License.*Version 2\.0.* # Regex to match.
    # Note lack of any 'data:' object definition, this implies this regex is a true/false type.

  - factName: api_version_fact
    type: regex
    path: /catalog-info.yaml
    regex:
      '^apiVersion: backstage.io/(.+)' # Note the capture group! Each capture group in a regex
      # *must* correspond to a named data element, see below.
    data: # Data describing each capture group
      - name: captured_api_version # The name of the first capture group
        type: string # The type of the first capture group.

  - factName: regex-glob
    type: regex
    path: /readmes/readme-[ab].md # GLOB path to the file whose contents will be searched
    regex: 'Version: (\d+?.\d*)' # Search each file for this regex.
    data:
      - name: parsedVersion
        type: number

With fact collection specified, we must now define checks against the data that will be collected for each fact. We define the checks in the app-config.yaml file. Here are sample checks that correspond to the RegEx facts above:

soundcheck:
  checks:
    - id: uses_recommended_license_check # ID for this check
      rule: # How to evaluate this check
        factRef: scm:default/apache_license_fact # The fact data to reference
        path:
          $.matches # The path to the field to analyze, note that this is always 'matches' for a
          # true/false type regex.
        operator: equal # Indicates the operation to apply
        value:
          true # The desired value of path field, above. True here indicates
          # that, indeed, we want to have found the 2.0 apache license version in the
          # given file.

    - id: api_version_check
      rule:
        factRef: scm:default/api_version_fact
        path:
          $.captured_api_version # This path refers to the name given to the capture group in
          # the fact definition.
        operator: equal
        value:
          v1alpha1 # This is the value we expect to have captured via the regex for this
          # capture group. This can be any string.

    - id: regex_glob_check
      rule:
        factRef: scm:default/regex-glob
        path: $..parsedVersion # Note the use of '..' to indicate that the path is nested, this gets all array values.
        operator: all:greaterThan # Note the use of 'all:' to indicate that the operator should be applied to all values in the array.
        value: 0.5 # The value to compare against. This check will pass if all values in the array are greater than 0.5, which is to say
        # that all files matched by the GLOB path contain a version number greater than 0.5.

Finally, the checks defined above must be present in the soundcheck-tracks.yaml file under an appropriate track level ordinal of a track. Here's an example:

---
- id: demo
  name: Demo
  ownerEntityRef: group:default/owning_group
  description: >
    Demonstration of Soundcheck Regex Fact Extractor
  levels:
    - ordinal: 1
      name: First level
      description: Checks leveraging Soundcheck's SCM Regex Fact Extractor
      checks:
        - id: uses_recommended_license_check # Check ID to include
          name: Uses Apache License 2.0 # Human-readable name for this check.
          description: |
            Use of the Apache License 2.0 is required.
        - id: api_version_check
          name: Has correct API version
          description: >
            Ensures that the component is using the correct api version, which is
            v1alpha1.
        - id: regex_glob_check
          name: Version number check
          description: >
            Ensures that all files matched by the GLOB path contain a version number greater than 0.5.

Let's look at the fact data that comes back for the regex_glob_check. In this example, there are two readme files, readme-a.md and readme-b.md. The contents of these files are as follows:

readme-a.md:

Readme for A. Version: 1.0

readme-b.md:

Readme for b. Version: 42

The fact defines a single capture group, and stores the captured values under the specified name of parsedVersion. So, the fact data that gets generated for this fact will look like this based on the files above:

{
  "https://github.com/JasonSmithSpotify/soundcheck-external-demo/tree/main/readmes/readme-a.md": {
    "parsedVersion": "1.0"
  },
  "https://github.com/JasonSmithSpotify/soundcheck-external-demo/tree/main/readmes/readme-b.md": {
    "parsedVersion": "42"
  }
}

Our check then says that all values in the parsedVersion array must be greater than 0.5, and so the check will pass in this example.

JSON/YAML Fact Extractor

The final fact extractor type supported by the Source Control Management (SCM) plugin is the JSON/YAML Fact Extractor. It works similarly to the RegEx Fact Extractor in that it extracts json/yaml values from a file for use in checks.

JSON/YAML Fact Extractor Schema

The extension schema for JSON/YAML Fact Extractors is as follows:

`type`

Must be one of json or yaml, like so:

type: json

`path`

The path to the file to analyze. GLOB paths are supported. When GLOB paths are used, the fact data will be an array, with each element corresponding to a file that matched the GLOB path.

`data`

Defines the data to collect for this fact. This is an array of the following fields:

name: An identifier for the data element. Subject to the naming restrictions of factName.
type: The expected type of data to be collected, either array or a primitive type (string, int, etc.)
jsonPath: A period delimited path to the desired json/yaml element.
items: A optional field with a single type property. If included, the data returned by the fact will be an array of all matching elements of the specified type. If omitted, the returned value will be a single element.

Sample JSON/YAML Configuration

The yaml below defines both collection types performed by the JSON/YAML Extractor: single element capture and array capture.

Sample fact definitions are as follows:

collects:
  - factName: entity_metadata_fact # Name of the fact
    type: json # Type of the fact
    path: /catalog-info.yaml # Path to the file whose contents will be searched
    data: # Data describing the file contents collected at each jsonPath
      - name: tags # Name for this entry in the data element.
        jsonPath: metadata.tags # Path from which to pull data from the file.
        type: array # Type of element, array or primitive.
        items: # For the array type, this items specification and the type of the items is required.
          type: string
      - name: pager_duty_integration_key
        jsonPath: metadata.annotations.pagerduty_integration-key
        type: string # For non-array captures, just the type of the data is required.

  - factName: json_glob
    type: json
    path: '*-collector.yaml' # Glob path. Finds all files ending in '-collector.yaml' at the root of the entities' repositories.
    data:
      - name: initialDelay
        jsonPath: $.initialDelay
        type: string
      - name: productionFilter
        jsonPath: $.filter[spec.lifecycle]
        type: string

The above data specifications correspond to the two types supported by this extractor, arrays and strings, respectively. The jsonPath of metadata.tags will be extracted into an array named tags of type string. The jsonPath of metadata.annotations.pagerduty_integration-key will be extracted into a variable called pager_duty_integration_key of type string.

The checks for the fact data extracted by the fact specification above could be as follows:

soundcheck:
  checks:
    - id: entity_metadata_tags_check # ID of this check
      rule: # How to evaluate this check
        factRef: scm:default/entity_metadata_fact # The fact data to reference
        path: $.tags # The path to the data in the collected fact's 'data' element
        operator: notEqual # The operation to apply
        value: undefined # The value to compare with the extracted value.
    - id: entity_metadata_key_check
      rule:
        factRef: scm:default/entity_metadata_fact
        path: $.pager_duty_integration_key
        operator: equal
        value: 12345
    - id: json_glob_production_filter_check
      rule:
        factRef: scm:default/json_glob
        path: $..productionFilter # Note the use of '..' to indicate that the path is nested, this gets all array values.
        operator: all:equal # Note the use of 'all:' to indicate that the operator should be applied to all values in the array.
        value: production # The value to compare against. This check will pass if all values in the array are equal to 'production'.
        # That is, if all files matched by the GLOB path have a production filter set to 'production'.

This defines two checks. The first check ensures that the tags array is not undefined in the file, meaning that there are tags present. The second check ensures that the pager_duty_integration_key is in the file and that it is equal to the given value.

Finally, these two checks are added to the soundcheck-tracks.yaml file, under an appropriate track level ordinal. Here's an example:

- id: demo
  name: Demo
  ownerEntityRef: group:default/owning_group
  description: >
    Demonstration of Soundcheck Regex Fact Extractor
  levels:
    - ordinal: 1
      name: First level
      description: Checks leveraging Soundcheck's SCM Regex Fact Extractor
      checks:
        - id: entity_metadata_tags_check
          name: Entity Metadata Tags Check
          description: Check that metadata tags are present.
        - id: entity_metadata_key_check
          name: Entity Metadata Key Check
          description: Check that the pager duty key is correct.
        - id: json_glob_production_filter_check
          name: Production Filter Check
          description: Check that all collector configurations have production filters set to 'production'.

Adding all checks to the soundcheck-tracks.yaml means that they must pass for the corresponding track and level to be considered as passing.

YAML Anchor Support

The SCM collector supports YAML anchor parsing and resolution within the same file. There are some limitations when using the !reference tag, and can only resolve references in the same file.

Prerequisites​

SCM Integrations - Connecting the SCM module to SCM providers​

Add the ScmFactCollector to Soundcheck​

Adding SCM Entities​

Entity configuration​

Configuring the SCM Module​

SCM Fact Collector: Default Configuration​

SCM Fact Collector: No-Code UI Configuration Option​

SCM Fact Collector: YAML Configuration Option​

SCM Fact Collector: URL Configuration Option​

Rate Limiting (Optional)​

Caching Full Etag Responses (Optional)​

SCM Fact Extraction Configuration​

frequency [Optional]​

initialDelay [optional]​

batchSize [optional]​

filter [Optional]​

exclude [optional]​

cache [Optional]​

collects​

SCM Fact Extractors​

Common Fact Extractor Schema​

factName​

filter [Optional]​

exclude [optional]​

cache [Optional]​

frequency [optional]​

batchSize [optional]​

branch [optional]​

Exists Fact Extractor​

type​

data​

Sample Exists Configuration​

RegEx Fact Extractor​

True/False Mode​

RegEx Capture Groups Mode​

RegEx Fact Extractor Schema​

type​

path​

regex​

flags [Optional]​

data [Optional]​

Sample RegEx Configuration​

JSON/YAML Fact Extractor​

JSON/YAML Fact Extractor Schema​

type​

path​

data​

Sample JSON/YAML Configuration​

YAML Anchor Support​

Prerequisites

SCM Integrations - Connecting the SCM module to SCM providers

Add the ScmFactCollector to Soundcheck

Adding SCM Entities

Entity configuration

Configuring the SCM Module

SCM Fact Collector: Default Configuration

SCM Fact Collector: No-Code UI Configuration Option

SCM Fact Collector: YAML Configuration Option

SCM Fact Collector: URL Configuration Option

Rate Limiting (Optional)

Caching Full Etag Responses (Optional)

SCM Fact Extraction Configuration

`frequency` [Optional]

`initialDelay` [optional]

`batchSize` [optional]

`filter` [Optional]

`exclude` [optional]

`cache` [Optional]

`collects`

SCM Fact Extractors

Common Fact Extractor Schema

`factName`

`filter` [Optional]

`exclude` [optional]

`cache` [Optional]

`frequency` [optional]

`batchSize` [optional]

`branch` [optional]

Exists Fact Extractor

`type`

`data`

Sample Exists Configuration

RegEx Fact Extractor

True/False Mode

RegEx Capture Groups Mode

RegEx Fact Extractor Schema

`type`

`path`

`regex`

`flags` [Optional]

`data` [Optional]

Sample RegEx Configuration

JSON/YAML Fact Extractor

JSON/YAML Fact Extractor Schema

`type`

`path`

`data`

Sample JSON/YAML Configuration

YAML Anchor Support