GitHub
Similar to the Source Control Management (SCM) integration plugin, the GitHub integration plugin for Soundcheck provides out-of-box integration with GitHub by leveraging Backstage's GitHub integration to implement extraction and collection of facts from GitHub repositories.
The purpose of the GitHub integration plugin is to provide GitHub-specific fact collection (like branch protections), while the SCM integration plugin provides the collection of facts based on repository content.
The GitHub integration plugin supports the extraction of the following facts:
Prerequisites
Configure GitHub integration in Backstage
Integrations are configured at the root level of app-config.yaml
. Here's an example configuration for GitHub:
integrations:
github:
- host: github.com
token: ${GITHUB_TOKEN}
Consult the Backstage GitHub integration instructions for full configuration details.
Add the GitHubFactCollector to Soundcheck
GitHub integration for Soundcheck is not installed by default. It must be manually installed and configured for the GitHub Fact Collector to work.
First, add the @spotify/backstage-plugin-soundcheck-backend-module-github
package:
yarn workspace backend add @spotify/backstage-plugin-soundcheck-backend-module-github
Then add the following to your packages/backend/src/index.ts
file:
const backend = createBackend();
backend.add(import('@spotify/backstage-plugin-soundcheck-backend'));
backend.add(
import('@spotify/backstage-plugin-soundcheck-backend-module-github'),
);
// ...
backend.start();
Consult the Soundcheck Backend documentation for additional details on setting up the Soundcheck backend.
Legacy Backend
If you are still using the Legacy Backend you can follow these instructions but we highly recommend migrating to the New Backend System.
First add the package: yarn workspace backend add @spotify/backstage-plugin-soundcheck-backend-module-github
Then in packages/backend/src/plugins/soundcheck.ts
, add the GitHubFactCollector
:
import { SoundcheckBuilder } from '@spotify/backstage-plugin-soundcheck-backend';
import { Router } from 'express';
import { PluginEnvironment } from '../types';
+ import { GithubFactCollector } from '@spotify/backstage-plugin-soundcheck-backend-module-github';
export default async function createPlugin(
env: PluginEnvironment,
): Promise<Router> {
return SoundcheckBuilder.create({ ...env })
+ .addFactCollectors(
+ GithubFactCollector.create(env.config, env.logger, env.cache),
+ )
.build();
}
Entity configuration
To be able to determine the repository to use the GitHub integration will use the value from the backstage.io/source-location
annotation. In many cases this will be set for you but if it is not you will need to add it to your catalog-info.yaml
file, here's a simple example:
metadata:
annotations:
backstage.io/source-location: url:https://github.com/my-org/my-service/
Plugin Configuration
The collection of facts is driven by configuration. To learn more about the configuration, jump to the Defining GitHub Fact Collectors section.
GitHub Fact Collector can be configured via YAML or No-Code UI. If you configure it via both YAML and No-Code UI, the configurations will be merged. It's preferable to choose a single source for the Fact Collectors configuration (either No-Code UI or YAML) to avoid confusing merge results.
Default Configuration
To add the default initial configuration of GitHub Fact Collector on startup, the following flag must be set to true in the app-config.yaml
file:
soundcheck:
addStartingConfigurations:
collectors: true
This configuration is required to be able to collect the facts necessary for the pre-canned checks and tracks.
Note: The configuration will be stored in the database and will be configurable via No-Code UI. If you'd like to add this configuration via yaml, you should add the following config instead:
soundcheck:
collectors:
github:
collects:
- factName: branch_protections
type: BranchProtections
cache: false
filter:
- kind:
- Component
spec.lifecycle:
- production
frequency:
cron: '7 3 * * *'
- factName: branch_rules
type: BranchRules
cache: false
filter:
- kind:
- Component
spec.lifecycle:
- production
frequency:
cron: '7 9 * * *'
- factName: repository_details
type: RepositoryDetails
cache: false
filter:
- kind:
- Component
spec.lifecycle:
- production
frequency:
cron: '7 5 * * *'
- factName: repository_languages
type: RepositoryLanguages
cache: false
filter:
- kind:
- Component
spec.lifecycle:
- production
frequency:
cron: '7 7 * * *'
No-Code UI Configuration Option
-
Make sure the prerequisite Configure GitHub integration in Backstage is completed and GitHub instance details are configured.
-
To enable the GitHub Integration, go to
Soundcheck > Integrations > GitHub
and click theConfigure
button. To learn more about the No-Code UI config, see the Configuring a fact collector (integration) via the no-code UI.
YAML Configuration Option
-
Create a
github-facts-collectors.yaml
file in the root of your Backstage repository and fill in all your GitHub Fact Collectors. A simple example GitHub Fact Collector is listed below.---
frequency:
cron: '0 * * * *'
collects:
- factName: branch_protections
type: BranchProtections
- factName: branch_rules
type: BranchRules
- factName: repository_details
type: RepositoryDetails
- factName: repository_languages
type: RepositoryLanguagesNote: this file will be loaded at runtime along with the rest of your Backstage configuration files. Therefore, make sure that it's available in deployed environments in the same way as your
app-config.yaml
files are. -
Add a soundcheck collectors field to
app-config.yaml
and reference the newly createdgithub-facts-collectors.yaml
# app-config.yaml
soundcheck:
collectors:
github:
$include: ./github-facts-collectors.yaml
Rate Limiting (Optional)
This fact collector can be rate limited in Soundcheck using the following configuration:
soundcheck:
job:
workers:
github:
limiter:
max: 4900
duration: 3600000
GitHub API has a limit of 5000 requests per hour (15000 for Enterprise). We recommend setting your rate limit to something below this, i.e. in the example above, we set the rate limit to 4900 executions every hour.
This fact collector handles rate limit errors per the recommendation from GitHub. Soundcheck will automatically wait and retry requests that are rate limited.
Defining GitHub Fact Collectors
This section describes the data shape and semantics of GitHub Fact Collectors.
Overall Shape Of A GitHub Fact Collector
The following is an example of a descriptor file for a GitHub Fact Collector:
---
frequency:
cron: '0 * * * *'
initialDelay:
seconds: 30
filter:
kind: 'Component'
cache:
duration:
hours: 2
collects:
- factName: branch_protections
type: BranchProtections
filter:
- spec.lifecycle: 'production'
spec.type: 'website'
cache: false
- factName: branch_rules
type: BranchRules
- factName: repository_details
type: RepositoryDetails
cache: true
exclude:
- spec.type: 'documentation'
- factName: repository_languages
type: RepositoryLanguages
Below are the details for each field.
frequency
[optional]
The frequency at which the collector should be executed. Possible values are either a cron expression { cron: ... }
or HumanDuration.
This is the default frequency for each extractor.
initialDelay
[optional]
The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... }
or HumanDuration.
batchSize
[optional]
The number of entities to collect facts for at once. Optional, the default value is 1.
Note: Fact collection for a batch of entities is still considered as one hit towards the rate limits
by the Soundcheck Rate Limiting engine, while the actual number of hits
will be equal to the batchSize
.
Example:
batchSize: 100
filter
[optional]
A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. This is the default filter for each extractor.
exclude
[optional]
Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.
filter:
- kind: component
exclude:
- spec.type: documentation
cache
[optional]
If the collected facts should be cached, and if so for how long. Possible values are either true
or false
or a nested { duration:
HumanDuration }
field.
This is the default cache config for each extractor.
collects
[required]
An array describing which facts to collect and how to extract them. See below for details about the overall shape of a fact extractor.
Overall Shape Of A Fact Extractor
Each extractor supports the fields described below.
factName
[required]
The name of the fact to be extracted.
- Minimum length of 1
- Maximum length of 100
- Alphanumeric with single separator instances of periods, dashes, underscores, or forward slashes
type
[required]
The type of the extractor (e.g. BranchProtections, RepositoryDetails).
frequency
[optional]
The frequency at which the fact extraction should be executed. Possible values are either a cron expression { cron: ... }
or HumanDuration.
If provided, it overrides the default frequency provided at the top level. If not provided, it defaults to the frequency provided at the top level. If neither extractor's frequency, nor default frequency is provided, the fact will only be collected on demand.
Example:
frequency:
minutes: 10
batchSize
[optional]
The number of entities to collect facts for at once. Optional, the default value is 1. If provided it overrides the default batchSize provided at the top level. If not provided it defaults to the batchSize provided at the top level. If neither collector's batchSize nor default batchSize is provided the fact will be collected for one entity at a time.
Note: Fact collection for a batch of entities is still considered as one hit towards the rate limits
by the Soundcheck Rate Limiting engine, while the actual number of hits
will be equal to the batchSize
.
Example:
batchSize: 100
branch
[optional]
The branch to extract the fact from. If not provided, defaults to the repository's default branch.
filter
[optional]
A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. If provided, it overrides the default filter provided at the top level. If not provided, it defaults to the filter provided at the top level. If neither extractor's filter, nor default filter is provided, the fact will be collected for all entities.
exclude
[optional]
Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.
filter:
- kind: component
exclude:
- spec.type: documentation
cache
[optional]
If the collected facts should be cached, and if so for how long. Possible values are either true
or false
or a nested { duration:
HumanDuration }
field.
If provided, it overrides the default cache config provided at the top level. If not provided, it defaults to the cache config provided at the top level. If neither extractor's cache nor default cache config is provided, the fact will not be cached.
Example:
cache:
duration:
hours: 24
Collecting BranchProtections Fact
The BranchProtections
fact contains information about configured branch protections for a given branch in a GitHub repository.
Shape of A BranchProtections Fact Collector
The shape of a BranchProtections Fact Collector matches the Overall Shape Of A GitHub Fact Collector (restriction: type: BranchProtections
).
The following is an example of the BranchProtections
Fact Collector configuration:
collects:
- factName: branch_protections
type: BranchProtections
frequency:
cron: '0 * * * *'
filter:
- spec.lifecycle: 'production'
spec.type: 'website'
cache: false
Shape of A BranchProtections Fact
The shape of a BranchProtections
Fact is based on the Fact Schema.
For a description of the data collected regarding branch protection, refer to the GitHub API documentation.
The following is an example of the collected BranchProtections
fact:
factRef: github:master/branch_protections
entityRef: component:default/queue-proxy
scope: master
timestamp: 2023-02-24T15:50+00Z
data:
url: 'https://api.github.com/repos/backstage/backstage/branches/master/protection'
required_pull_request_reviews:
url: 'https://api.github.com/repos/backstage/backstage/branches/master/protection/required_pull_request_reviews',
dismiss_stale_reviews: false
require_code_owner_reviews: true
required_approving_review_count: 2
require_last_push_approval: false
required_signatures:
url: 'https://api.github.com/repos/backstage/backstage/branches/master/protection/required_signatures'
enabled: false
enforce_admins:
url: 'https://api.github.com/repos/backstage/backstage/branches/master/protection/enforce_admins'
enabled: false
required_linear_history:
enabled: false
allow_force_pushes:
enabled: true
allow_deletions:
enabled: true
block_creations:
enabled: true
required_conversation_resolution:
enabled: false
lock_branch:
enabled: false
allow_fork_syncing:
enabled: true
Shape of A BranchProtections Fact Check
The shape of a BranchProtections
Fact Check matches the Shape of a Fact Check.
The following is an example of the BranchProtections
fact checks:
soundcheck:
checks:
- id: requires_code_owner_reviews
rule:
factRef: github:master/branch_protections
path: $.required_pull_request_reviews.require_code_owner_reviews
operator: equal
value: true
- id: requires_at_least_two_approving_reviews
rule:
factRef: github:master/branch_protections
path: $.required_pull_request_reviews.required_approving_review_count
operator: greaterThanInclusive
value: 2
The following is an example of the Soundcheck program that utilizes these checks:
- id: demo
name: Demo
ownerEntityRef: group:default/owning_group
description: Demonstration of Soundcheck BranchProtections Fact Extractor
levels:
- ordinal: 1
name: First level
description: Checks leveraging Soundcheck's GitHub BranchProtections Fact Extractor
checks:
- id: requires_code_owner_reviews
name: Requires code owner reviews
description: PR requires code owner reviews
- id: requires_at_least_two_approving_reviews
name: Requires at least two approving reviews
description: PR requires at least two approving reviews
Collecting BranchRules Fact
The BranchRules
fact contains information about configured branch rules for a given branch in a GitHub repository.
Shape of A BranchRules Fact Collector
The shape of a BranchRules Fact Collector matches the Overall Shape Of A GitHub Fact Collector (restriction: type: BranchRules
).
The following is an example of the BranchRules
Fact Collector configuration:
collects:
- factName: branch_rules
type: BranchRules
frequency:
cron: '0 * * * *'
filter:
- spec.lifecycle: 'production'
spec.type: 'website'
cache: false
Shape of A BranchRules Fact
The shape of a BranchRules
Fact is based on the Fact Schema.
For a description of the data collected regarding branch rules, refer to the GitHub API documentation.
The following is an example of the collected BranchRules
fact:
factRef: github:master/branch_rules
entityRef: component:default/queue-proxy
scope: default
timestamp: 2024-12-10T15:50+00Z
data:
rules:
- type: commit_message_pattern
ruleset_source_type: Repository
ruleset_source: monalisa/my-repo
ruleset_id: 42
parameters:
operator: starts_with
pattern: issue
- type: commit_author_email_pattern
ruleset_source_type: Organization
ruleset_source: my-org
ruleset_id: 73
parameters:
operator: contains
pattern: github
Shape of A BranchRules Fact Check
The shape of a BranchRules
Fact Check matches the Shape of a Fact Check.
The following is an example of the BranchRules
fact checks:
soundcheck:
checks:
- id: commit_message_pattern_is_set
rule:
factRef: github:default/branch_rules
path: $.rules[*].type
operator: contains
value: commit_message_pattern
- id: commit_message_pattern_matches_issues
rule:
factRef: github:default/branch_rules
path: $.rules[?(@.type === 'commit_message_pattern')].parameters.pattern
operator: matches
value: .*issues.*
The following is an example of the Soundcheck program that utilizes these checks:
- id: demo
name: Demo
ownerEntityRef: group:default/owning_group
description: Demonstration of Soundcheck BranchRules Fact Extractor
levels:
- ordinal: 1
name: First level
description: Checks leveraging Soundcheck's GitHub BranchRules Fact Extractor
checks:
- id: commit_message_pattern_is_set
name: Commit message patters is set
description: Commit message patters is set
- id: commit_message_pattern_matches_issues
name: Commit message pattern matches .*issues.*
description: Commit message pattern matches .*issues.*
Collecting RepositoryDetails Fact
The RepositoryDetails
fact contains information about a GitHub repository.
Shape of A RepositoryDetails Fact Collector
The shape of a RepositoryDetails
Fact Collector matches the Overall Shape Of A GitHub Fact Collector (restriction: type: RepositoryDetails
).
The following is an example of the RepositoryDetails
Fact Collector configuration:
collects:
- factName: repository_details
type: RepositoryDetails
frequency:
cron: '0 * * * *'
filter:
- spec.lifecycle: 'production'
cache: true
Shape of A RepositoryDetails Fact
The shape of a RepositoryDetails
Fact is based on the Fact Schema.
For a description of the data collected about repository, refer to the GitHub API documentation.
The following is an example of the collected RepositoryDetails
fact:
factRef: github:default/repository_details
entityRef: component:default/queue-proxy
scope: default
timestamp: 2023-02-24T15:50+00Z
data:
name: backstage
full_name: backstage/backstage
private: true
html_url: 'https://github.com/backstage/backstage'
description: null
fork: false
url: 'https://api.github.com/repos/backstage/backstage'
homepage: null
size: 3
stargazers_count: 0
watchers_count: 0
language: null
has_issues: true
has_projects: true
has_downloads: true
has_wiki: true
has_pages: false
has_discussions: false
forks_count: 0
mirror_url: null
archived: false
disabled: false
open_issues_count: 0
license: null
allow_forking: true
is_template: false
web_commit_signoff_required: false
visibility: 'private'
forks: 0
open_issues: 0
watchers: 0
default_branch: master
permissions:
admin: true
maintain: true
push: true
triage: true
pull: true
allow_squash_merge: true
allow_merge_commit: true
allow_rebase_merge: true
allow_auto_merge: false
delete_branch_on_merge: false
allow_update_branch: false
use_squash_pr_title_as_default: false
squash_merge_commit_message: 'COMMIT_MESSAGES'
squash_merge_commit_title: 'COMMIT_OR_PR_TITLE'
merge_commit_message: 'PR_TITLE'
merge_commit_title: 'MERGE_MESSAGE'
security_and_analysis:
secret_scanning:
status: 'disabled'
secret_scanning_push_protection:
status: 'disabled'
network_count: 0
subscribers_count: 1
Shape of A RepositoryDetails Fact Check
The shape of a RepositoryDetails
Fact Check matches the Shape of a Fact Check.
The following is an example of the RepositoryDetails
fact checks:
soundcheck:
checks:
- id: allows_rebase_merge
rule:
factRef: github:default/repository_details
path: $.allow_rebase_merge
operator: equal
value: true
- id: has_less_than_ten_open_issues
rule:
factRef: github:default/repository_details
path: $.open_issues
operator: lessThan
value: 10
The following is an example of the Soundcheck program that utilizes these checks:
- id: demo
name: Demo
ownerEntityRef: group:default/owning_group
description: Demonstration of Soundcheck RepositoryDetails Fact Extractor
levels:
- ordinal: 1
name: First level
description: Checks leveraging Soundcheck's GitHub RepositoryDetails Fact Extractor
checks:
- id: allows_rebase_merge
name: Allows Rebase Merge
description: Repository allows rebase merge
- id: has_less_than_ten_open_issues
name: Has Less Than 10 Open Issues
description: GitHub Repository Has Less Than 10 Open Issues
Collecting RepositoryLanguages Fact
The RepositoryLanguages
fact contains information about languages used in a GitHub repository.
Shape of A RepositoryLanguages Fact Collector
The shape of a RepositoryLanguages
Fact Collector matches the Overall Shape Of A GitHub Fact Collector (restriction: type: RepositoryLanguages
).
The following is an example of the RepositoryLanguages
Fact Collector configuration:
collects:
- factName: repository_languages
type: RepositoryLanguages
frequency:
cron: '0 * * * *'
filter:
- spec.lifecycle: 'production'
cache: true
Shape of A RepositoryLanguages Fact
The shape of a RepositoryLanguages
Fact is based on the Fact Schema.
For a description of the data collected about repository languages, refer to the GitHub API documentation.
The following is an example of the collected RepositoryLanguages
fact:
factRef: github:default/repository_languages
entityRef: component:default/queue-proxy
scope: default
timestamp: 2023-02-24T15:50+00Z
data:
C: 78769
Python: 7769
Shape of A RepositoryLanguages Fact Check
The shape of a RepositoryLanguages
Fact Check matches the Shape of a Fact Check.
The following is an example of the RepositoryLanguages
fact checks:
soundcheck:
checks:
- id: uses_python
rule:
factRef: github:default/repository_languages
path: $.Python
operator: greaterThan
value: 0
- id: uses_java
rule:
factRef: github:default/repository_languages
path: $.Java
operator: greaterThan
value: 0
The following is an example of the Soundcheck program that utilizes these checks:
- id: demo
name: Demo
ownerEntityRef: group:default/owning_group
description: Demonstration of Soundcheck RepositoryLanguages Fact Extractor
levels:
- ordinal: 1
name: First level
description: Checks leveraging Soundcheck's GitHub RepositoryLanguages Fact Extractor
checks:
- id: uses_python
name: Uses Python
description: Uses Python
- id: uses_java
name: Uses Java
description: Uses Java