GitHub
Similar to the Source Control Management (SCM) integration plugin, the GitHub integration plugin for Soundcheck provides out-of-box integration with GitHub by leveraging Backstage's GitHub integration to implement collection of facts from GitHub repositories.
The purpose of the GitHub integration plugin is to provide GitHub-specific fact collection (like branch protections), while the SCM integration plugin provides the collection of facts based on repository content.
The GitHub integration plugin supports the collection of the following facts:
- BranchProtections
- BranchRules
- CodeScanningAlerts
- DependabotAlerts
- RepositoryDetails
- RepositoryLanguages
- SecretScanningAlerts
- SecurityAdvisories
Prerequisites
Configure GitHub integration in Backstage
Integrations are configured at the root level of app-config.yaml
. Here's an example configuration for GitHub:
integrations:
github:
- host: github.com
token: ${GITHUB_TOKEN}
Consult the Backstage GitHub integration instructions for full configuration details.
Add the GitHubFactCollector to Soundcheck
GitHub integration for Soundcheck is not installed by default. It must be manually installed and configured for the GitHub Fact Collector to work.
First, add the @spotify/backstage-plugin-soundcheck-backend-module-github
package:
yarn workspace backend add @spotify/backstage-plugin-soundcheck-backend-module-github
Then add the following to your packages/backend/src/index.ts
file:
const backend = createBackend();
backend.add(import('@spotify/backstage-plugin-soundcheck-backend'));
backend.add(
import('@spotify/backstage-plugin-soundcheck-backend-module-github'),
);
// ...
backend.start();
Consult the Soundcheck Backend documentation for additional details on setting up the Soundcheck backend.
Entity configuration
To be able to determine the repository to use the GitHub integration will use the value from the backstage.io/source-location
annotation. In many cases this will be set for you but if it is not you will need to add it to your catalog-info.yaml
file, here's a simple example:
metadata:
annotations:
backstage.io/source-location: url:https://github.com/my-org/my-service/tree/main
Plugin Configuration
The collection of facts is driven by configuration. To learn more about the configuration, jump to the Defining GitHub Fact Collectors section.
GitHub Fact Collector can be configured via YAML or No-Code UI. If you configure it via both YAML and No-Code UI, the configurations will be merged. It's preferable to choose a single source for the Fact Collectors configuration (either No-Code UI or YAML) to avoid confusing merge results.
Default Configuration
To add the default initial configuration of GitHub Fact Collector on startup, the following flag must be set to true in the app-config.yaml
file:
soundcheck:
addStartingConfigurations:
collectors: true
This configuration is required to be able to collect the facts necessary for the pre-canned checks and tracks.
Note: The configuration will be stored in the database and will be configurable via No-Code UI. If you'd like to add this configuration via yaml, you should add the following config instead:
soundcheck:
collectors:
github:
collects:
- factName: branch_protections
type: BranchProtections
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 3 * * *'
- factName: branch_rules
type: BranchRules
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 9 * * *'
- factName: code_scanning_alerts
type: CodeScanningAlerts
state: open
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 11 * * *'
- factName: dependabot_alerts
type: DependabotAlerts
states:
- open
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 13 * * *'
- factName: repository_details
type: RepositoryDetails
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 5 * * *'
- factName: repository_languages
type: RepositoryLanguages
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 7 * * *'
- factName: secret_scanning_alerts
type: SecretScanningAlerts
state: open
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 15 * * *'
- factName: security_advisories
type: SecurityAdvisories
state: published
cache: false
filter:
- kind:
- Component
frequency:
cron: '7 17 * * *'
No-Code UI Configuration Option
-
Make sure the prerequisite Configure GitHub integration in Backstage is completed and GitHub instance details are configured.
-
To enable the GitHub Integration, go to
Soundcheck > Integrations > GitHub
and click theConfigure
button. To learn more about the No-Code UI config, see the Configuring a fact collector (integration) via the no-code UI.
YAML Configuration Option
-
Create a
github-facts-collectors.yaml
file in the root of your Backstage repository and fill in all your GitHub Fact Collectors. A simple example GitHub Fact Collector is listed below.---
frequency:
cron: '0 * * * *'
collects:
- factName: branch_protections
type: BranchProtections
- factName: branch_rules
type: BranchRules
- factName: code_scanning_alerts
type: CodeScanningAlerts
state: open
- factName: dependabot_alerts
type: DependabotAlerts
states:
- open
- factName: repository_details
type: RepositoryDetails
- factName: repository_languages
type: RepositoryLanguages
- factName: secret_scanning_alerts
type: SecretScanningAlerts
state: open
- factName: security_advisories
type: SecurityAdvisories
state: publishedNote: this file will be loaded at runtime along with the rest of your Backstage configuration files. Therefore, make sure that it's available in deployed environments in the same way as your
app-config.yaml
files are. -
Add a soundcheck collectors field to
app-config.yaml
and reference the newly createdgithub-facts-collectors.yaml
# app-config.yaml
soundcheck:
collectors:
github:
$include: ./github-facts-collectors.yaml
Rate Limiting (Optional)
This fact collector can be rate limited in Soundcheck using the following configuration:
soundcheck:
job:
workers:
github:
limiter:
max: 4900
duration: 3600000
GitHub API has a limit of 5000 requests per hour (15000 for Enterprise). We recommend setting your rate limit to something below this, i.e. in the example above, we set the rate limit to 4900 executions every hour.
This fact collector handles rate limit errors per the recommendation from GitHub. Soundcheck will automatically wait and retry requests that are rate limited.
Defining GitHub Fact Collectors
This section describes the data shape and semantics of GitHub Fact Collectors.
Overall Shape Of A GitHub Fact Collector
The following is an example of a descriptor file for a GitHub Fact Collector:
---
frequency:
cron: '0 * * * *'
initialDelay:
seconds: 30
filter:
kind: 'Component'
cache:
duration:
hours: 2
collects:
- factName: branch_protections
type: BranchProtections
filter:
- spec.lifecycle: 'production'
spec.type: 'website'
cache: false
- factName: branch_rules
type: BranchRules
- factName: code_scanning_alerts
type: CodeScanningAlerts
state: open
- factName: dependabot_alerts
type: DependabotAlerts
states:
- open
- factName: repository_details
type: RepositoryDetails
cache: true
exclude:
- spec.type: 'documentation'
- factName: repository_languages
type: RepositoryLanguages
- factName: secret_scanning_alerts
type: SecretScanningAlerts
state: open
- factName: security_advisories
type: SecurityAdvisories
state: published
Below are the details for each field.
frequency
[optional]
The frequency at which the collector should be executed. Possible values are either a cron expression { cron: ... }
or HumanDuration.
This is the default frequency for each collector.
initialDelay
[optional]
The amount of time that should pass before the first invocation happens. Possible values are either a cron expression { cron: ... }
or HumanDuration.
batchSize
[optional]
The number of entities to collect facts for at once. Optional, the default value is 1.
Note: Fact collection for a batch of entities is still considered as one hit towards the rate limits
by the Soundcheck Rate Limiting engine, while the actual number of hits
will be equal to the batchSize
.
Example:
batchSize: 100
filter
[optional]
A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. This is the default filter for each collector.
See filters for more details.
exclude
[optional]
Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.
filter:
- kind: component
exclude:
- spec.type: documentation
cache
[optional]
If the collected facts should be cached, and if so for how long. Possible values are either true
or false
or a nested { duration:
HumanDuration }
field.
This is the default cache config for each collector.
collects
[required]
An array describing which facts to collect and how to collect them. See below for details about the overall shape of a fact collector.
Overall Shape Of A Fact Collector
Each collector supports the fields described below.
factName
[required]
The name of the fact to be collected.
- Minimum length of 1
- Maximum length of 100
- Alphanumeric with single separator instances of periods, dashes, underscores, or forward slashes
type
[required]
The type of the collector (e.g. BranchProtections, RepositoryDetails).
frequency
[optional]
The frequency at which the fact collection should be executed. Possible values are either a cron expression { cron: ... }
or HumanDuration.
If provided, it overrides the default frequency provided at the top level. If not provided, it defaults to the frequency provided at the top level. If neither collector's frequency, nor default frequency is provided, the fact will only be collected on demand.
Example:
frequency:
minutes: 10
batchSize
[optional]
The number of entities to collect facts for at once. Optional, the default value is 1. If provided it overrides the default batchSize provided at the top level. If not provided it defaults to the batchSize provided at the top level. If neither collector's batchSize nor default batchSize is provided the fact will be collected for one entity at a time.
Note: Fact collection for a batch of entities is still considered as one hit towards the rate limits
by the Soundcheck Rate Limiting engine, while the actual number of hits
will be equal to the batchSize
.
Example:
batchSize: 100
filter
[optional]
A filter specifying which entities to collect the specified facts for. Matches the filter format used by the Catalog API. If provided, it overrides the default filter provided at the top level. If not provided, it defaults to the filter provided at the top level. If neither collector's filter, nor default filter is provided, the fact will be collected for all entities.
exclude
[optional]
Entities matching this filter will be skipped during the fact collection process. Can be used in combination with filter. Matches the filter format used by the Catalog API.
filter:
- kind: component
exclude:
- spec.type: documentation
cache
[optional]
If the collected facts should be cached, and if so for how long. Possible values are either true
or false
or a nested { duration:
HumanDuration }
field.
If provided, it overrides the default cache config provided at the top level. If not provided, it defaults to the cache config provided at the top level. If neither collector's cache nor default cache config is provided, the fact will not be cached.
Example:
cache:
duration:
hours: 24
Collecting BranchProtections Fact
The BranchProtections
fact contains information about configured branch protections for a default branch in a GitHub repository.
Prerequisites:
- Grant your GitHub app or access token with necessary permissions listed in the Get branch protection GitHub API documentation.
Shape of A BranchProtections Fact Collector
The shape of a BranchProtections Fact Collector matches the Overall Shape Of A Fact Collector (restriction: type: BranchProtections
).
The following is an example of the BranchProtections
Fact Collector configuration:
collects:
- factName: branch_protections
type: BranchProtections
frequency:
cron: '0 * * * *'
filter:
- spec.lifecycle: 'production'
spec.type: 'website'
cache: false
Shape of A BranchProtections Fact
The shape of a BranchProtections
Fact is based on the Fact Schema.
For a description of the data collected regarding branch protection, refer to the GitHub API documentation.
The following is an example of the collected BranchProtections
fact:
factRef: github:default/branch_protections
entityRef: component:default/queue-proxy
timestamp: 2023-02-24T15:50+00Z
data:
url: 'https://api.github.com/repos/backstage/backstage/branches/main/protection'
required_pull_request_reviews:
url: 'https://api.github.com/repos/backstage/backstage/branches/main/protection/required_pull_request_reviews',
dismiss_stale_reviews: false
require_code_owner_reviews: true
required_approving_review_count: 2
require_last_push_approval: false
required_signatures:
url: 'https://api.github.com/repos/backstage/backstage/branches/main/protection/required_signatures'
enabled: false
enforce_admins:
url: 'https://api.github.com/repos/backstage/backstage/branches/main/protection/enforce_admins'
enabled: false
required_linear_history:
enabled: false
allow_force_pushes:
enabled: true
allow_deletions:
enabled: true
block_creations:
enabled: true
required_conversation_resolution:
enabled: false
lock_branch:
enabled: false
allow_fork_syncing:
enabled: true