Overview
The Software Catalog in Backstage is intended to capture human mental models using entities and their relationships rather than an exhaustive inventory of all possible things. The focus is on attaching functionality and views centered around these entities. Determining the “edge” where the catalog ends and the external world begins is crucial to ensure that the catalog’s scope is appropriate. The Backstage software catalog serves as a centralized hub for organizing and discovering software components and services. While it excels at providing a high-level overview of these concepts, it may not be the ideal solution for tracking dynamic relationships between components and services in real-time. You can achieve real time views by attaching appropriate tooling to the nodes in the graph through annotations and developing custom front-end plugins that display deployment information and other real-time data. It is worth noting that the Backstage Software Catalog should not be considered the ultimate source of truth, instead, it is advisable to use the Backstage Catalog as a caching mechanism that utilizes a REST API to convey information to the catalog UI and other Backstage plugins. Adopting a GitOps approach is recommended to modify YAML files in Backstage, treating YAML files in repositories as the primary source of truth and using Scaffolder to make changes via the UI and generate a pull request in the repository with the updated changes.Descriptor Components used to build the Catalog Graph
Entities: An entity refers to a node in the graph that represents a distinct object, concept, or thing. Nodes are the fundamental building blocks of a graph database and are used to represent entities and their properties.Use cases out of the box
The catalog builds a graph using descriptors as nodes and relations as edges. Out of the box you get the following use cases:- Ownership tracking
- Inventory
- Search
- Lifecycle tracking
- Tracking of real-time information sources
- Dependency mapping
- API exposure
Tracking Assets
The recommended approach would be to represent information in catalog-info files, which the users themselves can manage. While automated classification based on repository contents can be helpful, it is recommended to use it only to generate the initial file and then allow humans to maintain it manually. The reason is that automation can sometimes fail, and it is essential to ensure the accuracy and reliability of this metadata. In short, humans should govern this piece of metadata to maintain its integrity.Well-known trackable assets
Components- Services
- Websites
- Libraries
- Data Pipelines
- Machine Learning Models
- Third-party software components: It is recommended to have a separate repo for all 3d party catalog-info files.
- Jira installation
- Pagerduty
- Physical resources: This is probably more useful for longer-lived ones (For example servers).
- Cloud Infrastructure services
- Business units
- Team
- Product area
- Ldap: Internal ldap usernames as entity names. e.g., owner: user:my-user or user: my-team-name.
- OpenApi
- AsyncApi
- graphQL
- gRPC