Introduction
You're a developer who just joined a new company. To deploy your first feature, you need to: provision a database, set up CI/CD, configure monitoring, request firewall rules, create Kubernetes manifests, and wait 3 days for ops to approve.
This is why platform engineering exists.
Platform engineering builds internal developer platforms (IDPs) that abstract infrastructure complexity behind self-service interfaces. Instead of opening 5 Jira tickets to deploy, developers click a button or run one CLI command.
What is Platform Engineering?
Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service for engineering organizations. The platform team treats the platform as a product, with developers as their customers.
| Traditional Ops | Platform Engineering |
|---|---|
| Tickets and manual approvals | Self-service APIs and UIs |
| "You're doing it wrong" | Golden paths that guide |
| Infrastructure as a service | Platform as a product |
| Siloed expertise | Shared capabilities |
The core deliverables:
- Golden paths — Pre-built, opinionated workflows for common tasks
- Self-service portal — A UI or CLI where developers provision resources
- Paved road documentation — Tutorials that follow the golden paths
- Automated compliance — Security, cost, and governance built into the platform
Building an IDP: The Minimum Viable Platform
Start with the most painful developer friction and solve that first:
Tier 1: Application scaffolding
A template that creates a repo with CI/CD, Dockerfile, and basic Kubernetes manifests. Developers run:
platform create service --name=checkout-api --language=go
This generates:
checkout-api/
.github/workflows/ci.yml
Dockerfile
kubernetes/
deployment.yaml
service.yaml
README.md (with deploy instructions)
Tier 2: Environment provisioning
Instead of "create a staging database," provide:
# platform.yaml in the service repo
environments:
staging:
database: postgresql-16
resources: small
auto_deploy: true
production:
database: postgresql-16-ha
resources: large
auto_deploy: false
requires_approval: true
A GitOps pipeline (ArgoCD or Flux) reconciles this declaration with the actual infrastructure.
Tier 3: Observability by default
Every service gets monitoring, logging, and alerting without configuration:
// No code needed — the platform injects:
// - OTel instrumentation via sidecar
// - Prometheus metrics endpoint
// - Log forwarding to Loki
// - Default dashboards in Grafana
Backstage: The Developer Portal
Spotify's Backstage has become the de-facto developer portal. It provides a unified UI for all your tooling:
npx @backstage/create-app@latest
cd my-backstage-app
yarn dev
Key Backstage features:
- Software Catalog — Auto-discovery of services, APIs, and resources
- Software Templates — Scaffold new services from golden path templates
- TechDocs — Documentation that lives alongside code
- Plugins — Integrate Kubernetes, CI/CD, monitoring dashboards
A Backstage template:
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: go-microservice
spec:
type: service
parameters:
- title: Service Details
properties:
name: {type: string, title: Service Name}
owner: {type: string, title: Team}
steps:
- action: fetch:template
input:
url: ./template
- action: publish:github
input:
repoUrl: github.com?owner={{ owner }}&repo={{ name }}
- action: catalog:register
Measuring Success: DevEx Metrics
Use the SPACE framework and DORA metrics to measure platform impact:
| Metric | Before Platform | After Platform |
|---|---|---|
| Time to first deploy | 3 days | 30 minutes |
| Deployment frequency | 2/week | 20/day |
| Change failure rate | 15% | 2% |
| Developer satisfaction | 3.2/5 | 4.5/5 |
Survey developers quarterly: "How easy is it to deploy a new service?" If the answer is longer than "one command," your platform needs work.
What Makes a Platform an Internal Developer Platform?
An Internal Developer Platform (IDP) is more than a collection of tools. It is a cohesive layer that abstracts infrastructure complexity while providing self-service capabilities to development teams.
Core Capabilities of an IDP
| Capability | What It Provides | Example Tools |
|---|---|---|
| Self-service provisioning | Developers create environments without tickets | Backstage, Port, Humanitec |
| Standardized templates | Pre-approved stacks with compliance baked in | Cookie-cutter, custom scaffolds |
| Environment management | Dev/staging/production with consistent config | Crossplane, Terraform, Pulumi |
| Deployment automation | CI/CD with approvals and rollbacks | ArgoCD, GitHub Actions, GitLab CI |
| Observability | Built-in monitoring, logging, and tracing | Grafana, Prometheus, Loki |
| Cost visibility | Showback of platform usage per team | Kubecost, AWS Cost Explorer |
Golden Paths
Golden paths are opinionated, pre-configured workflows that make the right thing the easy thing:
Developer wants to: Deploy a new microservice
Golden path:
1. Backstage scaffold creates repo from template (Node.js + PostgreSQL)
2. CI pipeline runs lint, test, build, security scan
3. ArgoCD syncs to staging environment automatically
4. Developer gets PR review link and staging URL in Slack
5. Merge to main triggers production deployment with canary
The golden path handles 80% of use cases. The remaining 20% can use custom configurations via the platform's escape hatch (raw Kubernetes manifests, custom Terraform).
Building an IDP with Backstage
Backstage is Spotify's open-source developer portal and the most popular IDP foundation.
Scaffolding a New Service
# Install Backstage CLI
npx @backstage/create-app@latest --name my-platform
# Add a software template
# templates/nodejs-service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: nodejs-service
title: Node.js Microservice
description: Create a new Node.js service with CI/CD
spec:
owner: platform-team
type: service
parameters:
- title: Provide Service Details
properties:
name:
title: Service Name
type: string
pattern: '^[a-z0-9-]+$'
owner:
title: Owning Team
type: string
enum: ['backend', 'frontend', 'data']
steps:
- id: template
name: Generate Service
action: fetch:template
input:
url: ./skeleton
values:
serviceName: ${{ parameters.name }}
owner: ${{ parameters.owner }}
- id: publish
name: Publish to GitHub
action: publish:github
input:
repoUrl: github.com?repo=${{ parameters.name }}&owner=${{ parameters.owner }}
- id: register
name: Register in Backstage
action: catalog:register
input:
repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
Service Catalog
Backstage's catalog automatically discovers services, APIs, and resources:
# catalog-info.yaml (in each service repo)
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-api
description: Payment processing service
annotations:
github.com/project-slug: company/payment-api
backstage.io/techdocs-ref: url:https://github.com/company/payment-api
tags:
- nodejs
- graphql
- critical
spec:
type: service
lifecycle: production
owner: backend-team
dependsOn:
- Component:postgres-db
- Resource:stripe-api
The catalog becomes the single source of truth for what runs in production and who owns it. Teams update their catalog-info.yaml alongside their code, so it never goes stale.
Platform Engineering vs DevOps
| Aspect | DevOps | Platform Engineering |
|---|---|---|
| Focus | Culture, collaboration, automation | Product, abstraction, developer experience |
| Output | Shared responsibility for operations | Self-service platform with golden paths |
| Team structure | Embedded DevOps in product teams | Central platform team serving product teams |
| Abstraction level | Direct cloud/K8s access | Higher-level interfaces (Backstage, APIs) |
| Metrics | Deployment frequency, MTTR | Developer satisfaction, time-to-prod |
Platform engineering does not replace DevOps -- it operationalizes DevOps at scale. When you have 20+ product teams, each doing DevOps independently, you get fragmentation. An IDP standardizes the good patterns while giving teams autonomy.
IDP Architecture: Reference Implementation
A production IDP combines several components into a cohesive experience:
+------------------+
| Backstage | <-- Developer portal, catalog, scaffolder
+--------+---------+
|
+---------------+---------------+
| | |
+----v----+ +----v----+ +-----v-----+
| ArgoCD | | Crossplane | | Vault | <-- GitOps, IaC, secrets
+----+----+ +----+------+ +-----+-----+
| | |
+----v----+ +----v----+ +-----v-----+
| Kubernetes | | AWS/GCP | | RDS, S3 | <-- Infrastructure
+-----------+ +---------+ +-----------+
CI/CD Integration
When a developer creates a service via Backstage, the platform automatically:
- Creates a GitHub repository with the template code
- Sets up GitHub Actions for CI (lint, test, security scan)
- Registers the service in ArgoCD for GitOps deployment
- Provisions a namespace and network policies in Kubernetes
- Configures monitoring (Prometheus ServiceMonitor) and logging (Loki label)
- Adds the service to the Backstage catalog with ownership metadata
- Sends a Slack notification with the staging URL
This end-to-end automation reduces the time to deploy a new service from days to minutes.
Scorecards and Governance
Backstage scorecards track how well services follow platform standards:
| Check | Description | Weight |
|---|---|---|
| ci-passing | Tests pass in CI | Critical |
| has-owner | Service has an assigned owner | Critical |
| has-techdocs | Documentation exists | High |
| has-monitoring | Prometheus rules configured | High |
| has-backups | Database backups enabled | Medium |
| has-oncall | On-call rotation defined | Medium |
| has-cost-tags | Cloud resources tagged | Low |
Services failing critical checks are flagged in the developer portal. This gives the platform team visibility into compliance without manual audits.
Measuring Platform Engineering Success
The ultimate metric is developer productivity. Track these indicators:
- Time to first deployment: How long from repo creation to live in staging? Target: <30 minutes
- Platform adoption rate: Percentage of services using the IDP. Target: >80%
- Infrastructure request tickets: Count of manual infrastructure requests. Target: trending to zero
- Developer NPS: Survey developers quarterly. Target: >50
- Deployment frequency: Are teams deploying more often with the platform? Target: 2x improvement
Conclusion
Platform engineering isn't about building a perfect platform on day one. It's about identifying the most painful friction in your developers' workflow and solving it systematically. Start with scaffolding and environments. Add Backstage when you have 10+ services. Layer on DevEx metrics to prove you're making things better, not just different.
The goal: make the right thing the easy thing.