CI/CD Pipeline With GitHub Actions: From Zero to Production in 30 Minutes

CI/CD is the skill that unlocks everything else in DevOps. Once you can ship code automatically — with confidence that broken code never reaches production — your entire engineering velocity changes.

This guide builds a production-ready pipeline from scratch using GitHub Actions. Not a toy example — the same patterns I use on systems with 50+ deployments per day.

What We're Building

A pipeline that:

Triggers on every push to main and every PR
Runs tests and code quality checks
Builds and pushes a Docker image to a registry
Deploys to production only when tests pass on main
Rolls back automatically on failed health checks

The File Structure

.github/
└── workflows/
    ├── ci.yml         # Tests on every push/PR
    └── deploy.yml     # Deploy on main branch merge

Step 1: The CI Workflow (Tests + Build)

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: ['*']
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    name: Test
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: testpass
          POSTGRES_DB: testdb
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      
      redis:
        image: redis:7
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 6379:6379
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: 'pip'
      
      - name: Install dependencies
        run: pip install -r requirements.txt -r requirements-dev.txt
      
      - name: Run linter
        run: ruff check .
      
      - name: Run type checker
        run: mypy app/
      
      - name: Run tests
        env:
          DATABASE_URL: postgresql://postgres:testpass@localhost:5432/testdb
          REDIS_URL: redis://localhost:6379
          SECRET_KEY: test-secret-key-not-for-production
        run: |
          pytest tests/ \
            --cov=app \
            --cov-report=xml \
            --cov-fail-under=80 \
            -v
      
      - name: Upload coverage
        uses: codecov/codecov-action@v4
        with:
          token: ${{ secrets.CODECOV_TOKEN }}
          fail_ci_if_error: false

  build:
    name: Build Image
    runs-on: ubuntu-latest
    needs: test
    
    outputs:
      image: ${{ steps.meta.outputs.tags }}
      digest: ${{ steps.build.outputs.digest }}
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      
      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      
      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=sha-
            type=ref,event=branch
            type=semver,pattern={{version}}
      
      - name: Build and push
        id: build
        uses: docker/build-push-action@v5
        with:
          context: .
          push: ${{ github.event_name != 'pull_request' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Key details:

Service containers (Postgres, Redis) spin up alongside your tests — real integration tests, not mocks
cache-from: type=gha uses GitHub Actions cache for Docker layer caching — build times drop 60–80% after first run
Image is only pushed to registry on non-PR pushes (saves registry costs and clutter)
needs: test ensures build only runs after tests pass

Step 2: The Deploy Workflow

# .github/workflows/deploy.yml
name: Deploy

on:
  workflow_run:
    workflows: [CI]
    types: [completed]
    branches: [main]

jobs:
  deploy:
    name: Deploy to Production
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    
    environment:
      name: production
      url: https://yourapp.com
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Get image digest from CI run
        id: get-image
        run: |
          # Get the SHA of the triggering commit
          SHA="${{ github.event.workflow_run.head_sha }}"
          IMAGE="${{ env.REGISTRY }}/${{ github.repository }}:sha-${SHA:0:7}"
          echo "image=$IMAGE" >> $GITHUB_OUTPUT
      
      - name: Deploy to server
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.PROD_HOST }}
          username: ${{ secrets.PROD_USER }}
          key: ${{ secrets.PROD_SSH_KEY }}
          script: |
            # Pull new image
            docker pull ${{ steps.get-image.outputs.image }}
            
            # Update the running container
            docker stop app || true
            docker rm app || true
            docker run -d \
              --name app \
              --restart unless-stopped \
              -p 8000:8000 \
              -e DATABASE_URL="${{ secrets.DATABASE_URL }}" \
              -e SECRET_KEY="${{ secrets.SECRET_KEY }}" \
              ${{ steps.get-image.outputs.image }}
            
            # Health check with retry
            for i in {1..12}; do
              if curl -sf http://localhost:8000/health; then
                echo "Health check passed"
                exit 0
              fi
              echo "Attempt $i failed, waiting 5s..."
              sleep 5
            done
            
            echo "Health check failed — rolling back"
            docker stop app
            docker start app-previous || true
            exit 1
      
      - name: Notify on failure
        if: failure()
        uses: 8398a7/action-slack@v3
        with:
          status: failure
          webhook_url: ${{ secrets.SLACK_WEBHOOK }}

Step 3: Secrets Setup

In your GitHub repo → Settings → Secrets and variables → Actions, add:

PROD_HOST          # Production server IP/hostname
PROD_USER          # SSH username (e.g., ubuntu, deploy)
PROD_SSH_KEY       # Private SSH key (generate a deploy key)
DATABASE_URL       # Production database connection string
SECRET_KEY         # Application secret key
SLACK_WEBHOOK      # Optional: Slack notifications
CODECOV_TOKEN      # Optional: coverage reporting

Security best practice: Create a dedicated deploy user on your server with minimal permissions — only enough to run Docker commands. Never use root.

Step 4: The Dockerfile That Works With This Pipeline

FROM python:3.12-slim as builder

WORKDIR /build
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim

# Create non-root user
RUN useradd -m -u 1000 app
WORKDIR /app

# Copy dependencies from builder
COPY --from=builder /root/.local /home/app/.local

# Copy application
COPY --chown=app:app . .

USER app

ENV PATH=/home/app/.local/bin:$PATH
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

CMD ["gunicorn", "app.main:app", "-b", "0.0.0.0:8000", "-w", "4"]

Multi-stage build keeps the final image small. Non-root user is a security requirement, not optional. The HEALTHCHECK directive is used by the deployment rollback script.

Advanced: Matrix Testing

Test against multiple Python/Node versions:

jobs:
  test:
    strategy:
      matrix:
        python-version: ['3.11', '3.12']
        os: [ubuntu-latest]
    runs-on: ${{ matrix.os }}
    
    steps:
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

Advanced: Reusable Workflows

Once you have multiple repos, extract common steps:

# .github/workflows/reusable-test.yml
on:
  workflow_call:
    inputs:
      python-version:
        required: false
        type: string
        default: '3.12'

jobs:
  test:
    # ... same test steps

Call it from any repo:

jobs:
  test:
    uses: your-org/shared-workflows/.github/workflows/reusable-test.yml@main
    with:
      python-version: '3.12'
    secrets: inherit

The Hidden Cost of Bad CI/CD

Companies without proper CI/CD typically experience:

2–5 production outages per month from manual deployment errors
30–60 min deployment process requiring engineer attention
"Works on my machine" bugs reaching production

With this pipeline:

Deployments are automatic and take 4–6 minutes unattended
Broken code is caught before it reaches main
Rollback is automatic if production health checks fail

That's SRE-level reliability from a weekend of setup. The engineering time saved compounds every sprint.

Monitoring Your Pipeline

GitHub gives you basic analytics, but also track:

Build time trend: should stay under 8 minutes for most apps
Test flakiness rate: tests that fail randomly destroy trust in CI
Deployment frequency: healthy teams ship 1–10x/day to production
Change failure rate: % of deployments causing incidents

These are the DORA metrics. Track them to measure engineering team health — not just individual productivity.

A working CI/CD pipeline is the foundation everything else in DevOps is built on. Get this right and every other automation becomes easier.