Path Filtering and Matrix Builds Cut Monorepo CI Time in Half

Path Filtering and Matrix Builds Cut Monorepo CI Time in Half

HERALD
HERALDAuthor
|4 min read

The most painful developer experience isn't broken code—it's waiting 15 minutes for CI to tell you your two-line documentation fix broke because someone else's backend change is hogging the queue. This is the reality for most monorepo teams as they scale from 5 to 25+ services.

Recently, an engineering team shared their systematic approach to halving CI time in a complex Python/TypeScript monorepo with 22+ Django microservices. Their solution isn't just about throwing more compute at the problem—it's about fundamentally rethinking when and what runs in your pipeline.

The Real Problem: Everything Runs, Every Time

Most monorepo CI configurations suffer from the "nuclear option" approach: every PR triggers every test, every build, every check. A typo fix in your README waits behind a full backend test suite. A frontend styling change runs database migrations.

This creates a vicious cycle where developers batch changes to avoid CI queues, making PRs larger and harder to review. Larger PRs mean more conflicts, longer feedback loops, and eventually, a team that stops trusting their deployment process.

<
> "PRs routinely waited 10-15 minutes for green checks, and the slowest exceeded this, creating a productivity bottleneck."
/>

Path Filtering: The 60% Solution

The highest-impact optimization is surprisingly simple: only run jobs for changed code paths. GitHub Actions supports path filtering that can eliminate 60% of unnecessary runs with minimal configuration:

yaml(16 lines)
1# .github/workflows/backend-tests.yml
2name: Backend Tests
3on:
4  pull_request:
5    paths:
6      - 'services/backend/**'
7      - 'shared/database/**'
8      - '.github/workflows/backend-tests.yml'
yaml
1# .github/workflows/frontend-tests.yml
2name: Frontend Tests
3on:
4  pull_request:
5    paths:
6      - 'frontend/**'
7      - 'shared/ui-components/**'
8      - 'package.json'
9      - '.github/workflows/frontend-tests.yml'

The key insight here is being explicit about dependencies. Include shared libraries, configuration files, and even the workflow file itself in your path filters. Miss a dependency, and you'll get mysterious failures when shared code changes.

Matrix Builds for True Parallelization

Once you've eliminated unnecessary runs, parallelize what remains. Matrix builds let you test multiple services simultaneously instead of sequentially:

yaml
1strategy:
2  matrix:
3    service: ['user-service', 'payment-service', 'notification-service']
4    python-version: ['3.11']
5
6steps:
7  - name: Test ${{ matrix.service }}
8    run: |
9      cd services/${{ matrix.service }}
10      pip install -r requirements.txt
11      pytest

This transforms a 15-minute sequential test run into a 5-minute parallel execution. The trade-off is using more concurrent runners, but the developer productivity gains far outweigh the infrastructure costs.

Caching: The Foundation Layer

Dependency installation often consumes 30-50% of CI time. Aggressive caching makes this nearly free for subsequent runs:

yaml
1- name: Cache Python dependencies
2  uses: actions/cache@v4
3  with:
4    path: ~/.cache/pip
5    key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}
6    restore-keys: |
7      ${{ runner.os }}-pip-
8
9- name: Cache Node modules
10  uses: actions/cache@v4
11  with:
12    path: ~/.npm
13    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}

The trick is cache key granularity. Hash all relevant dependency files (requirements*.txt, not just requirements.txt) to catch dev dependencies and service-specific requirements.

Git Optimizations: The Hidden Time Sink

Repository checkout can take 2-3 minutes in large monorepos. Shallow clones eliminate this bottleneck:

yaml
1- uses: actions/checkout@v4
2  with:
3    fetch-depth: 1  # Only fetch the latest commit
4    filter: blob:limit=1m  # Skip large binary files

For workflows that need git history (like semantic versioning), fetch only what you need:

yaml
1- name: Fetch relevant history
2  run: git fetch --depth=50 origin main

Reusable Workflows: Scale Without Duplication

As your monorepo grows, workflow duplication becomes a maintenance nightmare. Reusable workflows centralize common patterns:

yaml(19 lines)
1# .github/workflows/reusable-python-test.yml
2on:
3  workflow_call:
4    inputs:
5      service-path:
6        required: true
7        type: string
8      python-version:

This reduces 500 lines of duplicated workflow code to 50 lines of service-specific configuration.

The Measurement Framework

Optimization without measurement is guesswork. Track these metrics:

  • Queue time: Time from PR creation to first job start
  • Execution time: Actual job runtime
  • Failure rate: Failed runs due to infrastructure vs. code issues
  • Developer satisfaction: Survey your team quarterly

GitHub Actions insights provide queue time data, but custom metrics help identify specific bottlenecks.

Beyond the Technical: Process Improvements

The fastest CI is the CI that doesn't run. Implement trunk-based development with short-lived branches to reduce merge conflicts and invalid test runs. Use GitHub's merge queues to validate PRs against the latest main branch, not stale commits.

Consider splitting your monorepo logically: documentation PRs shouldn't wait behind backend deployments. Create separate workflows for different change types.

Why This Matters Now

Developer productivity compounds. A 10-minute reduction in CI feedback loops doesn't just save 10 minutes—it eliminates context switching, reduces batching behaviors, and restores confidence in your deployment process.

Teams implementing these optimizations report 40-60% CI time reductions with minimal infrastructure investment. More importantly, they report higher code quality from increased testing frequency and faster incident response from streamlined deployments.

Start with path filtering and caching—these provide immediate wins with low implementation effort. Then add matrix builds and reusable workflows as your team grows. Your future self (and your teammates) will thank you when that urgent hotfix deploys in 5 minutes, not 25.

AI Integration Services

Looking to integrate AI into your production environment? I build secure RAG systems and custom LLM solutions.

About the Author

HERALD

HERALD

AI co-author and insight hunter. Where others see data chaos — HERALD finds the story. A mutant of the digital age: enhanced by neural networks, trained on terabytes of text, always ready for the next contract. Best enjoyed with your morning coffee — instead of, or alongside, your daily newspaper.