GitHub Actions: Optimization and Performance

Slow CI is a productivity killer. Here's how to slash your GitHub Actions workflow times—with real examples, benchmarks, and the caching strategies that actually work.

Last month, a Node.js project I work on had a CI pipeline that took 15 minutes. Fifteen. Every push, every PR, fifteen minutes of waiting. Developers started batching changes to avoid triggering builds. Code review slowed down. People merged PRs without waiting for green checks because who has that kind of time?

Three weeks later, that same pipeline runs in under 3 minutes. Same tests, same linting, same everything—just smarter about how it does it.

This is part of my GitHub Actions series. Today we’re talking about making workflows fast and cheap. Because slow CI isn’t just annoying—it actively hurts your team.

Why Slow CI Kills Productivity

Before diving into optimizations, let’s talk about why this matters.

A 15-minute CI run sounds tolerable. It’s just 15 minutes, right? But here’s what actually happens:

  1. Developer pushes code, starts the build
  2. Switches context to something else while waiting
  3. Gets deep into that other task
  4. Build fails—maybe a flaky test, maybe a linting error
  5. Context switch back, fix the issue, push again
  6. Another 15 minutes
  7. Repeat

Studies show context switching costs about 23 minutes to recover focus. A slow CI pipeline doesn’t just waste build time—it fragments your entire workday.

There’s also the money angle. GitHub Actions gives you 2,000 free minutes per month for private repos on the Free tier, 3,000 on Team. If your workflows are wasteful, you’ll burn through those minutes fast. And at $0.008 per minute for additional Linux runner time (more for macOS and Windows), costs add up.

The Node.js Pipeline That Went From 15 Minutes to 3

Let me walk through the actual optimizations I made. Here’s what the workflow looked like before:

name: CI
on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm install
      - run: npm run lint
      - run: npm run typecheck
      - run: npm test
      - run: npm run build

Straightforward. Also slow. Let’s break down where the time went:

  • Checkout: 5 seconds (fine)
  • Setup Node: 3 seconds (fine)
  • npm install: 4 minutes 30 seconds (yikes)
  • Lint: 45 seconds
  • Typecheck: 1 minute 20 seconds
  • Tests: 6 minutes 15 seconds
  • Build: 2 minutes 10 seconds

Total: 15 minutes 8 seconds

Optimization 1: Dependency Caching

The npm install step was downloading 847 packages every single run. That’s insane. GitHub’s cache action fixes this.

- uses: actions/checkout@v4
- uses: actions/setup-node@v4
  with:
    node-version: '20'
    cache: 'npm'
- run: npm ci

Note two changes here. First, the cache: 'npm' option on setup-node. This caches the npm cache directory between runs, so subsequent installs only download packages that changed.

Second, I switched from npm install to npm ci. The ci command is faster for CI environments—it skips certain checks and installs exactly what’s in package-lock.json. On a warm cache, this brought install time down from 4:30 to 28 seconds.

Time saved: ~4 minutes

Optimization 2: Parallel Jobs

Lint, typecheck, tests, and build were running sequentially. But they don’t depend on each other. They can run in parallel.

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run lint

  typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run typecheck

  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm test

  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run build

“But wait,” you might say, “now each job is installing dependencies separately. Isn’t that wasteful?”

Yes and no. Each job runs on a fresh runner, so yes, there’s duplication. But the cache is shared across jobs in the same workflow run, so after the first job warms the cache, the others pull from it. More importantly, these jobs now run at the same time. The total wall-clock time is determined by the slowest job, not the sum of all jobs.

Before: 15 minutes (sequential) After: 6:45 (the test job, which is the longest)

Time saved: ~8 minutes

Optimization 3: Fail Fast

If linting fails, there’s no point running the other jobs. They’re going to fail anyway once the developer fixes the lint errors and retries.

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      # ... lint steps

  typecheck:
    needs: lint
    runs-on: ubuntu-latest
    steps:
      # ... typecheck steps

  test:
    needs: lint
    runs-on: ubuntu-latest
    steps:
      # ... test steps

  build:
    needs: [typecheck, test]
    runs-on: ubuntu-latest
    steps:
      # ... build steps

Now lint runs first. If it passes, typecheck and test run in parallel. If both pass, build runs. A lint failure stops everything immediately—no wasted time running tests that will just be re-run after the fix.

Time saved: Variable, but meaningful when things fail

Optimization 4: Conditional Execution

Not every change needs every check. Documentation changes don’t need tests. Test changes don’t need to rebuild production bundles.

jobs:
  changes:
    runs-on: ubuntu-latest
    outputs:
      src: ${{ steps.filter.outputs.src }}
      docs: ${{ steps.filter.outputs.docs }}
      tests: ${{ steps.filter.outputs.tests }}
    steps:
      - uses: actions/checkout@v4
      - uses: dorny/paths-filter@v3
        id: filter
        with:
          filters: |
            src:
              - 'src/**'
            docs:
              - 'docs/**'
              - '*.md'
            tests:
              - 'tests/**'
              - '**/*.test.ts'

  test:
    needs: [lint, changes]
    if: needs.changes.outputs.src == 'true' || needs.changes.outputs.tests == 'true'
    runs-on: ubuntu-latest
    steps:
      # ... test steps

  build:
    needs: [typecheck, changes]
    if: needs.changes.outputs.src == 'true'
    runs-on: ubuntu-latest
    steps:
      # ... build steps

The paths-filter action inspects what files changed and sets outputs accordingly. Downstream jobs can check those outputs to decide whether to run.

This one’s situational—it helps most when you have a mix of code, docs, and config changes happening. For pure code changes, everything still runs.

The Final Result

After all optimizations, the workflow looks like this:

name: CI
on: [push, pull_request]

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  changes:
    runs-on: ubuntu-latest
    outputs:
      src: ${{ steps.filter.outputs.src }}
      tests: ${{ steps.filter.outputs.tests }}
    steps:
      - uses: actions/checkout@v4
      - uses: dorny/paths-filter@v3
        id: filter
        with:
          filters: |
            src:
              - 'src/**'
            tests:
              - 'tests/**'
              - '**/*.test.ts'

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run lint

  typecheck:
    needs: lint
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run typecheck

  test:
    needs: [lint, changes]
    if: needs.changes.outputs.src == 'true' || needs.changes.outputs.tests == 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm test

  build:
    needs: [typecheck, test, changes]
    if: always() && needs.typecheck.result == 'success' && (needs.test.result == 'success' || needs.test.result == 'skipped') && needs.changes.outputs.src == 'true'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm run build

Note the concurrency block at the top. This is huge for PR workflows—if you push a new commit while a previous run is still going, it cancels the old run. No more watching three builds in progress when you only care about the latest one.

Final time: 2 minutes 47 seconds (on a typical code change with warm caches)

That’s an 82% reduction. Same checks, same safety, fraction of the time.

Caching Strategies by Ecosystem

npm caching is just the beginning. Here’s how to cache effectively across different ecosystems.

Python (pip)

- uses: actions/setup-python@v5
  with:
    python-version: '3.12'
    cache: 'pip'
- run: pip install -r requirements.txt

Or for more control:

- uses: actions/cache@v4
  with:
    path: ~/.cache/pip
    key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
    restore-keys: |
      ${{ runner.os }}-pip-

The restore-keys fallback is important. If your requirements.txt changes, you won’t have an exact cache match, but you’ll get the previous cache as a starting point. Partial cache hits are faster than cold starts.

Go Modules

- uses: actions/setup-go@v5
  with:
    go-version: '1.22'
    cache: true

Go’s module system is already pretty fast, but caching still helps. The setup-go action caches ~/go/pkg/mod automatically.

Rust (Cargo)

Rust builds are notoriously slow. Caching is essential.

- uses: actions/cache@v4
  with:
    path: |
      ~/.cargo/bin/
      ~/.cargo/registry/index/
      ~/.cargo/registry/cache/
      ~/.cargo/git/db/
      target/
    key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
    restore-keys: |
      ${{ runner.os }}-cargo-

The target/ directory is the big one—it contains compiled dependencies. On a large Rust project, this can save 10+ minutes.

There’s a dedicated action that handles this better:

- uses: Swatinem/rust-cache@v2

It’s smarter about what to cache and automatically handles cache invalidation.

Ruby (Bundler)

- uses: ruby/setup-ruby@v1
  with:
    ruby-version: '3.3'
    bundler-cache: true

One flag. That’s it. The setup-ruby action handles everything.

When Caching Hurts More Than It Helps

I need to be honest here—caching isn’t always the answer. There are cases where it backfires.

Cache invalidation bugs: If your cache key doesn’t include something it should, you might restore stale dependencies. I’ve seen builds pass in CI and fail locally because CI was using cached old packages. The fix is making sure your cache key hashes all relevant files.

Cache size limits: GitHub limits cache storage to 10GB per repository. If your caches are huge (looking at you, Rust projects with many dependencies), old caches get evicted. When the cache you need gets evicted, you’re back to cold starts—and you wasted time trying to restore a cache that doesn’t exist.

Overhead on small projects: Cache save and restore aren’t free. For a project with 20 small dependencies that install in 15 seconds, adding caching might actually slow things down. The save/restore overhead can exceed the time saved.

Rule of thumb: If your uncached install takes less than 30 seconds, caching probably isn’t worth it.

Docker Layer Caching

Docker builds in CI are often painfully slow because every build starts from scratch. Layer caching fixes this.

Here’s a multi-stage Dockerfile for a Node.js app:

FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]

The key insight: dependencies change less often than source code. By copying package*.json first and running npm ci in a separate stage, Docker can reuse that layer when only source code changes.

But in CI, there’s no layer cache by default. Enter BuildKit:

- uses: docker/setup-buildx-action@v3

- uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: myapp:latest
    cache-from: type=gha
    cache-to: type=gha,mode=max

The type=gha cache backend stores layers in GitHub Actions cache. Subsequent builds pull cached layers, only rebuilding what changed.

Real Numbers: Docker Build Optimization

A project I worked on had a complex multi-stage build:

StageCold BuildWith Layer Cache
Base dependencies3:428s (cached)
App dependencies2:1512s (cached)
Build1:481:48
Final image0:240:24
Total8:092:32

That’s a 69% improvement, and it gets better over time as your dependency layers stabilize.

One gotcha: the GitHub Actions cache has a 10GB limit per repo. Docker images can be big. If you’re caching multiple large images, you might hit that limit and lose older caches. Consider using a container registry for cache storage instead:

cache-from: type=registry,ref=ghcr.io/myorg/myapp:buildcache
cache-to: type=registry,ref=ghcr.io/myorg/myapp:buildcache,mode=max

Monorepo Optimization: Running Only Affected Tests

Monorepos present a unique challenge. You don’t want to run all tests for all packages when only one package changed.

The paths-filter approach I showed earlier works, but it requires manually maintaining filter definitions. For larger monorepos, consider tools like Nx or Turborepo that understand your dependency graph.

Here’s a simplified approach using Nx:

jobs:
  affected:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Nx needs git history
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npx nx affected --target=test --base=origin/main
      - run: npx nx affected --target=build --base=origin/main

The affected command figures out what changed compared to main and only runs tests/builds for those packages and their dependents.

A monorepo I work with has 12 packages. Running all tests takes 18 minutes. Running affected tests on a typical PR takes 4 minutes. That’s the power of understanding your dependency graph.

Artifact Management

Artifacts let you pass data between jobs and persist build outputs. But they come with overhead.

Upload and Download Costs

Every artifact upload and download takes time. For small artifacts, the overhead might exceed the benefit. I’ve seen workflows where uploading a 2MB test report took 20 seconds—longer than running the tests.

Rule of thumb: Don’t upload artifacts smaller than 10MB unless you really need them. Log the information instead.

Retention Policies

By default, artifacts stick around for 90 days. That’s a lot of storage for build outputs nobody will ever look at.

- uses: actions/upload-artifact@v4
  with:
    name: build-output
    path: dist/
    retention-days: 7

For most use cases, 7 days is plenty. For release artifacts, you might keep them longer. Tune retention based on actual needs.

Compressing Before Upload

If you must upload large artifacts, compress them first:

- run: tar -czf coverage.tar.gz coverage/
- uses: actions/upload-artifact@v4
  with:
    name: coverage-report
    path: coverage.tar.gz

I’ve seen 40% size reductions on coverage reports. Upload time scales roughly linearly with size, so this translates directly to time saved.

Choosing the Right Runner

GitHub offers different runner sizes. The default ubuntu-latest has 2 cores and 7GB RAM. For CPU-intensive tasks, that might not be enough.

Larger runners cost more but can be faster:

RunnervCPURAMCost/min (Linux)
Standard27GB$0.008
4-core416GB$0.016
8-core832GB$0.032
16-core1664GB$0.064

Is it worth it? Depends on your workload.

For parallelizable tasks (like running tests with Jest or pytest), more cores help a lot. If your test suite takes 10 minutes on 2 cores but 3 minutes on 8 cores, the 8-core runner is actually cheaper:

  • 2-core: 10 min x $0.008 = $0.08
  • 8-core: 3 min x $0.032 = $0.096

Okay, slightly more expensive. But if developer time matters (and it does), the 7-minute savings per run adds up. If your team runs 50 builds a day, that’s 350 minutes saved daily. At even a modest hourly rate, the math is clear.

For single-threaded tasks, larger runners don’t help. Measure before spending.

Concurrency Controls

I mentioned this earlier, but it’s worth expanding. The concurrency setting prevents wasted work:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

This groups runs by workflow and branch. If a new run starts for the same branch, the old one cancels.

For main branch builds, you might want different behavior:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

This cancels in-progress runs for feature branches but lets main branch runs complete. You probably want a clean history of main branch builds.

Incremental Builds and Test Selection

Some build tools support incremental compilation—only rebuilding what changed. TypeScript’s --incremental flag, for example:

- run: npx tsc --incremental
- uses: actions/cache@v4
  with:
    path: tsconfig.tsbuildinfo
    key: ${{ runner.os }}-tsc-${{ github.sha }}
    restore-keys: |
      ${{ runner.os }}-tsc-

Caching the .tsbuildinfo file lets subsequent builds skip unchanged modules.

For tests, some frameworks support running only affected tests. Jest’s --changedSince flag:

- run: npx jest --changedSince=origin/main

This runs tests for files that changed compared to main. In a large codebase, this can cut test time dramatically.

Caveat: Changed-based test selection can miss failures caused by transitive dependencies. A change in module A might break tests in module B that depends on A. Smart test selection tools try to account for this, but nothing’s perfect. Consider running the full test suite on main branch merges even if PRs only run affected tests.

Monitoring and Analyzing Performance

You can’t optimize what you can’t measure. GitHub provides some built-in insights, but for serious analysis, you need more.

Workflow Run Logs

Every step shows its duration. Look for patterns:

  • Which steps are consistently slow?
  • Which steps have high variance?
  • Are cache restore times reasonable?

GitHub Actions Insights

Under your repo’s Actions tab, there’s an “Insights” section showing workflow run times over time. Look for trends—if your builds are getting slower, investigate before it becomes a problem.

Third-Party Tools

Tools like BuildPulse, Datadog CI Visibility, and Honeycomb can give deeper insights. They track metrics over time, identify flaky tests, and help you spot performance regressions.

For most teams, the built-in tools are enough. But if you’re running hundreds of builds a day, dedicated CI analytics might be worth the investment.

Custom Timing Instrumentation

Sometimes you need more granularity than step-level timing. You can add your own:

- name: Run tests with timing
  run: |
    start_time=$(date +%s)
    npm test
    end_time=$(date +%s)
    echo "Test duration: $((end_time - start_time)) seconds"
    echo "test_duration=$((end_time - start_time))" >> $GITHUB_OUTPUT

Combine this with job summaries to surface timing information:

- name: Report timing
  run: |
    echo "## Performance Summary" >> $GITHUB_STEP_SUMMARY
    echo "- Test duration: ${{ steps.test.outputs.test_duration }}s" >> $GITHUB_STEP_SUMMARY

Cost Optimization for Private Repos

Public repos get unlimited Actions minutes. Private repos don’t. Here’s how to keep costs down.

Use Linux When Possible

Runner prices vary dramatically by OS:

OSPrice/minute
Linux$0.008
Windows$0.016
macOS$0.08

macOS is 10x more expensive than Linux. If you only need macOS for final testing (not for every lint and typecheck), structure your workflow accordingly:

jobs:
  lint:
    runs-on: ubuntu-latest  # Cheap
    # ...

  test-linux:
    runs-on: ubuntu-latest  # Cheap
    # ...

  test-macos:
    needs: [lint, test-linux]
    runs-on: macos-latest  # Expensive, but only runs after cheaper checks pass
    # ...

Self-Hosted Runners for Heavy Workloads

If you’re burning through Actions minutes, self-hosted runners might make sense. You pay for the compute directly (EC2, a Mac Mini in a closet, whatever) but don’t pay GitHub’s per-minute fees.

The trade-off: you’re responsible for maintenance, security, and availability. For small teams, it’s usually not worth the hassle. For larger organizations with predictable CI loads, it can save significant money.

Schedule Non-Urgent Work Off-Peak

This won’t save money, but it reduces load during peak development hours:

on:
  schedule:
    - cron: '0 3 * * *'  # 3 AM

Dependency updates, security scans, full regression suites—things that don’t need immediate feedback can run overnight.

Common Pitfalls

A few mistakes I’ve made (and seen others make):

Over-parallelization: Running 20 parallel jobs sounds fast, but if each job has 30 seconds of setup overhead, you’re wasting 10 minutes on overhead alone. Find the right balance.

Cache key collisions: Using ${{ runner.os }} in cache keys is common, but if you’re caching platform-specific binaries and also running on ARM vs x64, you need more specific keys.

Ignoring flaky tests: A test that fails 10% of the time means 10% of your builds take longer (because of retries) or fail entirely. Fix flaky tests—they’re a hidden performance drain.

Not measuring: “It feels faster” isn’t data. Track your p50 and p90 build times. Optimization without measurement is just guessing.

Quick Wins Checklist

If you take nothing else from this post, do these:

  1. Enable dependency caching for your package manager
  2. Add concurrency controls to cancel redundant runs
  3. Parallelize independent jobs (lint, test, build)
  4. Use npm ci instead of npm install (or equivalent for your ecosystem)
  5. Set artifact retention to something reasonable (not 90 days)
  6. Add path filters if your repo has distinct components

Each of these takes 5 minutes to implement and can shave minutes off every build.

The Bigger Picture

Workflow optimization is an ongoing process, not a one-time project. As your codebase grows, your CI will naturally slow down. Build in time for periodic reviews—maybe quarterly—to check if your workflows are still performing well.

The goal isn’t the fastest possible CI. It’s CI that’s fast enough that developers don’t work around it. If people are pushing without waiting for checks, or batching unrelated changes to avoid triggering builds, your CI is too slow. Fix it.

Fast feedback loops make better software. Every minute you shave off your CI is a minute your developers get back for actual work. And honestly? There’s something deeply satisfying about watching a build complete in 2 minutes that used to take 15.

Next up in this series: self-hosted runners—when the GitHub-provided runners aren’t enough, and how to run your own.