Last month, a Node.js project I work on had a CI pipeline that took 15 minutes. Fifteen. Every push, every PR, fifteen minutes of waiting. Developers started batching changes to avoid triggering builds. Code review slowed down. People merged PRs without waiting for green checks because who has that kind of time?
Three weeks later, that same pipeline runs in under 3 minutes. Same tests, same linting, same everything—just smarter about how it does it.
This is part of my GitHub Actions series. Today we’re talking about making workflows fast and cheap. Because slow CI isn’t just annoying—it actively hurts your team.
Why Slow CI Kills Productivity
Before diving into optimizations, let’s talk about why this matters.
A 15-minute CI run sounds tolerable. It’s just 15 minutes, right? But here’s what actually happens:
- Developer pushes code, starts the build
- Switches context to something else while waiting
- Gets deep into that other task
- Build fails—maybe a flaky test, maybe a linting error
- Context switch back, fix the issue, push again
- Another 15 minutes
- Repeat
Studies show context switching costs about 23 minutes to recover focus. A slow CI pipeline doesn’t just waste build time—it fragments your entire workday.
There’s also the money angle. GitHub Actions gives you 2,000 free minutes per month for private repos on the Free tier, 3,000 on Team. If your workflows are wasteful, you’ll burn through those minutes fast. And at $0.008 per minute for additional Linux runner time (more for macOS and Windows), costs add up.
The Node.js Pipeline That Went From 15 Minutes to 3
Let me walk through the actual optimizations I made. Here’s what the workflow looked like before:
name: CI
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm install
- run: npm run lint
- run: npm run typecheck
- run: npm test
- run: npm run build
Straightforward. Also slow. Let’s break down where the time went:
- Checkout: 5 seconds (fine)
- Setup Node: 3 seconds (fine)
- npm install: 4 minutes 30 seconds (yikes)
- Lint: 45 seconds
- Typecheck: 1 minute 20 seconds
- Tests: 6 minutes 15 seconds
- Build: 2 minutes 10 seconds
Total: 15 minutes 8 seconds
Optimization 1: Dependency Caching
The npm install step was downloading 847 packages every single run. That’s insane. GitHub’s cache action fixes this.
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
Note two changes here. First, the cache: 'npm' option on setup-node. This caches the npm cache directory between runs, so subsequent installs only download packages that changed.
Second, I switched from npm install to npm ci. The ci command is faster for CI environments—it skips certain checks and installs exactly what’s in package-lock.json. On a warm cache, this brought install time down from 4:30 to 28 seconds.
Time saved: ~4 minutes
Optimization 2: Parallel Jobs
Lint, typecheck, tests, and build were running sequentially. But they don’t depend on each other. They can run in parallel.
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run lint
typecheck:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run typecheck
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm test
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run build
“But wait,” you might say, “now each job is installing dependencies separately. Isn’t that wasteful?”
Yes and no. Each job runs on a fresh runner, so yes, there’s duplication. But the cache is shared across jobs in the same workflow run, so after the first job warms the cache, the others pull from it. More importantly, these jobs now run at the same time. The total wall-clock time is determined by the slowest job, not the sum of all jobs.
Before: 15 minutes (sequential) After: 6:45 (the test job, which is the longest)
Time saved: ~8 minutes
Optimization 3: Fail Fast
If linting fails, there’s no point running the other jobs. They’re going to fail anyway once the developer fixes the lint errors and retries.
jobs:
lint:
runs-on: ubuntu-latest
steps:
# ... lint steps
typecheck:
needs: lint
runs-on: ubuntu-latest
steps:
# ... typecheck steps
test:
needs: lint
runs-on: ubuntu-latest
steps:
# ... test steps
build:
needs: [typecheck, test]
runs-on: ubuntu-latest
steps:
# ... build steps
Now lint runs first. If it passes, typecheck and test run in parallel. If both pass, build runs. A lint failure stops everything immediately—no wasted time running tests that will just be re-run after the fix.
Time saved: Variable, but meaningful when things fail
Optimization 4: Conditional Execution
Not every change needs every check. Documentation changes don’t need tests. Test changes don’t need to rebuild production bundles.
jobs:
changes:
runs-on: ubuntu-latest
outputs:
src: ${{ steps.filter.outputs.src }}
docs: ${{ steps.filter.outputs.docs }}
tests: ${{ steps.filter.outputs.tests }}
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
src:
- 'src/**'
docs:
- 'docs/**'
- '*.md'
tests:
- 'tests/**'
- '**/*.test.ts'
test:
needs: [lint, changes]
if: needs.changes.outputs.src == 'true' || needs.changes.outputs.tests == 'true'
runs-on: ubuntu-latest
steps:
# ... test steps
build:
needs: [typecheck, changes]
if: needs.changes.outputs.src == 'true'
runs-on: ubuntu-latest
steps:
# ... build steps
The paths-filter action inspects what files changed and sets outputs accordingly. Downstream jobs can check those outputs to decide whether to run.
This one’s situational—it helps most when you have a mix of code, docs, and config changes happening. For pure code changes, everything still runs.
The Final Result
After all optimizations, the workflow looks like this:
name: CI
on: [push, pull_request]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-latest
outputs:
src: ${{ steps.filter.outputs.src }}
tests: ${{ steps.filter.outputs.tests }}
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
src:
- 'src/**'
tests:
- 'tests/**'
- '**/*.test.ts'
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run lint
typecheck:
needs: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run typecheck
test:
needs: [lint, changes]
if: needs.changes.outputs.src == 'true' || needs.changes.outputs.tests == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm test
build:
needs: [typecheck, test, changes]
if: always() && needs.typecheck.result == 'success' && (needs.test.result == 'success' || needs.test.result == 'skipped') && needs.changes.outputs.src == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run build
Note the concurrency block at the top. This is huge for PR workflows—if you push a new commit while a previous run is still going, it cancels the old run. No more watching three builds in progress when you only care about the latest one.
Final time: 2 minutes 47 seconds (on a typical code change with warm caches)
That’s an 82% reduction. Same checks, same safety, fraction of the time.
Caching Strategies by Ecosystem
npm caching is just the beginning. Here’s how to cache effectively across different ecosystems.
Python (pip)
- uses: actions/setup-python@v5
with:
python-version: '3.12'
cache: 'pip'
- run: pip install -r requirements.txt
Or for more control:
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
The restore-keys fallback is important. If your requirements.txt changes, you won’t have an exact cache match, but you’ll get the previous cache as a starting point. Partial cache hits are faster than cold starts.
Go Modules
- uses: actions/setup-go@v5
with:
go-version: '1.22'
cache: true
Go’s module system is already pretty fast, but caching still helps. The setup-go action caches ~/go/pkg/mod automatically.
Rust (Cargo)
Rust builds are notoriously slow. Caching is essential.
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
The target/ directory is the big one—it contains compiled dependencies. On a large Rust project, this can save 10+ minutes.
There’s a dedicated action that handles this better:
- uses: Swatinem/rust-cache@v2
It’s smarter about what to cache and automatically handles cache invalidation.
Ruby (Bundler)
- uses: ruby/setup-ruby@v1
with:
ruby-version: '3.3'
bundler-cache: true
One flag. That’s it. The setup-ruby action handles everything.
When Caching Hurts More Than It Helps
I need to be honest here—caching isn’t always the answer. There are cases where it backfires.
Cache invalidation bugs: If your cache key doesn’t include something it should, you might restore stale dependencies. I’ve seen builds pass in CI and fail locally because CI was using cached old packages. The fix is making sure your cache key hashes all relevant files.
Cache size limits: GitHub limits cache storage to 10GB per repository. If your caches are huge (looking at you, Rust projects with many dependencies), old caches get evicted. When the cache you need gets evicted, you’re back to cold starts—and you wasted time trying to restore a cache that doesn’t exist.
Overhead on small projects: Cache save and restore aren’t free. For a project with 20 small dependencies that install in 15 seconds, adding caching might actually slow things down. The save/restore overhead can exceed the time saved.
Rule of thumb: If your uncached install takes less than 30 seconds, caching probably isn’t worth it.
Docker Layer Caching
Docker builds in CI are often painfully slow because every build starts from scratch. Layer caching fixes this.
Here’s a multi-stage Dockerfile for a Node.js app:
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]
The key insight: dependencies change less often than source code. By copying package*.json first and running npm ci in a separate stage, Docker can reuse that layer when only source code changes.
But in CI, there’s no layer cache by default. Enter BuildKit:
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v6
with:
context: .
push: true
tags: myapp:latest
cache-from: type=gha
cache-to: type=gha,mode=max
The type=gha cache backend stores layers in GitHub Actions cache. Subsequent builds pull cached layers, only rebuilding what changed.
Real Numbers: Docker Build Optimization
A project I worked on had a complex multi-stage build:
| Stage | Cold Build | With Layer Cache |
|---|---|---|
| Base dependencies | 3:42 | 8s (cached) |
| App dependencies | 2:15 | 12s (cached) |
| Build | 1:48 | 1:48 |
| Final image | 0:24 | 0:24 |
| Total | 8:09 | 2:32 |
That’s a 69% improvement, and it gets better over time as your dependency layers stabilize.
One gotcha: the GitHub Actions cache has a 10GB limit per repo. Docker images can be big. If you’re caching multiple large images, you might hit that limit and lose older caches. Consider using a container registry for cache storage instead:
cache-from: type=registry,ref=ghcr.io/myorg/myapp:buildcache
cache-to: type=registry,ref=ghcr.io/myorg/myapp:buildcache,mode=max
Monorepo Optimization: Running Only Affected Tests
Monorepos present a unique challenge. You don’t want to run all tests for all packages when only one package changed.
The paths-filter approach I showed earlier works, but it requires manually maintaining filter definitions. For larger monorepos, consider tools like Nx or Turborepo that understand your dependency graph.
Here’s a simplified approach using Nx:
jobs:
affected:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Nx needs git history
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npx nx affected --target=test --base=origin/main
- run: npx nx affected --target=build --base=origin/main
The affected command figures out what changed compared to main and only runs tests/builds for those packages and their dependents.
A monorepo I work with has 12 packages. Running all tests takes 18 minutes. Running affected tests on a typical PR takes 4 minutes. That’s the power of understanding your dependency graph.
Artifact Management
Artifacts let you pass data between jobs and persist build outputs. But they come with overhead.
Upload and Download Costs
Every artifact upload and download takes time. For small artifacts, the overhead might exceed the benefit. I’ve seen workflows where uploading a 2MB test report took 20 seconds—longer than running the tests.
Rule of thumb: Don’t upload artifacts smaller than 10MB unless you really need them. Log the information instead.
Retention Policies
By default, artifacts stick around for 90 days. That’s a lot of storage for build outputs nobody will ever look at.
- uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
retention-days: 7
For most use cases, 7 days is plenty. For release artifacts, you might keep them longer. Tune retention based on actual needs.
Compressing Before Upload
If you must upload large artifacts, compress them first:
- run: tar -czf coverage.tar.gz coverage/
- uses: actions/upload-artifact@v4
with:
name: coverage-report
path: coverage.tar.gz
I’ve seen 40% size reductions on coverage reports. Upload time scales roughly linearly with size, so this translates directly to time saved.
Choosing the Right Runner
GitHub offers different runner sizes. The default ubuntu-latest has 2 cores and 7GB RAM. For CPU-intensive tasks, that might not be enough.
Larger runners cost more but can be faster:
| Runner | vCPU | RAM | Cost/min (Linux) |
|---|---|---|---|
| Standard | 2 | 7GB | $0.008 |
| 4-core | 4 | 16GB | $0.016 |
| 8-core | 8 | 32GB | $0.032 |
| 16-core | 16 | 64GB | $0.064 |
Is it worth it? Depends on your workload.
For parallelizable tasks (like running tests with Jest or pytest), more cores help a lot. If your test suite takes 10 minutes on 2 cores but 3 minutes on 8 cores, the 8-core runner is actually cheaper:
- 2-core: 10 min x $0.008 = $0.08
- 8-core: 3 min x $0.032 = $0.096
Okay, slightly more expensive. But if developer time matters (and it does), the 7-minute savings per run adds up. If your team runs 50 builds a day, that’s 350 minutes saved daily. At even a modest hourly rate, the math is clear.
For single-threaded tasks, larger runners don’t help. Measure before spending.
Concurrency Controls
I mentioned this earlier, but it’s worth expanding. The concurrency setting prevents wasted work:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
This groups runs by workflow and branch. If a new run starts for the same branch, the old one cancels.
For main branch builds, you might want different behavior:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
This cancels in-progress runs for feature branches but lets main branch runs complete. You probably want a clean history of main branch builds.
Incremental Builds and Test Selection
Some build tools support incremental compilation—only rebuilding what changed. TypeScript’s --incremental flag, for example:
- run: npx tsc --incremental
- uses: actions/cache@v4
with:
path: tsconfig.tsbuildinfo
key: ${{ runner.os }}-tsc-${{ github.sha }}
restore-keys: |
${{ runner.os }}-tsc-
Caching the .tsbuildinfo file lets subsequent builds skip unchanged modules.
For tests, some frameworks support running only affected tests. Jest’s --changedSince flag:
- run: npx jest --changedSince=origin/main
This runs tests for files that changed compared to main. In a large codebase, this can cut test time dramatically.
Caveat: Changed-based test selection can miss failures caused by transitive dependencies. A change in module A might break tests in module B that depends on A. Smart test selection tools try to account for this, but nothing’s perfect. Consider running the full test suite on main branch merges even if PRs only run affected tests.
Monitoring and Analyzing Performance
You can’t optimize what you can’t measure. GitHub provides some built-in insights, but for serious analysis, you need more.
Workflow Run Logs
Every step shows its duration. Look for patterns:
- Which steps are consistently slow?
- Which steps have high variance?
- Are cache restore times reasonable?
GitHub Actions Insights
Under your repo’s Actions tab, there’s an “Insights” section showing workflow run times over time. Look for trends—if your builds are getting slower, investigate before it becomes a problem.
Third-Party Tools
Tools like BuildPulse, Datadog CI Visibility, and Honeycomb can give deeper insights. They track metrics over time, identify flaky tests, and help you spot performance regressions.
For most teams, the built-in tools are enough. But if you’re running hundreds of builds a day, dedicated CI analytics might be worth the investment.
Custom Timing Instrumentation
Sometimes you need more granularity than step-level timing. You can add your own:
- name: Run tests with timing
run: |
start_time=$(date +%s)
npm test
end_time=$(date +%s)
echo "Test duration: $((end_time - start_time)) seconds"
echo "test_duration=$((end_time - start_time))" >> $GITHUB_OUTPUT
Combine this with job summaries to surface timing information:
- name: Report timing
run: |
echo "## Performance Summary" >> $GITHUB_STEP_SUMMARY
echo "- Test duration: ${{ steps.test.outputs.test_duration }}s" >> $GITHUB_STEP_SUMMARY
Cost Optimization for Private Repos
Public repos get unlimited Actions minutes. Private repos don’t. Here’s how to keep costs down.
Use Linux When Possible
Runner prices vary dramatically by OS:
| OS | Price/minute |
|---|---|
| Linux | $0.008 |
| Windows | $0.016 |
| macOS | $0.08 |
macOS is 10x more expensive than Linux. If you only need macOS for final testing (not for every lint and typecheck), structure your workflow accordingly:
jobs:
lint:
runs-on: ubuntu-latest # Cheap
# ...
test-linux:
runs-on: ubuntu-latest # Cheap
# ...
test-macos:
needs: [lint, test-linux]
runs-on: macos-latest # Expensive, but only runs after cheaper checks pass
# ...
Self-Hosted Runners for Heavy Workloads
If you’re burning through Actions minutes, self-hosted runners might make sense. You pay for the compute directly (EC2, a Mac Mini in a closet, whatever) but don’t pay GitHub’s per-minute fees.
The trade-off: you’re responsible for maintenance, security, and availability. For small teams, it’s usually not worth the hassle. For larger organizations with predictable CI loads, it can save significant money.
Schedule Non-Urgent Work Off-Peak
This won’t save money, but it reduces load during peak development hours:
on:
schedule:
- cron: '0 3 * * *' # 3 AM
Dependency updates, security scans, full regression suites—things that don’t need immediate feedback can run overnight.
Common Pitfalls
A few mistakes I’ve made (and seen others make):
Over-parallelization: Running 20 parallel jobs sounds fast, but if each job has 30 seconds of setup overhead, you’re wasting 10 minutes on overhead alone. Find the right balance.
Cache key collisions: Using ${{ runner.os }} in cache keys is common, but if you’re caching platform-specific binaries and also running on ARM vs x64, you need more specific keys.
Ignoring flaky tests: A test that fails 10% of the time means 10% of your builds take longer (because of retries) or fail entirely. Fix flaky tests—they’re a hidden performance drain.
Not measuring: “It feels faster” isn’t data. Track your p50 and p90 build times. Optimization without measurement is just guessing.
Quick Wins Checklist
If you take nothing else from this post, do these:
- Enable dependency caching for your package manager
- Add concurrency controls to cancel redundant runs
- Parallelize independent jobs (lint, test, build)
- Use
npm ciinstead ofnpm install(or equivalent for your ecosystem) - Set artifact retention to something reasonable (not 90 days)
- Add path filters if your repo has distinct components
Each of these takes 5 minutes to implement and can shave minutes off every build.
The Bigger Picture
Workflow optimization is an ongoing process, not a one-time project. As your codebase grows, your CI will naturally slow down. Build in time for periodic reviews—maybe quarterly—to check if your workflows are still performing well.
The goal isn’t the fastest possible CI. It’s CI that’s fast enough that developers don’t work around it. If people are pushing without waiting for checks, or batching unrelated changes to avoid triggering builds, your CI is too slow. Fix it.
Fast feedback loops make better software. Every minute you shave off your CI is a minute your developers get back for actual work. And honestly? There’s something deeply satisfying about watching a build complete in 2 minutes that used to take 15.
Next up in this series: self-hosted runners—when the GitHub-provided runners aren’t enough, and how to run your own.