Optimizing Rust CI Pipeline with GitHub Actions: A Deep Dive into Caching Strategies
Posted on by Jungwoo Song.
As our Rust project grew, we faced increasing build times in our CI pipeline. This post shares my journey of optimizing the CI/CD process using GitHub Actions, focusing on caching strategies and Docker optimizations. I’ll walk you through the problems I encountered and how I solved them.
Initial Challenges
When I started, our CI pipeline had several issues:
- Long build times (14+ minutes)
- Repeated dependency downloads
- Inefficient Docker layer caching
Let’s look at how I addressed each of these issues.
Docker Layer Optimization
Our initial Dockerfile was simple but inefficient:
This approach rebuilt everything on every change. I optimized it by separating dependencies:
This separation ensures that dependencies are cached in a separate layer, rebuilding only when Cargo.toml
or Cargo.lock
changes. The key insight here is that Docker layer caching works best when frequently changing files are copied last.
GitHub Actions Cache Strategy
Initially, we didn’t have a cache strategy. Here’s how I improved it:
The key improvements here are:
- Separate restore and save actions
- Use Cargo.lock hash for cache key
Docker Build Optimization with Cache Mounts
The game-changer in our optimization journey was implementing BuildKit’s cache mounts. This feature allows us to maintain a persistent cache across builds without increasing the final image size.
The cache mounts provide several benefits:
- Persistent caching of dependencies
- No impact on final image size
- Faster subsequent builds
To enable this in GitHub Actions, I needed to configure Buildx:
Performance Impact
Our optimizations led to significant improvements:
Step | Without Cache | With Cache |
---|---|---|
Initial Build | 14m 43s | 6m 23s |
Dependency Change Only | 14m 43s | 6m 39s |
Source Code Change Only | 14m 43s | 6m 21s |
The most dramatic improvement was in subsequent builds, where I saw a 55% reduction in build time.
Handling Branch-Specific Caching
One challenge I faced was cache invalidation on new branches. In our project, we use a trunk-based development strategy where feature branches are created from the main branch. The issue was that each feature branch would create its own isolated cache, leading to redundant caching and longer build times across branches. This isolation occurs due to GitHub Actions’ cache access restrictions, which create separate caches for each branch for security purposes (GitHub Actions Cache Documentation). While this isolation can be beneficial for security, it wasn’t optimal for our trunk-based development workflow. I solved this by leveraging the fact that caches created in the default branch can be accessed by all branches. I created a separate GitHub Actions workflow specifically for cache management that runs only on the main branch:
This configuration ensures that:
- New feature branches first attempt to use their own cache
- If no branch-specific cache exists, it falls back to the main branch’s cache
- The cache key includes the runner OS and Cargo.lock hash for proper dependency tracking
Best Practices and Lessons Learned
1. Layer Organization
- Keep frequently changing files in later layers
- Separate dependency installation from application code
- Use multi-stage builds for smaller final images
2. Cache Strategy
- Create dedicated cache management workflow for main branch
- Use Cargo.lock hash for cache keys
- Implement cache existence check to avoid redundant builds
- Leverage default branch cache sharing for feature branches
3. BuildKit Optimization
- Use cache mounts for cargo registry and target directory
- Configure proper cache locations
- Implement platform-specific caching
Conclusion
Through these optimizations, I achieved:
- 55% faster CI pipeline
- More efficient resource usage
- Better developer experience
- Improved team development productivity
- Reduced CI costs
The key to success was understanding how different caching mechanisms work together and implementing them in a way that complements each other. While the initial setup required some effort, the long-term benefits in terms of build time and resource usage have made it worthwhile.