6 Critical Insights into GitHub's eBPF Strategy for Breaking Deployment Dependencies

From Moocchen, the free encyclopedia of technology

At GitHub, we run our own source code on github.com—a practice that makes us our own most demanding customer. But this setup creates a dangerous loop: if github.com goes down, we can't access the code needed to fix it. That's a classic circular dependency, and it's just the tip of the iceberg. When deploying fixes, scripts themselves can introduce new dependencies that fail during an outage. To solve this, we turned to eBPF, a powerful kernel technology that lets us monitor and block problematic calls in real time. In this article, we'll break down the core problem, explore three types of circular dependencies, explain why manual reviews aren't enough, reveal how eBPF blocks dependencies, provide a practical implementation guide, and look at the future of deployment safety.

1. The Circular Dependency Dilemma

GitHub's infrastructure is built on a simple truth: we must be able to deploy fixes even when our service is partially down. Yet every deployment script runs the risk of creating a circular dependency—a loop where the deployment requires the very service it's trying to repair. For example, if a script pulls a binary from GitHub's release service, and that service is down, the script fails. This isn't just a theoretical concern; it's a real threat to uptime. Understanding this dilemma is the first step toward building resilient deployment systems. The key is to ensure that deployment code never depends on the healthy state of the platform it's fixing.

6 Critical Insights into GitHub's eBPF Strategy for Breaking Deployment Dependencies
Source: github.blog

2. Three Types of Circular Dependencies

During an outage, deployment scripts can run into three distinct types of circular dependencies. Direct dependencies occur when a script explicitly calls a service that's down—for instance, downloading a tool from GitHub while GitHub is unavailable. Hidden dependencies are more subtle: a script uses a local tool that, behind the scenes, checks for updates online. If the update check hangs or fails, the script stalls. Transient dependencies involve a chain: a script calls an internal API, which in turn tries to fetch something from GitHub, propagating the failure. Each type requires a different mitigation strategy, but all share the same root cause—relying on an external service during an incident.

3. Why Manual Review Isn't Enough

Traditionally, the burden fell on individual teams to audit their deployment scripts for circular dependencies. They'd scan code, document assumptions, and test in staging. But this approach has fatal flaws. Dependencies evolve: a script that works today may break tomorrow when a background tool adds an update check. Hidden and transient dependencies are notoriously hard to spot manually, especially in large, complex codebases. Moreover, manual reviews are slow and prone to human error. When an outage strikes, you need a deployment system that enforces safety automatically, not one that relies on fallible human memory. That's where kernel-level enforcement becomes invaluable.

4. How eBPF Blocks Circular Dependencies

eBPF (extended Berkeley Packet Filter) allows us to run sandboxed programs in the Linux kernel without changing kernel code. For deployment safety, we use eBPF to intercept system calls made by deployment scripts—specifically network calls. When a script attempts to connect to an external service that we've deemed risky (like github.com itself), eBPF can monitor or even block that call. This is done by attaching eBPF programs to network hooks (e.g., connect() syscall). We can create allowlists of known-safe endpoints (e.g., package mirrors, internal artifact stores) and deny everything else. The result: deployment scripts can't accidentally create circular dependencies, no matter how many hidden layers they have.

6 Critical Insights into GitHub's eBPF Strategy for Breaking Deployment Dependencies
Source: github.blog

5. Implementing eBPF in Your Deployment Pipeline

Getting started with eBPF for deployment safety is easier than you might think. First, identify the services your deployment scripts typically call during a run. Use a tool like strace or bpftrace to capture all outbound connections. Then, write a simple eBPF program that attaches to the connect syscall and checks the destination IP/port against a policy map. For example, you can allow traffic to your internal artifact repository but block anything else. GitHub's actual implementation uses a custom eBPF-based guard that runs on each deployment host. The program is loaded at boot time and applies to all deployment processes. You can start with open-source frameworks like bcc or libbpf. Test in a non-production environment first.

6. The Future of Deployment Safety with eBPF

eBPF is rapidly evolving, and its role in deployment safety will only grow. Future possibilities include dynamic policy updates based on real-time incident status, integration with orchestration tools like Kubernetes, and even fine-grained control over file system calls to prevent access to stale caches. GitHub is already exploring ways to extend this system to other circular dependency patterns, such as database calls that might trigger cascading failures. The ultimate goal is a self-healing deployment pipeline that enforces safety without human intervention. By adopting eBPF now, you can stay ahead of these trends and build a more resilient infrastructure.

In conclusion, eBPF provides a robust, kernel-level solution to the age-old problem of circular dependency in deployments. It moves the safety check from manual reviews to automated enforcement, giving engineers confidence that their scripts won't make a bad situation worse. Whether you're running a massive platform like GitHub or a smaller service, the principles remain the same: identify risky dependencies, use eBPF to block them, and deploy with peace of mind.