<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="3.9.0">Jekyll</generator><link href="https://github.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://github.com/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-04-14T17:51:51+00:00</updated><id>https://github.com/feed.xml</id><title type="html">Colin’s ALM Corner</title><subtitle>All things DevOps and GitHub. Musings about DevOps tooling, culture and philosophy.
</subtitle><author><name>Colin Dembovsky</name></author><entry><title type="html">From Sprints to Swarms, Part 3: When Code Gets Cheaper, Judgment Gets More Valuable.</title><link href="https://github.com/from-sprints-to-swarms-part-3-judgment-gets-more-valuable/" rel="alternate" type="text/html" title="From Sprints to Swarms, Part 3: When Code Gets Cheaper, Judgment Gets More Valuable." /><published>2026-04-14T09:00:00+00:00</published><updated>2026-04-14T09:00:00+00:00</updated><id>https://github.com/from-sprints-to-swarms-part-3-judgment-gets-more-valuable</id><content type="html" xml:base="https://github.com/from-sprints-to-swarms-part-3-judgment-gets-more-valuable/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#cheap-code-changes-the-economics&quot; id=&quot;markdown-toc-cheap-code-changes-the-economics&quot;&gt;Cheap Code Changes the Economics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#two-lanes-different-purposes&quot; id=&quot;markdown-toc-two-lanes-different-purposes&quot;&gt;Two Lanes, Different Purposes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-developer-role-moves-up-the-stack&quot; id=&quot;markdown-toc-the-developer-role-moves-up-the-stack&quot;&gt;The Developer Role Moves Up the Stack&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#verification-becomes-a-competitive-advantage&quot; id=&quot;markdown-toc-verification-becomes-a-competitive-advantage&quot;&gt;Verification Becomes a Competitive Advantage&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-platform-matters&quot; id=&quot;markdown-toc-the-platform-matters&quot;&gt;The Platform Matters&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#better-judgment-beats-more-output&quot; id=&quot;markdown-toc-better-judgment-beats-more-output&quot;&gt;Better Judgment Beats More Output&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#metrics-that-actually-matter&quot; id=&quot;markdown-toc-metrics-that-actually-matter&quot;&gt;Metrics That Actually Matter&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#devops-becomes-more-strategic-not-less&quot; id=&quot;markdown-toc-devops-becomes-more-strategic-not-less&quot;&gt;DevOps Becomes More Strategic, Not Less&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In &lt;a href=&quot;/from-sprints-to-swarms-part-1-ai-made-code-cheap/&quot;&gt;Part 1&lt;/a&gt; of this series, I argued that AI made code cheap, not delivery easy. In &lt;a href=&quot;/from-sprints-to-swarms-part-2-context-is-infrastructure/&quot;&gt;Part 2&lt;/a&gt; I argued that context and policy form the control plane that keeps the speed gains from AI from turning into chaos.&lt;/p&gt;

&lt;p&gt;That leaves the last question: when implementation gets dramatically cheaper, what becomes more valuable?&lt;/p&gt;

&lt;p&gt;The answer is &lt;em&gt;judgment&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Bad decisions, weak proofs, brittle architecture, and slow recovery are still expensive. Agents lower the cost of generating options, but they do not lower the cost of choosing badly or shipping something you cannot safely operate.&lt;/p&gt;

&lt;p&gt;This is the shift many teams still underestimate. Cheap generation should not lower the quality bar. It should raise the proof bar. If an agent can produce three plausible implementations before lunch, the hard part is no longer typing one of them. The hard part is deciding which one belongs in your production system, and proving it.&lt;/p&gt;

&lt;h2 id=&quot;cheap-code-changes-the-economics&quot;&gt;Cheap Code Changes the Economics&lt;/h2&gt;

&lt;p&gt;Implementation cost shapes engineering behavior. Teams tolerate awkward abstractions, live with duplicated logic, and defer rewrites because changing the system is expensive. Even when everybody knows a design is wrong, replacement cost often keeps the wrong thing in place.&lt;/p&gt;

&lt;p&gt;Agents change that calculation.&lt;/p&gt;

&lt;p&gt;You can now run parallel solution probes against the same problem, compare trade-offs quickly, and discard the weaker paths without feeling like you wasted a sprint. Disposable prototypes become practical. Selective rewrites become easier to justify. Branch by abstraction becomes more attractive because trying two implementations behind a stable contract is less painful than it used to be.&lt;/p&gt;

&lt;p&gt;That is the upside. The caution is just as important: &lt;em&gt;the cost of experiments is falling, but the cost of incidents is not&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Production outages still hurt. Compliance failures still hurt. Data corruption still hurts. Reputational damage still hurts. So while implementation gets cheaper, operational mistakes do not. That means the winning move is not to generate more code. It is to generate more options, then apply better judgment.&lt;/p&gt;

&lt;p&gt;This is why I think teams should carry &lt;em&gt;ideas&lt;/em&gt; forward, not &lt;em&gt;baggage&lt;/em&gt;. If a prototype taught you something useful, keep the learning. Do not preserve every rushed implementation just because code already exists. Code volume is not an asset by itself. Architecture and intent clarity is.&lt;/p&gt;

&lt;h2 id=&quot;two-lanes-different-purposes&quot;&gt;Two Lanes, Different Purposes&lt;/h2&gt;

&lt;p&gt;In the first post I talked about dual-lane execution: humans and agents should not own the same kinds of work in the same way. There is a related distinction inside the codebase itself. Teams increasingly need two lanes for software assets:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Innovation Lane&lt;/th&gt;
      &lt;th&gt;Production Lane&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Fast exploration&lt;/td&gt;
      &lt;td&gt;Hardened delivery&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Multiple competing implementations&lt;/td&gt;
      &lt;td&gt;One supported implementation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Loose edges are acceptable&lt;/td&gt;
      &lt;td&gt;Operational sharp edges are not&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Local proof may be enough&lt;/td&gt;
      &lt;td&gt;System proof is required&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Easy to discard&lt;/td&gt;
      &lt;td&gt;Expensive to keep&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The innovation lane can be a little messy on purpose. Prototype the workflow. Try two prompts. Compare two parsers. Generate scaffolding. Spike the interface. Throw most of it away if the idea does not hold up.&lt;/p&gt;

&lt;p&gt;The production lane is different. Once code graduates into the part of the system that other people depend on, the standards change. At that point, tests are not optional polish. Observability is not nice-to-have. Ownership cannot be fuzzy. Rollback cannot be an afterthought.&lt;/p&gt;

&lt;p&gt;That handoff matters more in the agentic era because the front half of the system can now move much faster than the back half. If you do not define graduation criteria, prototype code can easily leak into production simply because it exists and mostly works.&lt;/p&gt;

&lt;p&gt;A useful graduation gate asks a few plain questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;does this code solve a real, validated problem&lt;/li&gt;
  &lt;li&gt;can someone explain the design and its trade-offs clearly&lt;/li&gt;
  &lt;li&gt;do tests describe the expected behavior at the right level&lt;/li&gt;
  &lt;li&gt;do we know what signals will tell us it is healthy in production&lt;/li&gt;
  &lt;li&gt;is there a safe rollback or containment path&lt;/li&gt;
  &lt;li&gt;is there a named human owner for the risk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the answer to those questions is weak, the solution is not ready, &lt;em&gt;no matter how quickly it was produced&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;the-developer-role-moves-up-the-stack&quot;&gt;The Developer Role Moves Up the Stack&lt;/h2&gt;

&lt;p&gt;This is the part that tends to trigger anxiety, but I think the shift is more interesting than threatening. The developer role does not disappear; it shifts toward higher-leverage work.&lt;/p&gt;

&lt;p&gt;When code generation is abundant, the highest-value engineering work moves toward direction, selection, and proof. Developers become more like workflow directors than pure implementers. They frame the problem, shape the context packet, set boundaries for the agent, compare candidate solutions, and decide which path deserves deeper investment.&lt;/p&gt;

&lt;p&gt;That role has several facets:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;workflow director&lt;/strong&gt; - decomposes work, routes it well, and manages parallel execution&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;context architect&lt;/strong&gt; - makes sure the right domain, design, and operational context are available at the start&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;quality governor&lt;/strong&gt; - insists on evidence, not just plausible output&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;platform engineer&lt;/strong&gt; - improves the paved road so safe agentic delivery is the default&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;reviewer as verifier&lt;/strong&gt; - evaluates correctness, risk, and fitness, not just style&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of those are entirely new. But they move much closer to the center of the job.&lt;/p&gt;

&lt;p&gt;This is also why I think AI enablement becomes a continuous discipline, not a one-time rollout. Mature teams will keep refining their instructions, task templates, skills, ownership metadata, tests, rulesets, and observability because those things compound. Instructions, examples, docs, and tests are not administrative residue anymore. They are capital that helps humans and agents make better decisions.&lt;/p&gt;

&lt;p&gt;In other words, the developer of the next few years is not just writing code. They are designing the system that decides how code gets written, validated, and trusted.&lt;/p&gt;

&lt;h2 id=&quot;verification-becomes-a-competitive-advantage&quot;&gt;Verification Becomes a Competitive Advantage&lt;/h2&gt;

&lt;p&gt;If there is one idea I want to land in this series, it is this: &lt;em&gt;when generation gets cheap, verification becomes a differentiator&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That starts with tests, but it does not end there. A strong verification culture treats tests as contracts, observability as proof, rollout telemetry as feedback, and provenance as part of operational reality. It asks not only “did the code compile?” but also “what exactly did we change, why do we believe it is safe, and how quickly will we know if we were wrong?”&lt;/p&gt;

&lt;p&gt;That leads to a different kind of handoff. Good handoffs carry evidence:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;what the intended behavior is&lt;/li&gt;
  &lt;li&gt;what changed and why&lt;/li&gt;
  &lt;li&gt;what tests were run&lt;/li&gt;
  &lt;li&gt;what risks remain&lt;/li&gt;
  &lt;li&gt;what telemetry or dashboards should be watched&lt;/li&gt;
  &lt;li&gt;what the recovery path is if reality disagrees with the plan&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not bureaucracy. It is how you keep review and operations from collapsing under higher change volume.&lt;/p&gt;

&lt;p&gt;It also changes what good platforms optimize for. Faster pipelines matter, but fast wrong answers are not impressive. The real target is trustworthy speed: fast feedback, useful test signals, clean provenance, safe progressive rollout, and recovery paths that work under stress.&lt;/p&gt;

&lt;p&gt;That is why cheap code should raise the bar for proof. If your system can generate more change, your verification system has to turn more ambiguity into confidence. Teams that do this well will feel dramatically faster. Teams that do it poorly will feel busy, noisy, and fragile.&lt;/p&gt;

&lt;h2 id=&quot;the-platform-matters&quot;&gt;The Platform Matters&lt;/h2&gt;

&lt;p&gt;This is also why I think an integrated platform matters more in the agentic era.&lt;/p&gt;

&lt;p&gt;If agents work in one place, context lives in another, CI runs somewhere else, security signals arrive late, and audit trails have to be stitched together by hand, the system creates friction right where it needs clarity. Review slows down. Provenance gets fuzzy. Policy becomes easier to bypass by accident. The more autonomy you add, the more expensive that fragmentation becomes.&lt;/p&gt;

&lt;p&gt;An integrated platform does not remove the need for judgment. It makes judgment easier to apply at the right moment. When code, issues, pull requests, code review, Actions, security controls, environments, and audit history live in the same system, context travels with the work and evidence is easier to inspect.&lt;/p&gt;

&lt;p&gt;That is why I think GitHub’s Agentic Platform is strategically important here. It gives teams a shared surface for agent execution, human review, workflow automation, and governance rather than forcing them to bolt those capabilities together across disconnected tools. Agents can work where developers already work, and the surrounding controls can stay close to the activity instead of being bolted on later.&lt;/p&gt;

&lt;p&gt;That matters because speed is not the hard part anymore. Coordinated, governable, evidence-bearing speed is.&lt;/p&gt;

&lt;h2 id=&quot;better-judgment-beats-more-output&quot;&gt;Better Judgment Beats More Output&lt;/h2&gt;

&lt;p&gt;Once agents can produce large amounts of software, the hard decisions become more visible.&lt;/p&gt;

&lt;p&gt;What is worth building at all? What should be tested as an experiment and then discarded? What should be hardened into a long-lived capability? What should never be automated because the downside of failure is too high? Where does human review add real value, and where is it just ritual?&lt;/p&gt;

&lt;p&gt;Those are judgment questions. They live at the boundary between engineering, product, operations, and risk.&lt;/p&gt;

&lt;p&gt;This is also where accountability stays stubbornly human. An agent can propose a migration plan, generate a feature, or suggest a test. But a person still owns the decision to ship, the acceptance of residual risk, and the responsibility for the outcome. That is not a limitation of the tools. It is a property of real systems and real organizations.&lt;/p&gt;

&lt;p&gt;I also think teams need better trust calibration here. Blind trust is dangerous. Reflexive distrust is wasteful. The useful posture is conditional trust, backed by evidence. Low-risk, well-bounded work with strong automated checks should flow quickly. High-risk work should keep much tighter human loops.&lt;/p&gt;

&lt;p&gt;Not every task deserves autonomous parallelization. Agent budget should be treated a bit like compute budget: spend it where speed, coverage, or toil reduction materially improve the business outcome, not where it just creates more artifacts to review.&lt;/p&gt;

&lt;p&gt;Autonomy is earned by evidence.&lt;/p&gt;

&lt;h2 id=&quot;metrics-that-actually-matter&quot;&gt;Metrics That Actually Matter&lt;/h2&gt;

&lt;p&gt;If you want to know whether judgment and verification are improving, the metrics need to reflect more than raw activity.&lt;/p&gt;

&lt;p&gt;A few measures I would watch are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Predicted versus observed productivity&lt;/strong&gt; - did the time you thought you saved turn into shipped value, or just more in-flight work?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Experiment hit rate&lt;/strong&gt; - how often do prototypes or parallel probes produce something worth hardening?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Escaped defects&lt;/strong&gt; - is faster change creation increasing failure downstream?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Rollback frequency and recovery speed&lt;/strong&gt; - how often do you need to back out changes, and how quickly can you recover?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Review burden&lt;/strong&gt; - are senior engineers spending more time triaging noise than evaluating important risk?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Time reinvested in architecture and platform&lt;/strong&gt; - is AI-created time being spent on higher-leverage work, or just absorbed by more throughput?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Trust signals&lt;/strong&gt; - do teams increasingly let low-risk work flow through automation because the evidence deserves trust?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those metrics tell a more useful story than counting prompts, tokens, or lines of generated code. The goal is not to maximize output. The goal is to improve decision quality, delivery quality, and business outcomes.&lt;/p&gt;

&lt;h2 id=&quot;devops-becomes-more-strategic-not-less&quot;&gt;DevOps Becomes More Strategic, Not Less&lt;/h2&gt;

&lt;p&gt;A lot of the market conversation still implies that AI compresses software delivery into a prompt-and-approve loop. I do not buy that. AI mostly exposes whether a team has real delivery discipline underneath the surface.&lt;/p&gt;

&lt;p&gt;If your flow is weak, more generation creates more queues. If your context is weak, more autonomy creates more guesswork. If your verification is weak, more output creates more false confidence. If your operational discipline is weak, incidents erase the gains quickly.&lt;/p&gt;

&lt;p&gt;That is why I do not think DevOps becomes less important in the agentic era. I think it becomes a strategic differentiator.&lt;/p&gt;

&lt;p&gt;The teams that win will not be the teams with the most agents. They will be the teams with the clearest protocols, the best verification culture, the strongest platform leverage, and the most disciplined judgment about where autonomy belongs.&lt;/p&gt;

&lt;p&gt;Swarms still need stewards.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Across this series, I have argued three things. In &lt;a href=&quot;/from-sprints-to-swarms-part-1-ai-made-code-cheap/&quot;&gt;Part 1&lt;/a&gt;, AI shifted the bottleneck from coding effort to delivery flow. In &lt;a href=&quot;/from-sprints-to-swarms-part-2-context-is-infrastructure/&quot;&gt;Part 2&lt;/a&gt;, context and policy became the control plane for safe speed. Part 3 is the consequence of both: when code gets cheaper, judgment becomes the scarce capability.&lt;/p&gt;

&lt;p&gt;Cheap code is useful. Cheap mistakes are not. The future belongs to teams that can generate options quickly, verify them rigorously, and make disciplined decisions about what deserves to survive. That is less a story about replacing developers than about raising the value of the best parts of engineering judgment.&lt;/p&gt;

&lt;p&gt;Happy shipping!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="devops" /><category term="development" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2026/04/indy.png" /><media:content medium="image" url="https://github.com/assets/images/2026/04/indy.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">From Sprints to Swarms, Part 2: Context Is Infrastructure, Policy Is the Runtime</title><link href="https://github.com/from-sprints-to-swarms-part-2-context-is-infrastructure/" rel="alternate" type="text/html" title="From Sprints to Swarms, Part 2: Context Is Infrastructure, Policy Is the Runtime" /><published>2026-04-07T09:00:00+00:00</published><updated>2026-04-07T09:00:00+00:00</updated><id>https://github.com/from-sprints-to-swarms-part-2-context-is-infrastructure</id><content type="html" xml:base="https://github.com/from-sprints-to-swarms-part-2-context-is-infrastructure/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#series&quot; id=&quot;markdown-toc-series&quot;&gt;Series&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-happy-path-needs-more-than-a-prompt&quot; id=&quot;markdown-toc-the-happy-path-needs-more-than-a-prompt&quot;&gt;The Happy Path Needs More Than a Prompt&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#context-is-infrastructure&quot; id=&quot;markdown-toc-context-is-infrastructure&quot;&gt;Context Is Infrastructure&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-context-packet-matters&quot; id=&quot;markdown-toc-the-context-packet-matters&quot;&gt;The Context Packet Matters&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#clean-codebases-are-not-just-a-nice-to-have&quot; id=&quot;markdown-toc-clean-codebases-are-not-just-a-nice-to-have&quot;&gt;Clean Codebases Are Not Just a Nice-to-Have&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#documentation-is-now-an-execution-input&quot; id=&quot;markdown-toc-documentation-is-now-an-execution-input&quot;&gt;Documentation Is Now an Execution Input&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#policy-is-the-runtime&quot; id=&quot;markdown-toc-policy-is-the-runtime&quot;&gt;Policy Is the Runtime&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#review-by-risk-not-ritual&quot; id=&quot;markdown-toc-review-by-risk-not-ritual&quot;&gt;Review by Risk, Not Ritual&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-control-plane-in-github-terms&quot; id=&quot;markdown-toc-the-control-plane-in-github-terms&quot;&gt;The Control Plane in GitHub Terms&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#new-operating-rituals&quot; id=&quot;markdown-toc-new-operating-rituals&quot;&gt;New Operating Rituals&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;series&quot;&gt;Series&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-1-ai-made-code-cheap/&quot;&gt;Part 1: AI Made Code Cheap But Delivery Hard&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-2-context-is-infrastructure/&quot;&gt;Part 2: Context Is Infrastructure, Policy Is the Runtime&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-3-judgment-gets-more-valuable/&quot;&gt;Part 3: When Code Gets Cheaper, Judgment Gets More Valuable&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In &lt;a href=&quot;/from-sprints-to-swarms-part-1-ai-made-code-cheap/&quot;&gt;Part 1&lt;/a&gt; of this series, I argued that AI made code cheap, not delivery easy. Once the pull request becomes the unit of flow, the constraint shifts into review, testing, merge speed, and all the coordination work teams used to treat as background noise.&lt;/p&gt;

&lt;p&gt;That raises the next question: what does the happy path look like when agents do a meaningful share of the implementation?&lt;/p&gt;

&lt;p&gt;It looks something like this: an agent receives a bounded task, finds the right code and docs, makes a small change, runs the right tests, and passes automated checks. A human then reviews the change in proportion to its risk and ships it quickly.&lt;/p&gt;

&lt;p&gt;That workflow sounds simple. It only works when two conditions are already in place: reliable context and policy that runs near the activity. Together, they form the control plane for agentic delivery. Without them, faster generation just creates faster review debt.&lt;/p&gt;

&lt;h2 id=&quot;the-happy-path-needs-more-than-a-prompt&quot;&gt;The Happy Path Needs More Than a Prompt&lt;/h2&gt;

&lt;p&gt;A lot of current AI adoption still treats prompting as the main control surface. That is too narrow. What matters more is everything upstream that shapes what the agent can see, infer, and safely do, and what happens downstream that shapes how the system responds to what the agent did.&lt;/p&gt;

&lt;p&gt;If you want agents to behave well, you need a codebase and documentation set that are easy to search, easy to trust, and hard to misread. You also need tasks that state intent, scope, constraints, and risk clearly. Otherwise you get the worst kind of acceleration: fast local output followed by slow, expensive review and cleanup.&lt;/p&gt;

&lt;p&gt;The happy path still depends on a few things that should be familiar to any team that has done DevOps well:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;clear task intent&lt;/li&gt;
  &lt;li&gt;reliable documentation and architecture context&lt;/li&gt;
  &lt;li&gt;small batch sizes&lt;/li&gt;
  &lt;li&gt;good tests&lt;/li&gt;
  &lt;li&gt;automated checks, scans, and release protections&lt;/li&gt;
  &lt;li&gt;clean ownership and review paths&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of that is new. What changes is the penalty for getting it wrong. Weak context used to slow people down. In an agentic system, it can create a larger volume of confident mistakes.&lt;/p&gt;

&lt;h2 id=&quot;context-is-infrastructure&quot;&gt;Context Is Infrastructure&lt;/h2&gt;

&lt;p&gt;Think about what infrastructure as code solved. Before IaC, admins clicked around consoles, produced inconsistent environments, and created brittle handoffs. Infrastructure as code made infrastructure versioned, testable, and reviewable. That was a major step forward for speed and reliability.&lt;/p&gt;

&lt;p&gt;We need the same shift for context. In agentic engineering, context is infrastructure. If it is thin, stale, or hard to find, autonomy rests on guesswork.&lt;/p&gt;

&lt;p&gt;When an engineer starts a task, they do not start from raw code alone. They pull in product intent, domain language, architecture decisions, service boundaries, conventions, and known sharp edges. If your agents are contributing real work, they need the same kind of grounding, delivered in a form they can actually use.&lt;/p&gt;

&lt;p&gt;That context usually lives across several layers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;product intent and acceptance criteria&lt;/li&gt;
  &lt;li&gt;domain language and business rules&lt;/li&gt;
  &lt;li&gt;ADRs and architecture diagrams&lt;/li&gt;
  &lt;li&gt;service boundaries and ownership metadata&lt;/li&gt;
  &lt;li&gt;repository instructions and coding conventions&lt;/li&gt;
  &lt;li&gt;tests that show expected behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Teams can go overboard here. The point is not to dump the entire universe into every task. That just creates noise. The point is to make the &lt;em&gt;right&lt;/em&gt; context easy to find and the critical context &lt;em&gt;impossible to miss&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This is where many teams discover that their documentation problem is really a delivery problem. If architectural patterns are not discernible from the code, notes are stale, ownership is ambiguous, or operational constraints only exist in people’s heads, the agent does what a human would do under the same conditions: it guesses.&lt;/p&gt;

&lt;h2 id=&quot;the-context-packet-matters&quot;&gt;The Context Packet Matters&lt;/h2&gt;

&lt;p&gt;One practical pattern I like is the &lt;em&gt;context packet&lt;/em&gt;. Before work starts, package the task as a small execution contract instead of a vague request.&lt;/p&gt;

&lt;p&gt;A good packet answers questions like:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;what outcome are we trying to achieve&lt;/li&gt;
  &lt;li&gt;what is explicitly in and out of scope&lt;/li&gt;
  &lt;li&gt;what proves the change is done&lt;/li&gt;
  &lt;li&gt;what dependencies or affected systems matter&lt;/li&gt;
  &lt;li&gt;what risks need special attention&lt;/li&gt;
  &lt;li&gt;what roll-forward path exists if this goes badly&lt;/li&gt;
  &lt;li&gt;what evidence should accompany the handoff&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds close to good issue writing, and it is. The difference is that the packet needs to be written for execution, not just prioritization. A backlog item can tolerate some ambiguity because a human will usually resolve it on the fly. Agents mostly turn that ambiguity into review work.&lt;/p&gt;

&lt;p&gt;This is also where Copilot CLI &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/research&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/plan&lt;/code&gt; mode help. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/research&lt;/code&gt; helps gather the relevant code, docs, and constraints before implementation starts. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/plan&lt;/code&gt; forces that context into an explicit approach, which is often the fastest way to expose fuzzy intent before it turns into bad code and noisy review. For more formal requirements and plans, frameworks like &lt;a href=&quot;https://speckit.org/&quot;&gt;SpecKit&lt;/a&gt; can be used.&lt;/p&gt;

&lt;p&gt;This is one of the quieter shifts in the agentic era. Product thinking, architecture thinking, and delivery thinking have to meet earlier. Handoff quality at the start now does a lot to determine how expensive the review and recovery path will be at the end.&lt;/p&gt;

&lt;h2 id=&quot;clean-codebases-are-not-just-a-nice-to-have&quot;&gt;Clean Codebases Are Not Just a Nice-to-Have&lt;/h2&gt;

&lt;p&gt;This is the unglamorous part, but it matters. If your codebase is hard to understand, your agents will make hard-to-review changes. If it has weak tests, the evidence attached to those changes will be weak too. If ownership is fuzzy, review routing will be fuzzy as well.&lt;/p&gt;

&lt;p&gt;A lot of teams get a system to deployable shape, then spend years layering fixes and workarounds instead of improving the structure underneath. That is survivable in a system changing at human-speed. It gets much more expensive when you add agents that can produce changes faster than the system can absorb them.&lt;/p&gt;

&lt;p&gt;For some teams, the first serious agentic investment should not be net-new feature work. It should be test expansion, modular refactoring, better ownership metadata, and clearer repository guidance. Those are not side quests. They are how you build the context infrastructure that makes autonomy safer.&lt;/p&gt;

&lt;p&gt;Once that foundation improves, the payoff compounds. Agents can navigate the system more accurately, reviewers can reason about changes faster, and the codebase becomes a source of truth instead of a source of confusion.&lt;/p&gt;

&lt;h2 id=&quot;documentation-is-now-an-execution-input&quot;&gt;Documentation Is Now an Execution Input&lt;/h2&gt;

&lt;p&gt;I think this is the point many teams still underestimate. Documentation gets treated as onboarding material or compliance residue. In an agentic system, it is also an execution input.&lt;/p&gt;

&lt;p&gt;That includes obvious things like API contracts and architecture notes, but it also includes all the little details that experienced teams accumulate over time: naming conventions, test expectations, release rules, migration patterns, and operational caveats. If those things are not discoverable, agents will either miss them or infer them poorly.&lt;/p&gt;

&lt;p&gt;DORA has written about &lt;a href=&quot;https://dora.dev/capabilities/documentation-quality/&quot;&gt;documentation quality&lt;/a&gt; as a capability amplifier, and that framing fits here. Good documentation does not just help people ramp up. It improves change quality because the system becomes easier to understand and safer to modify.&lt;/p&gt;

&lt;p&gt;The reverse is also true. Doc rot is not a cosmetic issue. It is a delivery failure mode. If the docs say one thing and the code or platform behavior says another, review slows down, false confidence rises, and both humans and agents learn not to trust the written system.&lt;/p&gt;

&lt;p&gt;That is why I think mature teams will maintain docs the way they maintain tests: continuously, as part of the normal flow of change.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Teams should leverage agents for this work! The &lt;a href=&quot;https://github.github.com/gh-aw/setup/creating-workflows/#github-web-interface&quot;&gt;Documentation Updater sample&lt;/a&gt; is a practical pattern for keeping docs aligned with code changes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;policy-is-the-runtime&quot;&gt;Policy Is the Runtime&lt;/h2&gt;

&lt;p&gt;Context alone is not enough. Even perfect documentation does not replace guardrails.&lt;/p&gt;

&lt;p&gt;In a human-only workflow, policy often lives too far from the moment of execution. You see it in tribal review norms, security checklists, end-of-release approvals, or an outdated wiki page nobody opens until something goes wrong. That model does not scale when change volume rises.&lt;/p&gt;

&lt;p&gt;Policy has to run where the work happens.&lt;/p&gt;

&lt;p&gt;In practice, that means using &lt;a href=&quot;https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-rulesets/about-rulesets&quot;&gt;repository rulesets&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches#require-status-checks-before-merging&quot;&gt;required checks&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners&quot;&gt;CODEOWNERS&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/code-security/concepts/secret-security/about-secret-scanning&quot;&gt;secret scanning&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/code-security/concepts/code-scanning/about-code-scanning&quot;&gt;code scanning&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/code-security/concepts/supply-chain-security/about-dependency-review&quot;&gt;dependency review&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/actions/reference/workflows-and-actions/deployments-and-environments#deployment-protection-rules&quot;&gt;environment protections&lt;/a&gt;, &lt;a href=&quot;https://docs.github.com/en/enterprise-cloud@latest/organizations/keeping-your-organization-secure/managing-security-settings-for-your-organization/audit-log-events-for-your-organization&quot;&gt;audit logs&lt;/a&gt;, and CI/CD automation that enforces the standards the team claims to care about.&lt;/p&gt;

&lt;p&gt;This is not about distrusting developers or agents. It is about reducing variance. Good policy removes low-value decisions from the critical path and reserves human judgment for the places where judgment actually matters.&lt;/p&gt;

&lt;p&gt;That distinction matters. If every PR gets the same human scrutiny regardless of risk, your reviewers become the bottleneck and the review signal gets diluted. If low-risk work can flow through strong automated controls, humans can spend their time where the blast radius is real.&lt;/p&gt;

&lt;h2 id=&quot;review-by-risk-not-ritual&quot;&gt;Review by Risk, Not Ritual&lt;/h2&gt;

&lt;p&gt;The old habit is ritual: every change gets the same ceremony. Agentic engineering calls for a new habit: calibration: review in proportion to risk.&lt;/p&gt;

&lt;p&gt;A low-risk documentation fix should not wait in the same queue or require the same depth of analysis as an auth change, a payment flow change, or infrastructure that can disrupt production. Agentic delivery makes that distinction more important because the system can generate a lot of small, safe changes alongside a smaller number of high-consequence ones.&lt;/p&gt;

&lt;p&gt;Teams need an explicit risk model:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;every PR gets Copilot Code Review plus baseline checks&lt;/li&gt;
  &lt;li&gt;low-risk work gets fast automated validation and lightweight review, or automatic approval where policy allows&lt;/li&gt;
  &lt;li&gt;medium-risk work gets normal peer review plus targeted integration tests&lt;/li&gt;
  &lt;li&gt;high-risk work gets deeper human review, stronger evidence, and tighter deployment controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is not bureaucracy. It is calibration. You want the amount of human attention to match the potential downside of the change.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Once again, Agentic Workflows can be used - create a workflow that categorizes PRs by risk level and labels or routes them accordingly. For example, a PR that changes infrastructure code might automatically get labeled as “high-risk” and require additional checks or approvals.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is also where evidence-bearing handoffs matter. A good handoff does not just say, “I changed some files, please review.” It says: here is the intent I believe I implemented, here are the files I changed, here are the tests I ran, here are the risks I still see, and here is what I did not do from the acceptance criteria. That turns review from guesswork into decision-making.&lt;/p&gt;

&lt;h2 id=&quot;the-control-plane-in-github-terms&quot;&gt;The Control Plane in GitHub Terms&lt;/h2&gt;

&lt;p&gt;Teams should think of this as a layered control plane.&lt;/p&gt;

&lt;p&gt;Copilot custom instructions, custom agents, and Skills shape how work gets interpreted and executed. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/research&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/plan&lt;/code&gt; and/or Issue/PR templates provide the context packet. Actions and required checks verify behavior continuously. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CODEOWNERS&lt;/code&gt; and ownership metadata route the work to the right humans. Repository rulesets enforce the non-negotiables. Environment protections and deployment approvals contain higher-risk changes. Audit logs and PR history preserve the evidence trail.&lt;/p&gt;

&lt;p&gt;These features add a bit of value in isolation. But real value comes from how they combine. The agent gets better context at the start, the system enforces better policy during execution, and the reviewer receives a smaller, clearer unit of work with better evidence attached.&lt;/p&gt;

&lt;p&gt;The more autonomous execution becomes, the more deliberate the surrounding system has to be. &lt;em&gt;Speed without a control plane is just faster risk accumulation&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;new-operating-rituals&quot;&gt;New Operating Rituals&lt;/h2&gt;

&lt;p&gt;Once you accept that context and policy are part of the runtime, a few rituals start to change.&lt;/p&gt;

&lt;p&gt;Standups matter less as status broadcasts and more as exception handling. Review queues need active triage because review latency becomes a first-order constraint. Retrospectives should look at flow, false positives, noisy checks, stale docs, and policy gaps, not just missed estimates. Teams also need a visible backlog for context maintenance, because docs and instructions decay unless someone owns that work.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;In Part 1, I argued that AI changes where the bottlenecks live. Part 2 is the operational consequence of that shift. If agents are going to produce meaningful amounts of code, context stops being background material and policy stops being a late-stage gate. They become infrastructure and runtime.&lt;/p&gt;

&lt;p&gt;Better context improves autonomy. Stronger policy makes speed safer. Teams that build this control plane will scale agentic delivery far more effectively than teams that simply add more agents to a messy system.&lt;/p&gt;

&lt;p&gt;In Part 3, I’ll look at the next implication: when code gets cheaper, judgment becomes the scarce capability.&lt;/p&gt;

&lt;p&gt;Happy shipping!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="devops" /><category term="security" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2026/04/building.png" /><media:content medium="image" url="https://github.com/assets/images/2026/04/building.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">From Sprints to Swarms, Part 1: AI Made Code Cheap But Delivery Hard.</title><link href="https://github.com/from-sprints-to-swarms-part-1-ai-made-code-cheap/" rel="alternate" type="text/html" title="From Sprints to Swarms, Part 1: AI Made Code Cheap But Delivery Hard." /><published>2026-04-03T09:00:00+00:00</published><updated>2026-04-03T09:00:00+00:00</updated><id>https://github.com/from-sprints-to-swarms-part-1-ai-made-code-cheap</id><content type="html" xml:base="https://github.com/from-sprints-to-swarms-part-1-ai-made-code-cheap/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#series&quot; id=&quot;markdown-toc-series&quot;&gt;Series&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#part-1-ai-made-code-cheap-but-delivery-hard&quot; id=&quot;markdown-toc-part-1-ai-made-code-cheap-but-delivery-hard&quot;&gt;Part 1: AI Made Code Cheap But Delivery Hard&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-pull-request-is-the-unit-of-flow&quot; id=&quot;markdown-toc-the-pull-request-is-the-unit-of-flow&quot;&gt;The Pull Request Is the Unit of Flow&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#then-vs-now&quot; id=&quot;markdown-toc-then-vs-now&quot;&gt;Then vs Now&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#new-bottlenecks&quot; id=&quot;markdown-toc-new-bottlenecks&quot;&gt;New Bottlenecks&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#metrics-that-matter&quot; id=&quot;markdown-toc-metrics-that-matter&quot;&gt;Metrics That Matter&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#dual-lane-execution&quot; id=&quot;markdown-toc-dual-lane-execution&quot;&gt;Dual-Lane Execution&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;series&quot;&gt;Series&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-1-ai-made-code-cheap/&quot;&gt;Part 1: AI Made Code Cheap But Delivery Hard&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-2-context-is-infrastructure/&quot;&gt;Part 2: Context Is Infrastructure, Policy Is the Runtime&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/from-sprints-to-swarms-part-3-judgment-gets-more-valuable/&quot;&gt;Part 3: When Code Gets Cheaper, Judgment Gets More Valuable&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We all know that software engineering has been irrevocably changed by LLMs and agents. It is a mistake to think the main change is &lt;em&gt;faster typing&lt;/em&gt;. It isn’t. The bigger change is economic: when code gets cheaper, the delivery system becomes the constraint.&lt;/p&gt;

&lt;p&gt;This series is about what that means for software teams. Part 1 is about flow. Part 2 will cover context and policy. Part 3 will cover judgment and the changing role of the developer.&lt;/p&gt;

&lt;h2 id=&quot;part-1-ai-made-code-cheap-but-delivery-hard&quot;&gt;Part 1: AI Made Code Cheap But Delivery Hard&lt;/h2&gt;

&lt;p&gt;Copilot, Chat, and now agents have expanded the amount of software I can produce. They let me tackle work I would have avoided a few years ago because it was too laborious, too time consuming. But they have also exposed something uncomfortable: faster code generation does not automatically mean faster delivery.&lt;/p&gt;

&lt;p&gt;Before joining GitHub in 2021, I spent more than a decade in DevOps consulting. From that angle, most “agentic engineering” problems look familiar: they are still about process and culture more than about the tools. The labels may have changed, but most of the constraints have not. If code is cheaper, then review quality, test quality, merge flow, context management and architecture discipline matter even more.&lt;/p&gt;

&lt;h2 id=&quot;the-pull-request-is-the-unit-of-flow&quot;&gt;The Pull Request Is the Unit of Flow&lt;/h2&gt;

&lt;p&gt;Agile teams broke work into stories, estimated them with story points, and synchronized through sprints. That made sense when the system was shaped around human constraints: people are single-threaded, context is limited, and coordination is expensive.&lt;/p&gt;

&lt;p&gt;Agents weaken those assumptions. Work can now be decomposed and executed in parallel, often asynchronously. That shifts the real unit of flow from the story to the pull request.&lt;/p&gt;

&lt;p&gt;Teams should measure flow at the PR level: idea-to-PR (how fast can we code it), PR-to-merge (how fast can we validate it), and merge-to-production (how fast can we deliver it).&lt;/p&gt;

&lt;h2 id=&quot;then-vs-now&quot;&gt;Then vs Now&lt;/h2&gt;

&lt;p&gt;Sprints, standups, and estimates are coordination tools. They help humans batch work, surface blockers, and synchronize periodically.&lt;/p&gt;

&lt;p&gt;Agentic delivery looks different. You can have a continuous stream of PRs from multiple agents working in parallel. That changes the role of the human. The highest-value work shifts toward framing problems, resolving ambiguity, setting policy, reviewing risk, and deciding what is worth shipping.&lt;/p&gt;

&lt;p&gt;It also breaks batch-oriented quality practices. QA cannot stay at the end of the sprint. Security cannot be a late gate. Governance cannot depend on tribal knowledge. Quality, policy, and context have to move earlier and run continuously.&lt;/p&gt;

&lt;p&gt;This is where many teams get stuck. They add AI at the edge of the system, but leave the delivery model untouched. The result is predictable: more code entering the pipe, with the same review capacity and the same weak automation.&lt;/p&gt;

&lt;h2 id=&quot;new-bottlenecks&quot;&gt;New Bottlenecks&lt;/h2&gt;

&lt;p&gt;The bottlenecks do not disappear. They move:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Review throughput&lt;/strong&gt; - can review capacity keep pace with PR volume?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Test quality&lt;/strong&gt; - do tests catch regressions without creating noise?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;CI quality&lt;/strong&gt; - is verification fast, trustworthy, and automated?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Merge latency&lt;/strong&gt; - how long do approved changes sit before merge?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Context quality&lt;/strong&gt; - do people and agents have the right information at the right time?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Ownership clarity&lt;/strong&gt; - who decides, reviews, and accepts risk?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;metrics-that-matter&quot;&gt;Metrics That Matter&lt;/h2&gt;

&lt;p&gt;Business outcomes matter more than agent benchmarks. If faster code does not produce better delivery, the optimization is irrelevant. Start with a few operational metrics that expose flow quality:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;PR size&lt;/strong&gt; - larger PRs increase review cost and defect risk&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Review latency&lt;/strong&gt; - how long it takes a PR to get meaningful attention&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Median time-to-merge&lt;/strong&gt; - the clearest signal of delivery friction&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;CI pass rate&lt;/strong&gt; - whether your verification system is doing its job&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Agent-authored PR ratio&lt;/strong&gt; - how much work is entering the system through agents&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Agent-authored PR outcomes&lt;/strong&gt; - whether that work is actually getting accepted and shipped&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;dual-lane-execution&quot;&gt;Dual-Lane Execution&lt;/h2&gt;

&lt;p&gt;Not all work should go to agents. A useful model is dual-lane execution: one lane for human judgment, one for agent execution.&lt;/p&gt;

&lt;p&gt;Humans are better at ambiguity, trade-offs, architecture, and accepting risk. Agents are better at bounded fixes, repetitive refactors, test generation, documentation updates, and parallel experiments.&lt;/p&gt;

&lt;p&gt;The point is not rigid separation, but intentional routing. Agents simply cannot do everything, despite the hype. But we can probably give more work than we realize to agents. Good teams decide which lane owns which kind of work, then revisit that split as the agents evolve. Here is a rough sketch of how that might look:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Human Lane&lt;/th&gt;
      &lt;th&gt;Agent Lane&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Ambiguity resolution&lt;/td&gt;
      &lt;td&gt;Bounded fixes&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vision&lt;/td&gt;
      &lt;td&gt;Research, planning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Architecture decisions&lt;/td&gt;
      &lt;td&gt;Test generation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Product trade-offs&lt;/td&gt;
      &lt;td&gt;Refactoring&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Risk acceptance&lt;/td&gt;
      &lt;td&gt;Documentation updates&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Escalation handling&lt;/td&gt;
      &lt;td&gt;Repetitive toil&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Assessing business impact&lt;/td&gt;
      &lt;td&gt;Parallel experiments&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;AI did not remove the need for DevOps. If anything, it increased it. When code gets cheaper, flow discipline, verification, and judgment become more important, not less.&lt;/p&gt;

&lt;p&gt;Treat the PR as the unit of flow. Design for continuous review and verification. Be explicit about which work belongs to humans and which belongs to agents. In Part 2, I’ll look at the next constraint: context and policy as operational infrastructure.&lt;/p&gt;

&lt;p&gt;Happy shipping!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="devops" /><category term="process" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2026/04/carriage.png" /><media:content medium="image" url="https://github.com/assets/images/2026/04/carriage.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Transform Your SDLC with Agentic Workflows</title><link href="https://github.com/transform-sdlc-with-agentic-workflows/" rel="alternate" type="text/html" title="Transform Your SDLC with Agentic Workflows" /><published>2026-02-12T09:00:00+00:00</published><updated>2026-02-12T09:00:00+00:00</updated><id>https://github.com/transform-sdlc-with-agentic-workflows</id><content type="html" xml:base="https://github.com/transform-sdlc-with-agentic-workflows/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#intent-driven-development&quot; id=&quot;markdown-toc-intent-driven-development&quot;&gt;Intent-driven Development&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-are-agentic-workflows&quot; id=&quot;markdown-toc-what-are-agentic-workflows&quot;&gt;What Are Agentic Workflows?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#benefits-of-agentic-workflows&quot; id=&quot;markdown-toc-benefits-of-agentic-workflows&quot;&gt;Benefits of Agentic Workflows&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-can-you-do-with-agentic-workflows&quot; id=&quot;markdown-toc-what-can-you-do-with-agentic-workflows&quot;&gt;What can you do with Agentic Workflows?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#anatomy-of-an-agentic-workflow&quot; id=&quot;markdown-toc-anatomy-of-an-agentic-workflow&quot;&gt;Anatomy of an Agentic Workflow&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#updating-a-project-when-a-dependency-changes-a-real-example&quot; id=&quot;markdown-toc-updating-a-project-when-a-dependency-changes-a-real-example&quot;&gt;Updating a project when a dependency changes: A real example&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#a-new-way-of-thinking-intent-over-implementation&quot; id=&quot;markdown-toc-a-new-way-of-thinking-intent-over-implementation&quot;&gt;A New Way of Thinking: Intent Over Implementation&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#failed-build-autofix&quot; id=&quot;markdown-toc-failed-build-autofix&quot;&gt;Failed Build Autofix&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#patterns-worth-exploring&quot; id=&quot;markdown-toc-patterns-worth-exploring&quot;&gt;Patterns worth exploring&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#security-considerations&quot; id=&quot;markdown-toc-security-considerations&quot;&gt;Security considerations&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#tips-and-gotchas&quot; id=&quot;markdown-toc-tips-and-gotchas&quot;&gt;Tips and Gotchas&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In this post, I’ll show you how &lt;a href=&quot;https://github.blog/ai-and-ml/automate-repository-tasks-with-github-agentic-workflows/&quot;&gt;GitHub Agentic Workflows&lt;/a&gt; fundamentally change the way you should think about automation, and why “Continuous AI” is the next frontier of &lt;a href=&quot;/agentic-software-delivery/&quot;&gt;agentic software delivery&lt;/a&gt;. I’ll cover what Agentic Workflow are and show a real example where a few sentences of intent replace hours of manual SDK tracking, issue creation, and implementation work.&lt;/p&gt;

&lt;h2 id=&quot;intent-driven-development&quot;&gt;Intent-driven Development&lt;/h2&gt;

&lt;p&gt;For years, teams have been scripting automation in declarative languages like YAML. If you needed a weekly automation, you had to learn the cron syntax, chain together actions, write scripts, wire up secrets, parse outputs, handle edge cases, and debug failures. The logic lives in rigid procedural steps, and every new workflow requires you to think like a build engineer.&lt;/p&gt;

&lt;p&gt;Most automation teams rely on today is a rigid set of &lt;em&gt;steps&lt;/em&gt;. We’re now at an inflection point where we need to start thinking in terms of &lt;em&gt;outcomes&lt;/em&gt; instead. GitHub Actions is fantastic for repeatable processes that checkout, build and test the applications we work on day to day. But how can we add more intelligence into these workflows?&lt;/p&gt;

&lt;p&gt;In a previous post about &lt;a href=&quot;/self-healing-devops-with-copilot-and-actions/&quot;&gt;“self-healing DevOps”&lt;/a&gt;, I showed how to inference build failure analysis via GitHub models wrapped in Actions steps. The idea was solid: “Why should I debug failed builds - surely models are better at trawling build logs and diagnosing issues than I am?” - but the implementation was laborious and required writing procedural logic in YAML, even if there was some inferencing sandwiched in the middle.&lt;/p&gt;

&lt;p&gt;But what if there was an easier way to perform regular, intelligent tasks on a codebase? What if you could describe a goal in natural language, and have that intent transpiled into a GitHub Actions workflow? That’s exactly what Agentic Workflows do. They abstract workflows into natural language.&lt;/p&gt;

&lt;p&gt;When I first saw the demo of Agentic Workflows at Universe 2025 I was intrigued, but unconvinced. The barrier to entry was still fairly steep - there was a GitHub CLI extension that you had to install and you had to author markdown with special frontmatter before you got to a workflow. However, the amazing team at GitHub Next has done a fantastic job iterating on the experience and reducing a lot of friction. And the mind shift of “use agentic workflow to generate agentic workflows” is a game changer - and definetely carries tones of Inception!&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://gh.io/gh-aw/&quot;&gt;Agentic Workflows&lt;/a&gt; is an open source repo in technical preview and you can use it today. The framework is the enabler of some of the ideas I’ve been writing about for a while now like &lt;a href=&quot;/agentic-software-delivery/&quot;&gt;Agentic Software Delivery&lt;/a&gt;, &lt;a href=&quot;/eight-principles-agentic-software-delivery/&quot;&gt;eight principles for ASD&lt;/a&gt; and &lt;a href=&quot;/teaching-async-thinking-with-copilot/&quot;&gt;teaching async thinking&lt;/a&gt;. This is the next chapter that makes all of these “Continuous AI” concepts practical.&lt;/p&gt;

&lt;h2 id=&quot;what-are-agentic-workflows&quot;&gt;What Are Agentic Workflows?&lt;/h2&gt;

&lt;p&gt;Agentic Workflows are not just another way to write GitHub Actions. They represent a fundamental shift in how we think about automation. They empower “intent driven development”. Here is the process:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Use a simple prompt to instruct Copilot Coding Agent (CCA) to bootstrap the agentic workflow prerequisites into your repo. This adds steps to install the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh&lt;/code&gt; CLI and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh-aw&lt;/code&gt; extension, and creates a custom agent that help Copilot understand how to make new agentic workflows.&lt;/li&gt;
  &lt;li&gt;Use a simple prompt (and the custom agent from the boostrap) to describe a workflow you want - Copilot writes a markdown file for you.&lt;/li&gt;
  &lt;li&gt;You can edit the markdown file if you want to - but it’s better to &lt;em&gt;use the custom agent to refine the workflow for you&lt;/em&gt;. You should never edit markdown or YAML directly - just talk to your agent in natural language and let it do the work of creating the markdown and YAML for you.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;benefits-of-agentic-workflows&quot;&gt;Benefits of Agentic Workflows&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Natural language intent and iteration&lt;/strong&gt;: Describe what you want to achieve, instead of how to make it happen. The agent figures out the implementation details. This lets you iterate in natural language rather than having to program scripts and workflows by hand.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Scale and familiar framework&lt;/strong&gt;: Agentic Workflows are really just GitHub Actions workflows that invoke the Copilot CLI under the hood, so they run on the same infrastructure, with the same reliability, scale and performance you already have.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Built-in security&lt;/strong&gt;: The framework includes structural guardrails like a sandboxed read-only  environment to ensure that your agent can’t do anything you don’t explicitly allow. Safe outputs, network controls, and AI-powered threat detection make it safe to give agents more responsibility.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Enhanced fontmatter&lt;/strong&gt;: The fontmatter that Agentic Workflows introduces is more expressive than Actions metadata, leading to richer specifications, triggers, permissions, and engine options. But you don’t need to know all the details, since you can just tell your agent what you want and it will generate the correct markdown and frontmatter for you. For example, telling the agent to execute a workflow “daily” will lead to a cron expression that randomizes the &lt;em&gt;time&lt;/em&gt; of day so that you spread out your runs and avoid hitting rate limits.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Traceability&lt;/strong&gt;: Each Actions run that contains an Agentic Workflow gets a unique ID - that ID is added to any issues or PRs that the workflow creates, so you can easily trace outputs back to the specific workflow run and its associated markdown instructions. This is extremely helpful for debugging, auditing and searching (the repo semantic search will find mardown, YML and issues created by the workflow, and you can correlate all of those together via the unique ID).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;what-can-you-do-with-agentic-workflows&quot;&gt;What can you do with Agentic Workflows?&lt;/h2&gt;

&lt;p&gt;The truth is that the possibilities are really endless - you’re truly only limited by your imagination. There are a couple of canonical scenarios that are a great fit for Agentic Workflows:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;keeping documentation up to date&lt;/li&gt;
  &lt;li&gt;monitoring dependencies for updates and vulnerabilities&lt;/li&gt;
  &lt;li&gt;triaging issues and PRs&lt;/li&gt;
  &lt;li&gt;generating release notes&lt;/li&gt;
  &lt;li&gt;optimizing code and detection duplicate code&lt;/li&gt;
  &lt;li&gt;analyzing CI failures and creating remediation issues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the sky is the limit. If you can describe it, you can probably build it. The best way to get a sense of the possibilities is to check out the &lt;a href=&quot;https://github.github.com/gh-aw/&quot;&gt;Agentic Workflows gallery&lt;/a&gt; and see the examples that the GitHub team has built - and then start thinking about how you can build your own.&lt;/p&gt;

&lt;h2 id=&quot;anatomy-of-an-agentic-workflow&quot;&gt;Anatomy of an Agentic Workflow&lt;/h2&gt;

&lt;p&gt;At this point, I would normally show you a snippet of a workflow markdown or YAML, but I won’t do that here. You shouldn’t ever have to see or edit these files, since you can just talk to your agent in natural language and let it do the work of creating and refining these files for you. Instead, I’ll show you the actual prompt I used to create a workflow, and then I’ll break down the resulting markdown file so you can understand how the intent maps to the implementation.&lt;/p&gt;

&lt;p&gt;There are a couple of artifacts that the bootstrap process creates in your repo and then every repo consists of two files in the repo:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.github/agents/agentic-workflows.agent.md&lt;/code&gt; - this is the custom agent file that helps Copilot understand how to create and refine agentic workflows. You can tell your agent “Make me a new workflow that does X” and it will use the instructions in this file to generate the correct markdown and YAML for you.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.github/workflows/copilot-setup-steps.yml&lt;/code&gt; - if this does not exist, the agent will create it for you, otherwise the bootstrap will just add the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh-aw&lt;/code&gt; CLI setup steps to the existing file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, for each workflow you create via a prompt, you get two files:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.github/workflows/agentic-workflow.md&lt;/code&gt; - this file has frontmatter that defines triggers, permissions, and other metadata for your workflow, and a markdown body that describes the intent in natural language.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.github/workflows/agentic-workflow.lock.yml&lt;/code&gt; - this is the file that gets transpiled by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh-aw&lt;/code&gt; CLI and executed by GitHub Actions. It has all the security hardening, sandboxing, and threat detection baked in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once again: never ever edit the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.lock.yml&lt;/code&gt; file. You can edit the markdown body if you want to refine the intent, but it’s better to talk to your agent and let it update the markdown for you.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: While the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.lock.yml&lt;/code&gt; file is an committed to the repo, you should consider it a build artifact rather than source code. When a workflow executes, it will fetch the intent from the markdown of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;workflow.md&lt;/code&gt; file. You only need to update the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.lock.yml&lt;/code&gt; file when you change the frontmatter in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.md&lt;/code&gt; file - and if you use Copilot to perform your updates with the custom agent, this is done for you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;updating-a-project-when-a-dependency-changes-a-real-example&quot;&gt;Updating a project when a dependency changes: A real example&lt;/h2&gt;

&lt;p&gt;I was recently playing with the Copilot SDK and wanted to create a simple TUI (text-based UI) to experiment with the SDK and show off some of its capabilities. I built &lt;a href=&quot;https://github.com/colindembovsky/planeteer&quot;&gt;Planeteer&lt;/a&gt; as an experiment in work breakdown and orchestration using Copilot. But even as I was building Planeteer, I realized that the SDK was changing rapidly - new features, API changes, and improvements were landing on a daily basis. I wanted to stay up to date with those changes and incorporate them into Planeteer, but it was a lot of manual effort to track the SDK repo for updates, read changelogs, analyze relevance, create issues and implement changes. This is a perfect scenario for an Agentic Workflow.&lt;/p&gt;

&lt;p&gt;I headed to the Planeteer repo and clicked on the “Agents” tab. I bootstrapped Agentic Workflows by typing in this prompt (copied from the &lt;a href=&quot;https://github.github.com/gh-aw/setup/creating-workflows/#creating-agentic-workflows-using-a-coding-agent&quot;&gt;Agentic Workflows getting started docs&lt;/a&gt;):&lt;/p&gt;

&lt;div class=&quot;language-markdown highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
Initialize this repository for GitHub Agentic Workflows using https://raw.githubusercontent.com/github/gh-aw/main/create.md

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Copilot Coding Agent (CCA) got to work and created the necessary files to set up the framework in my repo and submitted a PR with the changes. I merged that PR, and then I was ready to create my first workflow.&lt;/p&gt;

&lt;p&gt;I also added two secrets (tokens) to the repo since these are required for the workflows I had in mind: one for Copilot inference (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GH_COPILOT_TOKEN&lt;/code&gt;) and one for assigning Copilot to issues and PRs &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GH_AW_AGENT_TOKEN&lt;/code&gt;. The &lt;a href=&quot;https://github.github.com/gh-aw/reference/auth/&quot;&gt;auth page&lt;/a&gt; has detailed instructions on how to create these tokens and what permissions they need.&lt;/p&gt;

&lt;p&gt;Now I was ready to create a workflow. I went back to the “Agents” tab, ensured that I was using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;agentic-workflows&lt;/code&gt; custom agent and typed in this prompt:&lt;/p&gt;

&lt;div class=&quot;language-markdown highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
Check the release notes and recent commits in &lt;span class=&quot;sb&quot;&gt;`github/copilot-sdk`&lt;/span&gt;.
Identify new features or enhancements from the last 7 days.
Suggest 3 ways to use these updates to improve my app.
For each suggestion, create a GitHub Issue with implementation details and assign it to Copilot for implementation.
Run this workflow once a week on Wednesdays.

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p class=&quot;figcaption&quot;&gt;&lt;img src=&quot;/assets/images/2026/02/aw/custom-agent-screenshot.png&quot; alt=&quot;Selecting the agentic-workflows custom agent in the Agents tab&quot; class=&quot;center-image&quot; /&gt;
Selecting the agentic-workflows custom agent to create a new workflow.&lt;/p&gt;

&lt;p&gt;CCA got to work and in a few minutes I had a PR with the new workflow. I checked through the markdown quickly, and it looked good - the frontmatter had the correct triggers and permissions, and the body had a clear description of the intent. I merged the PR, and now every Wednesday, this workflow runs, checks the SDK for updates, creates issues with enhancement suggestions, and assigns them to Copilot for implementation. I review the PRs that Copilot creates when I’m ready, provide feedback, and merge what makes sense.&lt;/p&gt;

&lt;p&gt;Let’s take a quick look at the markdown file - but remember, you don’t need to mess around with the markdown or YAML - just talk to your agent in natural language and let it do the work for you!&lt;/p&gt;

&lt;div class=&quot;language-markdown highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;

---&lt;/span&gt;
description: Weekly analysis of Copilot SDK releases and commits to suggest project enhancements
on:
  schedule: weekly on wednesday
permissions:
  contents: read
  issues: read
  pull-requests: read
tools:
  github:
    toolsets: [default]
safe-outputs:
  create-issue:
    title-prefix: &quot;[enhancement] &quot;
    labels: [enhancement, ai-suggestion]
    assignees: [copilot]
    max: 3
  assign-to-agent:
    name: copilot
&lt;span class=&quot;gh&quot;&gt;    max: 3
---
&lt;/span&gt;
&lt;span class=&quot;gh&quot;&gt;# Weekly Enhancement Suggestions&lt;/span&gt;

You are an AI agent that monitors the GitHub Copilot SDK (&lt;span class=&quot;sb&quot;&gt;`github/copilot-sdk`&lt;/span&gt;) for new releases, features, and changes, then suggests how they can be leveraged in the Planeteer project.

&lt;span class=&quot;gu&quot;&gt;## Context&lt;/span&gt;

This repository contains &lt;span class=&quot;gs&quot;&gt;**Planeteer**&lt;/span&gt;, an AI-powered work breakdown and parallel execution TUI built with Ink (React for terminals) and TypeScript. Planeteer depends on &lt;span class=&quot;gs&quot;&gt;**`@github/copilot-sdk`**&lt;/span&gt; for AI-powered project planning and execution. All Copilot SDK interactions are isolated in &lt;span class=&quot;sb&quot;&gt;`src/services/copilot.ts`&lt;/span&gt;.

&lt;span class=&quot;gu&quot;&gt;## Your Task&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
1.&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Gather recent activity**&lt;/span&gt; from the last 7 days in the &lt;span class=&quot;gs&quot;&gt;**`github/copilot-sdk`**&lt;/span&gt; repository (https://github.com/github/copilot-sdk):
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; List recent releases and release notes
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; List recent commits to the &lt;span class=&quot;sb&quot;&gt;`main`&lt;/span&gt; branch from the past week
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Review any notable changes, new features, bug fixes, API updates, or deprecations
&lt;span class=&quot;p&quot;&gt;
2.&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Review this project**&lt;/span&gt; (&lt;span class=&quot;sb&quot;&gt;`${{ github.repository }}`&lt;/span&gt;) to understand how the Copilot SDK is currently used:
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Read &lt;span class=&quot;sb&quot;&gt;`src/services/copilot.ts`&lt;/span&gt; to understand the current SDK integration points
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Check &lt;span class=&quot;sb&quot;&gt;`package.json`&lt;/span&gt; for the current SDK version
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Understand the project architecture to identify where SDK updates could have impact
&lt;span class=&quot;p&quot;&gt;
3.&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Analyze the Copilot SDK changes**&lt;/span&gt; and identify opportunities for this project:
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Determine which new SDK features or API changes could benefit Planeteer
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Consider new capabilities that could improve the clarification, breakdown, refinement, or execution flows
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Identify any deprecations or breaking changes that require attention
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Think about how new SDK features could unlock better UX, performance, or reliability
&lt;span class=&quot;p&quot;&gt;
4.&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Create exactly 3 enhancement suggestions**&lt;/span&gt; as GitHub issues in this repo. Each issue should:
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Have a clear, descriptive title summarizing the enhancement
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Include a detailed body with:
&lt;span class=&quot;p&quot;&gt;     -&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Background**&lt;/span&gt;: What recent Copilot SDK commit(s) or release(s) inspired this suggestion, with links to the relevant changes in &lt;span class=&quot;sb&quot;&gt;`github/copilot-sdk`&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;     -&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Proposal**&lt;/span&gt;: A clear description of how to leverage this SDK update in Planeteer
&lt;span class=&quot;p&quot;&gt;     -&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Benefit**&lt;/span&gt;: Why this enhancement would improve the project
&lt;span class=&quot;p&quot;&gt;     -&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Acceptance Criteria**&lt;/span&gt;: Specific, measurable criteria for completion
&lt;span class=&quot;p&quot;&gt;   -&lt;/span&gt; Be actionable and scoped appropriately for a single task
&lt;span class=&quot;p&quot;&gt;
5.&lt;/span&gt; &lt;span class=&quot;gs&quot;&gt;**Assign each issue to Copilot**&lt;/span&gt; for implementation using the &lt;span class=&quot;sb&quot;&gt;`assign-to-agent`&lt;/span&gt; safe output.

&lt;span class=&quot;gu&quot;&gt;## Guidelines&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
-&lt;/span&gt; Focus on practical, high-value enhancements that take advantage of new Copilot SDK capabilities.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; Each suggestion should be independent and self-contained.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; Ensure suggestions are diverse — cover different aspects of the project (e.g., one for new SDK features, one for performance or reliability improvements, one for UX enhancements enabled by SDK updates).
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; If there are no releases or commits in the Copilot SDK repo in the last week, base your suggestions on the current SDK capabilities that Planeteer is not yet using.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; When referencing recent activity, attribute changes to the humans who authored them, not to bots or automation tools.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; Use GitHub-flavored markdown for issue bodies.

&lt;span class=&quot;gu&quot;&gt;## Safe Outputs&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
-&lt;/span&gt; Use &lt;span class=&quot;sb&quot;&gt;`create-issue`&lt;/span&gt; to create each of the 3 enhancement issues.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; Use &lt;span class=&quot;sb&quot;&gt;`assign-to-agent`&lt;/span&gt; to assign Copilot to each created issue.
&lt;span class=&quot;p&quot;&gt;-&lt;/span&gt; If for any reason you cannot identify meaningful enhancements, use the &lt;span class=&quot;sb&quot;&gt;`noop`&lt;/span&gt; safe output with a message explaining why.&lt;span class=&quot;sb&quot;&gt;


&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can see how the frontmatter defines the triggers, permissions, tools and safe outputs, while the body describes the intent in natural language.&lt;/p&gt;

&lt;p&gt;Here’s an &lt;a href=&quot;https://github.com/colindembovsky/planeteer/issues/9&quot;&gt;example&lt;/a&gt; of one of the issues that gets created by the workflow. And since the workflow assigns the issue to Copilot, you can see the PR that Copilot creates to implement the issue as well: &lt;a href=&quot;https://github.com/colindembovsky/planeteer/pull/12&quot;&gt;example PR&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If I click “Edit” on the Issue body, I also see the following hidden HTML metadata:&lt;/p&gt;

&lt;div class=&quot;language-html highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;

&lt;span class=&quot;c&quot;&gt;&amp;lt;!-- gh-aw-agentic-workflow: Weekly Enhancement Suggestions, engine: copilot, run: https://github.com/colindembovsky/planeteer/actions/runs/21970499900 --&amp;gt;&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;&amp;lt;!-- gh-aw-workflow-id: weekly-enhancement-suggestions --&amp;gt;&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Searching &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;weekly-enhancement-suggestions&lt;/code&gt; in the repo yields code references (to the workflow files) as well as any issues created by the workflow, and you can easily correlate these together to trace outputs back to the specific workflow run and its associated markdown instructions.&lt;/p&gt;

&lt;h2 id=&quot;a-new-way-of-thinking-intent-over-implementation&quot;&gt;A New Way of Thinking: Intent Over Implementation&lt;/h2&gt;

&lt;p&gt;This is the mindset shift that matters most. For over a decade, “automation” has meant “write the steps.” Need to check an API? Write a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;curl&lt;/code&gt; command, parse the JSON, handle errors, format the output. Need to create an issue? Construct the body string, call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh issue create&lt;/code&gt;, capture the URL.&lt;/p&gt;

&lt;p&gt;Agentic Workflows ask a different question: &lt;strong&gt;what do you want to happen?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You don’t script the API call. You don’t build the JSON parser. You don’t format the issue body. You describe the outcome, and the agent handles the implementation. The framework provides the guardrails that make this safe: read-only execution by default, writes only through scoped “safe output” jobs, network egress controls via a firewall, and AI-powered threat detection that scans all outputs before they’re externalized.&lt;/p&gt;

&lt;p&gt;This isn’t “vibe coding your CI.” The structure is there. The security is there. The deterministic parts of your pipeline stay deterministic. But the intelligent, context-dependent tasks that you’ve been putting off (or doing manually) can now be expressed as intent.&lt;/p&gt;

&lt;p&gt;Think of it this way: you wouldn’t write YAML to tell a teammate how to review a PR. You’d say “check if the tests pass, flag any security concerns, and make sure the docs are updated.” Agentic Workflows let you talk to your automation the same way.&lt;/p&gt;

&lt;h3 id=&quot;failed-build-autofix&quot;&gt;Failed Build Autofix&lt;/h3&gt;

&lt;p&gt;In a past post I outlined the “manual way” to create &lt;a href=&quot;/self-healing-devops-with-copilot-and-actions/&quot;&gt;“self-healing DevOps”&lt;/a&gt;. This can now be replaced by a prompt like this to CCA using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;agentic-workflow&lt;/code&gt; custom agent:&lt;/p&gt;

&lt;div class=&quot;language-markdown highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
Create a workflow that runs on every failed build. The workflow should analyze the build logs, identify the root cause of the failure, and if it&apos;s a common issue with a known fix, automatically create a PR with the fix and assign it to Copilot for implementation.

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Much easier!&lt;/p&gt;

&lt;h3 id=&quot;patterns-worth-exploring&quot;&gt;Patterns worth exploring&lt;/h3&gt;

&lt;p&gt;The planeteer SDK monitor is just one pattern. The &lt;a href=&quot;https://github.github.com/gh-aw/&quot;&gt;Agentic Workflows gallery&lt;/a&gt; showcases several others:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;DailyOps&lt;/strong&gt;: Generate a daily repo status report - open PRs, stale issues, CI health, contributor activity&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;IssueOps&lt;/strong&gt;: Auto-triage incoming issues, add labels, request clarification, and route to the right team&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;ChatOps&lt;/strong&gt;: Respond to PR comments with agent-driven code analysis, suggestions, or documentation&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Continuous Documentation&lt;/strong&gt;: Keep READMEs, API docs, and architecture diagrams in sync with code changes&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Failure Analysis&lt;/strong&gt;: Analyze CI failures and create remediation issues (the structured evolution of my self-healing DevOps approach)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Multi-Repo Orchestration&lt;/strong&gt;: Coordinate changes across multiple repositories when a shared dependency updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The common thread is that these are all tasks where you know &lt;em&gt;what&lt;/em&gt; you want but the &lt;em&gt;how&lt;/em&gt; requires judgment and context. That’s exactly the sweet spot for Agentic Workflows.&lt;/p&gt;

&lt;h2 id=&quot;security-considerations&quot;&gt;Security considerations&lt;/h2&gt;

&lt;p&gt;Giving an AI agent write access to a repo sounds terrifying. But Agentic Workflows are designed with security in mind. Here’s why the security model matters:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Read-only by default&lt;/strong&gt;: The agent runs in a sandboxed container with read-only access. It can browse code, read issues, and fetch data, but it can’t write anything during execution.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Safe Outputs&lt;/strong&gt;: Writes happen in &lt;em&gt;separate&lt;/em&gt; jobs with explicitly scoped permissions. If your workflow only declares &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;issues&lt;/code&gt; as a safe output, the agent can’t push code or merge PRs, even if you accidentally instruct it to.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Agent Workflow Firewall&lt;/strong&gt;: The agent container uses iptables-based network egress controls via a Squid proxy. You can allowlist specific domains and block everything else.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Threat Detection&lt;/strong&gt;: Before any safe output is externalized, an AI-powered pipeline scans for secret leaks, malicious patches, injection attempts, and policy violations.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Content Sanitization&lt;/strong&gt;: Inputs are scrubbed of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@mentions&lt;/code&gt;, bot triggers, HTML/XML tags, and untrusted URIs to prevent injection attacks.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When creating or updating workflows, the compilation step (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh aw compile&lt;/code&gt;) bakes all of this into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.lock.yml&lt;/code&gt; file. You can audit it, review it, and version-control it just like any other Actions workflow.&lt;/p&gt;

&lt;h2 id=&quot;tips-and-gotchas&quot;&gt;Tips and Gotchas&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Start small&lt;/strong&gt;: Try one workflow in one repo. A daily status report or issue triage is a great first candidate. Get comfortable with the model before scaling up.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Iterate on the body, not the frontmatter&lt;/strong&gt;: Edits to the markdown instructions take effect on the next run without recompilation. Frontmatter changes (triggers, permissions, engine) require &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh aw compile&lt;/code&gt;. Even better - don’t edit the markdown at all. Just instruct CCA (using the agentic workflow custom agent) to update the workflow for you.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Cost awareness&lt;/strong&gt;: The Copilot engine uses 1-2 premium requests per run. Track usage with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gh aw logs&lt;/code&gt;. If you’re running daily across many repos, the costs add up.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;workflow_dispatch&lt;/code&gt; for testing&lt;/strong&gt;: By default, agentic workflows include a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;workflow_dispatch&lt;/code&gt; trigger so that you can manually trigger workflows without waiting for the schedule.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Be specific in your instructions&lt;/strong&gt;: Agents perform better with clear, structured prompts. List steps, define priorities, and handle edge cases explicitly. Think of the markdown body as a detailed brief for a capable but literal colleague. There’s no limit to the number of workflows, so you can even break complex processes into multiple smaller workflows that call each other via issue creation or repository dispatch.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: Agentic Workflows are actively evolving. Expect changes to the CLI, engine options, and security features. Pin to specific versions and monitor the &lt;a href=&quot;https://github.github.com/gh-aw/&quot;&gt;documentation&lt;/a&gt; for updates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Agentic Workflows represent a fundamental shift in how we think about automation. Instead of scripting steps in YAML, you declare intent. Instead of building one-off integrations, you describe outcomes and let AI agents handle execution. The combination of Agentic Workflows, GitHub Actions, and Copilot creates a natural language continuous AI loop where your codebase improves itself - with you in the decision seat, not the execution treadmill.&lt;/p&gt;

&lt;p&gt;The Planeteer example shows what this looks like in practice: a few sentences of intent replace hours of manual SDK tracking, issue creation, and implementation work. Every Wednesday, the loop runs. Issues appear. Copilot submits PRs. I review and merge. The app evolves.&lt;/p&gt;

&lt;p&gt;This is one example of why the GitHub platform is so powerful. We provide the building blocks - Actions for automation, Copilot for intelligence, and now Agentic Workflows for intent-driven development. The possibilities are endless, and we’re just scratching the surface of what’s possible when you combine these capabilities.&lt;/p&gt;

&lt;p&gt;Happy automating!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="actions" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2026/02/aw/agentic-workflows.png" /><media:content medium="image" url="https://github.com/assets/images/2026/02/aw/agentic-workflows.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Extract and visualize PR counts using the Copilot Enterprise Metrics API</title><link href="https://github.com/visualize-copilot-pr-metrics-with-github-app/" rel="alternate" type="text/html" title="Extract and visualize PR counts using the Copilot Enterprise Metrics API" /><published>2026-02-02T09:00:00+00:00</published><updated>2026-02-02T09:00:00+00:00</updated><id>https://github.com/visualize-copilot-pr-metrics-with-github-app</id><content type="html" xml:base="https://github.com/visualize-copilot-pr-metrics-with-github-app/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#prerequisites&quot; id=&quot;markdown-toc-prerequisites&quot;&gt;Prerequisites&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#step-1--create-the-github-app-and-capture-ids&quot; id=&quot;markdown-toc-step-1--create-the-github-app-and-capture-ids&quot;&gt;Step 1 — Create the GitHub App and capture IDs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#step-2--clone-the-repo-and-install-dependencies&quot; id=&quot;markdown-toc-step-2--clone-the-repo-and-install-dependencies&quot;&gt;Step 2 — Clone the repo and install dependencies&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#step-3--run-the-script&quot; id=&quot;markdown-toc-step-3--run-the-script&quot;&gt;Step 3 — Run the script&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#step-4--review-the-outputs&quot; id=&quot;markdown-toc-step-4--review-the-outputs&quot;&gt;Step 4 — Review the outputs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#alternatives-and-tips&quot; id=&quot;markdown-toc-alternatives-and-tips&quot;&gt;Alternatives and tips&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In this post, I’ll show you how to download and visualize Copilot PR counts (for Coding Agent and Code Review Agent) using metrics from your GitHub Enterprise using a python script. You’ll end with the JSON usage report plus a PR summary chart that visualized PRs created by humans and Copilot Coding Agent (CCA), and number of Code Reviews from humans and Copilot Code Review (CCR). This will help you keep tabs on how much usage you’re getting from CCR and CCA.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This visualization isn’t very sophisticated given the limits of the PR metrics available. It would be great to see and compare lead times for PR merges (time from open to merge) and compare those PRs with to those without CCR reviews - or even to split out by org or team! But this visualization at least gives you an idea at a coarse Enterprise level how many PRs are created by CCA and how many PRs are reviewed by CCR.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The code for this post is in &lt;a href=&quot;https://github.com/colindembovsky/copilot-pr-metrics&quot;&gt;this repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h2&gt;

&lt;p&gt;You’ll need:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Enterprise owner access in GitHub.&lt;/li&gt;
  &lt;li&gt;A GitHub App installed in your enterprise (instructions in the sample repo).&lt;/li&gt;
  &lt;li&gt;Python 3 and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pip&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Assumptions:&lt;/strong&gt; You have permission to create and install GitHub Apps in your GitHub enterprise. If you do not, ask your enterprise admin to help. The “app” is more like a service account - there’s no code at all!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;step-1--create-the-github-app-and-capture-ids&quot;&gt;Step 1 — Create the GitHub App and capture IDs&lt;/h2&gt;

&lt;p&gt;Create a GitHub App at the enterprise level, capture the IDs and download the private key &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pem&lt;/code&gt; file. The full instructions are detailed in the &lt;a href=&quot;https://github.com/colindembovsky/copilot-pr-metrics&quot;&gt;README file&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;step-2--clone-the-repo-and-install-dependencies&quot;&gt;Step 2 — Clone the repo and install dependencies&lt;/h2&gt;

&lt;p&gt;Use the following commands to clone the repo, activate a python virtual environment and install the dependencies.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
git clone https://github.com/colindembovsky/copilot-pr-metrics.git
&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;copilot-pr-metrics
python3 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; venv .venv
&lt;span class=&quot;nb&quot;&gt;source&lt;/span&gt; .venv/bin/activate
pip &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; requirements.txt

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;step-3--run-the-script&quot;&gt;Step 3 — Run the script&lt;/h2&gt;

&lt;p&gt;Run the script with your app details to fetch the latest 28-day enterprise usage report and generate PR metrics. You can enter the values as args or create a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test.env&lt;/code&gt; file in the repo root like this:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
&lt;span class=&quot;nv&quot;&gt;APP_ID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;123456
&lt;span class=&quot;nv&quot;&gt;INSTALLATION_ID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;987654321
&lt;span class=&quot;nv&quot;&gt;PRIVATE_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;path/to/app-key.pem
&lt;span class=&quot;nv&quot;&gt;ENTERPRISE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;your-enterprise-slug
&lt;span class=&quot;nv&quot;&gt;API_BASE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;https://api.github.com
&lt;span class=&quot;nv&quot;&gt;OUTPUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;metrics-YYYY-MM-DD.json

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Run the python command to download all the usage data as well as generate the chart of PR counts.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
python copilot_metrics.py &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--app-id&lt;/span&gt; &amp;lt;GITHUB_APP_ID&amp;gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--private-key&lt;/span&gt; &amp;lt;PATH_TO_PRIVATE_KEY_PEM&amp;gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--installation-id&lt;/span&gt; &amp;lt;APP_INSTALLATION_ID&amp;gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--enterprise&lt;/span&gt; &amp;lt;ENTERPRISE_SLUG&amp;gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;step-4--review-the-outputs&quot;&gt;Step 4 — Review the outputs&lt;/h2&gt;

&lt;p&gt;The script downloads the enterprise Copilot usage report (last 28 days) and produces a PR summary chart showing human PRs and Copilot PR activity for Copilot Coding Agent and Copilot Code Review. For an example chart, see the repository’s &lt;a href=&quot;https://github.com/colindembovsky/copilot-pr-metrics/blob/main/sample-chart.png&quot;&gt;sample-chart.png&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;alternatives-and-tips&quot;&gt;Alternatives and tips&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;For automation, schedule the script in a CI workflow or a cron job and version the JSON output in a metrics repository or create an Issue with the chart image.&lt;/li&gt;
  &lt;li&gt;Use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--output&lt;/code&gt; flag or the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OUTPUT&lt;/code&gt; variable to keep a dated archive of metrics files.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;You now have a repeatable way to download Copilot usage data and visualize PR activity using a GitHub App. From here, you can automate collection and build trend reporting.&lt;/p&gt;

&lt;p&gt;Happy measuring!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="github" /><category term="ai" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2026/02/copilot-pr-metrics.jpg" /><media:content medium="image" url="https://github.com/assets/images/2026/02/copilot-pr-metrics.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">The Agentic Future of Software Delivery: My AWS re:Invent 2025 Session</title><link href="https://github.com/agentic-future-aws-reinvent/" rel="alternate" type="text/html" title="The Agentic Future of Software Delivery: My AWS re:Invent 2025 Session" /><published>2025-12-09T10:00:00+00:00</published><updated>2025-12-09T10:00:00+00:00</updated><id>https://github.com/agentic-future-aws-reinvent</id><content type="html" xml:base="https://github.com/agentic-future-aws-reinvent/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#watch-the-session&quot; id=&quot;markdown-toc-watch-the-session&quot;&gt;Watch the Session&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#session-overview&quot; id=&quot;markdown-toc-session-overview&quot;&gt;Session Overview&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#whats-next&quot; id=&quot;markdown-toc-whats-next&quot;&gt;What’s Next?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I recently had the privilege of presenting at AWS re:Invent 2025 in Las Vegas. In this session, I traced the history of developer productivity tools from punch cards through IDEs and into the era of AI pair programming and autonomous agents. If you missed it, you can watch the full recording below.&lt;/p&gt;

&lt;h2 id=&quot;watch-the-session&quot;&gt;Watch the Session&lt;/h2&gt;

&lt;div style=&quot;position:relative;padding-bottom:56.25%;height:0;overflow:hidden;max-width:100%;&quot;&gt;
  &lt;iframe class=&quot;center-image&quot; src=&quot;https://www.youtube-nocookie.com/embed/Quit3lN2igY?rel=0&amp;amp;modestbranding=1&quot; title=&quot;The Agentic Future of Software Delivery - AWS re:Invent 2024&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot; style=&quot;position:absolute;top:0;left:0;width:100%;height:100%;&quot; loading=&quot;lazy&quot;&gt;&lt;/iframe&gt;
&lt;/div&gt;
&lt;p class=&quot;figcaption&quot;&gt;The Agentic Future of Software Delivery - AWS re:Invent 2024&lt;/p&gt;

&lt;h2 id=&quot;session-overview&quot;&gt;Session Overview&lt;/h2&gt;

&lt;p&gt;The session covers the evolution of tools designed to decrease the time it takes to get from idea to deployment (also sometimes called “developer productivity”):&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;The Early Days&lt;/strong&gt;: From punch cards and batch processing to interactive terminals&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The IDE Revolution&lt;/strong&gt;: How integrated development environments transformed coding&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;AI Pair Programming&lt;/strong&gt;: GitHub Copilot’s code completions started a revolution in 2021&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The Agentic Future&lt;/strong&gt;: Where autonomous agents are taking software delivery&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Industry Themes&lt;/strong&gt;: What trends are we seeing in the industry - and how is GitHub building for those?&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GitHub Universe Ships&lt;/strong&gt;: What announcements and ships GitHub made at Universe (in Oct 2025)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next?&lt;/h2&gt;

&lt;p&gt;The agentic future isn’t coming - it’s here. Even if you are faster at “coding” you can dilute those gains if you spend more time planning and validating! GitHub’s visions is to bring agentic capabilities to all parts of the SDLC so that you can scale your acceleration beyond the IDE.&lt;/p&gt;

&lt;p&gt;Agentic workflows are not just about new shiny tools - they are about a new way of thinking and working. For more on these topics, see my posts about principles of &lt;a href=&quot;/eight-principles-agentic-software-delivery/&quot;&gt;agentic software delivery&lt;/a&gt; and &lt;a href=&quot;/teaching-async-thinking-with-copilot/&quot;&gt;teaching async thinking&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Thanks to everyone who attended the session at re:Invent. The energy in Las Vegas was incredible, and the questions from the audience showed just how much interest there is in this space. We’re at an inflection point in software development, and I’m excited to see how teams embrace these new capabilities.&lt;/p&gt;

&lt;p&gt;Happy building!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="devops" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2025/12/aws-session.png" /><media:content medium="image" url="https://github.com/assets/images/2025/12/aws-session.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Teaching Your Team to Think Async-First with GitHub Copilot</title><link href="https://github.com/teaching-async-thinking-with-copilot/" rel="alternate" type="text/html" title="Teaching Your Team to Think Async-First with GitHub Copilot" /><published>2025-11-25T09:00:00+00:00</published><updated>2025-11-25T09:00:00+00:00</updated><id>https://github.com/teaching-async-thinking-with-copilot</id><content type="html" xml:base="https://github.com/teaching-async-thinking-with-copilot/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#the-fundamental-question-should-copilot-do-this&quot; id=&quot;markdown-toc-the-fundamental-question-should-copilot-do-this&quot;&gt;The Fundamental Question: “Should Copilot Do This?”&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#practical-async-patterns-examples-that-work&quot; id=&quot;markdown-toc-practical-async-patterns-examples-that-work&quot;&gt;Practical Async Patterns: Examples That Work&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#test-generation&quot; id=&quot;markdown-toc-test-generation&quot;&gt;Test Generation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#documentation&quot; id=&quot;markdown-toc-documentation&quot;&gt;Documentation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#prototyping-multiple-solutions&quot; id=&quot;markdown-toc-prototyping-multiple-solutions&quot;&gt;Prototyping Multiple Solutions&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#build-failure-analysis&quot; id=&quot;markdown-toc-build-failure-analysis&quot;&gt;Build Failure Analysis&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#code-optimization&quot; id=&quot;markdown-toc-code-optimization&quot;&gt;Code Optimization&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#redesigning-workflows-for-parallel-experimentation&quot; id=&quot;markdown-toc-redesigning-workflows-for-parallel-experimentation&quot;&gt;Redesigning Workflows for Parallel Experimentation&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#what-breaks-in-async-workflows&quot; id=&quot;markdown-toc-what-breaks-in-async-workflows&quot;&gt;What Breaks in Async Workflows&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#new-coordination-patterns&quot; id=&quot;markdown-toc-new-coordination-patterns&quot;&gt;New Coordination Patterns&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#teaching-the-mindset-what-to-delegate-what-to-keep&quot; id=&quot;markdown-toc-teaching-the-mindset-what-to-delegate-what-to-keep&quot;&gt;Teaching the Mindset: What to Delegate, What to Keep&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#copilot-excels-at&quot; id=&quot;markdown-toc-copilot-excels-at&quot;&gt;Copilot Excels At&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#humans-excel-at&quot; id=&quot;markdown-toc-humans-excel-at&quot;&gt;Humans Excel At&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#overcoming-resistance-this-feels-like-cheating&quot; id=&quot;markdown-toc-overcoming-resistance-this-feels-like-cheating&quot;&gt;Overcoming Resistance: “This Feels Like Cheating”&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;the-fundamental-question-should-copilot-do-this&quot;&gt;The Fundamental Question: “Should Copilot Do This?”&lt;/h2&gt;

&lt;p&gt;Once you’ve built an enablement program (see my companion post on &lt;a href=&quot;/building-copilot-enablement-program/&quot;&gt;Building a GitHub Copilot Enablement Program That Actually Works&lt;/a&gt;), the next challenge is teaching your team a new way of thinking.&lt;/p&gt;

&lt;p&gt;The fundamental shift is asking a new question before starting any task: &lt;strong&gt;“Should Copilot do this instead of me?”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This isn’t about laziness - rather, it’s about &lt;em&gt;leverage&lt;/em&gt;. When Copilot handles routine tasks, developers have more cognitive bandwidth for architecture, business logic, and creative problem-solving. But most developers have years of muscle memory telling them “I must do this myself.”&lt;/p&gt;

&lt;p&gt;The core shift your team needs to make is recognizing which work should be delegated to Copilot versus done manually. In this post, I’ll show you practical patterns for async, multi-threaded development where AI handles routine work in parallel while humans orchestrate and make decisions.&lt;/p&gt;

&lt;h2 id=&quot;practical-async-patterns-examples-that-work&quot;&gt;Practical Async Patterns: Examples That Work&lt;/h2&gt;

&lt;p&gt;Let’s look at specific tasks where async thinking delivers massive wins.&lt;/p&gt;

&lt;h3 id=&quot;test-generation&quot;&gt;Test Generation&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: Developers write feature code then spend hours crafting test cases, debugging failures, and iterating until coverage is acceptable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async approach with Copilot&lt;/strong&gt;: You describe behavior in comments as you write features, then ask Copilot to generate comprehensive tests including edge cases. Review the generated tests to ensure assertions meaningfully verify the contract (not just pass), adjust as needed, and move to your next task while Copilot handles integration tests in the background.&lt;/p&gt;

&lt;p&gt;This delivers huge time savings while often improving coverage, since AI identifies edge cases humans miss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key technique&lt;/strong&gt;: Create custom agents for testing to encapsulate your team’s patterns and ensure consistent output.&lt;/p&gt;

&lt;h3 id=&quot;documentation&quot;&gt;Documentation&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: Docs fall out of date because writing them is tedious. Developers ship features, promise to “update docs later,” and never do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async approach with Copilot&lt;/strong&gt;: Copilot generates README sections, API documentation, and code comments alongside your implementation, making it fast enough that developers actually keep docs current.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key technique&lt;/strong&gt;: Use Copilot Chat to generate documentation as you go. Prompt: “Generate API documentation for this function including parameters, return values, and usage examples.”&lt;/p&gt;

&lt;h3 id=&quot;prototyping-multiple-solutions&quot;&gt;Prototyping Multiple Solutions&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: Teams debate approaches theoretically in design meetings. They pick one design, discover limitations during implementation, then either live with them or go back to design. This cycle wastes days.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async approach with Copilot&lt;/strong&gt;: Create multiple GitHub issues describing different approaches and assign them all to Copilot coding agent. Within hours, you’re reviewing three actual implementations side by side, making decisions based on real code rather than speculation.&lt;/p&gt;

&lt;p&gt;This saves days compared to sequential implementation and dramatically improves decision quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key technique&lt;/strong&gt;: Write clear, detailed issue descriptions. Copilot coding agent works best when you specify requirements, constraints, and success criteria upfront.&lt;/p&gt;

&lt;h3 id=&quot;build-failure-analysis&quot;&gt;Build Failure Analysis&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: Developers scroll through hundreds of lines of logs guessing at root causes. They ping team members for help. They search Stack Overflow. It takes hours to diagnose complex failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async approach with Copilot&lt;/strong&gt;: GitHub Actions can automatically invoke Copilot to analyze failures, categorize them (code, config, test, infrastructure, transient), and generate remediation plans with the appropriate team member tagged.&lt;/p&gt;

&lt;p&gt;You can implement this today with the &lt;a href=&quot;https://github.com/actions/ai-inference&quot;&gt;actions/ai-inference&lt;/a&gt; action and GitHub Models (see my post on &lt;a href=&quot;/self-healing-devops-with-copilot-and-actions/&quot;&gt;self-healing devops&lt;/a&gt; for details).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key technique&lt;/strong&gt;: Set up automated workflows that capture build/test output and feed it to Copilot with context about your repo structure and conventions.&lt;/p&gt;

&lt;h3 id=&quot;code-optimization&quot;&gt;Code Optimization&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: Developers manually review code for dead imports, unused variables, inefficient algorithms, and memory leaks. They profile the application, identify hotspots, then spend hours refactoring. Code reviews often catch optimization opportunities too late, after the feature ships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async approach with Copilot&lt;/strong&gt;: Ask Copilot to analyze your codebase for optimization opportunities while you work on new features. Copilot can identify dead code, suggest more efficient algorithms, highlight memory-intensive operations, and recommend performance improvements.&lt;/p&gt;

&lt;p&gt;For example, prompt Copilot Chat: “Analyze this module for optimization opportunities including dead code, inefficient loops, and memory usage.”&lt;/p&gt;

&lt;p&gt;This is particularly powerful for refactoring legacy code. Create GitHub issues describing different optimization strategies (memory reduction, CPU optimization, code simplification) and assign them to Copilot coding agent. Compare the results to choose the best approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key technique&lt;/strong&gt;: Combine Copilot analysis with profiling data. Share performance metrics in your prompt: “This function takes 500ms on average with 10k records. Optimize for speed while maintaining correctness.” Copilot can suggest targeted optimizations based on actual bottlenecks rather than premature optimization.&lt;/p&gt;

&lt;p&gt;These patterns represent the essence of async, multi-threaded development: AI agents work in parallel while humans orchestrate and make decisions.&lt;/p&gt;

&lt;h2 id=&quot;redesigning-workflows-for-parallel-experimentation&quot;&gt;Redesigning Workflows for Parallel Experimentation&lt;/h2&gt;

&lt;p&gt;Async development requires rethinking how your team coordinates work. Traditional rituals assume synchronous, sequential workflows. They break down when developers work on multiple features simultaneously while AI handles background tasks.&lt;/p&gt;

&lt;h3 id=&quot;what-breaks-in-async-workflows&quot;&gt;What Breaks in Async Workflows&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Daily standups&lt;/strong&gt;: “What did you do yesterday?” becomes less meaningful when you’re orchestrating multiple parallel Coding agent tasks rather than completing one task yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review queues&lt;/strong&gt;: Reviewers expect PRs to arrive sequentially. In async workflows, developers might open multiple PRs for the same feature (testing different approaches) simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment schedules&lt;/strong&gt;: Batch deployments assume teams synchronize to a release train. Async teams ship when features are ready, not on a schedule.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design reviews&lt;/strong&gt;: Traditional design reviews debate approaches theoretically. Async teams prototype multiple approaches with AI and compare actual implementations and working prototypes.&lt;/p&gt;

&lt;h3 id=&quot;new-coordination-patterns&quot;&gt;New Coordination Patterns&lt;/h3&gt;

&lt;p&gt;Here’s what works better for async teams:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repurpose daily standups&lt;/strong&gt;: Rather than “what I did yesterday, what I’m doing today and what’s blocking me”, change the format to cover what Copilot is working on, what needs review and what prompts are yielding the best results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Move to continuous code review&lt;/strong&gt;: Create a policy that requires Copilot Code Review on all PRs, and continuously iterate on Copilot instructions that help guide the review to your standards and conventions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enable on-demand deployments&lt;/strong&gt;: Shift from “we deploy every Friday” to “we deploy when features pass quality gates.” Use GitHub Actions to automate deployments on merge to main. Requires strong automated testing, but eliminates batching delays.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Invest in good quality gates and policies&lt;/strong&gt;: including Code Quality and GitHub Advanced Security scans, linting and test coverage. You’re aiming to have high confidence that when all automated gates pass, the PR can be shipped as soon as human review is completed. No waiting for Friday.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shift design reviews to “build and compare”&lt;/strong&gt;: Instead of debating approaches in a meeting, create multiple issues describing each approach. Assign to Copilot coding agent or have developers prototype with Copilot assistance. Review actual code, not theoretical designs.&lt;/p&gt;

&lt;h2 id=&quot;teaching-the-mindset-what-to-delegate-what-to-keep&quot;&gt;Teaching the Mindset: What to Delegate, What to Keep&lt;/h2&gt;

&lt;p&gt;The hardest part of async thinking isn’t the mechanics - it’s the judgment. Developers need to learn what work Copilot handles well versus what requires human expertise.&lt;/p&gt;

&lt;h3 id=&quot;copilot-excels-at&quot;&gt;Copilot Excels At&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Repetitive code (CRUD operations, boilerplate, type conversions)&lt;/li&gt;
  &lt;li&gt;Test generation (unit tests, integration tests, edge cases)&lt;/li&gt;
  &lt;li&gt;Documentation (README, API docs, code comments)&lt;/li&gt;
  &lt;li&gt;Code transformations (refactoring, format changes, migrations)&lt;/li&gt;
  &lt;li&gt;Pattern matching (finding similar code, applying conventions)&lt;/li&gt;
  &lt;li&gt;Performance optimization (is this efficient enough?)&lt;/li&gt;
  &lt;li&gt;First-draft implementations of well-specified features&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;humans-excel-at&quot;&gt;Humans Excel At&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Architecture decisions (system design, tech stack choices)&lt;/li&gt;
  &lt;li&gt;Business logic validation (does this match requirements?)&lt;/li&gt;
  &lt;li&gt;Context synthesis (how does this fit the broader system?)&lt;/li&gt;
  &lt;li&gt;Ambiguity resolution (what did the stakeholder really mean?)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key principle&lt;/strong&gt;: Use Copilot to accelerate execution, but keep humans in the decision loop for validation and high-level thinking.&lt;/p&gt;

&lt;h2 id=&quot;overcoming-resistance-this-feels-like-cheating&quot;&gt;Overcoming Resistance: “This Feels Like Cheating”&lt;/h2&gt;

&lt;p&gt;Some developers resist async thinking because it feels like they’re not “really” coding. They worry about skill atrophy or whether they deserve credit for AI-assisted work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reframe the conversation&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;“Using Copilot isn’t cheating. It’s like using a compiler, debugger, or IDE - it’s a tool that makes you more effective.”&lt;/li&gt;
  &lt;li&gt;“Your skills shift from typing code to refining specs and reviewing code, which is actually more valuable. Senior engineers spend more time reviewing than typing.”&lt;/li&gt;
  &lt;li&gt;“You’re orchestrating complexity. That’s higher-level thinking than implementing details manually.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Address skill atrophy concerns&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;“You’re still coding. You’re just moving faster on routine work and spending more time on hard problems.”&lt;/li&gt;
  &lt;li&gt;“Review every suggestion critically. You’ll learn from seeing multiple approaches to problems.”&lt;/li&gt;
  &lt;li&gt;“Experiment with Copilot turning off periodically to ensure you retain fundamentals.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Celebrate the shift&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;“Think about what you can build now that you couldn’t before. More features? Better quality? Time to learn new skills?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most resistance fades once developers experience the productivity boost firsthand.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Teaching async-first thinking is the key to unlocking GitHub Copilot’s full potential. It’s not just about using AI tools - it’s about fundamentally rethinking how work gets done.&lt;/p&gt;

&lt;p&gt;Start by teaching your team to ask “Should Copilot do this?” before starting tasks. Show them practical patterns for test generation, documentation, prototyping, and build analysis. Redesign coordination patterns (standups, code review, deployments) to support parallel work. And help them develop judgment about what to delegate versus what requires human expertise.&lt;/p&gt;

&lt;p&gt;The shift from sequential, single-threaded development to async, multi-threaded workflows takes practice. But once your team internalizes the mindset, they’ll wonder how they ever worked any other way.&lt;/p&gt;

&lt;p&gt;For the leadership and enablement foundation that makes this possible, see my companion post on &lt;a href=&quot;/building-copilot-enablement-program/&quot;&gt;Building a GitHub Copilot Enablement Program That Actually Works&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy orchestrating!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="development" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2025/11/async-copilot.png" /><media:content medium="image" url="https://github.com/assets/images/2025/11/async-copilot.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Building a GitHub Copilot Enablement Program That Actually Works</title><link href="https://github.com/building-copilot-enablement-program/" rel="alternate" type="text/html" title="Building a GitHub Copilot Enablement Program That Actually Works" /><published>2025-11-25T08:00:00+00:00</published><updated>2025-11-25T08:00:00+00:00</updated><id>https://github.com/building-copilot-enablement-program</id><content type="html" xml:base="https://github.com/building-copilot-enablement-program/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#license-distribution-isnt-enough&quot; id=&quot;markdown-toc-license-distribution-isnt-enough&quot;&gt;License Distribution Isn’t Enough&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-innovation-curve-understanding-adoption-patterns&quot; id=&quot;markdown-toc-the-innovation-curve-understanding-adoption-patterns&quot;&gt;The Innovation Curve: Understanding Adoption Patterns&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-leaders-must-build&quot; id=&quot;markdown-toc-what-leaders-must-build&quot;&gt;What Leaders Must Build&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#building-a-continuous-enablement-program&quot; id=&quot;markdown-toc-building-a-continuous-enablement-program&quot;&gt;Building a Continuous Enablement Program&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#weekly-copilot-office-hours&quot; id=&quot;markdown-toc-weekly-copilot-office-hours&quot;&gt;Weekly Copilot Office Hours&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#monthly-team-showcases&quot; id=&quot;markdown-toc-monthly-team-showcases&quot;&gt;Monthly Team Showcases&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#quarterly-skills-refreshers&quot; id=&quot;markdown-toc-quarterly-skills-refreshers&quot;&gt;Quarterly Skills Refreshers&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#the-junior-developer-challenge&quot; id=&quot;markdown-toc-the-junior-developer-challenge&quot;&gt;The Junior Developer Challenge&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#creating-the-right-culture-celebrate-measure-share&quot; id=&quot;markdown-toc-creating-the-right-culture-celebrate-measure-share&quot;&gt;Creating the Right Culture: Celebrate, Measure, Share&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#celebrate-wins-publicly&quot; id=&quot;markdown-toc-celebrate-wins-publicly&quot;&gt;Celebrate Wins Publicly&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#measure-what-matters&quot; id=&quot;markdown-toc-measure-what-matters&quot;&gt;Measure What Matters&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#share-success-stories-internally&quot; id=&quot;markdown-toc-share-success-stories-internally&quot;&gt;Share Success Stories Internally&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#leadership-visibility-matters&quot; id=&quot;markdown-toc-leadership-visibility-matters&quot;&gt;Leadership Visibility Matters&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-sharpen-the-saw-principle-continuous-learning&quot; id=&quot;markdown-toc-the-sharpen-the-saw-principle-continuous-learning&quot;&gt;The “Sharpen the Saw” Principle: Continuous Learning&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#building-learning-into-the-cadence&quot; id=&quot;markdown-toc-building-learning-into-the-cadence&quot;&gt;Building Learning Into the Cadence&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#common-pitfalls-and-how-to-avoid-them&quot; id=&quot;markdown-toc-common-pitfalls-and-how-to-avoid-them&quot;&gt;Common Pitfalls and How to Avoid Them&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#mandating-copilot-without-committing-to-cultural-change&quot; id=&quot;markdown-toc-mandating-copilot-without-committing-to-cultural-change&quot;&gt;Mandating Copilot without Committing to Cultural Change&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#start-with-early-adopters-not-everyone&quot; id=&quot;markdown-toc-start-with-early-adopters-not-everyone&quot;&gt;Start with Early Adopters, Not Everyone&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#measure-beyond-productivity&quot; id=&quot;markdown-toc-measure-beyond-productivity&quot;&gt;Measure Beyond Productivity&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#budget-time-for-learning&quot; id=&quot;markdown-toc-budget-time-for-learning&quot;&gt;Budget Time for Learning&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#address-security-concerns-upfront&quot; id=&quot;markdown-toc-address-security-concerns-upfront&quot;&gt;Address Security Concerns Upfront&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#be-patient-with-adoption&quot; id=&quot;markdown-toc-be-patient-with-adoption&quot;&gt;Be Patient with Adoption&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;license-distribution-isnt-enough&quot;&gt;License Distribution Isn’t Enough&lt;/h2&gt;

&lt;p&gt;Many teams are struggling with “AI-hype”: they’ve purchased GitHub Copilot licenses for their teams, but are not immediately seeing massive productivity boosts across the board. Is Agentic software delivery just a marketing term, or can real gains be realized?&lt;/p&gt;

&lt;p&gt;Within weeks of provisioning licenses, most teams will notice a pattern: a handful of high performers start achieving remarkable results, slashing time spent on routine tasks, experimenting with new approaches, shipping features faster. Meanwhile, the majority of teams continue working exactly as before, with Copilot sitting idle or producing mediocre suggestions they ignore.&lt;/p&gt;

&lt;p&gt;This isn’t a tool problem. &lt;strong&gt;It’s a leadership problem&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;GitHub Copilot boosts task productivity - most developers instinctively know this. But &lt;em&gt;systemic&lt;/em&gt; gains don’t materialize automatically. They require intentional investment in training, cultural change, and workflow redesign. In this post, I’ll share what leaders must build to drive real Copilot adoption, and why productivity improvement is a change management issue rather than a technology issue.&lt;/p&gt;

&lt;p&gt;In my companion post, &lt;a href=&quot;/teaching-async-thinking-with-copilot/&quot;&gt;Teaching Your Team to Think Async-First with GitHub Copilot&lt;/a&gt;, I cover the specific mindset shifts and workflow changes teams need to make. This post focuses on the leadership and enablement foundation that makes those changes possible.&lt;/p&gt;

&lt;h2 id=&quot;the-innovation-curve-understanding-adoption-patterns&quot;&gt;The Innovation Curve: Understanding Adoption Patterns&lt;/h2&gt;

&lt;p&gt;When you roll out GitHub Copilot, you’ll encounter a classic innovation adoption curve:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;10% Innovators&lt;/strong&gt;: High performers who immediately “get it”, experiment aggressively, and achieve noticeable productivity gains within weeks&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;70% Early/Late Majority&lt;/strong&gt;: Developers who need structured guidance, coaching, and proof points before they change their workflows&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;20% Laggards&lt;/strong&gt;: Skeptics who resist AI, worry about job security, or simply prefer their existing muscle memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The innovators succeed because they have four traits:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;they understand prompt context engineering (even if they don’t call it that)&lt;/li&gt;
  &lt;li&gt;they have an experimentation mindset&lt;/li&gt;
  &lt;li&gt;they’re comfortable with ambiguity&lt;/li&gt;
  &lt;li&gt;they continually learn&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many developers don’t start with these skills.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/unleashing-developer-productivity-with-generative-ai&quot;&gt;McKinsey research&lt;/a&gt; confirms this pattern. Junior developers (less than 1 year experience) were actually 7-10% slower with AI tools when left to figure things out on their own. But organizations that implement coaching programs, use case frameworks, and skills development see consistent gains across all experience levels.&lt;/p&gt;

&lt;p&gt;The risk of uneven adoption is real: team friction emerges when some developers are much faster than others. Code quality concerns arise when inexperienced developers trust AI output without verification, and knowledge gaps widen as innovators pull ahead while others stagnate.&lt;/p&gt;

&lt;h2 id=&quot;what-leaders-must-build&quot;&gt;What Leaders Must Build&lt;/h2&gt;

&lt;p&gt;Leaders must build a structured enablement program, not just a rollout announcement. Here’s what that looks like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Training curriculum&lt;/strong&gt; with three levels:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Beginner: Copilot basics, prompt patterns, acceptance vs rejection criteria&lt;/li&gt;
  &lt;li&gt;Intermediate: Context refinement, code review with AI, debugging techniques&lt;/li&gt;
  &lt;li&gt;Advanced: Custom instructions and agents, agentic workflows, custom MCP servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Ongoing support&lt;/strong&gt; beyond one-time workshops:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Weekly “Copilot office hours” where developers bring real problems&lt;/li&gt;
  &lt;li&gt;Monthly team showcase of wins and techniques&lt;/li&gt;
  &lt;li&gt;Quarterly skills assessments and refresher training&lt;/li&gt;
  &lt;li&gt;Peer coaching network with internal champions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Metrics to track effectiveness&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Copilot suggestion acceptance rate&lt;/li&gt;
  &lt;li&gt;Time saved on routine tasks (measure before/after on sample tasks)&lt;/li&gt;
  &lt;li&gt;Developer satisfaction scores (survey quarterly)&lt;/li&gt;
  &lt;li&gt;Feature velocity and cycle time improvements&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Remember that there is no “single metric” - measure with the entire system in mind. Often Copilot gains are diluted by inefficient DevOps practices.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;building-a-continuous-enablement-program&quot;&gt;Building a Continuous Enablement Program&lt;/h2&gt;

&lt;p&gt;The reality is that AI and Copilot evolve rapidly in meaningful ways. Autocomplete in 2022, Chat in 2023, Workspace in 2024, Agents in 2025. One-time “Copilot 101” training becomes obsolete quickly.&lt;/p&gt;

&lt;p&gt;You need continuous enablement, not one-and-done workshops.&lt;/p&gt;

&lt;h3 id=&quot;weekly-copilot-office-hours&quot;&gt;Weekly Copilot Office Hours&lt;/h3&gt;

&lt;p&gt;Set up a recurring 30-minute session where developers bring real problems. Format:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Developer shares screen, shows a task they’re working on&lt;/li&gt;
  &lt;li&gt;Facilitator demonstrates Copilot techniques live&lt;/li&gt;
  &lt;li&gt;Team discusses when to use AI vs when to code manually&lt;/li&gt;
  &lt;li&gt;Capture learnings in &lt;a href=&quot;https://docs.github.com/en/copilot/how-tos/provide-context/use-copilot-spaces/use-copilot-spaces&quot;&gt;Copilot Spaces&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve seen teams run this as a Zoom or Teams call with recordings posted internally. The key is making it low-stakes and practical: real problems, real solutions, real-time.&lt;/p&gt;

&lt;h3 id=&quot;monthly-team-showcases&quot;&gt;Monthly Team Showcases&lt;/h3&gt;

&lt;p&gt;Once a month, dedicate 15 minutes of your team meeting to “Copilot wins.” Developers volunteer to share:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;A task where Copilot saved significant time&lt;/li&gt;
  &lt;li&gt;A new technique they learned&lt;/li&gt;
  &lt;li&gt;A prompt pattern that works well&lt;/li&gt;
  &lt;li&gt;A mistake they made and what they learned&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This serves three purposes: it spreads knowledge horizontally across the team, it normalizes talking about AI assistance (reducing stigma), and it creates positive reinforcement for experimentation.&lt;/p&gt;

&lt;h3 id=&quot;quarterly-skills-refreshers&quot;&gt;Quarterly Skills Refreshers&lt;/h3&gt;

&lt;p&gt;Every quarter, run a 1-hour training session on what’s new:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;New Copilot features released in the last 3 months&lt;/li&gt;
  &lt;li&gt;Advanced techniques for experienced users&lt;/li&gt;
  &lt;li&gt;Updated best practices based on team learnings&lt;/li&gt;
  &lt;li&gt;Industry case studies and benchmarks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rotate who leads these sessions. Don’t always make it the same “Copilot champion.” Distributed ownership drives distributed adoption.&lt;/p&gt;

&lt;h3 id=&quot;the-junior-developer-challenge&quot;&gt;The Junior Developer Challenge&lt;/h3&gt;

&lt;p&gt;Junior developers need extra support. Without foundational knowledge, they can’t evaluate whether Copilot’s suggestions are good or bad. They accept code blindly, leading to bugs, security issues, and learning gaps.&lt;/p&gt;

&lt;p&gt;Best practices for supporting juniors:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Pair them with senior developers for the first 2-3 months of Copilot use&lt;/li&gt;
  &lt;li&gt;Teach code review skills first, Copilot usage second (review is the forcing function for learning)&lt;/li&gt;
  &lt;li&gt;Encourage asking “Why does Copilot suggest this?” rather than accepting blindly&lt;/li&gt;
  &lt;li&gt;Set up automated quality gates (linting, security scanning, tests) that catch bad AI suggestions&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Remember&lt;/strong&gt;: Copilot is “like an excitable junior engineer who types really fast” (Kent Quirk). Juniors need to learn to be the senior engineer reviewing that excitable junior.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;creating-the-right-culture-celebrate-measure-share&quot;&gt;Creating the Right Culture: Celebrate, Measure, Share&lt;/h2&gt;

&lt;p&gt;Technology changes are easy. Cultural changes are hard. Here’s how to make Copilot adoption stick:&lt;/p&gt;

&lt;h3 id=&quot;celebrate-wins-publicly&quot;&gt;Celebrate Wins Publicly&lt;/h3&gt;

&lt;p&gt;Create a dedicated internal channel (e.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#copilot-wins&lt;/code&gt;) where anyone can share successes. Keep it genuine. Forced enthusiasm backfires, but genuine celebration creates momentum.&lt;/p&gt;

&lt;h3 id=&quot;measure-what-matters&quot;&gt;Measure What Matters&lt;/h3&gt;

&lt;p&gt;Track these metrics as you progress:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adoption metrics&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Active Copilot users (at least 1 suggestion accepted per week)&lt;/li&gt;
  &lt;li&gt;Suggestion acceptance rate (team average)&lt;/li&gt;
  &lt;li&gt;Chat interactions and agent mode per developer per week&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Impact metrics&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Cycle time (issue open to PR merged)&lt;/li&gt;
  &lt;li&gt;Developer satisfaction scores (include questions about AI tools)&lt;/li&gt;
  &lt;li&gt;Time spent on routine tasks (survey-based estimation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Quality metrics&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Defect escape rate (bugs found in production)&lt;/li&gt;
  &lt;li&gt;Test coverage trends&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Don’t expect instant DORA metric improvements. Copilot primarily improves &lt;em&gt;task-level&lt;/em&gt; productivity. Team-level metrics like cycle time also depend on code review speed, deployment frequency, coordination overhead and overall DevOps practices.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;share-success-stories-internally&quot;&gt;Share Success Stories Internally&lt;/h3&gt;

&lt;p&gt;Once a month, publish an internal blog post or email highlighting a Copilot success story. Interview a developer, show before/after workflows, quantify impact. Highlight wins (or losses with analysis), what stood out and what could be done next time.&lt;/p&gt;

&lt;h3 id=&quot;leadership-visibility-matters&quot;&gt;Leadership Visibility Matters&lt;/h3&gt;

&lt;p&gt;Executives need to understand and talk about Copilot. When VPs and directors demonstrate deeper knowledge of what Copilot can and can’t do, they will give their teams the freedom to get over the “learning hump”. When teams see that leadership is invested in their success with Copilot, it goes a long way to build confidence and encourage innovation.&lt;/p&gt;

&lt;h2 id=&quot;the-sharpen-the-saw-principle-continuous-learning&quot;&gt;The “Sharpen the Saw” Principle: Continuous Learning&lt;/h2&gt;

&lt;p&gt;GitHub Copilot is evolving rapidly. What worked 6 months ago may not be optimal today. Your team needs continuous learning baked into their cadence.&lt;/p&gt;

&lt;h3 id=&quot;building-learning-into-the-cadence&quot;&gt;Building Learning Into the Cadence&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Dedicated exploration time&lt;/strong&gt;: Allocate 2-4 hours per month per developer for exploring new Copilot features. Make it explicit, not “whenever you have time.” Treat it like tech debt work that needs to be scheduled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Reserve Friday afternoons once a month. Developers experiment with a new Copilot feature, document findings, share with team in next showcase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Internal Copilot Space&lt;/strong&gt;: Maintain a markdown repo (and link it to a Copilot Space) with:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Prompt patterns that work well for your codebase&lt;/li&gt;
  &lt;li&gt;Common pitfalls and how to avoid them&lt;/li&gt;
  &lt;li&gt;Use case examples with before/after&lt;/li&gt;
  &lt;li&gt;Links to official documentation and external resources&lt;/li&gt;
  &lt;li&gt;Change log of major Copilot updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Update at a regular cadence (monthly or quarterly). Assign a rotating “knowledge curator” to moderate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;External training refreshers&lt;/strong&gt;: Budget for annual training from GitHub or third-party providers. New major features justify bringing in experts to train your team. Internal champions can handle ongoing enablement, but external experts bring fresh perspectives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monthly “What’s New” lunch &amp;amp; learns&lt;/strong&gt;: When GitHub ships major updates, dedicate a lunch session to demoing new capabilities and use cases.&lt;/p&gt;

&lt;h2 id=&quot;common-pitfalls-and-how-to-avoid-them&quot;&gt;Common Pitfalls and How to Avoid Them&lt;/h2&gt;

&lt;p&gt;After helping dozens of organizations adopt Copilot, here are the mistakes I see repeatedly:&lt;/p&gt;

&lt;h3 id=&quot;mandating-copilot-without-committing-to-cultural-change&quot;&gt;Mandating Copilot without Committing to Cultural Change&lt;/h3&gt;

&lt;p&gt;Teams that say “everyone must use Copilot” but don’t provide resources and training create resentment. New tools require investment, and not many tools can deliver such huge ROI if there is investment as Copilot!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better approach&lt;/strong&gt;: Create training programs and commit to ongoing enablement. Create incentives for experimentation. Recognize early adopters. Share success stories. Let peer pressure and genuine value augment training to drive adoption.&lt;/p&gt;

&lt;h3 id=&quot;start-with-early-adopters-not-everyone&quot;&gt;Start with Early Adopters, Not Everyone&lt;/h3&gt;

&lt;p&gt;Rolling out to hundreds of developers at once creates chaos. You typically can’t support that many people learning simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better approach&lt;/strong&gt;: Start with 20-30 innovators. Learn what works. Document patterns. Then expand to more teams. Iterate. Then expand to everyone. Each wave teaches the next wave. This makes Copilot a lot “stickier”.&lt;/p&gt;

&lt;h3 id=&quot;measure-beyond-productivity&quot;&gt;Measure Beyond Productivity&lt;/h3&gt;

&lt;p&gt;If you only track “lines of code written” or “PRs merged,” you’ll optimize for the wrong things. Developers will ship more code, not better code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better approach&lt;/strong&gt;: Track developer satisfaction (“Does Copilot make your job better?”), code quality (defect rates, security issues), and learning (“What new skills have you developed?”). Balanced metrics drive balanced outcomes.&lt;/p&gt;

&lt;h3 id=&quot;budget-time-for-learning&quot;&gt;Budget Time for Learning&lt;/h3&gt;

&lt;p&gt;Teams that don’t explicitly allocate time for Copilot learning see minimal adoption. Developers are busy shipping features. Learning gets deprioritized.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better approach&lt;/strong&gt;: Schedule 2-4 hours per month per developer. Make it non-negotiable, like sprint planning or retrospectives. Track it in your sprint capacity planning.&lt;/p&gt;

&lt;h3 id=&quot;address-security-concerns-upfront&quot;&gt;Address Security Concerns Upfront&lt;/h3&gt;

&lt;p&gt;Developers worry about what Copilot can see, whether their code trains models, and whether AI might leak sensitive data.&lt;/p&gt;

&lt;p&gt;Be transparent:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;GitHub Copilot Business/Enterprise does not train on your code&lt;/li&gt;
  &lt;li&gt;Code snippets are not retained after generating suggestions&lt;/li&gt;
  &lt;li&gt;Use content exclusions to prevent Copilot from accessing secrets or sensitive files&lt;/li&gt;
  &lt;li&gt;Enable Copilot audit logs to track usage and investigate issues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You should also implement GitHub Advanced Security so that AppSec is baked in to the daily agentic workflow - this will reduce concern about shipping vulnerable code.&lt;/p&gt;

&lt;h3 id=&quot;be-patient-with-adoption&quot;&gt;Be Patient with Adoption&lt;/h3&gt;

&lt;p&gt;Meaningful adoption takes 6-12 months, not 6-12 weeks. Don’t expect instant transformation.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Month 1-3: Early adopters experiment, find use cases, beginner training programs launch&lt;/li&gt;
  &lt;li&gt;Month 4-6: Patterns emerge, intermediate/advanced training programs launch&lt;/li&gt;
  &lt;li&gt;Month 7-9: Majority of team adopts, workflows change&lt;/li&gt;
  &lt;li&gt;Month 10-12: New workflows become muscle memory, metrics improve&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you expect results in 30 days, you’re likely to be disappointed. If you invest systematically, you’ll see transformation.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Distributing GitHub Copilot licenses is the easy part. Building organizational capability to leverage AI is the real work.&lt;/p&gt;

&lt;p&gt;The shift to AI-assisted development requires intentional leadership. You must invest in training programs, drive cultural change by celebrating wins and sharing success stories, and commit to continuous learning as AI capabilities evolve.&lt;/p&gt;

&lt;p&gt;Start by understanding the innovation curve and building enablement programs that serve all adopter types. Create weekly office hours, monthly showcases, and quarterly refreshers. Measure what matters with balanced metrics across adoption, impact, and quality. And be patient - transformation takes 6-12 months.&lt;/p&gt;

&lt;p&gt;The organizations that win in the age of AI won’t be the ones who simply buy the best tools. They’ll be the ones who build the best enablement programs, foster the right culture, and empower their developers to think differently about how work gets done.&lt;/p&gt;

&lt;p&gt;In my companion post, &lt;a href=&quot;/teaching-async-thinking-with-copilot/&quot;&gt;Teaching Your Team to Think Async-First with GitHub Copilot&lt;/a&gt;, I dive into the specific mindset shifts and workflow redesigns that make async, multi-threaded development possible.&lt;/p&gt;

&lt;p&gt;Happy leading!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="process" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2025/11/copilot-enablement.png" /><media:content medium="image" url="https://github.com/assets/images/2025/11/copilot-enablement.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Eight Principles for Agentic Software Delivery (ASD)</title><link href="https://github.com/eight-principles-agentic-software-delivery/" rel="alternate" type="text/html" title="Eight Principles for Agentic Software Delivery (ASD)" /><published>2025-08-12T00:30:00+00:00</published><updated>2025-08-12T00:30:00+00:00</updated><id>https://github.com/eight-principles-agentic-software-delivery</id><content type="html" xml:base="https://github.com/eight-principles-agentic-software-delivery/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#the-evolution-from-continuous-delivery-to-asd&quot; id=&quot;markdown-toc-the-evolution-from-continuous-delivery-to-asd&quot;&gt;The Evolution from Continuous Delivery to ASD&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-1-outcome-focused-delivery&quot; id=&quot;markdown-toc-principle-1-outcome-focused-delivery&quot;&gt;Principle 1: Outcome-Focused Delivery&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application&quot; id=&quot;markdown-toc-github-application&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification&quot; id=&quot;markdown-toc-impact-classification&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-2-human-ai-collaboration-by-design&quot; id=&quot;markdown-toc-principle-2-human-ai-collaboration-by-design&quot;&gt;Principle 2: Human-AI Collaboration by Design&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-1&quot; id=&quot;markdown-toc-github-application-1&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-1&quot; id=&quot;markdown-toc-impact-classification-1&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-3-intelligent-automation-across-the-sdlc&quot; id=&quot;markdown-toc-principle-3-intelligent-automation-across-the-sdlc&quot;&gt;Principle 3: Intelligent Automation Across the SDLC&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-2&quot; id=&quot;markdown-toc-github-application-2&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-2&quot; id=&quot;markdown-toc-impact-classification-2&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-4-single-source-of-truth-and-platform-integration&quot; id=&quot;markdown-toc-principle-4-single-source-of-truth-and-platform-integration&quot;&gt;Principle 4: Single Source of Truth and Platform Integration&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-3&quot; id=&quot;markdown-toc-github-application-3&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-3&quot; id=&quot;markdown-toc-impact-classification-3&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-5-built-in-quality-and-security&quot; id=&quot;markdown-toc-principle-5-built-in-quality-and-security&quot;&gt;Principle 5: Built-In Quality and Security&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-4&quot; id=&quot;markdown-toc-github-application-4&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-4&quot; id=&quot;markdown-toc-impact-classification-4&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-6-shared-responsibility-and-ai-governance&quot; id=&quot;markdown-toc-principle-6-shared-responsibility-and-ai-governance&quot;&gt;Principle 6: Shared Responsibility and AI Governance&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-5&quot; id=&quot;markdown-toc-github-application-5&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-5&quot; id=&quot;markdown-toc-impact-classification-5&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-7-parallel-experimentation&quot; id=&quot;markdown-toc-principle-7-parallel-experimentation&quot;&gt;Principle 7: Parallel Experimentation&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-6&quot; id=&quot;markdown-toc-github-application-6&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-6&quot; id=&quot;markdown-toc-impact-classification-6&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#principle-8-continuous-learning-and-adaptation&quot; id=&quot;markdown-toc-principle-8-continuous-learning-and-adaptation&quot;&gt;Principle 8: Continuous Learning and Adaptation&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#github-application-7&quot; id=&quot;markdown-toc-github-application-7&quot;&gt;GitHub Application&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#impact-classification-7&quot; id=&quot;markdown-toc-impact-classification-7&quot;&gt;Impact Classification&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#putting-it-all-together&quot; id=&quot;markdown-toc-putting-it-all-together&quot;&gt;Putting It All Together&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#getting-started-with-asd&quot; id=&quot;markdown-toc-getting-started-with-asd&quot;&gt;Getting Started with ASD&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In this post, I’ll show you eight principles for implementing Agentic Software Delivery (ASD) and why they matter for accelerating value delivery while maintaining quality and security. You’ll learn how to blend human expertise with AI capabilities throughout your SDLC, evolving a delivery system that’s faster, smarter, and more focused on business outcomes.&lt;/p&gt;

&lt;h2 id=&quot;the-evolution-from-continuous-delivery-to-asd&quot;&gt;The Evolution from Continuous Delivery to ASD&lt;/h2&gt;

&lt;p&gt;Right as DevOps was becoming an industry standard, Continuous Delivery gave us some good &lt;a href=&quot;https://devopsnet.com/2011/08/04/continuous-delivery/&quot;&gt;foundational principles&lt;/a&gt; to put into practice: automate everything, maintain quality, and deliver frequently to name three. In one of my previous posts, I define &lt;a href=&quot;/agentic-software-delivery/&quot;&gt;Agentic Software Delivery&lt;/a&gt; (ASD) but in this post I want to start making it more practical by providing a set of principles. For each principal I want to assess its impact on the three pillars of ASD and show some practical GitHub implementation tips.&lt;/p&gt;

&lt;h2 id=&quot;principle-1-outcome-focused-delivery&quot;&gt;Principle 1: Outcome-Focused Delivery&lt;/h2&gt;

&lt;p&gt;Prioritize business outcomes over mere output. Every feature should tie back to customer value, and “done” means value is delivered in production, not just code completed. This may seem obvious, but there is a huge focus on which models write better code or what percentage of your codebase is written by AI - all of which are meaningless numbers if you are not achieving business outcomes. The goal isn’t more AI - it’s better results, improved productivity and happier developers.&lt;/p&gt;

&lt;h3 id=&quot;github-application&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Tie your work item tracking to Business Objectives so that the business outcome is clear throughout the SDLC&lt;/li&gt;
  &lt;li&gt;Configure GitHub Advanced Security to track vulnerability fixes as business outcomes - security is business value&lt;/li&gt;
  &lt;li&gt;Don’t overindex on a single measure - think system wide (read the GitHub &lt;a href=&quot;https://resources.github.com/engineering-system-success-playbook/&quot;&gt;Engineering System Success Playbook&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;On an architectural level, you should validate business value during rollout. One way could be to use feature flags and partial deployments to validate business metrics before full rollout.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: HIGH - Humans define business value and success metrics&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: LOW - Agents execute but don’t determine business priorities&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: MEDIUM - providing business value goals in agent-readable (and human readable) formats will improve agentic results&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-2-human-ai-collaboration-by-design&quot;&gt;Principle 2: Human-AI Collaboration by Design&lt;/h2&gt;

&lt;p&gt;Integrate human expertise with AI at every stage. Design processes where routine tasks are handled by AI while complex decisions remain human-guided. The goal is not to replace humans, but to replace mundane and low-level toil tasks with AI so that humans can work at more abstract layers and do higher-order work.&lt;/p&gt;

&lt;h3 id=&quot;github-application-1&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Enable GitHub Copilot for AI pair programming across your organization. This requires more than just assigning a license - you need to create enablement programs (and ideally, teams) that will continually train and enable teams. AI is moving fast, and just like we need Continuous Delivery, AI development requires Continuous enablement.&lt;/li&gt;
  &lt;li&gt;Use Copilot Code Review to assist with the increase in review burden&lt;/li&gt;
  &lt;li&gt;Use Copilot to generate READMEs, architectural documentation, style guides, coding patterns and other “tribal knowledge”. Source control these documents alongside code so that Copilot has access in the IDE or through Coding Agent.&lt;/li&gt;
  &lt;li&gt;Use Copilot to generate &lt;a href=&quot;https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions?versionId=free-pro-team%40latest&amp;amp;productId=copilot&amp;amp;restPage=tutorials%2Ccopilot-chat-cookbook&quot;&gt;custom instructions&lt;/a&gt; and custom &lt;a href=&quot;https://code.visualstudio.com/docs/copilot/chat/chat-modes&quot;&gt;Chat Modes&lt;/a&gt; that tailor and personalize Copilot&lt;/li&gt;
  &lt;li&gt;Invest in good automated quality gates, applied at scale using &lt;a href=&quot;https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-rulesets/about-rulesets&quot;&gt;Rulesets&lt;/a&gt;. Make the gates thorough so that you have high trust in code that passes the gauntlet. Apply these gates irrespective of the &lt;em&gt;source&lt;/em&gt; of changes (human, AI or a mix).&lt;/li&gt;
  &lt;li&gt;Create &lt;a href=&quot;https://github.blog/changelog/2025-05-29-introducing-copilot-spaces-a-new-way-to-work-with-code-and-context/&quot;&gt;Copilot Spaces&lt;/a&gt; that encapsulate softer skills, general coding guidelines and standards, and other “tribal knowledge” so that you can build a library that is searchable and personalized&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-1&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: HIGH - Humans provide context and custom guidelines, make decisions, and validate AI output&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: HIGH - Agents handle repetitive coding and testing tasks; automated gates do much of the validation heavy-lifting&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: HIGH - AI is personalized to your codebase and team patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-3-intelligent-automation-across-the-sdlc&quot;&gt;Principle 3: Intelligent Automation Across the SDLC&lt;/h2&gt;

&lt;p&gt;Automate everything feasible and use AI to extend automation into complex, context-driven tasks. Go beyond simple CI/CD to intelligent pipeline optimization. Teams that use build automation always outperform teams that build manually: the same will be said of teams that leverage AI in their pipelines to keep pipelines running continuously - they will outperform teams that rely on “traditional”, static pipelines.&lt;/p&gt;

&lt;h3 id=&quot;github-application-2&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Implement GitHub Actions workflows that &lt;a href=&quot;/self-healing-devops-with-copilot-and-actions/&quot;&gt;self-heal&lt;/a&gt; based when they fail&lt;/li&gt;
  &lt;li&gt;Use Dependabot for automated dependency &lt;a href=&quot;https://docs.github.com/en/code-security/dependabot/dependabot-security-updates/about-dependabot-security-updates&quot;&gt;updates&lt;/a&gt; with intelligent grouping&lt;/li&gt;
  &lt;li&gt;In high-activity repositories, deploy GitHub’s &lt;a href=&quot;https://github.blog/engineering/engineering-principles/how-github-uses-merge-queue-to-ship-hundreds-of-changes-every-day/&quot;&gt;merge queue&lt;/a&gt; to prevent failing PRs from blocking the entire pipeline&lt;/li&gt;
  &lt;li&gt;Use &lt;a href=&quot;https://github.blog/enterprise-software/ci-cd/when-to-choose-github-hosted-runners-or-self-hosted-runners-with-github-actions/&quot;&gt;GitHub Hosted Runners&lt;/a&gt; so that you can concentrate on software delivery rather than trying to scale and manage build farms&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-2&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: LOW - Humans set policies but don’t manage execution&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: HIGH - Agents handle most automation tasks independently&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: HIGH - AI is leveraged to self-heal tests, pipelines and other automated processes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-4-single-source-of-truth-and-platform-integration&quot;&gt;Principle 4: Single Source of Truth and Platform Integration&lt;/h2&gt;

&lt;p&gt;Keep all code, configurations, and documents in a single system for consistency and traceability. This enables both humans and AI to work from the same context. There may be exceptions that are role-dependent: designers typically work in tools like Figma rather than in source control systems like GitHub. For external systems, leverage Model Context Protocol (MCP) to provide context or extend model capabilities (tools). However, the backbone of your development system should be a single, AI-powered Source Control Management (SCM) system.&lt;/p&gt;

&lt;h3 id=&quot;github-application-3&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Choose a single SCM tool - make sure this platform is capable of running intelligent pipelines and integrating to other systems when necessary. Make sure this is a tool that your developers will enjoy working with!&lt;/li&gt;
  &lt;li&gt;Store infrastructure as code alongside application code in GitHub repos&lt;/li&gt;
  &lt;li&gt;Implement GitHub’s CODEOWNERS for clear ownership of code&lt;/li&gt;
  &lt;li&gt;Reduce the number of tools in your ecosystem, and use platform-native tools where possible. This not only reduces cognitive load and the cost of integration, but the fewer external systems you need, the faster you can get context from those external systems to agents when needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-3&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: MEDIUM - Humans establish structure and governance&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: HIGH - Agents need unified access to operate effectively&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: HIGH - Centralized data and reduced integration footprint enables better AI understanding&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: A fragmented toolchain limits AI effectiveness. Consolidation isn’t just about efficiency; it’s about how quickly you can enable AI agents to understand your entire system.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;principle-5-built-in-quality-and-security&quot;&gt;Principle 5: Built-In Quality and Security&lt;/h2&gt;

&lt;p&gt;Embed quality and security from the start. AI tools should help generate tests, detect vulnerabilities, and enforce standards continuously.&lt;/p&gt;

&lt;h3 id=&quot;github-application-4&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Enable &lt;a href=&quot;https://docs.github.com/en/enterprise-cloud@latest/get-started/learning-about-github/about-github-advanced-security&quot;&gt;GitHub Advanced Security&lt;/a&gt; for automatic vulnerability scanning and remediation at scale using &lt;a href=&quot;https://docs.github.com/en/code-security/securing-your-organization/fixing-security-alerts-at-scale/about-security-campaigns?versionId=free-pro-team%40latest&amp;amp;productId=copilot&amp;amp;restPage=tutorials%2Ccopilot-chat-cookbook&quot;&gt;Campaigns&lt;/a&gt; and &lt;a href=&quot;https://docs.github.com/en/code-security/code-scanning/managing-code-scanning-alerts/responsible-use-autofix-code-scanning&quot;&gt;Autofix&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Configure &lt;a href=&quot;https://docs.github.com/en/code-security/secret-scanning/introduction/about-secret-scanning&quot;&gt;secret scanning&lt;/a&gt; with custom patterns for your organization&lt;/li&gt;
  &lt;li&gt;Implement Actions workflows that block deployments on security issues&lt;/li&gt;
  &lt;li&gt;Leverage GitHub Copilot to create unit and integration tests&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-4&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: HIGH - Humans define quality standards and security policies&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: HIGH - Platform continuously scans, while Copilot continuously improves test coverage&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: MEDIUM - AI uses common test patterns and systems effectively&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-6-shared-responsibility-and-ai-governance&quot;&gt;Principle 6: Shared Responsibility and AI Governance&lt;/h2&gt;

&lt;p&gt;Everyone shares responsibility for success, including AI systems. Establish clear governance for what AI can do autonomously versus what requires human oversight.&lt;/p&gt;

&lt;h3 id=&quot;github-application-5&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Set up CODEOWNERS to require human review for critical paths&lt;/li&gt;
  &lt;li&gt;Focus on creating reusable documentation that can guide agents effectively so that they can work more independently&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-5&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: HIGH - Humans maintain accountability and set boundaries&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: MEDIUM - Agents operate with clear instructions&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: MEDIUM - Common practices and guidance is shared for humans and agents&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-7-parallel-experimentation&quot;&gt;Principle 7: Parallel Experimentation&lt;/h2&gt;

&lt;p&gt;Leverage the scale of autonomous agents to experiment widely. Without AI, creating several solutions to a problem means utilizing several teams, or the same team to solve the same problem several times. This is cost-prohibitive. But with agents, you can instruct several agents to work on different flavors of a solution in parallel and pick the best one, since cost is not longer a limiting factor.&lt;/p&gt;

&lt;h3 id=&quot;github-application-6&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Automated pipelines is critical for parallel experimentation, since the object is to have each solution be created independently and autonomously&lt;/li&gt;
  &lt;li&gt;Leverage Coding Agent and instead of requesting 1 solution, create 3 or 5 Issues each with a different “flavor” of solution (using Agent Mode to brainstorm) and assign Coding Agent to each Issue. Pick the best result and discard the others.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-6&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: MEDIUM - Humans interpret results and make decisions&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: HIGH - Agents brainstorm, plan and implement multiple solutions in parallel&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: MEDIUM - AI leverages business outcomes, custom instructions and other artifacts to produce solutions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;principle-8-continuous-learning-and-adaptation&quot;&gt;Principle 8: Continuous Learning and Adaptation&lt;/h2&gt;

&lt;p&gt;Commit to evolving both your process and your AI tools and integration. Feed learnings back into the system so it gets smarter over time.&lt;/p&gt;

&lt;h3 id=&quot;github-application-7&quot;&gt;GitHub Application&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Use GitHub Discussions to capture retrospectives and learnings. Distill key insights into improved documentation, instructions and Spaces&lt;/li&gt;
  &lt;li&gt;Track metrics in GitHub Insights to identify enablement opportunities&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;impact-classification-7&quot;&gt;Impact Classification&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Human Expertise&lt;/strong&gt;: HIGH - Humans drive improvement initiatives and learning&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Autonomous Agents&lt;/strong&gt;: MEDIUM - Agents improve as guidelines are improved&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Intelligent Context&lt;/strong&gt;: HIGH - AI models improve through continuous improvement and assessment&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: Your delivery system should be treated as a product that you continuously refine. Regular retrospectives should include evaluating AI performance alongside human processes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting It All Together&lt;/h2&gt;

&lt;p&gt;These eight principles work together to create a delivery system that is more than the sum of its parts. When you combine outcome focus with AI collaboration, a unified platform with built-in quality, and experimentation at scale, you get an SDLC that delivers value consistently and adapts to changing needs. I believe that GitHub is uniquely positioned to empower this transformation.&lt;/p&gt;

&lt;h2 id=&quot;getting-started-with-asd&quot;&gt;Getting Started with ASD&lt;/h2&gt;

&lt;p&gt;Start small. Pick one or two principles that address your biggest pain points:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;If you’re drowning in repetitive tasks, focus on Intelligent Automation&lt;/li&gt;
  &lt;li&gt;If quality issues slip through, prioritize built in Quality and Security&lt;/li&gt;
  &lt;li&gt;If you’re not sure you’re building the right thing, start with Outcome-Focused Delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Remember, ASD isn’t about replacing your existing practices overnight. It’s about gradually evolving them to leverage AI capabilities and augmenting the human expertise that makes your team unique.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Agentic Software Delivery represents the next evolution in how we build and deliver software. By following these eight principles and leveraging platforms like GitHub that support both human and AI collaboration, you can create a delivery system that’s faster, smarter, and more focused on what really matters: delivering value to your business and users.&lt;/p&gt;

&lt;p&gt;Happy delivering!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="devops" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2025/08/agents-humans.png" /><media:content medium="image" url="https://github.com/assets/images/2025/08/agents-humans.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Self-Healing DevOps with Copilot and Actions</title><link href="https://github.com/self-healing-devops-with-copilot-and-actions/" rel="alternate" type="text/html" title="Self-Healing DevOps with Copilot and Actions" /><published>2025-08-08T09:00:00+00:00</published><updated>2025-08-08T09:00:00+00:00</updated><id>https://github.com/self-healing-devops-with-copilot-and-actions</id><content type="html" xml:base="https://github.com/self-healing-devops-with-copilot-and-actions/">&lt;ol id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#watch-it-in-action&quot; id=&quot;markdown-toc-watch-it-in-action&quot;&gt;Watch it in action&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#why-this-matters&quot; id=&quot;markdown-toc-why-this-matters&quot;&gt;Why this matters&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#prerequisites&quot; id=&quot;markdown-toc-prerequisites&quot;&gt;Prerequisites&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#how-it-works&quot; id=&quot;markdown-toc-how-it-works&quot;&gt;How it works&lt;/a&gt;    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#categories-and-decisions&quot; id=&quot;markdown-toc-categories-and-decisions&quot;&gt;Categories and decisions&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#auto-analyze-build-failures-workflow&quot; id=&quot;markdown-toc-auto-analyze-build-failures-workflow&quot;&gt;Auto Analyze Build Failures Workflow&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-analysis-prompt&quot; id=&quot;markdown-toc-the-analysis-prompt&quot;&gt;The analysis prompt&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
  &lt;p&gt;Photo generated in ChatGPT&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;watch-it-in-action&quot;&gt;Watch it in action&lt;/h2&gt;

&lt;p&gt;Here’s a short video showing the self-healing loop analyzing a failed run, classifying the issue, and opening a remediation issue which Copilot Coding Agent immediately fixes!&lt;/p&gt;

&lt;div style=&quot;position:relative;padding-bottom:56.25%;height:0;overflow:hidden;max-width:100%;&quot;&gt;
  &lt;iframe class=&quot;center-image&quot; src=&quot;https://www.youtube-nocookie.com/embed/N53ddmqmABg?rel=0&amp;amp;modestbranding=1&quot; title=&quot;Self-healing DevOps demo&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot; style=&quot;position:absolute;top:0;left:0;width:100%;height:100%;&quot; loading=&quot;lazy&quot;&gt;&lt;/iframe&gt;
&lt;/div&gt;
&lt;p class=&quot;figcaption&quot;&gt;Watch the self-healing loop analyze a failed run, open a remediation issue and fix it.&lt;/p&gt;

&lt;h2 id=&quot;why-this-matters&quot;&gt;Why this matters&lt;/h2&gt;

&lt;p&gt;Build failures can be expensive. They interrupt flow, burn time on triage, and often hide what actually needs fixing. Most failures fall into a few categories: transient network issues, dependency problems, broken code, configuration issues, or test failures.&lt;/p&gt;

&lt;p&gt;What if your pipeline could diagnose itself - and even propose a fix?&lt;/p&gt;

&lt;p&gt;In this post, I’ll show you how to wire up a self-healing loop in GitHub Actions using GitHub Copilot (via the models API wrapped in https://github.com/actions/ai-inference) to:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Analyze failed workflow runs automatically&lt;/li&gt;
  &lt;li&gt;Classify the failure with a short summary and a clear plan&lt;/li&gt;
  &lt;li&gt;Skip transient failures&lt;/li&gt;
  &lt;li&gt;Open a labeled remediation issue for non-transient failures&lt;/li&gt;
  &lt;li&gt;Assign the issue to Copilot Coding Agent (CCA) if code work is needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is simple: shrink time-to-understanding and time-to-action automatically.&lt;/p&gt;

&lt;h2 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Workflows built with GitHub Actions&lt;/li&gt;
  &lt;li&gt;A Personal Access Token (PAT) with the following permissions:
    &lt;ul&gt;
      &lt;li&gt;Account-level: Models (read)&lt;/li&gt;
      &lt;li&gt;Org-level: Issues (read/write), Actions (read)&lt;/li&gt;
      &lt;li&gt;This PAT should be saved as a repo secret like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AUTO_REMEDIATION_PAT&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: The workflow requires the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;models: read&lt;/code&gt; permission and uses the repo &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GITHUB_TOKEN&lt;/code&gt; for model inference, plus a PAT for any assignment actions that the default token cannot perform.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;how-it-works&quot;&gt;How it works&lt;/h2&gt;

&lt;p&gt;At the heart of the solution are two pieces:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;A reusable prompt that tells the model exactly how to analyze logs and specifies a JSON schema for a structured output.1. A workflow that triggers on failed runs, calls the inference task, parses the JSON, and then creates or updates a remediation issue with labels. For certain categories, it assigns the issue to Copilot.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;categories-and-decisions&quot;&gt;Categories and decisions&lt;/h3&gt;

&lt;p&gt;The prompt (outlined below) constrains responses to a strict JSON schema and a small set of categories like code, test, config, dependency, infrastructure or quality. The category can be used to guide the next steps.&lt;/p&gt;

&lt;p&gt;If the analysis marks &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transient: true&lt;/code&gt;, we log and stop. If &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;false&lt;/code&gt;, we open or update a remediation issue with labels like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;auto-remediation&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;workflow:&amp;lt;name&amp;gt;&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;category:&amp;lt;type&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;auto-analyze-build-failures-workflow&quot;&gt;Auto Analyze Build Failures Workflow&lt;/h2&gt;

&lt;p&gt;This workflow triggers on every completed run and filters to failed conclusions (excluding itself). It then:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;actions/ai-inference@v1&lt;/code&gt; with the prompt file and GitHub MCP enabled so the model can fetch failed job logs using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;github-mcp-server&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Parses the JSON result (category, summary, plan, transient).&lt;/li&gt;
  &lt;li&gt;Stops if the failure is deemed to be transient. Otherwise, ensures labels exist, creates a remediation issue, and assigns Copilot when appropriate.&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&quot;language-yml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
&lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Auto Analyze Build Failures&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;workflow_run&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;workflows&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;*&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;completed&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;permissions&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;contents&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;read&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;actions&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;write&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;issues&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;write&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;pull-requests&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;read&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;models&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;read&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;jobs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;analyze-failure&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;runs-on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ github.event.workflow_run.conclusion == &apos;failure&apos; &amp;amp;&amp;amp; github.event.workflow_run.name != &apos;Auto Analyze Build Failures&apos; }}&lt;/span&gt;
    
    &lt;span class=&quot;na&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Checkout repository&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;actions/checkout@v4&lt;/span&gt;

    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Analyze build failure&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;analyze&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;actions/ai-inference@v1&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;prompt-file&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.github/models/failed-run-analyze.prompt.yml&apos;&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;enable-github-mcp&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;token&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;github-mcp-token&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.AUTO_REMEDIATION_PAT }}&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;max-tokens&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10000&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;repo: ${{ github.event.repository.name }}&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;owner: ${{ github.event.repository.owner.login }}&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;workflow_run_id: ${{ github.event.workflow_run.id }}&lt;/span&gt;

    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Parse results&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;parse&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;actions/github-script@v7&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;RESPONSE_JSON&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ steps.analyze.outputs.response }}&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;const responseString = process.env.RESPONSE_JSON;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;if (!responseString) {&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;core.setFailed(&apos;No response received from analysis step&apos;);&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;return;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;const responseJSON = JSON.parse(responseString);&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;core.setOutput(&apos;category&apos;, responseJSON.category || &apos;&apos;);&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;core.setOutput(&apos;summary&apos;, responseJSON.summary || &apos;&apos;);&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;core.setOutput(&apos;plan&apos;, responseJSON.plan || &apos;&apos;);&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;core.setOutput(&apos;transient&apos;, responseJSON.transient || &apos;false&apos;);&lt;/span&gt;

    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Check for existing remediation issue&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ steps.parse.outputs.transient == &apos;false&apos; }}&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;check-issue&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;workflow_name=&quot;${{ github.event.workflow_run.name }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;existing_issue=$(gh issue list \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--repo &quot;${{ github.repository }}&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--state open \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--label &quot;workflow:$workflow_name&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--label &quot;auto-remediation&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--json number \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--jq &apos;.[0].number&apos;)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;echo &quot;existing_issue=$existing_issue&quot; &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;GH_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;

    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Create remediation issue&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;create-issue&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ steps.parse.outputs.transient == &apos;false&apos; }}&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;workflow_name=&quot;${{ github.event.workflow_run.name }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;workflow_url=&quot;${{ github.event.workflow_run.html_url }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;category=&quot;${{ steps.parse.outputs.category }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;existing_issue=&quot;${{ steps.check-issue.outputs.existing_issue }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;if [[ -n &quot;$existing_issue&quot; ]]; then&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;echo &quot;Skipping issue creation - existing issue #$existing_issue found&quot;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;exit 0&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;fi&lt;/span&gt;

        &lt;span class=&quot;s&quot;&gt;issue_body=$(cat &amp;lt;&amp;lt; EOF&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;## Build Failure Analysis&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;**Workflow:** [$workflow_name]($workflow_url)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;**Run ID:** ${{ github.event.workflow_run.id }}&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;**Category:** $category&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;**Branch:** ${{ github.event.workflow_run.head_branch }}&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;**Commit:** ${{ github.event.workflow_run.head_sha }}&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;### Summary&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;${{ steps.parse.outputs.summary }}&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;### Remediation Plan&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;${{ steps.parse.outputs.plan }}&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;### Links&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;- [Failed Workflow Run]($workflow_url)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;- [Repository](${{ github.event.repository.html_url }})&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;---&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;*This issue was automatically created by the build failure analysis system.*&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;EOF&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;)&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;# Ensure required labels exist (idempotent)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;gh label create &quot;auto-remediation&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--description &quot;Issues automatically created by build failure analysis&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--color &quot;FF6B6B&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--repo &quot;${{ github.repository }}&quot; || true&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;gh label create &quot;workflow:$workflow_name&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--description &quot;Issues related to $workflow_name workflow&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--color &quot;0052CC&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--repo &quot;${{ github.repository }}&quot; || true&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;gh label create &quot;category:$category&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--description &quot;Issues categorized as $category&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--color &quot;7057ff&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--repo &quot;${{ github.repository }}&quot; || true&lt;/span&gt;
        
        &lt;span class=&quot;s&quot;&gt;issue_url=$(gh issue create \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--repo &quot;${{ github.repository }}&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--title &quot;🔧 Auto-Remediation: $workflow_name Build Failure&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--body &quot;$issue_body&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--label &quot;auto-remediation&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--label &quot;workflow:$workflow_name&quot; \&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;--label &quot;category:$category&quot;)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;issue_number=$(echo &quot;$issue_url&quot; | sed &apos;s/.*\/issues\///&apos;)&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;echo &quot;issue_number=$issue_number&quot; &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;GH_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;

    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Assign issue to Copilot (optional)&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ steps.parse.outputs.transient == &apos;false&apos; &amp;amp;&amp;amp; steps.create-issue.outputs.issue_number != &apos;&apos; }}&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;category=&quot;${{ steps.parse.outputs.category }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;issue_number=&quot;${{ steps.create-issue.outputs.issue_number }}&quot;&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;if [[ &quot;$category&quot; == &quot;code&quot; || &quot;$category&quot; == &quot;test&quot; || &quot;$category&quot; == &quot;config&quot; ]]; then&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;echo &quot;Assigning issue #$issue_number to Copilot&quot;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;# Query suggested actors to find Copilot agent id&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;copilot_query=&apos;query {&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;repository(owner: &quot;${{ github.event.repository.owner.login }}&quot;, name: &quot;${{ github.event.repository.name }}&quot;) {&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;suggestedActors(capabilities: [CAN_BE_ASSIGNED], first: 100) {&lt;/span&gt;
                &lt;span class=&quot;s&quot;&gt;nodes { login __typename ... on Bot { id } }&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;}&apos;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;copilot_response=$(gh api graphql -f query=&quot;$copilot_query&quot;)&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;copilot_id=$(echo &quot;$copilot_response&quot; | jq -r &apos;.data.repository.suggestedActors.nodes[] | select(.login == &quot;copilot-swe-agent&quot;) | .id&apos;)&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;if [[ -n &quot;$copilot_id&quot; &amp;amp;&amp;amp; &quot;$copilot_id&quot; != &quot;null&quot; ]]; then&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;issue_query=&apos;query { repository(owner: &quot;${{ github.event.repository.owner.login }}&quot;, name: &quot;${{ github.event.repository.name }}&quot;) { issue(number: &apos;$issue_number&apos;) { id } } }&apos;&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;issue_response=$(gh api graphql -f query=&quot;$issue_query&quot;)&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;issue_id=$(echo &quot;$issue_response&quot; | jq -r &apos;.data.repository.issue.id&apos;)&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;assign_mutation=&apos;mutation { replaceActorsForAssignable(input: {assignableId: &quot;&apos;$issue_id&apos;&quot;, actorIds: [&quot;&apos;$copilot_id&apos;&quot;]}) { assignable { ... on Issue { id } } } }&apos;&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;gh api graphql -f query=&quot;$assign_mutation&quot; &amp;gt;/dev/null 2&amp;gt;&amp;amp;1 || echo &quot;Copilot assignment failed&quot;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;else&lt;/span&gt;
            &lt;span class=&quot;s&quot;&gt;echo &quot;Copilot agent not available in this repository&quot;&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;fi&lt;/span&gt;
         &lt;span class=&quot;s&quot;&gt;else&lt;/span&gt;
           &lt;span class=&quot;s&quot;&gt;echo &quot;Category &apos;$category&apos; does not require Copilot assignment&quot;&lt;/span&gt;
         &lt;span class=&quot;s&quot;&gt;fi&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;GH_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.AUTO_REMEDIATION_PAT }}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Notes:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Line 1: Sets a clear workflow name to reference in labels and filters.&lt;/li&gt;
  &lt;li&gt;Line 4-7: Triggers on all completed workflow runs; we’ll filter to failures in the job-level condition.&lt;/li&gt;
  &lt;li&gt;Line 9-14: Grants least-privilege permissions; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;models: read&lt;/code&gt; is required for inference, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;issues: write&lt;/code&gt; for creating remediation issues.&lt;/li&gt;
  &lt;li&gt;Line 17-18: Uses an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt; to only run when a workflow failed and to avoid analyzing itself.&lt;/li&gt;
  &lt;li&gt;Line 21-22: Checks out the repository so the prompt file is available in the workspace.&lt;/li&gt;
  &lt;li&gt;Line 25-35: Calls actions/ai-inference with the prompt, enabling GitHub MCP so the model can fetch logs; passes both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GITHUB_TOKEN&lt;/code&gt; and a PAT for MCP actions that need expanded scopes. Passes in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;repo&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;owner&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;workflow_run_id&lt;/code&gt; as inputs. The MCP integration in the ai-inference Action handles fetching failed job logs for analysis.&lt;/li&gt;
  &lt;li&gt;Line 38-51: Parses the JSON response into outputs (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;category&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;summary&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plan&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transient&lt;/code&gt;) for downstream steps; fails early if there’s no response.&lt;/li&gt;
  &lt;li&gt;Line 54-66: Uses gh CLI to detect an existing open remediation issue by labels to prevent duplicates; gated on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transient == &apos;false&apos;&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Line 69-115: Creates labels idempotently, opens a structured remediation issue, and captures the new issue number.&lt;/li&gt;
  &lt;li&gt;Line 118-146: Optionally assigns the issue to the Copilot Coding Agent for code-like categories by querying GraphQL for the agent id and replacing assignees.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-analysis-prompt&quot;&gt;The analysis prompt&lt;/h2&gt;

&lt;p&gt;The prompt is a small contract: constrain the task, provide examples, and require a strict JSON response with a known schema. That keeps downstream steps simple and reliable:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
&lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Failed Run Autofix&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;openai/gpt-4.1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;messages&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;role&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;system&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;You are an expert DevOps engineer and build master with deep knowledge of&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;CI/CD pipelines,  build systems, and software development workflows. You&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;specialize in analyzing build failures  and providing actionable&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;remediation plans for the OctoCAT Supply Chain Management application.&lt;/span&gt;

      &lt;span class=&quot;s&quot;&gt;Your task is to analyze failed CI/CD pipeline runs and determine:&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;1. Whether the failure is transient (network issues, temporary service&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;outages, etc.)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;2. The root cause and category of the failure&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;3. A detailed plan for fixing the issue&lt;/span&gt;

      &lt;span class=&quot;s&quot;&gt;When analyzing logs, look for:&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Compilation errors (TypeScript, build tool issues)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Test failures (unit tests, integration tests)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Dependency issues (missing packages, version conflicts)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Configuration problems (environment variables, config files)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Infrastructure issues (network timeouts, service unavailability)&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;- Code quality issues (linting, formatting)&lt;/span&gt;

      &lt;span class=&quot;s&quot;&gt;Provide your analysis in the specified JSON format with clear, actionable&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;recommendations.&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;role&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;user&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;## CI/CD Pipeline Failure Logs&lt;/span&gt;

      &lt;span class=&quot;s&quot;&gt;Use the GitHub MCP `get_job_logs` tool to retrieve failed job logs from&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;the specified workflow run.&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;- Repo: {{repo}}&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;- Owner: {{owner}}&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;- Workflow Run ID: {{workflow_run_id}}&lt;/span&gt;

      &lt;span class=&quot;s&quot;&gt;Analyze the failure logs and provide a comprehensive assessment including&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;whether this is a  transient failure, or if it requires code  changes.&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;If code changes are needed, provide a detailed remediation plan. Keep the&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;summary short and DO NOT include the entire build log.&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;responseFormat&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;json_schema&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;jsonSchema&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;name&quot;: &quot;failure_analysis&quot;,&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;strict&quot;: true,&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;schema&quot;: {&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;type&quot;: &quot;object&quot;,&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;properties&quot;: {&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;transient&quot;: { &quot;type&quot;: &quot;boolean&quot; },&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;summary&quot;: { &quot;type&quot;: &quot;string&quot; },&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;plan&quot;: { &quot;type&quot;: &quot;string&quot; },&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;category&quot;: {&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;&quot;type&quot;: &quot;string&quot;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;&quot;enum&quot;: [&quot;code&quot;,&quot;test&quot;,&quot;config&quot;,&quot;dependency&quot;,&quot;infrastructure&quot;,&quot;quality&quot;,&quot;repeat-transient&quot;]&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;},&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;required&quot;: [&quot;transient&quot;,&quot;summary&quot;,&quot;plan&quot;,&quot;category&quot;],&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;additionalProperties&quot;: false&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Notes:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Line 1: Human-friendly prompt name for traceability in logs.&lt;/li&gt;
  &lt;li&gt;Line 2: Sets the target model; pick a capable, response-format compliant model.&lt;/li&gt;
  &lt;li&gt;Line 3-33: System message establishes expertise, scope, and expected behavior focusing on CI/CD analysis.&lt;/li&gt;
  &lt;li&gt;Line 35-49: User message instructs the model to use GitHub MCP to fetch logs and defines inputs the workflow passes.&lt;/li&gt;
  &lt;li&gt;Line 50: Forces a strict JSON response to simplify parsing downstream.&lt;/li&gt;
  &lt;li&gt;Line 51-74: Defines a compact JSON schema with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transient&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;summary&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plan&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;category&lt;/code&gt; with an enum to keep 
categories consistent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In general, you’ll want to keep the entire prompt and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jsonSchema&lt;/code&gt; small and strict. Your downstream logic gets much simpler when the response is tightly constrained. The ai-inference task is also going to error out if there are too many steps or too much content, so watch for this.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;You can turn noisy failures into crisp, actionable work in minutes. The combination of GitHub Actions, Copilot API, and Copilot Coding Agent gives you a lightweight, reliable self-healing loop: automatic analysis, clear categorization, and immediate remediation tracking. Happy healing!&lt;/p&gt;</content><author><name>Colin Dembovsky</name></author><category term="ai" /><category term="actions" /><summary type="html"></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://github.com/assets/images/2025/08/failed-image.png" /><media:content medium="image" url="https://github.com/assets/images/2025/08/failed-image.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>