GitHub Copilot Developer Training — Advanced Topics

Duration: ~3 hours 20 minutes (200 min)
Format: Presentation + Live Demo + Hands-On
Audience: Developers extending GitHub Copilot with integrations, evaluation frameworks, and diagnostic tools
Focus: Extensions, MCP, evaluating AI output, and troubleshooting Copilot
Repo: microsoft/GitHubCopilot_Customized (OctoCAT Supply)

Part of the Copilot Developer Training curriculum (Foundations · Agentic Patterns · Advanced Topics). This module can be delivered standalone.


Workshop Overview

This module covers the advanced topics that round out a developer's Copilot proficiency: extending Copilot with third-party integrations and MCP servers, building evaluation frameworks to measure AI output quality, and mastering the diagnostic tools needed when things go wrong.

Learning Objectives

Prerequisites / What You Should Know

If you're taking this module standalone (without Modules 1–2), you should be comfortable with these concepts:

Concept Quick Summary
Chat Modes Ask (read-only), Agent (edits files), Plan (proposes changes)
Custom Instructions .github/copilot-instructions.md (always-on) + file-targeted .github/instructions/*.instructions.md
Custom Agents .github/agents/*.md — specialized personas with tool and model selection
Agentic Loops Agent mode iterates: plan → act → observe → reflect
Agent Patterns Single-agent/multi-skill (Copilot Agent mode), multi-agent (specialized agents for different domains)
Context Window Finite token budget; instructions, context, history, and output all compete for space

If you attended Modules 1–2, skip this section — you've covered all of the above.

Prerequisites

Requirement Details
GitHub Account With Copilot Pro, Business, or Enterprise license
VS Code Latest stable (or Insiders for preview features)
Copilot Extension GitHub Copilot + GitHub Copilot Chat extensions installed
Node.js Version 18 or higher
Git For cloning the demo repository

Session Agenda

Section Topic Time
Session 6 Extensions & MCP 60 min
6.1 Opening & AI Safety: "Third-Party Trust" 5 min
6.2 VS Code Chat Participants 10 min
6.3 GitHub Copilot Extensions 15 min
6.4 MCP Architecture 15 min
6.5 MCP Configuration 10 min
6.6 Session 6 Summary & Discussion 5 min
☕ Break 10 min
Session 7 Evaluating Agentic Output 70 min
7.1 Opening & AI Safety: "When to Trust, When to Verify" 5 min
7.2 Defining Success Criteria 15 min
7.3 Output Quality Rubrics 15 min
7.4 Evaluation Methods 15 min
7.5 Tracking & Improvement 15 min
7.6 Usage, Billing & Cost Strategies 10 min
7.7 Session 7 Summary & Discussion 5 min
☕ Break 10 min
Session 8 Troubleshooting & Diagnostics 60 min
8.1 Opening & AI Safety: "Debugging the Black Box" 5 min
8.2 Output Log Channels 15 min
8.3 Chat Debug Mode 15 min
8.4 Agent Debug Logs 15 min
8.5 Diagnostics Collection & Curriculum Wrap-Up 10 min

Total: ~200 min (~3h 20min) including breaks


6.1. Opening & AI Safety: "Third-Party Trust" (5 min)

Key Points

Trust Evaluation Checklist

Question What to Check
Who built it? Verified publisher, GitHub organization, or known vendor
What does it access? File system, network, terminal, API tokens
What can it do? Read-only vs. write access; can it execute commands?
Is it maintained? Recent updates, active issue tracker, responsive maintainer
Is it scoped? Does it request minimum necessary permissions?

Discussion Points


6.2. VS Code Chat Participants (10 min)

Key Points

Built-in Participants

Participant What It Accesses Capabilities
@workspace File tree, file contents, symbols, dependencies Project structure, cross-file search, symbol lookup
@vscode Settings, extensions, keybindings, commands Editor configuration, extension recommendations
@terminal Recent terminal output, command history Error diagnosis, command suggestions

How Participants Work Under the Hood

  1. You type @workspace How is auth implemented?
  2. VS Code routes the message to the @workspace participant
  3. The participant searches your project (file tree, symbols, content)
  4. Relevant code snippets and file references are injected into the context
  5. The augmented context is sent to the model for response

🖥️ Demo: Participant Capabilities

  1. @workspace — Ask "What is the database schema for orders?" — show it finding the schema across files
  2. @vscode — Ask "How do I configure format on save?" — show it referencing VS Code settings
  3. @terminal — After a failed npm test, ask "Why did the test fail?" — show it reading terminal output

Discussion Points


6.3. GitHub Copilot Extensions (15 min)

Key Points

Extension Capabilities

Capability Description Example
Chat responses Answer questions with domain knowledge @docker How do I optimize this Dockerfile?
Code actions Generate or modify code based on external context @azure Generate a Bicep template for this architecture
Tool invocation Call external APIs and return results @sentry What are the top errors this week?

Marketplace Discovery

🖥️ Demo: Using a Copilot Extension

  1. Show the Copilot Extensions marketplace
  2. Install an extension (e.g., Docker, GitHub Models, or Azure)
  3. Use it in chat: @extension-name [question]
  4. Show how the response includes domain-specific knowledge beyond Copilot's base training

Discussion Points


6.4. MCP Architecture (15 min)

Key Points

MCP Architecture

┌─────────────────┐
│     VS Code     │  ← HOST: manages connections
│    (Host)       │
├─────────────────┤
│   Copilot       │  ← CLIENT: sends requests to servers
│   (Client)      │
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
    ▼         ▼
┌────────┐ ┌────────┐
│ MCP    │ │ MCP    │  ← SERVERS: provide tools, resources, prompts
│Server A│ │Server B│
│(GitHub)│ │(File   │
│        │ │System) │
└────────┘ └────────┘

MCP Capabilities

Capability Description Example
Tools Functions the model can call create_issue, run_query, deploy_app
Resources Data the model can read Database schemas, API specs, documentation
Prompts Reusable prompt templates "Summarize this PR", "Review for security"

Transport Types

Transport How It Works Best For
stdio Runs as a local process, communicates via stdin/stdout Local tools, file system, CLIs
SSE Server-sent events over HTTP Remote servers, web services
Streamable HTTP HTTP with streaming responses Modern remote servers

MCP vs. Copilot Extensions

Aspect MCP Servers Copilot Extensions
Standard Open protocol (any client) GitHub-specific
Installation Configure in .vscode/mcp.json Install from GitHub Marketplace
Runs where Locally or remote server GitHub's infrastructure
Access scope What you configure What the app requests
Best for Custom internal tools Polished third-party integrations

Discussion Points


6.5. MCP Configuration (10 min)

Key Points

Configuration Syntax

{
  "servers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${input:github-token}"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "./src"]
    }
  }
}

Security Considerations

Concern Mitigation
Token exposure Use ${input:name} for secrets — VS Code prompts at runtime, never stored in file
File access scope Limit filesystem servers to specific directories
Network access Review what remote servers the MCP server contacts
Command execution Only configure servers from trusted sources

🖥️ Demo: MCP Server Setup

  1. Create .vscode/mcp.json with a filesystem server pointing to the api/ directory
  2. Open chat — show the MCP tools appearing in the tool list
  3. Ask Copilot to use the filesystem tool: "List all TypeScript files in the API directory"
  4. Add a GitHub MCP server — show it pulling issue data into the chat context

Discussion Points


6.6. Session 6 Summary & Discussion (5 min)

Key Takeaways

Discussion Points


☕ Break — 10 Minutes


7.1. Opening & AI Safety: "When to Trust, When to Verify" (5 min)

Key Points

Risk-Based Verification

Risk Level Code Type Verification Level
🟢 Low Boilerplate, scaffolding, docs Quick scan
🟡 Medium Business logic, API routes Code review + tests
🔴 High Auth, security, financial, infra Deep review + security scan + tests

Discussion Points


7.2. Defining Success Criteria (15 min)

Key Points

Success Criteria by Task Type

Task Type Correctness Style Performance Security
Bug fix Fix resolves the issue; no regressions Matches existing code style No performance degradation No new vulnerabilities
New feature Meets requirements; handles edge cases Follows project patterns Acceptable response time Input validation, auth checks
Refactoring Same behavior; tests still pass Improved readability Equal or better performance No security regression
Test generation Tests are meaningful; cover edge cases Consistent test structure Tests run in < 30s No test data leaks

Defining Criteria Before Coding

Template:

Task: [what you're asking Copilot to do]
Success looks like:
  - Functional: [what the code should do]
  - Quality: [readability, patterns, style]
  - Constraints: [performance, security, compatibility]
  - Not acceptable: [what the code should NOT do]

🖥️ Demo: Criteria-Driven Prompting

  1. Define success criteria for a task: "Add pagination to the products API endpoint"
  2. Include criteria in the prompt: "Must be cursor-based, max 100 items, no offset, include Link headers"
  3. Evaluate the output against the criteria — does it meet each one?
  4. Compare to asking without criteria: "Add pagination to products"

Discussion Points


7.3. Output Quality Rubrics (15 min)

Key Points

Rubric Template

Dimension Score 1 (Poor) Score 2 (Acceptable) Score 3 (Good) Score 4 (Excellent)
Correctness Doesn't work Works for happy path Handles common edge cases Handles all edge cases, robust
Completeness Missing major parts Core functionality present Fully implements requirements Exceeds requirements with thoughtful additions
Code Style Inconsistent, unreadable Mostly consistent Follows project patterns Clean, idiomatic, well-structured
Security Has vulnerabilities No obvious issues Validates input, handles errors Defense in depth, follows OWASP
Performance Unacceptable Adequate Efficient Optimized with appropriate trade-offs

Applying a Rubric

Example — Agent-generated pagination code:

Dimension Score Notes
Correctness 3 Works, handles empty results and large pages
Completeness 2 Missing Link header, no total count
Code Style 3 Matches existing Express route patterns
Security 2 No input validation on page size parameter
Performance 3 Uses cursor-based approach, efficient
Overall 2.6 Needs: input validation + Link headers

🖥️ Demo: Rubric Evaluation

  1. Ask Agent mode to add a search feature to the products API
  2. Score the output against the rubric dimensions
  3. Show where it falls short — ask Copilot to fix the specific weaknesses
  4. Re-score — show the improvement

Discussion Points


7.4. Evaluation Methods (15 min)

Key Points

Automated Evaluation

Check What It Validates Tool
Lint Code style, syntax, unused variables ESLint, Prettier
Type check Type safety, interface compliance TypeScript compiler
Tests Functional correctness Vitest, Jest
Build Compilation, bundling Vite, tsc
Security scan Known vulnerabilities, SAST CodeQL, Semgrep

Human-in-the-Loop Evaluation

Review Focus What to Look For
Logic Does the code actually solve the problem?
Architecture Does it fit the existing patterns?
Edge cases What happens with unexpected input?
Maintainability Will another developer understand this in 6 months?
Over-engineering Did the AI add unnecessary complexity?

The Verification Pipeline

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│  Agent   │───►│  Lint    │───►│  Type    │───►│  Test    │───►│  Human   │
│  Output  │    │  Check   │    │  Check   │    │  Suite   │    │  Review  │
└──────────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘
                   ▼ Fail          ▼ Fail          ▼ Fail          ▼ Reject
              Auto-reject     Auto-reject     Auto-reject     Send back

Batch Evaluation

For systematic quality assessment across many tasks:

  1. Define a set of representative prompts (10-20 tasks)
  2. Run each through Copilot
  3. Score outputs against the rubric
  4. Track aggregate scores over time
  5. Identify patterns: which task types does Copilot handle well vs. poorly?

🖥️ Demo: Automated Evaluation Pipeline

  1. Generate code with Agent mode
  2. Run the lint + type check + test pipeline
  3. Show a failure — observe how the automated gate catches the issue
  4. Fix the issue and re-run — show the pipeline passing

Discussion Points


7.5. Tracking & Improvement (15 min)

Key Points

Metrics to Track

Metric How to Measure What It Tells You
First-pass acceptance rate % of agent output accepted without changes How well your instructions/prompts work
Iteration count Number of follow-up prompts needed Prompt quality and task complexity alignment
Rubric scores Average scores per dimension over time Where Copilot consistently under- or over-performs
Time savings Time with Copilot vs. estimated manual time ROI of AI-assisted development

The Feedback Loop

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Generate │───►│ Evaluate │───►│ Identify │───►│ Improve  │
│ code     │    │ with     │    │ weakness │    │ prompts/ │
│          │    │ rubric   │    │          │    │ instruct │
└──────────┘    └──────────┘    └──────────┘    └────┬─────┘
     ▲                                               │
     └───────────────────────────────────────────────┘

Trust Calibration by Task Type

Task Type Trust Level Verification Effort
Boilerplate / scaffolding High Quick scan
CRUD endpoints High Test coverage check
Business logic Medium Code review + tests
Algorithm implementation Medium-Low Deep review + benchmarks
Security-critical code Low Full security review
Infrastructure / IaC Low Plan review + dry run

🖥️ Demo: Feedback Loop in Action

  1. Generate a utility function — score it (e.g., correctness: 3, style: 2)
  2. Identify the style weakness — add a specific instruction to copilot-instructions.md
  3. Regenerate the same function — show the improved style score
  4. Discuss how this scales across a team

Discussion Points


7.6. Usage, Billing & Cost Strategies (10 min)

Key Points

The Billing Transition

Aspect Current (Premium Requests) New (AI Credits — June 2026)
Unit 1 request × model multiplier Tokens × model price → AI credits
Free models GPT-4.1, GPT-4o, GPT-5 mini (0x) Completions + next edit unlimited; 1,900–3,900 credits/mo included
Pooling Per-user monthly allowance Pooled at billing entity level

Model Cost Tiers — Know What You're Spending

Tier Models Multiplier
Free (0x) GPT-4.1, GPT-4o, GPT-5 mini, Raptor mini No premium requests consumed
Budget (0.25–0.33x) GPT-5.4 nano, Grok Code Fast, Claude Haiku 4.5, Gemini 3 Flash Fraction of a request
Standard (1x) Claude Sonnet 4/4.5/4.6, Gemini 2.5 Pro, GPT-5.2/5.4 1 request per prompt
Premium (3–7.5x) Claude Opus 4.5/4.6 (3x), Claude Opus 4.7 (7.5x), GPT-5.5 (7.5x) Multiple requests per prompt
Ultra (30x) Claude Opus 4.6 fast mode 30 requests per prompt

10 Developer Cost Strategies

# Strategy Impact
1 Use included models (GPT-4.1, GPT-4o, GPT-5 mini) for daily work Zero cost — 0x multiplier
2 Auto model selection in Copilot Chat 10% discount on multipliers
3 Budget models for routine tasks (Haiku 0.33x, GPT-5.4 nano 0.25x) 67–75% cheaper than standard
4 Reserve premium models (Opus, GPT-5.5) for complex problems only Avoid 3–7.5x multiplier on simple tasks
5 Lean copilot-instructions.md — keep under 100 lines Fewer input tokens per interaction
6 Targeted context#file not #codebase Dramatically fewer input tokens
7 Fresh sessions for new tasks Prevents history bloat consuming tokens
8 File-targeted instructions with applyTo Only loads when relevant files are active
9 Set per-user budgets Prevents runaway spending
10 Monitor usage reports regularly Catch high-consumption patterns early

Admin Budget Controls

Budgets can be set at four levels — each can trigger alerts or enforce hard stops:

  1. Enterprise-level — caps all orgs, repos, cost centers
  2. Organization-level — caps all repos in the org
  3. Cost-center-level — caps a single cost center
  4. User-level — caps individual users ($0 budget = no access)

Note: There is no automatic fallback to cheaper models when a budget is exhausted. Usage is simply blocked until the next billing cycle.

Discussion Points


7.7. Session 7 Summary & Discussion (5 min)

Key Takeaways

Discussion Points


☕ Break — 10 Minutes


8.1. Opening & AI Safety: "Debugging the Black Box" (5 min)

Key Points

Discussion Points


8.2. Output Log Channels (15 min)

Key Points

Output Channels

Channel What It Logs When to Check
GitHub Copilot Completion events, model selection, errors Completions not appearing or wrong
GitHub Copilot Chat Chat request/response cycles, tool calls Chat responses are wrong or slow
Language Server Language-specific analysis, symbols Code navigation or analysis issues

How to Access

  1. Open the Output panel: Ctrl+Shift+U (Windows) / Cmd+Shift+U (Mac)
  2. Select the channel from the dropdown
  3. Look for errors (red), warnings (yellow), and info messages

Common Log Patterns

Log Pattern Meaning Action
Request failed: 401 Authentication expired Re-sign in to GitHub
Request failed: 429 Rate limit exceeded Wait or switch to a lower-tier model
Context truncated Context window overflow Reduce attached context or start fresh session
Model not available Selected model is down or restricted Switch to a different model

🖥️ Demo: Reading Output Logs

  1. Open the Copilot output channel — show the log stream
  2. Trigger a completion — identify the log entry for that event
  3. Show an error scenario (e.g., disconnect network briefly) — identify the error in logs
  4. Show the Chat output channel — trace a chat request through the logs

Discussion Points


8.3. Chat Debug Mode (15 min)

Key Points

What Debug Mode Reveals

Information Why It Matters
Full context sent See exactly what the model received — instructions, files, history
Token counts Input tokens, output tokens, total — identify bloat
Model used Which model actually handled the request
Timing How long each phase took (context assembly, model call, rendering)
Tool calls Which tools (search, file read, terminal) were invoked and their results

How to Enable

{
  "github.copilot.chat.debugMode": true
}

After enabling, debug information appears in the chat output channel and optionally inline with responses.

Debug Output Anatomy

A typical debug output shows:

[DEBUG] Context assembly:
  System prompt: 1,247 tokens
  Repository instructions: 312 tokens
  File-targeted instructions: 0 tokens
  Attached context (#file): 3,891 tokens
  Conversation history: 2,104 tokens
  Current message: 87 tokens
  ────────────────────────
  Total input: 7,641 tokens

[DEBUG] Model: gpt-4o
[DEBUG] Response: 342 tokens (1.2s)
[DEBUG] Tool calls: workspace.search (284ms), file.read (12ms)

🖥️ Demo: Chat Debug Mode Walkthrough

  1. Enable github.copilot.chat.debugMode in VS Code settings
  2. Send a chat message — open the output channel to see the debug dump
  3. Point out: instructions loaded, context composition, token counts
  4. Send a message with a large #file reference — show the token count jump
  5. Show how different models appear in the debug output

Discussion Points


8.4. Agent Debug Logs (15 min)

Key Points

Agent Log Location

Agent debug logs are written to a session-specific log file. Access them via:

  1. Command Palette → "GitHub Copilot: Open Agent Debug Log"
  2. Or find them in the workspace storage directory

Understanding Agent Iteration Logs

Each iteration in the agent loop creates a log entry:

[Iteration 1] Plan: Add validation middleware to orders route
  Tool: file.read → api/routes/orders.ts (success, 2,341 tokens)
  Tool: file.read → api/routes/products.ts (success, 1,892 tokens)
  Tool: file.write → api/middleware/validate-order.ts (created)
  Tool: file.edit → api/routes/orders.ts (modified lines 12-18)
  Tool: terminal.run → npm run lint (exit code: 1)
  Result: FAIL — lint error on line 5: 'Request' is not defined

[Iteration 2] Fix: Add missing import
  Tool: file.edit → api/middleware/validate-order.ts (modified line 1)
  Tool: terminal.run → npm run lint (exit code: 0)
  Tool: terminal.run → npm test (exit code: 0)
  Result: PASS — all validation gates passed

Tool Call Traces

Field Description
Tool name Which tool was called (file.read, terminal.run, etc.)
Arguments What was passed to the tool (file path, command)
Result Success/failure, output content, token count
Duration How long the tool call took
Decision What the agent decided to do next based on the result

🖥️ Demo: Tracing an Agent Failure

  1. Give Agent mode a task that will fail on first attempt (e.g., "Add a feature that requires a missing dependency")
  2. Open the agent debug log — trace through the iterations
  3. Show where the agent detected the failure (observe phase) and how it adjusted (reflect phase)
  4. Point out the tool call trace: what was read, what was written, what commands ran

Discussion Points


8.5. Diagnostics Collection & Curriculum Wrap-Up (10 min)

Key Points

Diagnostics Collection

  1. Command Palette → "GitHub Copilot: Collect Diagnostics"
  2. Review the generated file — redact any sensitive content
  3. Attach to GitHub issues or support requests

Diagnostic Toolkit Cheat Sheet

Tool Command / Location What You Get
Output logs Ctrl+Shift+U → select channel Real-time log stream
Debug mode github.copilot.chat.debugMode: true Context composition, token counts
Agent debug log Command Palette → "Open Agent Debug Log" Iteration traces, tool calls
Diagnostics export Command Palette → "Collect Diagnostics" Full diagnostic bundle
Extension version Extensions panel → GitHub Copilot → version Confirm you're on latest
Network check Output logs → look for 401/429/timeout Connection and auth issues

Curriculum Wrap-Up — All 8 Sessions Complete

Module Sessions Core Themes
Foundations (Module 1) 1–3 Chat interface, context management, models & tokens
Agentic Patterns (Module 2) 4–5 Agentic loops, self-correction, rubber duck, patterns & antipatterns
Advanced Topics (Module 3) 6–8 Extensions, MCP, evaluation, troubleshooting

What to Do Next

Further Learning

Resource URL
Copilot Documentation https://docs.github.com/en/copilot
MCP Specification https://modelcontextprotocol.io
OctoCAT Supply Demo Repo https://github.com/microsoft/GitHubCopilot_Customized
VS Code Copilot Features https://code.visualstudio.com/docs/copilot/overview

Discussion Points


Workshop guide for GitHub Copilot Developer Training — Advanced Topics (Module 3 of 3)