GitHub Copilot Developer Training — Advanced Topics

Duration: ~3 hours 20 minutes (200 min)
Format: Presentation + Live Demo + Hands-On
Audience: Developers extending GitHub Copilot with integrations, evaluation frameworks, and diagnostic tools
Focus: Extensions, MCP, evaluating AI output, and troubleshooting Copilot
Repo: microsoft/GitHubCopilot_Customized (OctoCAT Supply)

Part of the Copilot Developer Training curriculum (Foundations · Agentic Patterns · Advanced Topics). This module can be delivered standalone.

Workshop Overview

This module covers the advanced topics that round out a developer's Copilot proficiency: extending Copilot with third-party integrations and MCP servers, building evaluation frameworks to measure AI output quality, and mastering the diagnostic tools needed when things go wrong.

Learning Objectives

Understand VS Code chat participants and GitHub Copilot Extensions
Configure and use MCP (Model Context Protocol) servers for external tool integration
Define success criteria and quality rubrics for AI-generated code
Build automated and human-in-the-loop evaluation workflows
Navigate Copilot's output logs, debug mode, and agent debug traces
Collect and export diagnostics for troubleshooting

Prerequisites / What You Should Know

If you're taking this module standalone (without Modules 1–2), you should be comfortable with these concepts:

Concept	Quick Summary
Chat Modes	Ask (read-only), Agent (edits files), Plan (proposes changes)
Custom Instructions	`.github/copilot-instructions.md` (always-on) + file-targeted `.github/instructions/*.instructions.md`
Custom Agents	`.github/agents/*.md` — specialized personas with tool and model selection
Agentic Loops	Agent mode iterates: plan → act → observe → reflect
Agent Patterns	Single-agent/multi-skill (Copilot Agent mode), multi-agent (specialized agents for different domains)
Context Window	Finite token budget; instructions, context, history, and output all compete for space

If you attended Modules 1–2, skip this section — you've covered all of the above.

Prerequisites

Requirement	Details
GitHub Account	With Copilot Pro, Business, or Enterprise license
VS Code	Latest stable (or Insiders for preview features)
Copilot Extension	GitHub Copilot + GitHub Copilot Chat extensions installed
Node.js	Version 18 or higher
Git	For cloning the demo repository

Session Agenda

Section	Topic	Time
Session 6	Extensions & MCP	60 min
6.1	Opening & AI Safety: "Third-Party Trust"	5 min
6.2	VS Code Chat Participants	10 min
6.3	GitHub Copilot Extensions	15 min
6.4	MCP Architecture	15 min
6.5	MCP Configuration	10 min
6.6	Session 6 Summary & Discussion	5 min
	☕ Break	10 min
Session 7	Evaluating Agentic Output	70 min
7.1	Opening & AI Safety: "When to Trust, When to Verify"	5 min
7.2	Defining Success Criteria	15 min
7.3	Output Quality Rubrics	15 min
7.4	Evaluation Methods	15 min
7.5	Tracking & Improvement	15 min
7.6	Usage, Billing & Cost Strategies	10 min
7.7	Session 7 Summary & Discussion	5 min
	☕ Break	10 min
Session 8	Troubleshooting & Diagnostics	60 min
8.1	Opening & AI Safety: "Debugging the Black Box"	5 min
8.2	Output Log Channels	15 min
8.3	Chat Debug Mode	15 min
8.4	Agent Debug Logs	15 min
8.5	Diagnostics Collection & Curriculum Wrap-Up	10 min

Total: ~200 min (~3h 20min) including breaks

6.1. Opening & AI Safety: "Third-Party Trust" (5 min)

Key Points

Extensions and MCP servers are third-party code that runs alongside Copilot — they can access your codebase and execute actions
Trust decisions matter: who built it? What data does it access? What actions can it take?
Evaluate before installing: read permissions, check the publisher, review community feedback
Organization admins can control which extensions and MCP servers are allowed

Trust Evaluation Checklist

Question	What to Check
Who built it?	Verified publisher, GitHub organization, or known vendor
What does it access?	File system, network, terminal, API tokens
What can it do?	Read-only vs. write access; can it execute commands?
Is it maintained?	Recent updates, active issue tracker, responsive maintainer
Is it scoped?	Does it request minimum necessary permissions?

Discussion Points

How does your team currently vet VS Code extensions?
What's your organization's policy on third-party AI integrations?

6.2. VS Code Chat Participants (10 min)

Key Points

Chat participants are specialized handlers built into VS Code that route your questions to domain-specific logic
They're the @ mentions you use in chat: @workspace, @vscode, @terminal
Each participant has access to different data and capabilities

Built-in Participants

Participant	What It Accesses	Capabilities
`@workspace`	File tree, file contents, symbols, dependencies	Project structure, cross-file search, symbol lookup
`@vscode`	Settings, extensions, keybindings, commands	Editor configuration, extension recommendations
`@terminal`	Recent terminal output, command history	Error diagnosis, command suggestions

How Participants Work Under the Hood

You type @workspace How is auth implemented?
VS Code routes the message to the @workspace participant
The participant searches your project (file tree, symbols, content)
Relevant code snippets and file references are injected into the context
The augmented context is sent to the model for response

🖥️ Demo: Participant Capabilities

@workspace — Ask "What is the database schema for orders?" — show it finding the schema across files
@vscode — Ask "How do I configure format on save?" — show it referencing VS Code settings
@terminal — After a failed npm test, ask "Why did the test fail?" — show it reading terminal output

Discussion Points

Which participant do you use most? Which do you underuse?
How does @workspace decide which files are relevant to your question?

6.3. GitHub Copilot Extensions (15 min)

Key Points

Copilot Extensions are GitHub Apps that add chat capabilities to Copilot
They appear as @extension-name participants in chat — you interact with them through natural language
Extensions can access external services (Docker, Azure, Sentry, etc.) and bring domain-specific knowledge into Copilot

Extension Capabilities

Capability	Description	Example
Chat responses	Answer questions with domain knowledge	`@docker How do I optimize this Dockerfile?`
Code actions	Generate or modify code based on external context	`@azure Generate a Bicep template for this architecture`
Tool invocation	Call external APIs and return results	`@sentry What are the top errors this week?`

Marketplace Discovery

Browse at https://github.com/marketplace?type=apps&copilot_app=true
Filter by category: development tools, security, cloud, productivity
Check: publisher verification, install count, recent activity

🖥️ Demo: Using a Copilot Extension

Show the Copilot Extensions marketplace
Install an extension (e.g., Docker, GitHub Models, or Azure)
Use it in chat: @extension-name [question]
Show how the response includes domain-specific knowledge beyond Copilot's base training

Discussion Points

What external services would benefit from a Copilot Extension in your workflow?
How do extensions differ from MCP servers (covered next)?
What security considerations apply to extensions that access external services?

6.4. MCP Architecture (15 min)

Key Points

MCP (Model Context Protocol) is an open standard for connecting AI models to external tools and data sources
Think of it as "USB for AI" — a standard interface that any tool can implement
MCP separates three roles: Host (VS Code), Client (Copilot), and Server (the tool provider)

MCP Architecture

┌─────────────────┐
│     VS Code     │  ← HOST: manages connections
│    (Host)       │
├─────────────────┤
│   Copilot       │  ← CLIENT: sends requests to servers
│   (Client)      │
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
    ▼         ▼
┌────────┐ ┌────────┐
│ MCP    │ │ MCP    │  ← SERVERS: provide tools, resources, prompts
│Server A│ │Server B│
│(GitHub)│ │(File   │
│        │ │System) │
└────────┘ └────────┘

MCP Capabilities

Capability	Description	Example
Tools	Functions the model can call	`create_issue`, `run_query`, `deploy_app`
Resources	Data the model can read	Database schemas, API specs, documentation
Prompts	Reusable prompt templates	"Summarize this PR", "Review for security"

Transport Types

Transport	How It Works	Best For
stdio	Runs as a local process, communicates via stdin/stdout	Local tools, file system, CLIs
SSE	Server-sent events over HTTP	Remote servers, web services
Streamable HTTP	HTTP with streaming responses	Modern remote servers

MCP vs. Copilot Extensions

Aspect	MCP Servers	Copilot Extensions
Standard	Open protocol (any client)	GitHub-specific
Installation	Configure in `.vscode/mcp.json`	Install from GitHub Marketplace
Runs where	Locally or remote server	GitHub's infrastructure
Access scope	What you configure	What the app requests
Best for	Custom internal tools	Polished third-party integrations

Discussion Points

What internal tools or services would you expose via MCP?
How does the host/client/server separation improve security?
When would you choose MCP over a Copilot Extension (or vice versa)?

6.5. MCP Configuration (10 min)

Key Points

MCP servers are configured in .vscode/mcp.json in your workspace
Each entry specifies a server name, command/transport, and optional environment variables
VS Code discovers the servers on startup and makes their tools available in chat

Configuration Syntax

{
  "servers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${input:github-token}"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "./src"]
    }
  }
}

Security Considerations

Concern	Mitigation
Token exposure	Use `${input:name}` for secrets — VS Code prompts at runtime, never stored in file
File access scope	Limit filesystem servers to specific directories
Network access	Review what remote servers the MCP server contacts
Command execution	Only configure servers from trusted sources

🖥️ Demo: MCP Server Setup

Create .vscode/mcp.json with a filesystem server pointing to the api/ directory
Open chat — show the MCP tools appearing in the tool list
Ask Copilot to use the filesystem tool: "List all TypeScript files in the API directory"
Add a GitHub MCP server — show it pulling issue data into the chat context

Discussion Points

What .vscode/mcp.json configuration would you set up first for your project?
How would you share MCP configurations across your team?
What's the security review process for adding a new MCP server?

6.6. Session 6 Summary & Discussion (5 min)

Key Takeaways

VS Code chat participants (@workspace, @vscode, @terminal) route questions to specialized handlers
Copilot Extensions bring third-party domain knowledge into chat via GitHub Apps
MCP is the open standard for tool integration — configure in .vscode/mcp.json
Always evaluate trust, permissions, and security before adding extensions or MCP servers

Discussion Points

What's the first extension or MCP server you'll set up after this session?
How does MCP change what's possible with Copilot in your workflow?
What governance process would you recommend for your org's MCP server list?

☕ Break — 10 Minutes

7.1. Opening & AI Safety: "When to Trust, When to Verify" (5 min)

Key Points

Not all AI output deserves the same level of scrutiny — risk should drive verification effort
Low risk: Boilerplate code, test scaffolding, documentation → quick glance
Medium risk: Business logic, API routes, data transformations → careful review
High risk: Security code, authentication, financial calculations, infrastructure → thorough verification + testing
The goal is a trust calibration framework — not "trust everything" or "verify everything"

Risk-Based Verification

Risk Level	Code Type	Verification Level
🟢 Low	Boilerplate, scaffolding, docs	Quick scan
🟡 Medium	Business logic, API routes	Code review + tests
🔴 High	Auth, security, financial, infra	Deep review + security scan + tests

Discussion Points

Where does your team currently draw the trust/verify line?
Has AI-generated code ever introduced a bug that made it to production?

7.2. Defining Success Criteria (15 min)

Key Points

Before using Copilot for a task, define what "good output" looks like
Success criteria prevent the trap of accepting the first thing that compiles
Different task types need different criteria

Success Criteria by Task Type

Task Type	Correctness	Style	Performance	Security
Bug fix	Fix resolves the issue; no regressions	Matches existing code style	No performance degradation	No new vulnerabilities
New feature	Meets requirements; handles edge cases	Follows project patterns	Acceptable response time	Input validation, auth checks
Refactoring	Same behavior; tests still pass	Improved readability	Equal or better performance	No security regression
Test generation	Tests are meaningful; cover edge cases	Consistent test structure	Tests run in < 30s	No test data leaks

Defining Criteria Before Coding

Template:

Task: [what you're asking Copilot to do]
Success looks like:
  - Functional: [what the code should do]
  - Quality: [readability, patterns, style]
  - Constraints: [performance, security, compatibility]
  - Not acceptable: [what the code should NOT do]

🖥️ Demo: Criteria-Driven Prompting

Define success criteria for a task: "Add pagination to the products API endpoint"
Include criteria in the prompt: "Must be cursor-based, max 100 items, no offset, include Link headers"
Evaluate the output against the criteria — does it meet each one?
Compare to asking without criteria: "Add pagination to products"

Discussion Points

How often do you define success criteria before writing code (with or without AI)?
What's the minimum criteria you'd set for every Copilot interaction?
How do success criteria change for different team roles (junior vs. senior dev)?

7.3. Output Quality Rubrics (15 min)

Key Points

A rubric is a structured scoring system for evaluating AI output quality
Rubrics make evaluation consistent, repeatable, and shareable across a team
Score each dimension independently — a fix that's correct but unreadable still needs work

Rubric Template

Dimension	Score 1 (Poor)	Score 2 (Acceptable)	Score 3 (Good)	Score 4 (Excellent)
Correctness	Doesn't work	Works for happy path	Handles common edge cases	Handles all edge cases, robust
Completeness	Missing major parts	Core functionality present	Fully implements requirements	Exceeds requirements with thoughtful additions
Code Style	Inconsistent, unreadable	Mostly consistent	Follows project patterns	Clean, idiomatic, well-structured
Security	Has vulnerabilities	No obvious issues	Validates input, handles errors	Defense in depth, follows OWASP
Performance	Unacceptable	Adequate	Efficient	Optimized with appropriate trade-offs

Applying a Rubric

Example — Agent-generated pagination code:

Dimension	Score	Notes
Correctness	3	Works, handles empty results and large pages
Completeness	2	Missing Link header, no total count
Code Style	3	Matches existing Express route patterns
Security	2	No input validation on page size parameter
Performance	3	Uses cursor-based approach, efficient
Overall	2.6	Needs: input validation + Link headers

🖥️ Demo: Rubric Evaluation

Ask Agent mode to add a search feature to the products API
Score the output against the rubric dimensions
Show where it falls short — ask Copilot to fix the specific weaknesses
Re-score — show the improvement

Discussion Points

What dimensions matter most for your team's code quality standards?
How would you automate parts of the rubric (e.g., lint = style, tests = correctness)?
Would you share rubrics across the team, or let each developer customize theirs?

7.4. Evaluation Methods (15 min)

Key Points

Evaluation happens at two levels: automated (machines check) and human-in-the-loop (developers review)
The best approach combines both in a verification pipeline
Automated checks catch the obvious; human review catches the subtle

Automated Evaluation

Check	What It Validates	Tool
Lint	Code style, syntax, unused variables	ESLint, Prettier
Type check	Type safety, interface compliance	TypeScript compiler
Tests	Functional correctness	Vitest, Jest
Build	Compilation, bundling	Vite, tsc
Security scan	Known vulnerabilities, SAST	CodeQL, Semgrep

Human-in-the-Loop Evaluation

Review Focus	What to Look For
Logic	Does the code actually solve the problem?
Architecture	Does it fit the existing patterns?
Edge cases	What happens with unexpected input?
Maintainability	Will another developer understand this in 6 months?
Over-engineering	Did the AI add unnecessary complexity?

The Verification Pipeline

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│  Agent   │───►│  Lint    │───►│  Type    │───►│  Test    │───►│  Human   │
│  Output  │    │  Check   │    │  Check   │    │  Suite   │    │  Review  │
└──────────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘
                   ▼ Fail          ▼ Fail          ▼ Fail          ▼ Reject
              Auto-reject     Auto-reject     Auto-reject     Send back

Batch Evaluation

For systematic quality assessment across many tasks:

Define a set of representative prompts (10-20 tasks)
Run each through Copilot
Score outputs against the rubric
Track aggregate scores over time
Identify patterns: which task types does Copilot handle well vs. poorly?

🖥️ Demo: Automated Evaluation Pipeline

Generate code with Agent mode
Run the lint + type check + test pipeline
Show a failure — observe how the automated gate catches the issue
Fix the issue and re-run — show the pipeline passing

Discussion Points

What automated checks does your team already have in CI?
How would you adapt your PR review process for AI-generated code?
What's the right balance between automated and human evaluation?

7.5. Tracking & Improvement (15 min)

Key Points

Evaluation without tracking is a one-time exercise — tracking creates a feedback loop
Monitor quality trends over time: are your prompts/instructions improving?
The feedback loop: evaluate → identify weakness → improve instructions → re-evaluate

Metrics to Track

Metric	How to Measure	What It Tells You
First-pass acceptance rate	% of agent output accepted without changes	How well your instructions/prompts work
Iteration count	Number of follow-up prompts needed	Prompt quality and task complexity alignment
Rubric scores	Average scores per dimension over time	Where Copilot consistently under- or over-performs
Time savings	Time with Copilot vs. estimated manual time	ROI of AI-assisted development

The Feedback Loop

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Generate │───►│ Evaluate │───►│ Identify │───►│ Improve  │
│ code     │    │ with     │    │ weakness │    │ prompts/ │
│          │    │ rubric   │    │          │    │ instruct │
└──────────┘    └──────────┘    └──────────┘    └────┬─────┘
     ▲                                               │
     └───────────────────────────────────────────────┘

Trust Calibration by Task Type

Task Type	Trust Level	Verification Effort
Boilerplate / scaffolding	High	Quick scan
CRUD endpoints	High	Test coverage check
Business logic	Medium	Code review + tests
Algorithm implementation	Medium-Low	Deep review + benchmarks
Security-critical code	Low	Full security review
Infrastructure / IaC	Low	Plan review + dry run

🖥️ Demo: Feedback Loop in Action

Generate a utility function — score it (e.g., correctness: 3, style: 2)
Identify the style weakness — add a specific instruction to copilot-instructions.md
Regenerate the same function — show the improved style score
Discuss how this scales across a team

Discussion Points

What metrics would be most valuable for your team to track?
How do you currently measure developer productivity? How would AI change that?
What feedback loops already exist in your development process?

7.6. Usage, Billing & Cost Strategies (10 min)

Key Points

GitHub is transitioning from premium-request-based billing to usage-based billing (AI Credits) starting June 1, 2026
GitHub AI Credits are the new billing unit: 1 AI credit = $0.01 USD
Billing is based on actual token consumption: input tokens + output tokens + cached tokens
Code completions and next edit suggestions remain unlimited — they are NOT billed in AI credits
Credits are pooled at the billing entity level — power users are offset by lighter users

The Billing Transition

Aspect	Current (Premium Requests)	New (AI Credits — June 2026)
Unit	1 request × model multiplier	Tokens × model price → AI credits
Free models	GPT-4.1, GPT-4o, GPT-5 mini (0x)	Completions + next edit unlimited; 1,900–3,900 credits/mo included
Pooling	Per-user monthly allowance	Pooled at billing entity level

Model Cost Tiers — Know What You're Spending

Tier	Models	Multiplier
Free (0x)	GPT-4.1, GPT-4o, GPT-5 mini, Raptor mini	No premium requests consumed
Budget (0.25–0.33x)	GPT-5.4 nano, Grok Code Fast, Claude Haiku 4.5, Gemini 3 Flash	Fraction of a request
Standard (1x)	Claude Sonnet 4/4.5/4.6, Gemini 2.5 Pro, GPT-5.2/5.4	1 request per prompt
Premium (3–7.5x)	Claude Opus 4.5/4.6 (3x), Claude Opus 4.7 (7.5x), GPT-5.5 (7.5x)	Multiple requests per prompt
Ultra (30x)	Claude Opus 4.6 fast mode	30 requests per prompt

10 Developer Cost Strategies

#	Strategy	Impact
1	Use included models (GPT-4.1, GPT-4o, GPT-5 mini) for daily work	Zero cost — 0x multiplier
2	Auto model selection in Copilot Chat	10% discount on multipliers
3	Budget models for routine tasks (Haiku 0.33x, GPT-5.4 nano 0.25x)	67–75% cheaper than standard
4	Reserve premium models (Opus, GPT-5.5) for complex problems only	Avoid 3–7.5x multiplier on simple tasks
5	Lean `copilot-instructions.md` — keep under 100 lines	Fewer input tokens per interaction
6	Targeted context — `#file` not `#codebase`	Dramatically fewer input tokens
7	Fresh sessions for new tasks	Prevents history bloat consuming tokens
8	File-targeted instructions with `applyTo`	Only loads when relevant files are active
9	Set per-user budgets	Prevents runaway spending
10	Monitor usage reports regularly	Catch high-consumption patterns early

Admin Budget Controls

Budgets can be set at four levels — each can trigger alerts or enforce hard stops:

Enterprise-level — caps all orgs, repos, cost centers
Organization-level — caps all repos in the org
Cost-center-level — caps a single cost center
User-level — caps individual users ($0 budget = no access)

Note: There is no automatic fallback to cheaper models when a budget is exhausted. Usage is simply blocked until the next billing cycle.

Discussion Points

Which cost tier do most of your daily tasks fall into?
How would you implement per-user budgets in your organization?
What's the ROI calculation for using a 7.5x model vs. a 1x model?

7.7. Session 7 Summary & Discussion (5 min)

Key Takeaways

Define success criteria before asking Copilot to generate code
Use rubrics to make evaluation consistent and repeatable
Combine automated checks with human review in a verification pipeline
Track metrics over time — use the feedback loop to continuously improve prompts and instructions

Discussion Points

What's the first quality metric you'll start tracking?
How would you introduce rubric-based evaluation to your team?
What's the biggest quality challenge with AI-generated code in your experience?

☕ Break — 10 Minutes

8.1. Opening & AI Safety: "Debugging the Black Box" (5 min)

Key Points

AI models are often treated as black boxes — input goes in, output comes out, and it's unclear why
Copilot provides transparency tools: logs, debug mode, and traces that show what the model received and how it reasoned
Transparency builds trust — when you can see the context, you can understand the output
Debugging AI is different from debugging code: you're debugging the context and instructions, not an algorithm

Discussion Points

Have you ever been confused by a Copilot response? How did you investigate?
How does transparency change your trust in AI-generated output?

8.2. Output Log Channels (15 min)

Key Points

VS Code has multiple output channels related to Copilot — each logs different information
The main channels: GitHub Copilot, GitHub Copilot Chat, and language-specific log channels
Output logs are your first diagnostic tool — they show connection status, errors, and feature availability

Output Channels

Channel	What It Logs	When to Check
GitHub Copilot	Completion events, model selection, errors	Completions not appearing or wrong
GitHub Copilot Chat	Chat request/response cycles, tool calls	Chat responses are wrong or slow
Language Server	Language-specific analysis, symbols	Code navigation or analysis issues

How to Access

Open the Output panel: Ctrl+Shift+U (Windows) / Cmd+Shift+U (Mac)
Select the channel from the dropdown
Look for errors (red), warnings (yellow), and info messages

Common Log Patterns

Log Pattern	Meaning	Action
`Request failed: 401`	Authentication expired	Re-sign in to GitHub
`Request failed: 429`	Rate limit exceeded	Wait or switch to a lower-tier model
`Context truncated`	Context window overflow	Reduce attached context or start fresh session
`Model not available`	Selected model is down or restricted	Switch to a different model

🖥️ Demo: Reading Output Logs

Open the Copilot output channel — show the log stream
Trigger a completion — identify the log entry for that event
Show an error scenario (e.g., disconnect network briefly) — identify the error in logs
Show the Chat output channel — trace a chat request through the logs

Discussion Points

How often do you check output logs when Copilot behaves unexpectedly?
What's the most common error you've seen in Copilot logs?

8.3. Chat Debug Mode (15 min)

Key Points

Chat debug mode reveals what Copilot sees: the full context sent to the model, token counts, timing, and model selection
Enable via VS Code setting: github.copilot.chat.debugMode
This is the single most useful diagnostic tool for understanding "why did Copilot say that?"

What Debug Mode Reveals

Information	Why It Matters
Full context sent	See exactly what the model received — instructions, files, history
Token counts	Input tokens, output tokens, total — identify bloat
Model used	Which model actually handled the request
Timing	How long each phase took (context assembly, model call, rendering)
Tool calls	Which tools (search, file read, terminal) were invoked and their results

How to Enable

{
  "github.copilot.chat.debugMode": true
}

After enabling, debug information appears in the chat output channel and optionally inline with responses.

Debug Output Anatomy

A typical debug output shows:

[DEBUG] Context assembly:
  System prompt: 1,247 tokens
  Repository instructions: 312 tokens
  File-targeted instructions: 0 tokens
  Attached context (#file): 3,891 tokens
  Conversation history: 2,104 tokens
  Current message: 87 tokens
  ────────────────────────
  Total input: 7,641 tokens

[DEBUG] Model: gpt-4o
[DEBUG] Response: 342 tokens (1.2s)
[DEBUG] Tool calls: workspace.search (284ms), file.read (12ms)

🖥️ Demo: Chat Debug Mode Walkthrough

Enable github.copilot.chat.debugMode in VS Code settings
Send a chat message — open the output channel to see the debug dump
Point out: instructions loaded, context composition, token counts
Send a message with a large #file reference — show the token count jump
Show how different models appear in the debug output

Discussion Points

What surprised you about the context composition in debug mode?
How would you use token count information to optimize your prompts?
When should you enable debug mode vs. leave it off?

8.4. Agent Debug Logs (15 min)

Key Points

Agent mode creates detailed logs of every iteration in its agentic loop
These logs show: plan, tool calls, results, decisions, and error recovery
Agent debug logs are essential for understanding why the agent took a particular approach

Agent Log Location

Agent debug logs are written to a session-specific log file. Access them via:

Command Palette → "GitHub Copilot: Open Agent Debug Log"
Or find them in the workspace storage directory

Understanding Agent Iteration Logs

Each iteration in the agent loop creates a log entry:

[Iteration 1] Plan: Add validation middleware to orders route
  Tool: file.read → api/routes/orders.ts (success, 2,341 tokens)
  Tool: file.read → api/routes/products.ts (success, 1,892 tokens)
  Tool: file.write → api/middleware/validate-order.ts (created)
  Tool: file.edit → api/routes/orders.ts (modified lines 12-18)
  Tool: terminal.run → npm run lint (exit code: 1)
  Result: FAIL — lint error on line 5: 'Request' is not defined

[Iteration 2] Fix: Add missing import
  Tool: file.edit → api/middleware/validate-order.ts (modified line 1)
  Tool: terminal.run → npm run lint (exit code: 0)
  Tool: terminal.run → npm test (exit code: 0)
  Result: PASS — all validation gates passed

Tool Call Traces

Field	Description
Tool name	Which tool was called (`file.read`, `terminal.run`, etc.)
Arguments	What was passed to the tool (file path, command)
Result	Success/failure, output content, token count
Duration	How long the tool call took
Decision	What the agent decided to do next based on the result

🖥️ Demo: Tracing an Agent Failure

Give Agent mode a task that will fail on first attempt (e.g., "Add a feature that requires a missing dependency")
Open the agent debug log — trace through the iterations
Show where the agent detected the failure (observe phase) and how it adjusted (reflect phase)
Point out the tool call trace: what was read, what was written, what commands ran

Discussion Points

How would you use agent debug logs to improve your agent instructions?
What patterns in the logs indicate the agent is stuck?
How do agent debug logs compare to traditional application debugging?

8.5. Diagnostics Collection & Curriculum Wrap-Up (10 min)

Key Points

When filing issues or requesting support, a diagnostics bundle provides all the information needed
VS Code can export Copilot diagnostics including: extension version, log files, configuration, and system info
Always sanitize diagnostics before sharing — remove tokens, secrets, and proprietary code

Diagnostics Collection

Command Palette → "GitHub Copilot: Collect Diagnostics"
Review the generated file — redact any sensitive content
Attach to GitHub issues or support requests

Diagnostic Toolkit Cheat Sheet

Tool	Command / Location	What You Get
Output logs	`Ctrl+Shift+U` → select channel	Real-time log stream
Debug mode	`github.copilot.chat.debugMode: true`	Context composition, token counts
Agent debug log	Command Palette → "Open Agent Debug Log"	Iteration traces, tool calls
Diagnostics export	Command Palette → "Collect Diagnostics"	Full diagnostic bundle
Extension version	Extensions panel → GitHub Copilot → version	Confirm you're on latest
Network check	Output logs → look for 401/429/timeout	Connection and auth issues

Curriculum Wrap-Up — All 8 Sessions Complete

Module	Sessions	Core Themes
Foundations (Module 1)	1–3	Chat interface, context management, models & tokens
Agentic Patterns (Module 2)	4–5	Agentic loops, self-correction, rubber duck, patterns & antipatterns
Advanced Topics (Module 3)	6–8	Extensions, MCP, evaluation, troubleshooting

What to Do Next

Set up .github/copilot-instructions.md in your primary repository
Create 1-2 custom agents for your team's common workflows
Configure MCP servers for your internal tools
Establish an evaluation rubric and start tracking quality metrics
Share learnings with your team — the best way to learn is to teach

Further Learning

Resource	URL
Copilot Documentation	https://docs.github.com/en/copilot
MCP Specification	https://modelcontextprotocol.io
OctoCAT Supply Demo Repo	https://github.com/microsoft/GitHubCopilot_Customized
VS Code Copilot Features	https://code.visualstudio.com/docs/copilot/overview

Discussion Points

Across all 8 sessions, what's the single most impactful concept for your daily work?
What's the first change you'll make to your team's Copilot setup?
What topics would you want to explore in a follow-up session?

Workshop guide for GitHub Copilot Developer Training — Advanced Topics (Module 3 of 3)