LV.1
EXP 0/1000
◀ Playbook index
NO.08

🛡️ Harness Engineering

At a Glance

Harness Engineering is the discipline of designing the scaffold that gets the best results from AI.

It's not just about imposing restrictions — it's about defining the goal, context, roles, and verification methods so AI can move toward outcomes safely and without hesitation.

The Good Old Days

Old-school LLM chat was simple. You threw a prompt at the LLM, and the LLM returned an answer.

flowchart LR
  You[You] -->|Prompt| LLM[The LLM]
  LLM -->|Answer| You

  classDef human fill:#102033,stroke:#00f0ff,color:#e8f4ff,stroke-width:2px;
  classDef llm fill:#302500,stroke:#ffb000,color:#fff4d6,stroke-width:2px;
  class You human;
  class LLM llm;

In that world, context was almost entirely assembled by hand inside the prompt.

Current

Today, an agent / harness with project context and tools stands in front of the LLM.

flowchart LR
  Project[You + Project] -->|prompts / instructions / skills / MCP| Agent[The Agent<br/>aka Harness<br/><br/>Copilot Chat<br/>Copilot CLI<br/>Cloud Agent<br/>Claude Code<br/>Codex]
  Agent -->|answer / PR / edit| Project
  Agent -->|context| LLM[The LLM]
  LLM -->|next step| Agent
  Agent -->|tool call| Tools[Tools<br/>read / edit / run]
  Tools -->|result| Agent

  classDef human fill:#102033,stroke:#00f0ff,color:#e8f4ff,stroke-width:2px;
  classDef llm fill:#302500,stroke:#ffb000,color:#fff4d6,stroke-width:2px;
  classDef agent fill:#132812,stroke:#9bbc0f,color:#f4ffd8,stroke-width:2px;
  classDef context fill:#20242a,stroke:#8b949e,color:#d0d7de,stroke-width:2px;
  class Project human;
  class LLM llm;
  class Agent agent;
  class Tools context;

No magic. The agent is a layer that manages what to read, which tools to use, and how to return the result — instead of calling the LLM directly.

Under the Hood: Agent / Harness (Simplified)

  • Execution Loop: The LLM decides the next move, executes a tool → returns the result to context, and repeats until done.
  • Context Management: Organizes the system prompt, available tools, user task, and tool results, passing them as context with each LLM call.
# --- Setup ---
system_prompt = "You are a helpful coding assistant..."
available_tools = [search_web, read_file, edit_file, run_terminal]

# --- Agent Loop ---
user_task = input("How can I help you?")
context = [system_prompt, available_tools, user_task]

while True:
    next_step = await llm.determine_next_step(context)
    context.append(next_step)

    if next_step.intent == "done":
        return next_step.final_answer

    result = await execute_tool(next_step.tool, next_step.args)
    context.append(result)

What to Harness With?

There is no single technology tool that makes AI powerful. Separate what to always load from what to call only when needed.

Technology ToolLocation / ConfigWhen to Use
Repository-wide custom instructions.github/copilot-instructions.mdRepo-wide conventions, prohibitions, and verification commands
Path-specific custom instructions.github/instructions/*.instructions.md + applyToArea-specific rules for tests/**, api/**, etc.
Agent skills.github/skills/*/SKILL.md / ~/.copilot/skills/Specialized procedures like PR descriptions, frontend design
Custom agents.github/agents/*.agent.md / ~/.copilot/agents/Switch roles, models, and available tools
Hooks.github/hooks/*.jsonInject scripts before/after tool execution to deny, log, or notify
MCP serversMCP config fileConnect to GitHub, Figma, Playwright, Jira, Salesforce
Tool permissionsAgent host permission settingsControl read/search only, allow edits, allow command execution, etc.

The GitHub Docs names are Repository-wide custom instructions and Path-specific custom instructions. On the VS Code side, the latter is also called file-based instructions.

Ecosystem Comparison

The same “AI scaffold” concept exists across ecosystems, but file locations and names differ slightly.

LayerGitHub / CopilotOpen Ecosystem
Global instructions.github/copilot-instructions.mdAGENTS.md
Path-specific rules.github/instructions/*.instructions.mdnested AGENTS.md
Skills (project).github/skills/*/SKILL.md.agents/skills/*/SKILL.md
Skills (personal)~/.copilot/skills/~/.agents/skills/
Custom agentsCopilot custom agentsagent definitions / plugins
MCP / toolsmcp.configmcp.config

Copilot’s strength is native support for the formats of major vendors. In the CLI, type /help to see available formats and commands.

Common Concepts

A good harness is not just a collection of tools — it defines how AI should proceed without getting lost.

PatternWhat does it do?What improves?
Spec-to-code / Spec-drivenWrite the what / why as a spec first, then break it into plan → tasks → implementThe spec becomes the source of truth — predictable implementation instead of vibe coding
Multi-phase coding planThe orchestrator decomposes implementation into multiple phases, each with a clear purpose, order, and completion criteriaEven large changes proceed step by step, without AI rushing ahead all at once
File assignmentThe Planner explicitly lists files to touch; the orchestrator checks for file overlap before parallelizingMultiple agents don’t overwrite each other; Coder / Designer can run in parallel safely
Prompt engineeringWhen writing a Skill / Agent, clearly specify role · objective · deliverableKeeps the agent consistent on who it is, what to achieve, and what to output
Context engineeringDeliver only the context needed for the task, structured appropriatelyAvoids distraction from noise; answers stay aligned with the codebase, spec, and constraints
Approval gatesHumans review at key checkpoints — spec / plan / PR / releasePreserves automation speed while letting humans stop only the dangerous decisions

Designing spec · phase · file ownership · role/objective/deliverable · context · approval upfront makes AI not just faster, but produces fewer reworks too.

Example: Ultralight

Ultralight is a multi-agent orchestration example by Burke Holland, Developer Advocate at Microsoft.
It creates a multi-phase execution plan, detects file overlaps, and acts as a harness that distributes work in parallel to Planner / Coder / Designer.

flowchart LR
  User[User prompt] --> O[Orchestrator<br/>Claude Sonnet 4.6<br/>multi-phase plan]

  O --> P[Planner<br/>Claude Opus 4.6<br/>research + docs]
  O --> C[Coder<br/>GPT-5.3-Codex<br/>scoped code changes]
  O --> D[Designer<br/>Claude Opus 4.6<br/>UI / UX owner]

  D -.-> S[Frontend Design Skill<br/>used by Designer<br/>brand / layout / CSS]
  C -.-> M[MCP Server<br/>used by Coder<br/>GitHub / Playwright<br/>docs]

  P --> O
  C --> O
  D --> O
  O --> R[Pull Request<br/>human review]

  classDef host fill:#102033,stroke:#00f0ff,color:#e8f4ff,stroke-width:2px;
  classDef agent fill:#132812,stroke:#9bbc0f,color:#f4ffd8,stroke-width:2px;
  classDef harness fill:#2a1020,stroke:#ff2e88,color:#ffe8f4,stroke-width:2px;
  classDef ship fill:#302500,stroke:#ffb000,color:#fff4d6,stroke-width:2px;
  class O host;
  class P,C,D agent;
  class S,M harness;
  class R ship;

🚀 I made a Codespace-ready template repo so you can try it in a few clicks: theomonfort/ultralight-template