🧬 Copilot-LD

An intelligent agent leveraging GitHub Copilot and Linked Data

Core Concepts

Understanding the foundational concepts behind Copilot-LD helps you make the most of the platform. This guide explains the "why" behind key architectural decisions and how they work together.

What is Copilot-LD?

Copilot-LD is an intelligent agent that combines GitHub Copilot's language models with linked data and retrieval-augmented generation (RAG) to provide accurate, context-aware assistance. Unlike simple chatbots, it understands semantic relationships in your knowledge base and provides responses grounded in your actual data.

Core Technologies

Linked Data

Linked data provides the semantic structure that makes Copilot-LD uniquely accurate. Instead of treating content as plain text, the system understands relationships and context through HTML microdata with Schema.org vocabularies.

Why Linked Data?

Retrieval-Augmented Generation (RAG)

RAG enhances language model responses by retrieving relevant context from your knowledge base before generating answers. This grounds responses in factual information rather than relying solely on the model's training data.

The RAG Process:

  1. Query: User asks a question or makes a request
  2. Retrieve: System finds relevant content using vector similarity search
  3. Augment: Retrieved content is added to the conversation context
  4. Generate: Language model produces a response informed by the retrieved context

Why RAG?

Microservices Architecture

Copilot-LD is built as a collection of specialized microservices that communicate via gRPC. Each service has a single, well-defined responsibility.

Why Microservices?

gRPC Communication

Services communicate using gRPC, a high-performance RPC framework with Protocol Buffers for message serialization.

Why gRPC?

Distributed Tracing

Copilot-LD implements comprehensive distributed tracing to make the agent's decision-making process observable. Each request creates a trace—a complete record of all service calls, tool executions, and timing information as the agent processes the request.

Why Tracing for Agentic Systems?

Agentic AI systems present unique observability challenges that make tracing essential:

The Tracing Model:

Each trace consists of spans—individual units of work with start/end times, attributes, and relationships to other spans:

  1. Trace ID: Unique identifier shared by all spans in a request
  2. Span ID: Unique identifier for each operation
  3. Parent Span ID: Links spans into a hierarchical tree showing call relationships
  4. Span Kind: Classifies operations as SERVER (incoming), CLIENT (outgoing), or INTERNAL
  5. Attributes: Structured metadata (service name, method, resource IDs, message counts)
  6. Events: Point-in-time markers (request.sent, response.received) with additional context
  7. Status: Success or error state with optional error messages

Why Distributed Tracing?

System Capabilities

Intelligent Request Processing

The Agent service orchestrates request processing, making autonomous decisions about which tools to call and when. It doesn't follow a rigid workflow but adapts based on the conversation context and available tools.

Contextual Memory

The Memory service maintains conversation history with intelligent budgeting. It allocates token budgets between tools, context, and history to maximize relevance while respecting model limits.

The Vector service provides content-based semantic search to find documents by their actual content using vector embeddings. The service generates embeddings from text queries and searches against indexed document content.

Policy-Based Access Control

The Graph service enforces policy-based filtering, ensuring users only access resources they're authorized to see. Policies are defined declaratively and applied consistently across all resource access.

Extensible Tool System

The Tool service enables the agent to execute external functions. Tools are defined using Protocol Buffers and can be added without modifying core services. The agent autonomously decides when to call tools based on conversation context.

Request Flow

Understanding how a request flows through the system helps clarify how the components work together.

Online Processing (Runtime)

  1. Client Request: User sends a message through an extension (web interface, Teams bot, etc.)
  2. Agent Orchestration: Agent service receives the request and validates authentication
  3. Memory Assembly: Agent requests a memory window with conversation history and available tools
  4. Context Retrieval: Agent resolves resource identifiers to actual content, with policy filtering applied
  5. Completion Generation: Agent sends assembled context to LLM service for response generation
  6. Tool Execution: If the LLM decides to call tools, Agent executes them and continues the loop
  7. Response: Final completion is saved to memory and returned to the client

This flow is sequential per request but multiple requests can be processed concurrently. The agent makes intelligent decisions at each step rather than following a rigid pipeline.

Offline Processing (Build Time)

Before the system can answer questions, knowledge must be processed into searchable formats:

  1. Resource Extraction: HTML files with microdata are scanned and converted to individual resource documents
  2. Embedding Creation: Content is converted to vector embeddings
  3. Index Building: Vector database is created for fast similarity search

This offline pipeline ensures runtime queries are fast—no external API calls needed during search, just in-memory vector operations.

Why Separate Online and Offline Processing?

Copilot-LD deliberately separates build-time processing from runtime operations for several important reasons:

Performance

Cost Efficiency

Reliability

Architectural Principles

Radical Simplicity

Copilot-LD is built with plain JavaScript and no external dependencies beyond Node.js built-ins. This deliberate choice makes the system:

Business Logic First

Core logic lives in framework-agnostic packages (@copilot-ld/lib*) that can be imported and tested independently. Services are thin adapters that wire packages together with gRPC communication.

Benefits:

Type Safety Without TypeScript

Protocol Buffers provide type safety and schema validation without requiring TypeScript compilation. Generated JavaScript includes JSDoc types for IDE support while remaining simple JavaScript at runtime.

Security by Design

Security is built into the architecture from the start:

Next Steps

Now that you understand the core concepts, you can: