AI Agents¶

This page documents the AI agent implementations in the AI Ops App.

Overview¶

The AI Ops App uses LangGraph to create stateful AI agents that can interact with users and external systems. The app provides two agent implementations:

Multi-MCP Agent: Production agent supporting multiple MCP servers (recommended)
Single-MCP Agent: Simplified agent for single MCP server scenarios

Multi-MCP Agent¶

The Multi-MCP Agent is the production-ready implementation that supports connecting to multiple Model Context Protocol servers simultaneously.

Key Features¶

Multiple MCP Server Support: Connect to any number of MCP servers
Application-Level Caching: Caches MCP client and tools for performance
Health-Based Server Selection: Only uses servers with "Healthy" status
Automatic Tool Discovery: Discovers tools from all healthy MCP servers
Checkpointing: Maintains conversation history using Redis
Graceful Degradation: Continues working even if some MCP servers fail

Architecture¶

User Message → Multi-MCP Agent → LangGraph State Graph
                      ↓
                Azure OpenAI Model
                      ↓
           ┌──────────┴──────────┐
           ↓                      ↓
    MCP Server 1             MCP Server 2
    (Tools A, B, C)          (Tools D, E, F)
           ↓                      ↓
           └──────────┬──────────┘
                      ↓
                  Response

Core Functions¶

get_or_create_mcp_client¶

async def get_or_create_mcp_client(
    force_refresh: bool = False
) -> Tuple[Optional[MultiServerMCPClient], List]:
    """Get or create MCP client with application-level caching.

    Args:
        force_refresh: Force cache refresh even if not expired

    Returns:
        Tuple of (client, tools) or (None, []) if no healthy servers
    """

Cache Behavior: - Cache TTL: 5 minutes (300 seconds) - Thread-safe with asyncio lock - Invalidated on server status changes - Force refresh available when needed

Server Selection: - Queries for servers with status="Healthy" - Protocol must be "http" - Failed servers automatically excluded

warm_mcp_cache¶

async def warm_mcp_cache():
    """Warm the MCP client cache on application startup."""

Called during app initialization to pre-populate the cache. Reduces first-request latency.

process_message¶

async def process_message(
    user_message: str,
    thread_id: str,
    checkpointer=None
) -> str:
    """Process a user message through the multi-MCP agent.

    Args:
        user_message: The user's input message
        thread_id: Unique identifier for conversation thread
        checkpointer: LangGraph checkpointer for state persistence

    Returns:
        The agent's response as a string
    """

Message Processing Flow: 1. Get or create cached MCP client 2. Retrieve LLM model configuration 3. Create LangGraph state graph 4. Process message with conversation history 5. Return agent response

State Management¶

The agent uses MessagesState for conversation tracking:

class MessagesState(TypedDict):
    """State for the agent graph."""
    messages: Annotated[List[BaseMessage], add_messages]

The add_messages reducer: - Properly accumulates messages - Works with checkpointers for persistence - Maintains conversation context

Configuration¶

Cache Settings¶

# Cache TTL: 5 minutes
CACHE_TTL_SECONDS = 300

# Cache structure
_mcp_client_cache = {
    "client": None,
    "tools": None, 
    "timestamp": None,
    "server_count": 0,
}

HTTP Client Configuration¶

def httpx_client_factory(**kwargs):
    """Factory for httpx client with SSL verification disabled.

    Note: verify=False is intentional for internal MCP servers
    with self-signed certificates.
    """
    return httpx.AsyncClient(verify=False, timeout=30.0, **kwargs)

Usage Example¶

from ai_ops.agents.multi_mcp_agent import process_message
from ai_ops.checkpointer import get_checkpointer

# Process a message with conversation history
async with get_checkpointer() as checkpointer:
    response = await process_message(
        user_message="What is the status of my network?",
        thread_id="user-session-123",
        checkpointer=checkpointer
    )
    print(response)

Error Handling¶

The agent handles various error scenarios:

No Healthy MCP Servers: - Returns None for client - Agent continues without MCP tools - Logs warning message

MCP Server Connection Failures: - Failed servers excluded from operations - Cache updated to reflect failures - Agent uses remaining healthy servers

LLM API Errors: - Errors propagated to caller - Consider implementing retry logic - Check Azure OpenAI rate limits

Single-MCP Agent¶

The Single-MCP Agent is a simplified implementation for scenarios with only one MCP server.

Key Features¶

Single Server Focus: Designed for one MCP server
Simpler Configuration: Less complex than multi-server setup
Same LangGraph Architecture: Uses LangGraph state management
Production-Ready: Suitable for focused use cases

When to Use¶

Use the Single-MCP Agent when: - You have only one MCP server - Simpler architecture is preferred - You want explicit server selection - Testing and development scenarios

Use the Multi-MCP Agent when: - You have multiple MCP servers - Dynamic server management needed - Production deployment with scaling - Automatic failover desired

System Prompts¶

The agents use system prompts to define their behavior. Prompts can be managed via the UI or stored in code files.

Dynamic System Prompt Loading¶

The get_active_prompt() helper function loads prompts using a fallback hierarchy:

from ai_ops.helpers.get_prompt import get_active_prompt

# Load prompt for a specific model
system_prompt = get_active_prompt(llm_model)

Fallback Hierarchy:

Model-Assigned Prompt: If the LLM Model has a system_prompt FK with "Approved" status
Global File-Based Prompt: The first approved prompt with is_file_based=True
Code Fallback: Built-in get_multi_mcp_system_prompt() function

Prompt Helper Functions¶

The ai_ops/helpers/get_prompt.py module provides:

Function	Description
`get_active_prompt(llm_model)`	Main entry point - loads prompt with fallback hierarchy
`_load_prompt_content(prompt_obj, model_name)`	Loads content from file or database
`_render_prompt_variables(prompt_text, model_name)`	Substitutes runtime variables
`_get_fallback_prompt(model_name)`	Returns code-based fallback prompt

Template Variables¶

Runtime variables are substituted in prompt text:

Variable	Description	Example
`{current_date}`	Current date in "Month DD, YYYY" format	January 13, 2026
`{current_month}`	Current month in "Month YYYY" format	January 2026
`{model_name}`	Name of the LLM model	gpt-4o

Multi-MCP System Prompt¶

For file-based prompts, the default implementation is in:

# ai_ops/prompts/multi_mcp_system_prompt.py
SYSTEM_PROMPT = """
You are a helpful AI assistant powered by Azure OpenAI...
"""

The multi-MCP prompt: - Explains multi-server capabilities - Provides guidance on tool usage - Sets expectations for responses - Defines assistant personality

Single System Prompt¶

# ai_ops/prompts/system_prompt.py  
SYSTEM_PROMPT = """
You are a helpful AI assistant...
"""

Simpler prompt for single-server scenarios.

Customizing Prompts¶

Prompts can be customized in two ways:

Option 1: Via Nautobot UI (Recommended)

Navigate to AI Platform > LLM > System Prompts
Create a new prompt with your custom instructions
Set status to "Approved"
Optionally assign to a specific model

Option 2: Code-Based (for version control)

Create a Python file in ai_ops/prompts/
Define a get_<filename>() function
Create a SystemPrompt record with is_file_based=True

Example file-based prompt:

# ai_ops/prompts/network_specialist.py

def get_network_specialist(model_name: str = "AI Assistant") -> str:
    """Return the network specialist system prompt."""
    return f"""You are {model_name}, a network operations AI assistant.

Your capabilities include:
- Analyzing network configurations
- Troubleshooting connectivity issues
- Suggesting automation improvements

Always follow RFC standards when applicable.
"""

Agent Integration¶

The build_agent() function automatically loads the appropriate prompt:

async def build_agent(llm_model=None, checkpointer=None, provider=None):
    # ... setup code ...

    # Get system prompt from database or fallback to code-based prompt
    system_prompt = await sync_to_async(get_active_prompt)(llm_model)

    # Create agent with the loaded prompt
    graph = create_agent(
        model=llm,
        tools=tools,
        system_prompt=system_prompt,
        middleware=middleware,
        checkpointer=checkpointer,
    )
    return graph

LangGraph Integration¶

State Graph Structure¶

Both agents use LangGraph's StateGraph:

from langgraph.graph import StateGraph

# Create graph
workflow = StateGraph(MessagesState)

# Add nodes
workflow.add_node("agent", agent_node)
workflow.add_node("tools", tool_node)

# Add edges
workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue)
workflow.add_edge("tools", "agent")

# Compile with checkpointer
graph = workflow.compile(checkpointer=checkpointer)

Message Flow¶

START → agent: Initial message routing
agent → tools (conditional): If tool calls needed
tools → agent: Tool results fed back
agent → END: Final response

Checkpointing¶

Conversation state persisted using Redis:

from ai_ops.checkpointer import get_checkpointer

async with get_checkpointer() as checkpointer:
    result = await graph.ainvoke(
        {"messages": [HumanMessage(content=user_message)]},
        config={"configurable": {"thread_id": thread_id}}
    )

Thread IDs: - Unique identifier per conversation - Typically uses session ID - Enables multi-user support - Isolates conversations

MCP Client Integration¶

MultiServerMCPClient¶

The langchain-mcp-adapters library provides MCP integration:

from langchain_mcp_adapters.client import MultiServerMCPClient

# Create client
client = MultiServerMCPClient(
    connections={
        "server1": {"url": "https://mcp1.example.com"},
        "server2": {"url": "https://mcp2.example.com"},
    },
    httpx_client_factory=httpx_client_factory
)

# Get tools
async with client:
    tools = await client.get_tools()

Tool Discovery¶

Tools are automatically discovered from MCP servers:

# Tools include metadata
for tool in tools:
    print(f"Tool: {tool.name}")
    print(f"Description: {tool.description}")
    print(f"Schema: {tool.args_schema}")

Tool Execution¶

LangGraph automatically handles tool execution:

Agent decides to call tool
ToolNode executes tool call
Results returned to agent
Agent incorporates results in response

Performance Considerations¶

Caching Strategy¶

Why Cache? - MCP client initialization is expensive - Tool discovery requires network calls - Multiple users share the same servers

Cache Invalidation: - Time-based (5 minute TTL) - Manual refresh via force_refresh=True - Server status changes (handled by health checks)

Async Architecture¶

All agent operations are async:

# Good - async/await
async def handle_message(message):
    response = await process_message(message, thread_id)
    return response

# Bad - blocking
def handle_message(message):
    # This won't work - process_message is async
    response = process_message(message, thread_id)

Rate Limiting¶

Consider Azure OpenAI rate limits:

Monitor API usage
Implement retry logic
Use appropriate models for workload
Request quota increases if needed

Testing Agents¶

Unit Testing¶

import pytest
from ai_ops.agents.multi_mcp_agent import get_or_create_mcp_client

@pytest.mark.asyncio
async def test_mcp_client_cache():
    # First call - cache miss
    client1, tools1 = await get_or_create_mcp_client()

    # Second call - cache hit
    client2, tools2 = await get_or_create_mcp_client()

    # Should return same client
    assert client1 is client2

Integration Testing¶

@pytest.mark.asyncio  
async def test_process_message():
    from ai_ops.agents.multi_mcp_agent import process_message
    from ai_ops.checkpointer import get_checkpointer

    async with get_checkpointer() as checkpointer:
        response = await process_message(
            user_message="Hello",
            thread_id="test-thread",
            checkpointer=checkpointer
        )

        assert isinstance(response, str)
        assert len(response) > 0

Troubleshooting¶

Agent Not Responding¶

Check these items:

LLM Model Configuration: Verify default model exists
Azure Connectivity: Test API endpoint access
Logs: Review for error messages
Permissions: Ensure proper API key permissions

MCP Tools Not Available¶

Verify:

Server Health: Check MCP server status
Cache State: Try force refresh
Network: Test server URL accessibility
Protocol: Ensure HTTP protocol selected

Conversation History Lost¶

Check:

Redis Connection: Verify Redis is running
Thread IDs: Ensure consistent thread_id usage
Checkpointer: Confirm checkpointer passed correctly
Cleanup Job: Check if cleanup removed history

Best Practices¶

Agent Usage¶

Use Multi-MCP Agent: For production deployments
Implement Error Handling: Wrap agent calls in try/except
Monitor Performance: Track response times and errors
Cache Awareness: Understand caching behavior

Prompt Engineering¶

Be Specific: Clear instructions in system prompts
Provide Context: Include relevant background
Set Boundaries: Define what agent should/shouldn't do
Test Thoroughly: Validate prompt changes

Production Deployment¶

Scale Redis: Ensure adequate Redis capacity
Monitor Rate Limits: Watch Azure OpenAI usage
Health Checks: Regular MCP server monitoring
Logging: Comprehensive logging for debugging

Models - Database models documentation
Helpers - Helper functions
Checkpointer - Checkpoint configuration (see ai_ops/checkpointer.py)

AI Agents¶

Overview¶

Multi-MCP Agent¶

Key Features¶

Architecture¶

Core Functions¶

get_or_create_mcp_client¶

warm_mcp_cache¶

process_message¶

State Management¶

Configuration¶

Cache Settings¶

HTTP Client Configuration¶

Usage Example¶

Error Handling¶

Single-MCP Agent¶

Key Features¶

When to Use¶

System Prompts¶

Dynamic System Prompt Loading¶

Prompt Helper Functions¶

Template Variables¶

Multi-MCP System Prompt¶

Single System Prompt¶

Customizing Prompts¶

Agent Integration¶

LangGraph Integration¶

State Graph Structure¶

Message Flow¶

Checkpointing¶

MCP Client Integration¶

MultiServerMCPClient¶

Tool Discovery¶

Tool Execution¶

Performance Considerations¶

Caching Strategy¶

Async Architecture¶

Rate Limiting¶

Testing Agents¶

Unit Testing¶

Integration Testing¶

Troubleshooting¶

Agent Not Responding¶

MCP Tools Not Available¶

Conversation History Lost¶

Best Practices¶

Agent Usage¶

Prompt Engineering¶

Production Deployment¶

Related Documentation¶