Using the App¶

This document describes common use-cases and scenarios for the AI Ops App.

General Usage¶

The AI Ops App provides an AI-powered chat interface that can assist with various operational tasks. The chat assistant uses configurable LLM models from multiple providers (Ollama, OpenAI, Azure AI, Anthropic, HuggingFace, or custom) and can be extended with MCP servers to provide additional capabilities.

Basic Chat Interaction¶

Open the AI Chat Assistant from the navigation menu
Type a natural language question or request
The AI agent processes your message through configured middleware and LLM
The agent provides a response, with conversation context maintained
Continue the conversation - the agent maintains context from previous messages
Start a new conversation by refreshing the page or clearing your session

Use-cases and common workflows¶

Use Case 1: Information Retrieval¶

Scenario: You need to understand how a particular feature works in Nautobot or your infrastructure.

Workflow: 1. Open the AI Chat Assistant 2. Ask questions like: - "How do I configure a new device in Nautobot?" - "What are the best practices for IP address management?" - "Explain the difference between sites and locations in Nautobot" 3. The AI provides detailed explanations based on its training

Example Interaction:

User: What is the purpose of the LLMModel in this plugin?
AI: The LLMModel is a database model that stores configurations for 
    Large Language Models from any supported provider (Ollama, OpenAI,
    Azure AI, Anthropic, HuggingFace, or custom). It includes fields 
    for the model name, provider relationship, API endpoint, API keys 
    (via Secrets), temperature settings, cache TTL, and allows you to 
    designate a default model. The model can also have middleware 
    configurations applied for caching, logging, and other processing...

Use Case 2: Troubleshooting Assistance¶

Scenario: You encounter an issue and need guidance on debugging or resolving it.

Workflow: 1. Describe the problem to the AI Chat Assistant 2. Provide error messages or symptoms 3. Follow the AI's step-by-step troubleshooting suggestions 4. Ask follow-up questions for clarification

Use Case 3: Configuration Guidance¶

Scenario: You need help configuring the AI Ops App or understanding configuration options.

Workflow: 1. Ask about configuration parameters 2. Request examples of proper configuration 3. Get recommendations for your specific use case

Use Case 4: MCP Server Integration¶

Scenario: You want to extend the AI agent's capabilities with custom tools via MCP servers.

Workflow: 1. Configure MCP servers in the app 2. The agent automatically discovers available tools from healthy MCP servers 3. Ask the agent to perform tasks that require those tools 4. The agent uses MCP tools transparently to accomplish the task

Use Case 5: Multi-Turn Conversations¶

Scenario: You have a complex task that requires multiple steps and ongoing dialogue.

Workflow: 1. Start a conversation with the initial request 2. The AI provides information or asks clarifying questions 3. Continue the conversation with follow-ups and refinements 4. The agent maintains context throughout the session

Use Case 6: Multi-Provider LLM Management¶

Scenario: Managing multiple LLM providers and models for different purposes.

Workflow: 1. Configure LLM providers (e.g., Ollama for development, Azure AI for production) 2. Create multiple LLM models under different providers 3. Designate one model as the default for general use 4. Use provider-specific configurations for specialized tasks

Use Cases for Multiple Providers: - Ollama (llama2): Local development and testing without API costs - OpenAI (gpt-4o): Fast, production-quality responses - Azure AI (gpt-4-turbo): Enterprise deployment with Azure compliance - Anthropic (claude-3-opus): Complex reasoning tasks requiring deep analysis

Use Case 7: Middleware Configuration¶

Scenario: Applying middleware to models for caching, logging, validation, and retry logic.

Workflow: 1. Create middleware types (built-in or custom) 2. Configure middleware instances for specific models 3. Set execution priorities (lower numbers execute first) 4. Enable/disable middleware as needed without deletion

Common Middleware Chains:

Production Model (gpt-4o): - Priority 10: LoggingMiddleware (request/response tracking) - Priority 20: CacheMiddleware (reduce API calls, 1-hour TTL) - Priority 30: RetryMiddleware (3 retries with exponential backoff) - Priority 40: ValidationMiddleware (input/output validation)

Development Model (ollama:llama2): - Priority 10: LoggingMiddleware (verbose debugging) - Priority 20: ValidationMiddleware (input validation only)

Use Case 8: Provider-Specific Deployments¶

Scenario: Different environments use different LLM providers.

Deployment Examples:

Local Development:

# Ollama provider for cost-free local testing
Provider: Ollama
Model: llama2
Endpoint: http://localhost:11434
Middleware: LoggingMiddleware only

Non-Production:

# Azure AI with caching for cost control
Provider: Azure AI
Model: gpt-3.5-turbo
Endpoint: https://nonprod.openai.azure.com/
Middleware: LoggingMiddleware, CacheMiddleware (4-hour TTL)

Production:

# Azure AI with full middleware stack
Provider: Azure AI
Model: gpt-4o
Endpoint: https://prod.openai.azure.com/
Middleware: LoggingMiddleware, CacheMiddleware, RetryMiddleware, ValidationMiddleware

Tips for Effective Use¶

Getting Better Responses¶

Be Specific: Provide clear, detailed questions
Provide Context: Include relevant information about your environment
Use Follow-ups: Don't hesitate to ask for clarification or more details
Try Different Phrasings: If you don't get the answer you need, rephrase your question

Leveraging Conversation History¶

The agent remembers previous messages in the same session
Reference earlier parts of the conversation naturally
Build on previous responses to dig deeper into topics
Start fresh by clearing your session when changing topics

Provider Selection Guide¶

Different providers excel at different tasks:

Development & Testing¶

Ollama (Local) - ✓ Best for: Local development, testing without API costs - ✓ No API keys needed, completely free - ✓ Privacy - data never leaves your network - ✓ Good for: Development, experimentation, learning - ✗ Performance depends on local hardware - ✗ Limited to available open-source models

Example Use Cases:

- Testing new prompts and configurations
- Learning how the AI assistant works
- Development without internet connectivity
- Privacy-sensitive environments

General Production Use¶

OpenAI - ✓ Best for: Fast responses, general-purpose tasks - ✓ Excellent for code generation and explanations - ✓ Regular model updates with improvements - ✓ Good documentation and community support - $ Pay-per-use pricing - ⚠️ May have availability issues during high demand

Example Use Cases:

- General network queries and troubleshooting
- Configuration generation and validation
- Documentation assistance
- Quick operational questions

Enterprise Production¶

Azure AI - ✓ Best for: Enterprise deployments with SLAs - ✓ Microsoft compliance standards (HIPAA, SOC 2, etc.) - ✓ Regional deployments for data residency - ✓ Integration with Azure ecosystem - ✓ Private endpoints and VNet integration - $ Higher cost than direct OpenAI - ⚠️ Requires Azure subscription and setup

Example Use Cases:

- Regulated industry deployments
- Enterprise-scale production workloads
- Integration with Azure services
- Multi-region high-availability deployments

Complex Reasoning¶

Anthropic (Claude) - ✓ Best for: Complex analysis and reasoning tasks - ✓ Exceptional context understanding - ✓ Largest context window (200K tokens) - ✓ Strong analytical capabilities - $ Higher cost per token - ⚠️ May be slower than other providers

Example Use Cases:

- Complex troubleshooting scenarios
- Multi-step operational procedures
- Analyzing large configuration files
- Strategic planning and recommendations

Open Source & Specialized¶

HuggingFace - ✓ Best for: Access to specialized open-source models - ✓ Self-hosting options for data control - ✓ Wide variety of models for specific tasks - ✓ Cost-effective with dedicated endpoints - ⚠️ Variable quality across models - ⚠️ May require more setup

Example Use Cases:

- Domain-specific fine-tuned models
- Self-hosted deployments for compliance
- Experimentation with cutting-edge models
- Cost optimization with smaller models

Middleware Best Practices¶

Logging: Always enable for production debugging
Caching: Use for models with API costs to reduce expenses
Set longer TTL (4-24 hours) for stable data
Set shorter TTL (5-60 minutes) for dynamic data
Retry: Critical for production reliability
Enable for cloud providers (transient failures)
Not needed for local Ollama
Validation: Essential for security and data integrity
PII Redaction: Required for sensitive data handling
Priority Order: Logging (10) → Validation (15) → Cache (20) → Retry (30)

Cost Optimization Strategies¶

For OpenAI and Azure AI¶

1. Enable CacheMiddleware with appropriate TTL
2. Use lower temperature for deterministic responses (fewer retries)
3. Implement request validation to prevent malformed queries
4. Monitor token usage via middleware
5. Consider GPT-3.5-turbo for simpler queries
6. Set up billing alerts in provider console

For Anthropic¶

1. Leverage large context window to avoid multiple requests
2. Use Claude Haiku for simple, fast responses
3. Reserve Claude Opus for complex reasoning tasks
4. Enable caching for repeated analysis

Mixed Strategy¶

Development: Ollama (free)
Staging: OpenAI GPT-3.5-turbo (cost-effective)
Production: Azure AI GPT-4o (enterprise)
Complex Analysis: Anthropic Claude 3 Opus (specialized)

Limitations and Considerations¶

Model Knowledge Cutoff: LLM models have a training data cutoff date
Context Window: Very long conversations may exceed context limits
Rate Limits: LLM provider APIs have rate limits that may affect response times
Provider Costs: OpenAI, Azure AI, and Anthropic have per-token costs
Ollama Performance: Local models may be slower than cloud providers
Accuracy: Always verify critical information from AI responses
MCP Server Dependency: Some capabilities require healthy MCP servers
Middleware Overhead: Each middleware adds processing time
Network Latency: Cloud providers require internet connectivity

Provider-Specific Notes¶

Ollama¶

✓ Free and private
✓ No rate limits
✗ Requires local installation and sufficient hardware
✗ Performance depends on CPU/GPU
✗ Limited to available open-source models
Best Models: llama2 (general), mistral (efficient), codellama (code)

OpenAI¶

✓ Fast response times
✓ Latest models (GPT-4o, GPT-4-turbo)
✗ Pay-per-use pricing ($0.03-0.30/1K tokens)
✗ May have availability issues during peak times
✗ Regular model deprecations
Best Models: gpt-4o (balanced), gpt-4-turbo (speed), gpt-3.5-turbo (cost)

Azure AI¶

✓ Enterprise SLAs (99.9% uptime)
✓ Regional deployments
✓ Microsoft compliance standards
✓ Private Link support
✗ More expensive than OpenAI direct
✗ Requires Azure subscription
✗ Model availability varies by region
Best Models: gpt-4o (production), gpt-4-turbo (balanced)

Anthropic¶

✓ Strong analytical capabilities
✓ Longer context windows (200K tokens)
✓ Good for complex reasoning
✗ Higher cost per token ($0.015-0.225/1K tokens)
✗ Smaller model selection
✗ May be slower for simple queries
Best Models: claude-3-opus (complex), claude-3-sonnet (balanced), claude-3-haiku (fast)

HuggingFace¶

✓ Wide model selection
✓ Self-hosting options
✓ Free tier available
✗ Variable quality across models
✗ May require technical expertise
✗ Inference API rate limits
Best Models: Llama 2 (general), Mistral (efficient), CodeLlama (code)

Real-World Usage Examples¶

Example 1: Daily Operations (Development)¶

Environment: Development
Provider: Ollama
Model: llama2
Middleware: LoggingMiddleware only
Cost: $0/month

Usage:
- Testing new configurations
- Learning the system
- Experimenting with prompts
- No API costs

Example 2: Production Operations (Enterprise)¶

Environment: Production
Provider: Azure AI
Model: gpt-4o
Middleware: LoggingMiddleware, CacheMiddleware, RetryMiddleware, ValidationMiddleware
Cost: ~$500/month (10K requests, avg 1K tokens)

Usage:
- Troubleshooting network issues
- Configuration assistance
- Documentation queries
- Compliance requirements met

Example 3: Complex Analysis (Specialized)¶

Environment: Production
Provider: Anthropic
Model: claude-3-opus
Middleware: LoggingMiddleware, RetryMiddleware
Cost: ~$200/month (2K requests, avg 2K tokens)

Usage:
- Root cause analysis
- Strategic planning
- Policy recommendations
- Large document analysis

Example 4: Multi-Provider Strategy¶

Development: Ollama llama2 (free)
Testing: OpenAI gpt-3.5-turbo (cost-effective)
Production Simple: OpenAI gpt-4o (fast)
Production Complex: Anthropic claude-3-opus (reasoning)
Total Cost: ~$600/month

Benefit: Right tool for each job, cost optimization

For detailed configuration instructions, refer to the Provider Configuration Guide, Middleware Configuration Guide, and MCP Server Configuration Guide.