Skip to main content
Agentor provides comprehensive observability through automatic tracing, allowing you to monitor agent behavior, debug issues, and optimize performance in production.

Why Observability?

  • Debug Faster: See exactly what your agent did and why
  • Optimize Performance: Identify bottlenecks and slow operations
  • Monitor Costs: Track token usage and API calls
  • Improve Quality: Analyze successful vs failed interactions
  • Trace Multi-Agent: Follow complex workflows across agents

Quick Start

Observability is automatically enabled when you have a Celesto API key:
import os
from agentor import Agentor

# Set your API key
os.environ["CELESTO_API_KEY"] = "your-api-key"

# Tracing is automatically enabled
agent = Agentor(
    name="My Agent",
    model="gpt-5-mini",
    tools=["get_weather"]
)

result = agent.run("What's the weather in Paris?")

# View traces at: https://celesto.ai/observe
Get your API key from the Celesto Dashboard.

Automatic Tracing

Agentor automatically captures:
  • LLM Calls: Model name, tokens, latency, cost
  • Tool Calls: Which tools were called and their results
  • Agent Handoffs: Multi-agent communication flows
  • Errors: Exception details and stack traces
  • Timing: Duration of each operation
  • Input/Output: Messages and responses
from agentor import Agentor

agent = Agentor(
    name="Research Agent",
    model="gpt-5-mini",
    tools=["web_search"]
)

# This run is automatically traced
result = agent.run("Research the latest developments in quantum computing")

# All operations are captured:
# - Initial LLM call
# - Tool calls (web_search)
# - Follow-up LLM calls
# - Final response

Manual Tracing Setup

For more control, explicitly enable tracing:
from agentor import Agentor

agent = Agentor(
    name="Production Agent",
    model="gpt-5-mini",
    enable_tracing=True  # Explicitly enable
)
Disable automatic tracing:
import os

# Disable even with CELESTO_API_KEY set
os.environ["CELESTO_DISABLE_AUTO_TRACING"] = "true"

agent = Agentor(name="Agent", model="gpt-5-mini")
# No automatic tracing

Custom Tracing

For advanced use cases, configure tracing manually:
from agentor.tracer import setup_celesto_tracing, get_run_config
from agentor import Agentor
from agents import Runner
import os

# Setup tracing
processor = setup_celesto_tracing(
    endpoint="https://api.celesto.ai/traces/ingest",
    token=os.environ.get("CELESTO_API_KEY"),
    batch_delay=1.0,        # Seconds before flushing batch
    max_batch_size=256,     # Max traces per batch
    replace_default=True    # Replace OpenAI's default tracing
)

# Create agent
agent = Agentor(name="Agent", model="gpt-5-mini")

# Run with custom config
result = await Runner.run(
    agent.agent,
    "Your query",
    context=get_run_config(
        group_id="session-123",  # Group related traces
        metadata={               # Custom metadata
            "user_id": "user-456",
            "environment": "production"
        }
    )
)

# Ensure traces are sent before exit
processor.force_flush()
processor.shutdown()

Grouping Traces

Group related operations (conversations, sessions):
from agentor.tracer import get_run_config
from agents import Runner
import uuid

# Create a session ID
session_id = str(uuid.uuid4())

agent = Agentor(name="Agent", model="gpt-5-mini")

# All runs with same group_id are grouped together
for user_message in conversation:
    result = await Runner.run(
        agent.agent,
        user_message,
        context=get_run_config(
            group_id=session_id,
            metadata={"turn": len(conversation)}
        )
    )

Adding Metadata

Enrich traces with custom metadata:
from agentor.tracer import get_run_config
from agents import Runner

result = await Runner.run(
    agent.agent,
    user_input,
    context=get_run_config(
        metadata={
            "user_id": "user-123",
            "session_id": "session-456",
            "environment": "production",
            "version": "v2.1.0",
            "feature_flags": ["new-ui", "beta-features"],
            "user_tier": "premium"
        }
    )
)

Viewing Traces

Access your traces in the Celesto dashboard:
  1. Visit https://celesto.ai/observe
  2. Log in with your account
  3. View traces in real-time

Trace Details

Each trace shows:
  • Timeline: Visual representation of operations
  • Spans: Individual operations (LLM calls, tool calls)
  • Tokens: Input/output tokens per call
  • Cost: Estimated cost per operation
  • Latency: Time spent in each operation
  • Errors: Any exceptions or failures
  • Metadata: Custom metadata you added

Filtering Traces

Filter by:
  • Agent name
  • Time range
  • Status (success/failure)
  • Group ID (session)
  • Custom metadata
  • Token usage
  • Cost

Monitoring Patterns

Track Token Usage

import asyncio
from agentor import Agentor
from agentor.tracer import get_run_config
from agents import Runner

agent = Agentor(name="Agent", model="gpt-5-mini")

async def track_usage():
    results = []
    
    for i, prompt in enumerate(batch_prompts):
        result = await Runner.run(
            agent.agent,
            prompt,
            context=get_run_config(
                group_id="batch-job-001",
                metadata={"batch_index": i}
            )
        )
        results.append(result)
    
    # View token usage in dashboard by group_id
    return results

asyncio.run(track_usage())

Monitor Error Rates

from agentor.tracer import get_run_config
import logging

logger = logging.getLogger(__name__)

async def monitored_run(user_input, user_id):
    try:
        result = await Runner.run(
            agent.agent,
            user_input,
            context=get_run_config(
                metadata={
                    "user_id": user_id,
                    "input_length": len(user_input)
                }
            )
        )
        return result
    except Exception as e:
        logger.error(f"Agent error for user {user_id}: {e}")
        # Error is automatically captured in traces
        raise

A/B Testing

import random
from agentor.tracer import get_run_config

def run_ab_test(user_input, user_id):
    variant = "A" if random.random() < 0.5 else "B"
    
    # Different instructions for each variant
    instructions = {
        "A": "You are a concise assistant.",
        "B": "You are a detailed assistant."
    }
    
    agent = Agentor(
        name=f"Agent-{variant}",
        model="gpt-5-mini",
        instructions=instructions[variant]
    )
    
    result = await Runner.run(
        agent.agent,
        user_input,
        context=get_run_config(
            metadata={
                "variant": variant,
                "user_id": user_id
            }
        )
    )
    
    # Compare variants in dashboard
    return result

Multi-Agent Tracing

import asyncio
from agentor import Agentor
from agentor.tracer import get_run_config
from agents import Runner
import uuid

research_agent = Agentor(name="Research", model="gpt-5-mini")
writing_agent = Agentor(name="Writing", model="gpt-5-mini")
review_agent = Agentor(name="Review", model="gpt-5-mini")

async def traced_workflow(topic):
    workflow_id = str(uuid.uuid4())
    
    # All operations share the same group_id
    config = get_run_config(
        group_id=workflow_id,
        metadata={"workflow": "content-creation", "topic": topic}
    )
    
    # Step 1: Research (traced)
    research = await Runner.run(
        research_agent.agent,
        f"Research {topic}",
        context=config
    )
    
    # Step 2: Write (traced)
    draft = await Runner.run(
        writing_agent.agent,
        f"Write about {topic} using: {research.final_output}",
        context=config
    )
    
    # Step 3: Review (traced)
    final = await Runner.run(
        review_agent.agent,
        f"Review and improve: {draft.final_output}",
        context=config
    )
    
    # View complete workflow in dashboard by workflow_id
    return final.final_output

Performance Optimization

Use traces to identify bottlenecks:
1
Identify Slow Operations
2
View the timeline in the dashboard to find:
3
  • Slow LLM calls (switch to faster model?)
  • Slow tool calls (optimize tool code)
  • Unnecessary tool calls (improve instructions)
  • 4
    Optimize Token Usage
    5
    Check token counts:
    6
  • High input tokens → Reduce prompt length
  • High output tokens → Add max_tokens limit
  • Many calls → Better instructions to reduce iterations
  • 7
    Reduce Costs
    8
    Analyze cost per operation:
    9
  • Use cheaper models for simple tasks
  • Cache tool results when possible
  • Batch operations to reduce overhead
  • 10
    Fix Errors
    11
    Find common failure patterns:
    12
  • Which prompts fail most?
  • Which tools have errors?
  • What error messages appear?
  • Best Practices

    1
    Always Use Group IDs
    2
    Group related operations:
    3
    # Good - trackable session
    config = get_run_config(group_id=session_id)
    
    # Less useful - isolated traces
    config = get_run_config()  # No group_id
    
    4
    Add Meaningful Metadata
    5
    # Good - rich context
    metadata = {
        "user_id": user_id,
        "user_tier": "premium",
        "feature": "research",
        "version": "v2"
    }
    
    # Less useful - minimal context
    metadata = {"timestamp": time.time()}
    
    6
    Flush Before Exit
    7
    For scripts and batch jobs:
    8
    from agentor.tracer import setup_celesto_tracing
    
    processor = setup_celesto_tracing(
        endpoint="https://api.celesto.ai/traces/ingest",
        token=api_key
    )
    
    try:
        # Your agent code
        pass
    finally:
        processor.force_flush()  # Ensure traces are sent
        processor.shutdown()
    
    9
    Monitor Production Continuously
    10
    Set up alerts in the dashboard for:
    11
  • Error rate thresholds
  • High latency operations
  • Unusual token usage
  • Cost spikes
  • 12
    Disable Tracing in Tests
    13
    import os
    
    # In test setup
    os.environ["CELESTO_DISABLE_AUTO_TRACING"] = "true"
    

    Privacy and Security

    Sensitive Data

    Tracing includes input/output by default. For sensitive data:
    from agents import RunConfig
    
    # Disable sensitive data capture
    config = RunConfig(
        trace_include_sensitive_data=False  # Don't trace messages
    )
    
    result = await Runner.run(agent.agent, user_input, context=config)
    

    Data Retention

    Traces are stored according to your Celesto plan:
    • Free tier: 7 days
    • Pro tier: 30 days
    • Enterprise: Custom retention

    Troubleshooting

    Traces Not Appearing

    Check:
    1. API key is set correctly
    2. Network connectivity to Celesto
    3. No firewall blocking outbound requests
    4. Traces are flushed (for scripts)
    # Debug tracing
    import logging
    logging.basicConfig(level=logging.DEBUG)
    
    from agentor.tracer import setup_celesto_tracing
    processor = setup_celesto_tracing(
        endpoint="https://api.celesto.ai/traces/ingest",
        token=api_key
    )
    

    High Latency

    Tracing adds minimal overhead (less than 10ms typically). If experiencing issues:
    # Increase batch delay to reduce frequency
    processor = setup_celesto_tracing(
        endpoint="https://api.celesto.ai/traces/ingest",
        token=api_key,
        batch_delay=5.0  # Send every 5 seconds instead of 1
    )
    

    Missing Metadata

    Ensure you’re using get_run_config:
    # Correct
    from agentor.tracer import get_run_config
    config = get_run_config(metadata={"key": "value"})
    
    # Won't include metadata
    from agentor.config import CelestoConfig
    config = CelestoConfig()  # No metadata support
    

    Next Steps

    Last modified on March 4, 2026