5  Adding Memory to Our Agent

5.1 Learning Objectives

  • Implement conversation memory to maintain context across interactions
  • Create a command-line interface for our agent
  • Add error handling and debugging capabilities
  • Understand the concept of context in the Agent SDK

In this chapter, we’ll enhance our basic agent with advanced features that make it more powerful and user-friendly.

5.2 Add Memory to Our Agent

5.2.1 Adding Conversational Memory with Response ID

OpenAI’s Agent SDK is built on top of lower level APIs. By default, it uses the Response API, which introduces several improvements over the Chat Completion API. In particular, it provides Built-in State Management: The API maintains conversation state through response IDs, providing several advantages for keeping conversation history. The model no longer needs to process the entire conversation history with each request, which reduces token usage and improves response times. Conversation state is maintained server-side, so our application only needs to track a single response ID. This is especially efficient for long conversations.

That is one convenient way we can implement memory in our agent. However, while our default model gpt-4.1-nano supports the Response API, not all models do, especially none-OpenAI models.

5.2.2 Adding Custom Memory Module to the Agent Context

Another way to implement memory is through the agent’s context. This requires more work, but allows for more flexibility.

First, let’s create a memory module:

touch src/openai_agent_sdk/memory.py

Now, let’s implement a simple memory system:

# src/openai_agent_sdk/memory.py
from dataclasses import dataclass, field
from typing import List, Dict, Any
import logging

# Configure logger for memory system
logger = logging.getLogger(__name__)

@dataclass
class ConversationMemory:
    """Simple memory store for conversation history."""
    conversation_history: List[Dict[str, str]] = field(default_factory=list)
    max_history_size: int = 10  # Configurable parameter for memory size
    
    def __post_init__(self):
        """Initialize the memory system."""
        logger.info(f"Initializing ConversationMemory with max_history_size={self.max_history_size}")
    
    def add_interaction(self, user_query: str, agent_response: str):
        """Add a user-agent interaction to memory."""
        logger.info(f"Adding interaction to memory (history size: {len(self.conversation_history)})")
        logger.debug(f"User query: {user_query[:50]}...")
        
        # Just store the conversation messages
        self.conversation_history.append({
            "user_query": user_query,
            "agent_response": agent_response
        })
        
        # Debug log to show the full memory content
        logger.debug(f"Current memory content: {self.conversation_history}")
        
        # Trim history if needed
        if len(self.conversation_history) > self.max_history_size:
            logger.info(f"Trimming conversation history to max size {self.max_history_size}")
            self.conversation_history = self.conversation_history[-self.max_history_size:]
    
    def get_conversation_summary(self) -> str:
        """Format the conversation history for inclusion in agent context."""
        if not self.conversation_history:
            logger.debug("No conversation history available")
            return "No conversation history yet."
            
        logger.debug(f"Generating conversation summary for {len(self.conversation_history)} interactions")
        summary = f"Previous conversation history:\n\n"
        
        # Include all stored exchanges, which will be limited by max_history_size
        for interaction in self.conversation_history:
            summary += f"User: {interaction['user_query']}\n"
            summary += f"AI: {interaction['agent_response']}\n\n"
        
        return summary
    
    def clear(self):
        """Clear all conversation history."""
        logger.info("Clearing conversation history")
        self.conversation_history = []

This simple memory class:

  • Stores conversations as a list of user-agent exchanges
  • Limits the history to a configurable number of exchanges
  • Formats the history as a readable conversation summary

we simply track and store the previous conversation history. Then we can provide this memory to the agent each time we process a query.

Note

We used the @dataclass decorator for the conversational memory class. To have a better understanding structured data models such as @dataclass and pyantic, refer to this section.

There are many ways to enhance it for more complex applications:

  1. Summarization: Instead of storing full conversations, summarize them to save tokens:

    def summarize_conversation(self):
        """Generate a condensed summary of key points discussed."""
        summary = "Key topics discussed: "
        # Extract companies, metrics, and other important entities
        return summary
  2. Entity Tracking: Explicitly track important entities mentioned in the conversation:

    def extract_entities(self, text):
        """Extract key entities like company names from text."""
        entities = {"companies": [], "metrics": [], "time_periods": []}
        # Use regex or NLP to extract entities
        return entities
  3. Semantic Search: For longer conversations, use embeddings to find relevant information:

    def retrieve_similar(self, query):
        """Retrieve semantically similar information."""
        query_embedding = self.embedding_model.embed(query)
        return self.vector_db.search(query_embedding)

These techniques can be combined to create more sophisticated memory systems that maintain context even in complex, lengthy conversations, or even include dynamic contextual understanding beyond the scope of the conversation history.

5.3 Inject Memory into our Agent

In both cases, we need to update our agent to use the memory system. We’ll need to modify the agent.py file:

# src/openai_agent_sdk/agent.py (updated)
from agents import Agent, Runner, RunContextWrapper, function_tool
import logging
from dataclasses import dataclass 
from typing import Optional
from functools import wraps
from src.openai_agent_sdk.memory import ConversationMemory
from src.common.config import SYSTEM_PROMPT, DEFAULT_MODEL
from src.common.tools_yf import (
    get_stock_price as original_get_stock_price,
    get_stock_history as original_get_stock_history,
    get_company_info as original_get_company_info,
    get_financial_metrics as original_get_financial_metrics
)

# Configure logger for the agent
logger = logging.getLogger(__name__)

# Wrap our tools to use with the Agent SDK
@function_tool
@wraps(original_get_stock_price)
def get_stock_price(ticker: str) -> str:
    return original_get_stock_price(ticker)

@function_tool
@wraps(original_get_stock_history)
def get_stock_history(ticker: str, days: int) -> str:
    return original_get_stock_history(ticker, days)

@function_tool
@wraps(original_get_company_info)
def get_company_info(ticker: str) -> str:
    return original_get_company_info(ticker)

@function_tool
@wraps(original_get_financial_metrics)
def get_financial_metrics(ticker: str) -> str:
    return original_get_financial_metrics(ticker)

@dataclass
class MarketMindContext:
    """Context object for the MarketMind agent."""
    memory: ConversationMemory
    previous_response_id: Optional[str] = None
    use_explicit_memory: bool = True
    use_response_id_memory: bool = True
    
    def add_to_memory(self, user_query: str, agent_response: str) -> None:
        """Add an interaction to the conversation memory."""
        if self.use_explicit_memory:
            self.memory.add_interaction(user_query, agent_response)
        
    def get_memory_summary(self) -> str:
        """Get a summary of the conversation memory."""
        if self.use_explicit_memory:
            return self.memory.get_conversation_summary()
        return ""  # Return empty string if explicit memory is disabled
    
    def set_response_id(self, response_id: Optional[str]) -> None:
        """Store the response ID from the last interaction."""
        if self.use_response_id_memory and response_id:
            logger.debug(f"Storing response ID: {response_id}")
            self.previous_response_id = response_id
        else:
            logger.debug("No response ID to store or response ID memory disabled")
            self.previous_response_id = None

class MarketMindOpenAIAgent:
    def __init__(self, 
                 model=DEFAULT_MODEL, 
                 use_explicit_memory=True, 
                 use_response_id_memory=True):
        """Initialize the MarketMind agent.
        
        Args:
            model: The OpenAI model to use.
            use_explicit_memory: Whether to use the explicit conversation memory.
            use_response_id_memory: Whether to use the response ID for conversation continuity.
        """
        logger.info(f"Initializing MarketMindOpenAIAgent with model={model}, " +
                   f"use_explicit_memory={use_explicit_memory}, " +
                   f"use_response_id_memory={use_response_id_memory}")
        
        # Initialize context with memory
        self.context = MarketMindContext(
            memory=ConversationMemory(),
            use_explicit_memory=use_explicit_memory,
            use_response_id_memory=use_response_id_memory
        )
        
        # Initialize the agent with proper typing
        self.agent = Agent[MarketMindContext](
            name = "MarketMind",
            model = model,
            instructions = self._get_dynamic_instructions,
            tools = [                
                get_stock_price,
                get_stock_history,
                get_company_info,
                get_financial_metrics
            ],
        )
        logger.info("Agent initialization complete")

    def _get_dynamic_instructions(self, context: RunContextWrapper[MarketMindContext], agent=None) -> str:
        """Dynamic instructions that include conversation memory."""
        logger.debug("Generating dynamic instructions with conversation memory")
        
        # Get the actual context object
        market_mind_context = context.context
        
        if market_mind_context.use_explicit_memory:
            logger.debug(f"Context received in instructions: memory_size={len(market_mind_context.memory.conversation_history) if market_mind_context.memory.conversation_history else 0}")
        else:
            logger.debug("Explicit memory disabled")
        
        base_instructions = SYSTEM_PROMPT            

        # Add memory context if conversation history exists and explicit memory is enabled
        if (market_mind_context.use_explicit_memory and 
            market_mind_context.memory and 
            market_mind_context.memory.conversation_history):
            memory_context = f"\n\nCONVERSATION MEMORY:\n{market_mind_context.get_memory_summary()}"
            full_instructions = base_instructions + memory_context
            logger.debug(f"Added conversation memory context ({len(market_mind_context.memory.conversation_history)} interactions)")
            return full_instructions
        
        logger.debug("No conversation memory to add or explicit memory disabled")
        return base_instructions

    async def process_query(self, query: str, *, 
                     max_turns: int = 10,  
                     hooks = None,
                     run_config = None,
                     previous_response_id: str = None) -> str:
        """Process a user query using the agent and update memory.
        
        Args:
            query: The user's query string
            max_turns: The maximum number of turns to run the agent for
            hooks: An object that receives callbacks on various lifecycle events
            run_config: Global settings for the entire agent run
            previous_response_id: The ID of the previous response, if using OpenAI models via the
                Responses API, this allows you to skip passing in input from the previous turn.
        """
        logger.info(f"Processing query: {query[:50]}...")
        
        # Only use stored response_id if explicitly enabled and not provided
        if (self.context.use_response_id_memory and 
            previous_response_id is None and 
            self.context.previous_response_id):
            logger.debug(f"Using stored response ID: {self.context.previous_response_id}")
            previous_response_id = self.context.previous_response_id
        elif not self.context.use_response_id_memory:
            # If response ID memory is disabled, explicitly set to None
            previous_response_id = None
            logger.debug("Response ID memory disabled, ignoring any previous response ID")
        
        # Process the query using the agent with proper context
        logger.debug(f"Sending query to OpenAI agent with context type: {type(self.context)}")
        
        # Pass all the parameters to the Runner.run method
        result = await Runner.run(
            self.agent, 
            query,
            context=self.context,  # Pass the context to the runner
            max_turns=max_turns,
            hooks=hooks,
            run_config=run_config,
            previous_response_id=previous_response_id
        )
        
        # Log usage information if available
        if hasattr(result, 'usage'):
            logger.debug(f"Usage stats: {result.usage}")
        
        # Store the latest response_id for future interactions if feature is enabled
        if self.context.use_response_id_memory:
            latest_response_id = None
            if result.raw_responses and hasattr(result.raw_responses[-1], 'response_id'):
                latest_response_id = result.raw_responses[-1].response_id
                logger.debug(f"Got response_id: {latest_response_id}")
                
            # Update the stored response_id in the context
            self.context.set_response_id(latest_response_id)
        else:
            logger.debug("Response ID tracking disabled, not storing response ID")
            
        final_output = result.final_output
        logger.debug(f"Received response: {final_output[:50]}...")
        
        # Store the conversation history using the context if explicit memory is enabled
        logger.info("Updating conversation memory")
        self.context.add_to_memory(query, final_output)
        
        return final_output

Let’s break down the key changes:

  1. Configurable Behavior: Users can enable or disable either approach through the use_response_id_memory and use_explicit_memory parameters.
  2. Response ID Tracking: We’ve added support for OpenAI’s response ID feature, which can help maintain conversation continuity by itself.
  3. Explicit Memory Insertion: We’ve implemented a _get_dynamic_instructions method that includes conversation history memory in the agent’s instructions for the explicit memory.
  4. Explicit Memory Updates: After processing a query, we update the explicit memory with the interaction.
  5. Context Class: We’ve created a MarketMindContext class to hold our explicit conversation memory and other state.
Note

The Agent SDK’s context system we used for our memory management is a powerful feature. Here is how it works:

  1. Context Class: We define a custom class (MarketMindContext) that holds our state.
  2. Type Parameter: We use the Agent[MarketMindContext] syntax to tell the SDK about our context type.
  3. Context Injection: We pass our context object to Runner.run() when processing a query.
  4. Context Wrapper: The SDK wraps our context in a RunContextWrapper and passes it to our dynamic instructions function.
  5. State Persistence: Our context object persists between interactions, allowing us to maintain conversation history.

The context system provides a clean way to manage state without cluttering our agent implementation. It’s particularly useful for memory management, user preferences, and other stateful features.

For a deeper dive into the context system, refer to this section and this section.

5.4 Creating a Command-Line Interface

Now, let’s create a command-line interface for our agent using the Click library:

mkdir -p src/cli
touch src/cli/__init__.py
touch src/cli/main.py

Let’s implement the CLI by updating our main.py:

# src/cli/main.py (updated)
import asyncio
import click
import logging
import os
from datetime import datetime
from src.openai_agent_sdk.agent import MarketMindOpenAIAgent
from src.common.config import setup_logging, DEFAULT_MODEL, DEFAULT_DEBUG_MODULES

# Get logger for this module
logger = logging.getLogger(__name__)

@click.group(context_settings=dict(help_option_names=['-h', '--help']))
def cli():
    """MarketMind: Your AI-powered financial assistant."""
    pass

@cli.command()
@click.option('--model', default=DEFAULT_MODEL, help='The model to use for the agent')
@click.option('--debug', is_flag=True, help='Enable debug logging')
@click.option('--use-explicit-memory/--no-explicit-memory', default=True, 
              help='Use explicit conversation memory in the system prompt')
@click.option('--use-response-id/--no-response-id', default=True, 
              help='Use OpenAI response IDs for conversation continuity')
def openai_agent_sdk(model, debug, use_explicit_memory, use_response_id):
    """Start MarketMind using OpenAI Agent SDK."""
    
    # Set up logging - always log to file if debug is enabled, never to console for CLI
    log_filename = setup_logging(
        debug=debug,
        module_loggers=DEFAULT_DEBUG_MODULES,
        log_to_file=debug,
        console_output=False  # Don't output logs to console for CLI apps
    )
    
    logger.info(f"Starting MarketMind with model={model}, explicit_memory={use_explicit_memory}, response_id={use_response_id}")
    
    # Initialize the agent with memory options
    agent = MarketMindOpenAIAgent(
        model=model,
        use_explicit_memory=use_explicit_memory,
        use_response_id_memory=use_response_id
    )
    
    click.echo(click.style("\n🤖 MarketMind Financial Assistant powered by OpenAI Agent SDK", fg='blue', bold=True))
    click.echo(click.style("Ask me about stocks, companies, or financial metrics. Type 'exit' to quit.\n", fg='blue'))
    
    # Display active memory settings
    memory_settings = []
    if use_explicit_memory:
        memory_settings.append("conversation history in system prompt")
    if use_response_id:
        memory_settings.append("OpenAI response ID continuity")
    
    if memory_settings:
        click.echo(click.style(f"Memory enabled: {', '.join(memory_settings)}", fg='yellow'))
    else:
        click.echo(click.style("Memory disabled: Agent has no conversational context", fg='yellow'))
    
    if log_filename:
        click.echo(click.style(f"Log file: {log_filename}", fg='yellow'))
    
    # Use this function to create the event loop and run the conversation
    async def run_conversation():
        # Main conversation loop
        while True:
            # Get user input
            user_input = click.prompt(click.style("You", fg='green', bold=True))
            
            # Check for exit command
            if user_input.lower() in ('exit', 'quit', 'q'):
                logger.info("User requested exit")
                click.echo(click.style("\nThank you for using MarketMind! Goodbye.", fg='blue'))
                break

            # Process the query
            click.echo(click.style("MarketMind", fg='blue', bold=True) + " is thinking...")
            
            click.echo(click.style("  🤔 Processing query and deciding on actions...", fg="yellow"))

            try:
                # Process the query using the agent - it now automatically handles response IDs
                response = await agent.process_query(user_input)
                click.echo(click.style("  ✅ Analysis complete, generating response...", fg="green"))
                
                # Display the response
                click.echo(click.style("MarketMind", fg='blue', bold=True) + f": {response}\n")
                
                # Log memory stats for debugging
                if use_explicit_memory:
                    memory_size = len(agent.context.memory.conversation_history) if agent.context.memory.conversation_history else 0
                    logger.debug(f"Conversation memory size: {memory_size} interactions")
                
                if use_response_id and agent.context.previous_response_id:
                    logger.debug(f"Response ID captured for conversation continuity")
                
            except Exception as e:
                logger.error(f"Error processing query: {str(e)}", exc_info=True)
                click.echo(click.style("  ❌ Error processing query", fg="red"))
                click.echo(click.style("MarketMind", fg='blue', bold=True) + 
                          f": I encountered an error while processing your request. Please try again.\n")
    
    # Run the async conversation loop
    asyncio.run(run_conversation())

def main():
    """Entry point for the CLI."""
    try:
        cli()
    except Exception as e:
        logger.error(f"Unhandled exception: {str(e)}", exc_info=True)
        click.echo(click.style("An unexpected error occurred. Please check the logs.", fg="red"))

if __name__ == "__main__":
    main()

Note that in our CLI, we’ve made these options configurable:

@cli.command()
@click.option('--use-explicit-memory/--no-explicit-memory', default=True, 
              help='Use explicit conversation memory in the system prompt')
@click.option('--use-response-id/--no-response-id', default=True, 
              help='Use OpenAI response IDs for conversation continuity')
def openai_agent_sdk(model, debug, use_explicit_memory, use_response_id):
    # ...
    agent = MarketMindOpenAIAgent(
        model=model,
        use_explicit_memory=use_explicit_memory,
        use_response_id_memory=use_response_id
    )
    # ...

This allows users to experiment with different memory approaches and understand their trade-offs.

Recall that We have set the CLI entry point in our pyproject.toml:

[project.scripts]
market-mind = "src.cli.main:main"

5.5 Testing the Advanced Agent

Now that we’ve implemented memory and a CLI, let’s test our advanced agent:

# Install the package in development mode
uv pip install -e .

# Run the CLI
market-mind openai-agent-sdk

You should see an interactive CLI where you can chat with the agent. Try asking a series of related questions to test the memory capabilities:

You: What's the current price of Tesla stock?
MarketMind is thinking...
  🤔 Processing query and deciding on actions...
  🔍 Getting stock price for TSLA
  ✅ Analysis complete, generating response...
MarketMind: Tesla, Inc. (TSLA) is currently trading at $237.97, up 4.60% today.

You: How about Apple?
MarketMind is thinking...
  🤔 Processing query and deciding on actions...
  🔍 Getting stock price for AAPL
  ✅ Analysis complete, generating response...
MarketMind: Apple Inc. (AAPL) is currently trading at $209.28, up 0.44% today.

You: Compare their financial metrics
MarketMind is thinking...
  🤔 Processing query and deciding on actions...
  💰 Getting financial metrics for TSLA
  💰 Getting financial metrics for AAPL
  ✅ Analysis complete, generating response...
MarketMind: Here's a comparison of Tesla and Apple's key financial metrics:

| Metric | Tesla (TSLA) | Apple (AAPL) |
|--------|--------------|--------------|
| Market Cap | $917.81 billion | $3.14 trillion |
| P/E Ratio | 163.76 | 33.22 |
| Revenue | $95.72 billion | $395.76 billion |
| Profit Margin | 6.38% | 24.30% |
| Return on Equity | 8.77% | 136.52% |
| Dividend Yield | N/A | 48.00% |

Apple has a significantly larger market cap and revenue compared to Tesla. Apple is also more profitable with a higher profit margin and an exceptionally high return on equity. Additionally, Apple offers a dividend yield while Tesla does not. Tesla's high P/E ratio suggests investors expect higher future growth compared to Apple.

Notice how the agent remembered that we were talking about Tesla and Apple when we asked to compare their financial metrics. This demonstrates the power of our memory implementation.

5.5.1 Trace the Agent

OpenAI Agent SDK provides a tracing feature that allows us to see what’s going on under the hood of our agent at the tracing page.

The common way to track how the program runs is to use logging. You may notice we use logging extensively in our code. For an introduction about logging in Python, refer to this section.

5.6 Key Takeaways

In this chapter, we’ve: - Implemented a conversation memory system to maintain context across interactions - Created a context class to manage state in a type-safe manner - Built a command-line interface for our agent - Added comprehensive error handling and logging - Explored the context system in the Agent SDK

These enhancements represent the “Context” component of our ABC Framework. They allow our agent to maintain state across interactions, remember previous conversations, and provide a more natural user experience.

5.7 Summary of the OpenAI Agent SDK Implementation

We’ve now completed our implementation of the MarketMind financial assistant using the OpenAI Agent SDK. Let’s summarize what we’ve built:

  1. Financial Tools: We created four tools that retrieve real-time financial data from Yahoo Finance.
  2. Agent Implementation: We built an agent that can understand financial queries and use the appropriate tools to answer them.
  3. Memory System: We implemented a conversation memory system that allows the agent to maintain context across interactions.
  4. Command-Line Interface: We created a user-friendly CLI for interacting with the agent.

This implementation demonstrates the power of the ABC Framework: - Action: Our financial tools allow the agent to interact with the external world. - Brain: The Agent SDK provides a powerful reasoning engine for understanding queries and selecting tools. - Context: Our memory system maintains state across interactions, allowing for more natural conversations.

In the next chapter, we’ll explore a different approach to building our agent using the Chat Completion API. This will give us a deeper understanding of how agents work behind the scenes.