A FastAPI-based backend demonstrating the OpenAI Agents SDK with streaming endpoints for multiple AI agents. This project features dedicated routers per agent with real-time streaming events including agent updates, raw responses, and run items, plus persistent conversation memory for multi-turn interactions.
This project is structured with separate packages for each agent:
src/
├── api/ # FastAPI application
│ ├── routers/ # Agent-specific endpoints
│ └── utils/ # Shared utilities
├── chat_agent/ # General chat agent package
├── research_bot/ # Basic research agent package
│ ├── agents/ # Planner, Search, and Writer agents
│ └── manager.py # Research orchestration
└── deep_research_agent/ # Advanced multi-agent research system
├── agents/ # Hierarchical specialized agents
├── models.py # Comprehensive data models
├── orchestrator.py # Multi-agent coordination
├── tools.py # Research function tools
└── config.py # Advanced configuration system
Each agent package can be imported and used independently, making the system modular and scalable.
- 🚀 Per-agent dedicated endpoints with standardized patterns
- 📡 Real-time streaming with Server-Sent Events (SSE)
- 🔄 Event types: Raw LLM responses, semantic agent events, handoffs
- 💾 Session Memory & Conversation History - Persistent multi-turn conversations
- 🧩 Modular architecture - each agent as separate package
- 📚 Auto-generated OpenAPI docs at
/docs
- 🔧 Development-ready with hot reload and comprehensive logging
Built-in conversational memory using OpenAI Agents SDK's SQLiteSession:
- Multi-turn conversations: Agents remember context across requests
- Session isolation: Each user/conversation maintains separate history
- Persistent storage: Conversation history survives server restarts
- Environment-based config: Enable with simple
ENABLE_SESSIONS=true
- Zero code changes: Existing endpoints automatically support sessions
- Production-ready: SQLite-based storage with proper error handling
Perfect for: Chatbots, virtual assistants, customer support, educational apps, and any conversational AI that needs context awareness.
- uv - Fast Python package manager
- OpenAI API Key - Set in environment variables
# On macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# On Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Via pip
pip install uv
git clone https://github.com/ahmad2b/openai-agents-streaming-api.git
cd openai-agents-streaming-api
# Create virtual environment with correct Python version
uv venv
# Activate virtual environment
# On Unix/macOS:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install project in development mode with all dependencies
uv pip install -e .
Create a .env
file in the project root:
# .env
OPENAI_API_KEY=your_openai_api_key_here
# Session Memory Configuration (Optional)
ENABLE_SESSIONS=true # Enable conversation memory
SESSION_DB_PATH=./conversations.db # SQLite database path
# Optional: Logging level
LOG_LEVEL=INFO
# Optional: Custom port (default is 8000)
PORT=8000
# Using uvicorn directly (with hot reload)
uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000
# Or using the module directly
python -m src.api.main
# Install production dependencies (if different)
uv pip install -e .
# Run with production settings
uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --workers 4
# Run directly with uv (manages virtual environment automatically)
uv run uvicorn src.api.main:app --reload
- Application:
http://127.0.0.1:8000
- Interactive Docs:
http://127.0.0.1:8000/docs
- ReDoc:
http://127.0.0.1:8000/redoc
Each agent has standardized endpoints with automatic session support:
POST /assistant/run # Synchronous execution
POST /assistant/stream # Real-time streaming
GET /assistant/info # Agent information & session config
GET /assistant/session/{session_id} # Get all messages for session
DELETE /assistant/session/{session_id} # Clear conversation history
POST /chat/run # Synchronous execution
POST /chat/stream # Real-time streaming
GET /chat/info # Agent information & session config
GET /chat/session/{session_id} # Get all messages for session
DELETE /chat/session/{session_id} # Clear conversation history
POST /research # Full research pipeline
# Test the chat agent without session memory
curl -X POST "http://127.0.0.1:8000/chat/run" \
-H "Content-Type: application/json" \
-d '{"input": "Hello, how can you help me?"}'
# First message with session_id - agent introduces context
curl -X POST "http://127.0.0.1:8000/chat/run" \
-H "Content-Type: application/json" \
-d '{"input": "Hi, my name is Sarah and I work as a software engineer", "session_id": "user_sarah_123"}'
# Second message - agent remembers Sarah and her profession
curl -X POST "http://127.0.0.1:8000/chat/run" \
-H "Content-Type: application/json" \
-d '{"input": "What kind of work do I do?", "session_id": "user_sarah_123"}'
# Stream responses with conversation context
curl -X POST "http://127.0.0.1:8000/chat/stream" \
-H "Content-Type: application/json" \
-d '{"input": "Give me some programming tips for my field", "session_id": "user_sarah_123"}' \
--no-buffer
# Get all messages for a session
curl -X GET "http://127.0.0.1:8000/chat/session/user_sarah_123"
# Get limited number of recent messages (e.g., last 10)
curl -X GET "http://127.0.0.1:8000/chat/session/user_sarah_123?limit=10"
# Clear conversation history for a user
curl -X DELETE "http://127.0.0.1:8000/chat/session/user_sarah_123"
# Check agent info and session configuration
curl -X GET "http://127.0.0.1:8000/chat/info"
Environment Variables:
ENABLE_SESSIONS=true
- Enable conversation memory globallySESSION_DB_PATH=./conversations.db
- SQLite database location (optional)
Automatic Session Handling:
- No session_id: Traditional stateless interaction
- With session_id: Automatic conversation history using OpenAI Agents SDK's SQLiteSession
- Session isolation: Each session_id maintains separate conversation memory
- Persistent storage: History survives server restarts and deployments
Built on OpenAI Agents SDK patterns:
- Uses
SQLiteSession
for reliable conversation storage - Integrates with
Runner.run()
andRunner.run_streamed()
seamlessly - Maintains conversation context across agent handoffs
- Supports both synchronous and streaming interactions with memory
- Environment-driven: 12-factor app configuration
- Zero code changes: Works with existing agent implementations
- Scalable storage: SQLite for single instance, easily extensible to PostgreSQL
- Error handling: Graceful degradation when sessions unavailable
- Security: Session isolation prevents cross-user data leakage
# Add new dependencies
uv add package-name
# Add development dependencies
uv add --dev pytest black isort
# Update all dependencies
uv lock --upgrade
# Install from lock file (for consistent environments)
uv pip install -r uv.lock
# Install development tools
uv add --dev black isort flake8 pytest
# Format code
black src/
isort src/
# Run linting
flake8 src/
# Run tests
pytest
Each agent can be imported and used independently, with or without session memory:
# Using the chat agent directly (no session)
from src.chat_agent.main import chat_agent
from agents import Runner
result = await Runner.run(chat_agent, "Hello!")
# Using with session memory
from agents import SQLiteSession
session = SQLiteSession("user_123", db_path="conversations.db")
result = await Runner.run(chat_agent, "Hello!", session=session)
# Using the research bot
from src.research_bot.manager import ResearchManager
manager = ResearchManager()
report = await manager.run("AI trends 2024")
Each agent uses the standardized create_agent_router()
utility that provides:
- POST
/run
- Synchronous execution with complete response - POST
/stream
- Real-time streaming with formatted events - DELETE
/session/{session_id}
- Clear conversation history for specific session - GET
/info
- Agent metadata, configuration, and session status
The streaming endpoints emit structured events:
raw_response
- Direct from OpenAI (text deltas, function calls, etc.)run_item
- Semantic agent events (tool usage, handoffs, reasoning)agent_updated
- Agent handoff notificationsstream_complete
- Final results with usage statistics and session infoerror
- Error handling with details
To add a new agent with automatic session support:
- Create
src/your_agent/main.py
with agent definition - Create
src/api/routers/your_agent.py
usingcreate_agent_router()
- Include the router in
src/api/main.py
- Sessions work automatically - no additional code needed!
- Import Errors: Ensure you're running from the project root and the virtual environment is activated
- OpenAI API Key: Check that
OPENAI_API_KEY
is set in your.env
file - Port Conflicts: Change the port in the uvicorn command if 8000 is occupied
- Python Version: Ensure Python 3.13+ is installed and selected
- Session Memory: Set
ENABLE_SESSIONS=true
and restart server to enable conversation memory
# Run with debug logging
LOG_LEVEL=DEBUG uvicorn src.api.main:app --reload
# Check agent information and session configuration
curl http://127.0.0.1:8000/chat/info
# Test session functionality
curl -X POST http://127.0.0.1:8000/chat/run \
-H "Content-Type: application/json" \
-d '{"input": "Test message", "session_id": "debug_session"}'
- Use
--workers N
for production deployment - Configure appropriate
--timeout-keep-alive
for long streaming sessions - Monitor memory usage with longer conversations
- Session cleanup: Implement periodic cleanup of old conversation sessions
- Database maintenance: Regular SQLite VACUUM for optimal performance
- Customer support chatbots with conversation context
- Virtual assistants that remember user preferences
- Educational tutors tracking learning progress
- Healthcare assistants maintaining patient interaction history
- Employee helpdesk with persistent conversation threads
- Sales assistants remembering customer interactions
- Internal knowledge bots with user-specific context
- Multi-turn research assistants building on previous queries
- Code assistants with project context memory
- Documentation bots maintaining conversation flow
- API testing tools with session-based request history
- Interactive debugging assistants with state persistence
- Follow the established package structure
- Use the standardized agent router pattern
- Session memory works automatically - no special handling needed
- Add comprehensive logging
- Update documentation for new agents
- Test both sync and streaming endpoints with and without sessions
Built with FastAPI, OpenAI Agents SDK (with SQLiteSession), and uv for modern Python development.