System Overview
Praxos is built as a modular, event-driven AI assistant platform that integrates with multiple communication channels and external services.Core Components
Ingress Layer
The ingress layer handles incoming messages from various platforms:- FastAPI Server (
src/ingress/api.py) - Main HTTP/WebSocket server - Platform Adapters - Telegram, Discord, Slack, WhatsApp adapters
- Message Validation - Input sanitization and validation
- Queue Publishing - Pushes messages to processing queue
Message Queue
Uses Azure Service Bus (or Redis in local mode) for:- Asynchronous message processing
- Load balancing across workers
- Retry mechanisms for failed tasks
- Priority-based task scheduling
Worker Pool
Background workers (src/workers/) process tasks:
- Agent Workers - Run LangGraph agent loops
- Scheduled Task Workers - Execute recurring tasks
- Ingestion Workers - Process files and documents
- Auto-scaling based on queue depth
LangGraph Agent
The core AI orchestration engine:- Multi-step reasoning with LangGraph
- Conversational memory and context
- Dynamic tool selection
- Interrupt handling for long operations
Tool Factory
Dynamically creates tools based on user context:- Integration Tools - Google, Microsoft, Notion, etc.
- Communication Tools - Send messages, intermediate updates
- Database Tools - User preferences, context storage
- Web Tools - Browsing, search, content extraction
- Utility Tools - File processing, scheduling
Data Flow
Message Processing Flow
-
Receive Message
- User sends message via platform (Telegram, Discord, etc.)
- Platform webhook hits ingress endpoint
- Message validated and normalized
-
Queue & Route
- Message published to Azure Service Bus
- Worker picks up message from queue
- Worker loads user context from database
-
Agent Processing
- LangGraph agent initialized with tools
- Agent reasons about task
- Agent executes tools as needed
- Intermediate updates sent to user
-
Response Delivery
- Final response formatted for platform
- Egress layer sends message
- State saved to database
Tool Execution Pattern
Technology Stack
Backend
- Python 3.11+ - Core runtime
- FastAPI - Web framework
- LangGraph - Agent orchestration
- Pydantic - Data validation
- Motor - Async MongoDB driver
AI & LLM
- Portkey - LLM gateway for routing
- OpenAI - Primary LLM provider
- Google Gemini - Alternative LLM provider
- LangChain - LLM abstractions and utilities
Infrastructure
- Docker - Containerization
- Kubernetes - Orchestration
- Azure Service Bus - Message queue
- Azure Cosmos DB - Document database
- Azure Key Vault - Secrets management
Web Automation
- Playwright - Browser automation
- browser-use - AI-powered browsing library
- BeautifulSoup - HTML parsing
Design Principles
Modularity
Each integration and tool is self-contained:Extensibility
Adding new integrations is straightforward:- Create new integration module
- Implement base adapter interface
- Register with tool factory
- Configure credentials
Scalability
- Stateless workers for horizontal scaling
- Queue-based architecture for load distribution
- Async operations throughout
- Efficient caching strategies
Reliability
- Automatic retry logic for transient failures
- Dead letter queues for failed messages
- Health checks and monitoring
- Graceful degradation
Configuration Management
Environment-Based Config
Secrets Management
Production secrets stored in Azure Key Vault:- API tokens
- Database credentials
- Integration credentials
Feature Flags
Enable/disable features per user:- Beta feature access
- Tool availability
- Rate limiting
Monitoring & Observability
Logging
Structured logging throughout:Metrics
Key metrics tracked:- Message processing time
- Tool execution success rate
- Queue depth
- Error rates by type
Tracing
Distributed tracing for request flows:- End-to-end message journey
- Tool execution times
- External API calls
Security
Authentication
- Platform-specific auth (bot tokens, OAuth)
- User identity verification
- Session management
Authorization
- User-level permissions
- Integration access control
- Tool usage policies
Data Protection
- Encryption at rest (Cosmos DB)
- Encryption in transit (TLS)
- Credential isolation per user
- PII handling compliance
Performance Considerations
Caching Strategy
- User context cached in-memory (TTL: 5 min)
- Integration credentials cached (TTL: 1 hour)
- Common responses cached
Resource Management
- Connection pooling for databases
- Rate limiting on external APIs
- Memory limits per worker
- Timeout handling for long operations
Optimization
- Lazy loading of integrations
- Async/await throughout
- Bulk operations where possible
- Efficient serialization
Future Enhancements
Planned architectural improvements:- Multi-tenant Support - Isolated environments per organization
- Plugin System - User-installable tools and integrations
- Event Streaming - Real-time analytics with Kafka/Event Hub
- Edge Deployment - Regional workers for lower latency
- GraphQL API - Flexible data querying