Complete observability stack with Grafana, Prometheus, Tempo, and Loki on AWS ECS, featuring a sample Flask application with automated testing.
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Load Balancer │────│ Data Processor │────│ S3 Bucket │
│ │ │ (Flask App) │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────────┐
│ Observability │
│ │
│ ┌─────────────────┐ │
│ │ AWS Managed │ │ ◄── Metrics
│ │ Prometheus │ │
│ └─────────────────┘ │
│ │
│ ┌─────────────────┐ │
│ │ Tempo (ECS) │ │ ◄── Traces
│ └─────────────────┘ │
│ │
│ ┌─────────────────┐ │
│ │ Loki (ECS) │ │ ◄── Logs
│ └─────────────────┘ │
│ │
│ ┌─────────────────┐ │
│ │ Grafana (ECS) │ │ ◄── Visualization
│ └─────────────────┘ │
└──────────────────────┘
- Self-hosted Grafana (ECS): Visualization with automated data source configuration
- AWS Managed Prometheus: Metrics storage
- Tempo (ECS): Distributed tracing
- Loki (ECS): Log aggregation
- Data Processor: Flask API with OpenTelemetry instrumentation
- Lambda Testing: Automated API calls every minute
- AWS CLI configured
- AWS CDK installed:
npm install -g aws-cdk - Docker running
scripts/complete-setup.shDeployment is fully automated and creates:
- ECS Fargate cluster with 4 services (Data Processor, Tempo, Loki, Grafana)
- Application Load Balancers
- AWS Managed Prometheus workspace
- S3 bucket for data storage
- Lambda function for automated testing
- Grafana data sources (Prometheus, Loki, Tempo) with trace/log correlation
aws cloudformation describe-stacks \
--stack-name GrafanaObservabilityStackStack \
--region us-west-2 \
--query 'Stacks[0].Outputs[?OutputKey==`GrafanaURL`].OutputValue' \
--output textLogin: Username admin, Password admin
tests/test.shtests/test-scenario1-invalid-json.shThis generates errors and failures for testing observability dashboards and alerts.
This stack works with the Grafana MCP Server to provide LLM-powered observability analysis. The MCP server enables AI agents to:
- Query Grafana dashboards and metrics
- Analyze traces and logs
- Investigate incidents and anomalies
- Provide intelligent troubleshooting recommendations
Deploy both stacks together for a complete agentic observability solution.
cdk destroyThis library is licensed under the MIT-0 License. See the LICENSE file.