Configuration Reference¶
This page documents all configuration variables for the terraform-aws-ecs module.
Required Variables¶
These variables must be provided - they have no defaults.
service_name¶
Name of the ECS service. Used for naming resources and CloudWatch log groups.
docker_image¶
Container image to run. Can be from Docker Hub, ECR, or any registry.
# Docker Hub
docker_image = "nginx:latest"
# Amazon ECR
docker_image = "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-app:v1.2.3"
# GitHub Container Registry
docker_image = "ghcr.io/myorg/myapp:main"
alarm_emails¶
Email addresses for CloudWatch alarm notifications. At least one required.
load_balancer_subnets¶
Subnet IDs for the load balancer. Use public subnets for internet-facing ALBs.
asg_subnets¶
Subnet IDs for EC2 instances. Use private subnets for security.
dns_names¶
Hostnames to create in Route53. Creates DNS records and SSL certificate.
# Single hostname: api.example.com
dns_names = ["api"]
# Multiple hostnames with apex domain
dns_names = ["", "www"] # example.com and www.example.com
zone_id¶
Route53 hosted zone ID for DNS records.
Container Configuration¶
container_port¶
TCP port the container listens on.
| Default | Validation |
|---|---|
8080 | 1-65535 |
container_cpu¶
CPU units for the container. 1 vCPU = 1024 units.
| Default |
|---|
200 |
container_memory¶
Hard memory limit in MB. Container is killed if exceeded.
| Default |
|---|
128 |
container_memory_reservation¶
Soft memory limit in MB. Used for task placement decisions.
| Default |
|---|
null (uses container_memory) |
container_command¶
Override the container's default command.
container_healthcheck_command¶
Container health check command. Exit 0 = healthy.
Default: curl -f http://localhost/ || exit 1
Auto Scaling Group (ASG) Configuration¶
asg_instance_type¶
EC2 instance type for the ASG.
| Default |
|---|
"t3.micro" |
asg_min_size¶
Minimum number of EC2 instances.
| Default | Validation |
|---|---|
| Number of subnets | 1-1000 |
asg_max_size¶
Maximum number of EC2 instances.
| Default | Validation |
|---|---|
| Calculated from task requirements | 1-1000 |
Default Behavior (Recommended):
When not specified, the module automatically calculates the optimal max size based on:
- Memory capacity needed to run
task_max_counttasks - CPU capacity needed to run
task_max_counttasks - Minimum of
asg_min_size + 1for scaling headroom
When to Override:
- Cost control: Cap maximum spend
- Capacity planning: Match infrastructure budget
- Testing: Smaller values in non-production
Warning: Setting too low can cause:
- ECS tasks failing to place
- Service degradation during traffic spikes
- Deployment failures
# Cost control - cap at 10 instances
asg_max_size = 10
# Let module calculate (recommended)
# asg_max_size = null
asg_health_check_grace_period¶
Seconds to wait for instance health check after launch.
| Default |
|---|
300 |
on_demand_base_capacity¶
Minimum on-demand instances when using spot instances.
| Default |
|---|
null (all on-demand) |
Task Scaling Configuration¶
task_desired_count¶
Initial number of tasks to run.
| Default |
|---|
1 |
task_min_count¶
Minimum tasks for autoscaling.
| Default | Validation |
|---|---|
1 | >= 1 |
task_max_count¶
Maximum tasks for autoscaling.
| Default |
|---|
10 |
autoscaling_metric¶
Metric for task autoscaling.
| Default | Valid Values |
|---|---|
"ECSServiceAverageCPUUtilization" | ECSServiceAverageCPUUtilization, ECSServiceAverageMemoryUtilization, ALBRequestCountPerTarget |
autoscaling_target_cpu_usage¶
Target CPU percentage for scaling (when using CPU metric).
| Default | Validation |
|---|---|
60 | 1-100 |
Deployment Strategy¶
By default ECS performs rolling deployments: it starts a new task, waits for it to become healthy, then stops the old one. This is safe for stateless services but breaks when two copies of a service cannot coexist — for example, when a container holds an exclusive file lock on an EFS volume.
Three variables control this behaviour:
| Variable | Default | Purpose |
|---|---|---|
deployment_minimum_healthy_percent | 100 | Minimum running tasks during deploy (% of desired_count) |
deployment_maximum_percent | 200 | Maximum running tasks during deploy (% of desired_count) |
enable_deployment_circuit_breaker | true | Auto-rollback on repeated failures |
When to change the defaults¶
Single-writer / EFS-backed services (e.g. VictoriaLogs, Loki in single-tenant mode, any service that calls flock):
# Stop the old task first, then start the new one.
# Brief downtime during deployment — acceptable for background services.
deployment_minimum_healthy_percent = 0
deployment_maximum_percent = 100
The 0 / 100 pair tells ECS: "you may have zero running tasks temporarily, and never more than one." ECS will drain and stop the existing task, release the file lock, and only then launch the replacement.
Stateless web services (default — no change needed):
# The defaults keep at least one task running at all times.
# deployment_minimum_healthy_percent = 100
# deployment_maximum_percent = 200
ECS launches a second task alongside the first, shifts traffic once the new task is healthy, then removes the old one. Zero downtime.
Aggressive rolling deploy for large task counts:
# Replace up to half the fleet at a time for faster deploys.
deployment_minimum_healthy_percent = 50
deployment_maximum_percent = 200
Tip: Combine
deployment_minimum_healthy_percent = 0withenable_deployment_circuit_breaker = true(the default) so that a bad image is automatically rolled back instead of leaving the service down.
Load Balancer Configuration¶
lb_type¶
Load balancer type.
| Default | Valid Values |
|---|---|
"alb" | alb, nlb |
When to use ALB (default): - HTTP/HTTPS services (REST APIs, web apps) - Need path-based or host-based routing - Need HTTP-level health checks (healthcheck_path)
When to use NLB: - Raw TCP/UDP services (databases, gRPC, custom protocols) - Need ultra-low latency or static IPs - Health check is TCP connection only (no HTTP path)
Note: With NLB, the
healthcheck_pathvariable is ignored. Health checks verify only that a TCP connection can be established oncontainer_port.
load_balancing_algorithm_type¶
ALB target group routing algorithm.
| Default | Valid Values |
|---|---|
"round_robin" | round_robin, least_outstanding_requests |
idle_timeout¶
Connection idle timeout in seconds.
| Default |
|---|
60 |
healthcheck_path¶
HTTP path for ALB health checks.
| Default |
|---|
"/index.html" |
healthcheck_interval¶
Seconds between health checks.
| Default |
|---|
10 |
healthcheck_timeout¶
Health check timeout in seconds.
| Default | Validation |
|---|---|
5 | > 0, must be < healthcheck_interval |
healthcheck_response_code_matcher¶
HTTP status codes considered healthy.
| Default |
|---|
"200-299" |
extra_target_groups¶
Extra target groups for multi-port containers. Each entry creates an ALB listener and target group, adds a port mapping to the task definition, and registers the ECS service with the target group.
| Default |
|---|
{} |
extra_target_groups = {
grpc = {
listener_port = 4317
container_port = 4317
protocol = "HTTP" # optional, default: "HTTP"
health_check = { # optional, all fields have defaults
path = "/health" # default: "/"
port = "traffic-port" # default: "traffic-port"
matcher = "200" # default: "200-299"
interval = 30 # default: 30
timeout = 5 # default: 5
}
}
}
Note: Adding or removing entries forces ECS service replacement (AWS API limitation on
load_balancerblocks).
alb_access_log_athena_enabled¶
Enable Athena querying for ALB access logs. Only effective with lb_type = "alb".
| Default |
|---|
false |
When enabled, creates: - Glue catalog database and table (schema over the ALB access log S3 bucket) - S3 results bucket (encrypted, 30-day expiry) - Athena workgroup pre-configured with the results bucket
Once enabled, you can query ALB access logs with SQL in the Athena console. For query examples and detailed usage, see the website-pod documentation.
CloudWatch Configuration¶
enable_cloudwatch_logs¶
Enable CloudWatch logging for containers.
| Default |
|---|
true |
cloudwatch_log_group_retention¶
Log retention in days.
| Default | Valid Values |
|---|---|
365 | 0, 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1096, 1827, 2192, 2557, 2922, 3288, 3653 |
cloudwatch_log_kms_key_id¶
KMS key ARN for log encryption.
| Default |
|---|
null (AWS managed encryption) |
enable_container_insights¶
Enable ECS Container Insights.
| Default |
|---|
false |
When enabled, the module proactively creates the /aws/ecs/containerinsights/<service_name>/performance log group with the configured cloudwatch_log_group_retention (365 days by default). This ensures ISO/SOC-compliant retention instead of the 1-day default that ECS would set if it created the group itself.
Existing deployments: If ECS already created the log group, import it into Terraform state before applying:
Environment and Secrets¶
environment¶
Environment name for tagging.
| Default |
|---|
"development" |
task_environment_variables¶
Environment variables for the container.
task_environment_variables = [
{ name = "LOG_LEVEL", value = "info" },
{ name = "API_URL", value = "https://api.example.com" }
]
task_secrets¶
Secrets from AWS Secrets Manager or Parameter Store.
task_secrets = [
{
name = "DATABASE_PASSWORD"
valueFrom = "arn:aws:secretsmanager:us-west-2:123456789012:secret:db-password"
},
{
name = "API_KEY"
valueFrom = "arn:aws:ssm:us-west-2:123456789012:parameter/api-key"
}
]
Storage¶
task_efs_volumes¶
EFS volumes to mount in containers. Transit encryption is enabled automatically.
task_local_volumes¶
Host volumes to mount in containers.
Advanced Configuration¶
task_role_arn¶
IAM role for containers to assume (for AWS API calls).
execution_extra_policy¶
Additional IAM policies for the task execution role.
execution_extra_policy = {
"secrets" = "arn:aws:iam::123456789012:policy/SecretsAccess"
"ecr" = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}
ssh_key_name¶
SSH key for EC2 instance access (debugging).
ssh_cidr_block¶
CIDR allowed for SSH access.
enable_deployment_circuit_breaker¶
Enable ECS deployment circuit breaker.
| Default |
|---|
true |
deployment_minimum_healthy_percent¶
Lower limit on the number of running tasks during a deployment, as a percentage of desired_count. Set to 0 for single-task EFS-backed services that cannot run two copies simultaneously (e.g. services using flock).
| Default | Validation |
|---|---|
100 | 0-100 |
deployment_maximum_percent¶
Upper limit on the number of running tasks during a deployment, as a percentage of desired_count.
| Default | Validation |
|---|---|
200 | 100-400 |
service_health_check_grace_period_seconds¶
Grace period before health checks start for new tasks.
Vector Agent¶
The module can deploy a Vector Agent daemon on every EC2 instance in the cluster. It collects container logs and host metrics, then forwards them to a Vector Aggregator.
enable_vector_agent¶
Enable the Vector Agent daemon.
| Default |
|---|
false |
vector_aggregator_endpoint¶
Address of the Vector Aggregator (host:port). Required when using the default config template.
vector_agent_image¶
Container image for the Vector Agent.
| Default |
|---|
timberio/vector:0.43.1-alpine |
vector_agent_exclude_containers¶
Container names to exclude from Vector Agent log collection. The agent always excludes itself (vector-agent) regardless of this list. Only used by the default config template — ignored if vector_agent_config is set.
| Default |
|---|
["ecs-agent"] |
# Exclude additional sidecars
vector_agent_exclude_containers = ["ecs-agent", "nginx-sidecar", "envoy"]
vector_agent_config¶
Custom Vector Agent config (YAML string). Replaces the built-in template.
vector_agent_config = templatefile("files/vector.yaml.tftpl", {
endpoint = "aggregator.example.com:6000"
})
ECS Agent Configuration¶
ecs_log_level¶
Log level for the ECS agent running on EC2 instances.
| Default | Valid Values |
|---|---|
"info" | debug, info, warn, error, crit |
Use debug only for troubleshooting — it generates a high volume of logs that can overwhelm observability pipelines.
Validation Rules¶
The module includes built-in validation to catch errors early:
| Variable | Validation |
|---|---|
asg_min_size | 1-1000 when set |
asg_max_size | 1-1000 when set, must be >= asg_min_size |
container_port | 1-65535 |
lb_type | "alb" or "nlb" |
autoscaling_metric | Valid ECS/ALB metric |
autoscaling_target_cpu_usage | 1-100 |
extra_target_groups[*].container_port | 1-65535 |
extra_target_groups[*].listener_port | 1-65535 |
healthcheck_interval | Must be >= healthcheck_timeout |
vector_aggregator_endpoint | Required when enable_vector_agent = true and no custom config |
deployment_minimum_healthy_percent | 0-100 |
deployment_maximum_percent | 100-400 |
extra_target_groups | Only supported with lb_type = "alb" |
ecs_log_level | One of: debug, info, warn, error, crit |
Full Example¶
module "production_api" {
source = "registry.infrahouse.com/infrahouse/ecs/aws"
version = "7.13.1"
providers = {
aws = aws
aws.dns = aws
}
# Service
service_name = "api"
environment = "production"
# Container
docker_image = "123456789012.dkr.ecr.us-west-2.amazonaws.com/api:v2.0.0"
container_port = 8080
container_cpu = 512
container_memory = 1024
container_healthcheck_command = "curl -f http://localhost:8080/health || exit 1"
# Environment
task_environment_variables = [
{ name = "LOG_LEVEL", value = "info" }
]
task_secrets = [
{ name = "DB_PASSWORD", valueFrom = "arn:aws:secretsmanager:..." }
]
# Scaling
task_desired_count = 3
task_min_count = 2
task_max_count = 20
autoscaling_target_cpu_usage = 60
# Infrastructure
asg_instance_type = "t3.medium"
load_balancer_subnets = var.public_subnet_ids
asg_subnets = var.private_subnet_ids
# Load Balancer
healthcheck_path = "/health"
healthcheck_interval = 15
healthcheck_timeout = 10
idle_timeout = 120
# DNS
zone_id = var.zone_id
dns_names = ["api"]
# Monitoring
alarm_emails = ["oncall@example.com"]
enable_cloudwatch_logs = true
cloudwatch_log_group_retention = 365
enable_container_insights = true
tags = {
team = "platform"
project = "api"
}
}
Outputs¶
The module exports these outputs for use in downstream configurations.
Service Outputs¶
| Output | Description |
|---|---|
service_arn | ECS service ARN |
service_name | ECS service name (for CloudWatch Container Insights metrics) |
cluster_name | ECS cluster name (for CloudWatch Container Insights metrics) |
DNS and Load Balancer¶
| Output | Description |
|---|---|
dns_hostnames | List of DNS hostnames where the service is available |
load_balancer_arn | Load balancer ARN |
load_balancer_dns_name | Load balancer DNS name |
load_balancer_arn_suffix | ARN suffix for CloudWatch ALB metrics |
target_group_arn | Primary target group ARN |
extra_target_group_arns | Map of extra target group ARNs |
target_group_arn_suffix | Target group ARN suffix for CloudWatch metrics |
ssl_listener_arn | SSL listener ARN (ALB only) |
acm_certificate_arn | ACM certificate ARN used by the load balancer (ALB only) |
load_balancer_security_groups | Security groups associated with the load balancer (ALB only) |
Auto Scaling Group¶
| Output | Description |
|---|---|
asg_arn | Auto Scaling Group ARN |
asg_name | Auto Scaling Group name |
IAM and Security¶
| Output | Description |
|---|---|
task_execution_role_arn | Task execution role ARN (used by ECS agent) |
task_execution_role_name | Task execution role name |
backend_security_group | Security group ID of backend instances |
Athena (ALB only)¶
| Output | Description |
|---|---|
alb_access_log_glue_database | Glue catalog database name for ALB access logs |
alb_access_log_glue_table | Glue catalog table name for ALB access logs |
athena_workgroup | Athena workgroup name for querying ALB access logs |
athena_results_bucket | S3 bucket for Athena query results |
CloudWatch¶
| Output | Description |
|---|---|
cloudwatch_log_group_name | Main CloudWatch log group name for ECS tasks |
cloudwatch_log_group_names | Map of all log group names: ecs, syslog, dmesg |
Usage Examples¶
Reference DNS hostnames:
Create CloudWatch dashboard:
resource "aws_cloudwatch_dashboard" "ecs" {
dashboard_name = "ecs-${module.ecs.service_name}"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
properties = {
metrics = [
["AWS/ECS", "CPUUtilization", "ClusterName", module.ecs.cluster_name, "ServiceName", module.ecs.service_name]
]
}
}
]
})
}
Access log groups:
# Main ECS task logs
locals {
ecs_log_group = module.ecs.cloudwatch_log_group_name
}
# All log groups (v7.0+ returns a map)
locals {
ecs_logs = module.ecs.cloudwatch_log_group_names["ecs"]
syslog_logs = module.ecs.cloudwatch_log_group_names["syslog"]
dmesg_logs = module.ecs.cloudwatch_log_group_names["dmesg"]
}