From 7bfee9d3ac10eaa9ef06ccf101e87e76278564d7 Mon Sep 17 00:00:00 2001 From: Kevin Date: Thu, 27 Mar 2025 13:06:15 -0400 Subject: [PATCH 1/4] add windows_service service monitor --- host/windows/README.md | 69 +++++++++++++++++++++++++++++++++++++++ host/windows/common.tf | 1 + host/windows/main.tf | 35 ++++++++++++++++++++ host/windows/variables.tf | 47 ++++++++++++++++++++++++++ host/windows/versions.tf | 1 + 5 files changed, 153 insertions(+) create mode 100644 host/windows/README.md create mode 120000 host/windows/common.tf create mode 100644 host/windows/main.tf create mode 100644 host/windows/variables.tf create mode 120000 host/windows/versions.tf diff --git a/host/windows/README.md b/host/windows/README.md new file mode 100644 index 0000000..91e385b --- /dev/null +++ b/host/windows/README.md @@ -0,0 +1,69 @@ + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | ~> 1.5 | +| [datadog](#requirement\_datadog) | >= 3.37 | +| [null](#requirement\_null) | >= 3.1.0 | + +## Providers + +| Name | Version | +|------|---------| +| [datadog](#provider\_datadog) | >= 3.37 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [datadog_monitor.windows_service](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [additional\_tags](#input\_additional\_tags) | Additional tags to apply to all monitors | `list(string)` | `[]` | no | +| [alert\_critical\_priority](#input\_alert\_critical\_priority) | Priority for alerts within critical threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no | +| [alert\_message](#input\_alert\_message) | Message to prepend to alert notifications | `string` | `"Alert"` | no | +| [alert\_nodata\_priority](#input\_alert\_nodata\_priority) | Priority for alerts within warning threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no | +| [base\_tags](#input\_base\_tags) | Base tags to apply to all monitors | `list(string)` | `[]` | no | +| [cost\_center](#input\_cost\_center) | Cost Center of the monitored resource (leave blank to omit tag) | `string` | `null` | no | +| [dashboard\_link](#input\_dashboard\_link) | Dashboard link to include in message | `string` | `null` | no | +| [env](#input\_env) | Environment the monitored resource is in (leave blank to omit tag) | `string` | `null` | no | +| [evaluation\_delay](#input\_evaluation\_delay) | Monitor evaluation delay (see [https://docs.datadoghq.com/monitors/configuration/?tab=thresholdalert#set-alert-conditions](Datadog Docs)) | `number` | `900` | no | +| [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no | +| [monitor\_include\_tags](#input\_monitor\_include\_tags) | Tags to be included in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no | +| [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before generating alerts for a new resource | `number` | `300` | no | +| [notify\_alert\_override](#input\_notify\_alert\_override) | List of notifications for alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_crit\_override](#input\_notify\_crit\_override) | List of notifications for 24x7 alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_default](#input\_notify\_default) | List of alert notifications (can be overridden based on alert type) | `list(string)` | n/a | yes | +| [notify\_no\_data](#input\_notify\_no\_data) | Alert if no matching data is found | `bool` | `false` | no | +| [notify\_nodata\_override](#input\_notify\_nodata\_override) | List of notifications for no data (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_nonprod\_override](#input\_notify\_nonprod\_override) | List of notifications for non-prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | +| [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | +| [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | +| [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | +| [timeout\_h](#input\_timeout\_h) | Auto-resolve alert in specified hours if condition no longer matches | `number` | `0` | no | +| [title\_prefix](#input\_title\_prefix) | Prefix all alerts with specified value in brackets | `string` | `null` | no | +| [title\_suffix](#input\_title\_suffix) | Suffix all alerts with specified value in parenthesis | `string` | `null` | no | +| [warn\_priority](#input\_warn\_priority) | Priority for alerts with no data (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no | +| [windows\_service\_alert\_enabled](#input\_windows\_service\_alert\_enabled) | Enable or disable the Windows service alert monitor | `bool` | `true` | no | +| [windows\_service\_alert\_operator](#input\_windows\_service\_alert\_operator) | Operator for the Windows service alert threshold comparison | `string` | `"<"` | no | +| [windows\_service\_alert\_threshold\_critical](#input\_windows\_service\_alert\_threshold\_critical) | Critical threshold for the Windows service alert | `number` | `1` | no | +| [windows\_service\_alert\_threshold\_warning](#input\_windows\_service\_alert\_threshold\_warning) | Warning threshold for the Windows service alert | `number` | `2` | no | +| [windows\_service\_alert\_timeframe](#input\_windows\_service\_alert\_timeframe) | Timeframe for the Windows service alert evaluation | `string` | `"5m"` | no | +| [windows\_service\_alert\_use\_message](#input\_windows\_service\_alert\_use\_message) | Whether to use the base message for the Windows service alert | `bool` | `true` | no | + +## Outputs + +No outputs. + \ No newline at end of file diff --git a/host/windows/common.tf b/host/windows/common.tf new file mode 120000 index 0000000..47c0063 --- /dev/null +++ b/host/windows/common.tf @@ -0,0 +1 @@ +../../common/common.tf \ No newline at end of file diff --git a/host/windows/main.tf b/host/windows/main.tf new file mode 100644 index 0000000..07b0573 --- /dev/null +++ b/host/windows/main.tf @@ -0,0 +1,35 @@ +locals { + # these must be defined but do not need to be overridden + monitor_alert_default_priority = null + monitor_warn_default_priority = null + monitor_nodata_default_priority = null + + title_prefix = var.title_prefix == null ? "" : "[${var.title_prefix}]" + title_suffix = var.title_suffix == null ? "" : " (${var.title_suffix})" +} + +resource "datadog_monitor" "windows_service" { + count = var.windows_service_alert_enabled ? 1 : 0 + + name = join("", [local.title_prefix, "Windows Service Alert - {{host.name}}", local.title_suffix]) + message = var.windows_service_alert_use_message ? local.query_alert_base_message : "" + tags = concat(local.common_tags, var.base_tags, var.additional_tags) + type = "service check" + + evaluation_delay = var.evaluation_delay + notify_no_data = false + renotify_interval = 0 + notify_audit = false + timeout_h = var.timeout_h + include_tags = false + require_full_window = true + + query = < Date: Wed, 9 Apr 2025 12:55:47 -0400 Subject: [PATCH 2/4] update windows service query --- host/windows/main.tf | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/host/windows/main.tf b/host/windows/main.tf index 07b0573..73ce626 100644 --- a/host/windows/main.tf +++ b/host/windows/main.tf @@ -25,7 +25,7 @@ resource "datadog_monitor" "windows_service" { require_full_window = true query = < Date: Mon, 14 Apr 2025 12:26:32 -0400 Subject: [PATCH 3/4] add elasticache memory monitor --- aws/elasticache/README.md | 9 ++++++++- aws/elasticache/main.tf | 29 +++++++++++++++++++++++++++ aws/elasticache/variables.tf | 39 ++++++++++++++++++++++++++++++++++++ 3 files changed, 76 insertions(+), 1 deletion(-) diff --git a/aws/elasticache/README.md b/aws/elasticache/README.md index 67890f6..e55b60c 100644 --- a/aws/elasticache/README.md +++ b/aws/elasticache/README.md @@ -40,6 +40,7 @@ No modules. | [datadog_monitor.hit_rate](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | | [datadog_monitor.hit_rate_anomaly](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | | [datadog_monitor.max_connections](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | +| [datadog_monitor.memory_utilization](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | | [datadog_monitor.swap_usage](https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/monitor) | resource | ## Inputs @@ -97,6 +98,12 @@ No modules. | [max\_connections\_threshold\_critical](#input\_max\_connections\_threshold\_critical) | Critical threshold (connections) | `number` | `64000` | no | | [max\_connections\_threshold\_warning](#input\_max\_connections\_threshold\_warning) | Warning threshold (connections) | `number` | `60000` | no | | [max\_connections\_use\_message](#input\_max\_connections\_use\_message) | Whether to use the query alert base message for max connections monitor | `bool` | `false` | no | +| [memory\_utilization\_enabled](#input\_memory\_utilization\_enabled) | Enable memory utilization monitor | `bool` | `false` | no | +| [memory\_utilization\_evaluation\_window](#input\_memory\_utilization\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`) | `string` | `"last_1h"` | no | +| [memory\_utilization\_no\_data\_window](#input\_memory\_utilization\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `15` | no | +| [memory\_utilization\_threshold\_critical](#input\_memory\_utilization\_threshold\_critical) | Critical threshold (percentage) | `number` | `80` | no | +| [memory\_utilization\_threshold\_warning](#input\_memory\_utilization\_threshold\_warning) | Warning threshold (percentage) | `number` | `70` | no | +| [memory\_utilization\_use\_message](#input\_memory\_utilization\_use\_message) | Whether to use the query alert base message for memory utilization monitor | `bool` | `false` | no | | [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no | | [monitor\_include\_tags](#input\_monitor\_include\_tags) | Tags to be included in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no | | [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before generating alerts for a new resource | `number` | `300` | no | @@ -109,7 +116,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [swap\_usage\_enabled](#input\_swap\_usage\_enabled) | Enable swap usage monitor | `bool` | `false` | no | diff --git a/aws/elasticache/main.tf b/aws/elasticache/main.tf index 2ad69b1..b214f3d 100644 --- a/aws/elasticache/main.tf +++ b/aws/elasticache/main.tf @@ -212,3 +212,32 @@ END warning = var.swap_usage_threshold_warning } } + +resource "datadog_monitor" "memory_utilization" { + count = var.memory_utilization_enabled ? 1 : 0 + + name = join("", [local.title_prefix, "Elasticache Memory Utilization - {{replication_group.name}} - {{value}}%", local.title_suffix]) + include_tags = false + message = var.memory_utilization_use_message ? local.query_alert_base_message : "" + tags = concat(local.common_tags, var.base_tags, var.additional_tags) + type = "query alert" + + evaluation_delay = var.evaluation_delay + new_group_delay = var.new_group_delay + notify_no_data = var.notify_no_data + no_data_timeframe = var.memory_utilization_no_data_window + renotify_interval = var.renotify_interval + require_full_window = true + timeout_h = var.timeout_h + + query = <= ${var.memory_utilization_threshold_critical} +END + + monitor_thresholds { + critical = var.memory_utilization_threshold_critical + warning = var.memory_utilization_threshold_warning + } +} diff --git a/aws/elasticache/variables.tf b/aws/elasticache/variables.tf index da5dd70..f4e163d 100644 --- a/aws/elasticache/variables.tf +++ b/aws/elasticache/variables.tf @@ -321,3 +321,42 @@ variable "swap_usage_use_message" { type = bool default = false } + +######################################## +# Memory Utilization +######################################## +variable "memory_utilization_enabled" { + default = false + description = "Enable memory utilization monitor" + type = bool +} + +variable "memory_utilization_evaluation_window" { + default = "last_1h" + description = "Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`)" + type = string +} + +variable "memory_utilization_no_data_window" { + default = 15 + description = "No data threshold (in minutes, 0 to disable)" + type = number +} + +variable "memory_utilization_threshold_critical" { + default = 80 + description = "Critical threshold (percentage)" + type = number +} + +variable "memory_utilization_threshold_warning" { + default = 70 + description = "Warning threshold (percentage)" + type = number +} + +variable "memory_utilization_use_message" { + description = "Whether to use the query alert base message for memory utilization monitor" + type = bool + default = false +} From b9149c2d9b72c5a65e6c217b62a88d66e0751087 Mon Sep 17 00:00:00 2001 From: Kevin Date: Mon, 14 Apr 2025 12:26:44 -0400 Subject: [PATCH 4/4] update READMEs --- aws/alb/README.md | 10 +++++----- aws/apigateway/README.md | 2 +- aws/beanstalk/README.md | 2 +- aws/ec2/README.md | 2 +- aws/ecs-cluster/README.md | 2 +- aws/ecs-fargate/README.md | 2 +- aws/ecs-service/README.md | 2 +- aws/elasticsearch/README.md | 8 ++++---- aws/elb/README.md | 2 +- aws/lambda/README.md | 2 +- aws/rds/README.md | 2 +- aws/sqs/README.md | 2 +- aws/vpn/README.md | 2 +- 13 files changed, 20 insertions(+), 20 deletions(-) diff --git a/aws/alb/README.md b/aws/alb/README.md index 039a2bb..44f2709 100644 --- a/aws/alb/README.md +++ b/aws/alb/README.md @@ -48,22 +48,22 @@ No modules. | [dashboard\_link](#input\_dashboard\_link) | Dashboard link to include in message | `string` | `null` | no | | [env](#input\_env) | Environment the monitored resource is in (leave blank to omit tag) | `string` | `null` | no | | [evaluation\_delay](#input\_evaluation\_delay) | Monitor evaluation delay (see [https://docs.datadoghq.com/monitors/configuration/?tab=thresholdalert#set-alert-conditions](Datadog Docs)) | `number` | `900` | no | -| [http\_5xx\_responses\_enabled](#input\_http\_5xx\_responses\_enabled) | Enable HTTP 5xx response monitor | `bool` | `false` | no | +| [http\_5xx\_responses\_enabled](#input\_http\_5xx\_responses\_enabled) | Enable HTTP 5xx response monitor | `bool` | `true` | no | | [http\_5xx\_responses\_evaluation\_window](#input\_http\_5xx\_responses\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | | [http\_5xx\_responses\_no\_data\_window](#input\_http\_5xx\_responses\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no | | [http\_5xx\_responses\_threshold\_critical](#input\_http\_5xx\_responses\_threshold\_critical) | Critical threshold (percentage, 0-100) | `number` | `75` | no | | [http\_5xx\_responses\_threshold\_warning](#input\_http\_5xx\_responses\_threshold\_warning) | Warning threshold (percentage, 0-100) | `number` | `25` | no | | [http\_5xx\_responses\_use\_message](#input\_http\_5xx\_responses\_use\_message) | Whether to use the query alert base message | `bool` | `false` | no | -| [http\_5xx\_tg\_responses\_enabled](#input\_http\_5xx\_tg\_responses\_enabled) | Enable HTTP 5xx response monitor (target group) | `bool` | `false` | no | +| [http\_5xx\_tg\_responses\_enabled](#input\_http\_5xx\_tg\_responses\_enabled) | Enable HTTP 5xx response monitor (target group) | `bool` | `true` | no | | [http\_5xx\_tg\_responses\_evaluation\_window](#input\_http\_5xx\_tg\_responses\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | | [http\_5xx\_tg\_responses\_no\_data\_window](#input\_http\_5xx\_tg\_responses\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no | | [http\_5xx\_tg\_responses\_threshold\_critical](#input\_http\_5xx\_tg\_responses\_threshold\_critical) | Critical threshold (percentage, 0-100) | `number` | `75` | no | | [http\_5xx\_tg\_responses\_threshold\_warning](#input\_http\_5xx\_tg\_responses\_threshold\_warning) | Warning threshold (percentage, 0-100) | `number` | `25` | no | | [http\_5xx\_tg\_responses\_use\_message](#input\_http\_5xx\_tg\_responses\_use\_message) | Whether to use the query alert base message | `bool` | `false` | no | -| [latency\_enabled](#input\_latency\_enabled) | Enable latency monitor | `bool` | `false` | no | +| [latency\_enabled](#input\_latency\_enabled) | Enable latency monitor | `bool` | `true` | no | | [latency\_evaluation\_window](#input\_latency\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | | [latency\_no\_data\_window](#input\_latency\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no | -| [latency\_threshold\_critical](#input\_latency\_threshold\_critical) | Critical threshold (seconds) | `number` | `null` | no | +| [latency\_threshold\_critical](#input\_latency\_threshold\_critical) | Critical threshold (seconds) | `number` | `3` | no | | [latency\_threshold\_warning](#input\_latency\_threshold\_warning) | Warning threshold (seconds) | `number` | `null` | no | | [latency\_use\_message](#input\_latency\_use\_message) | Whether to use the query alert base message | `bool` | `false` | no | | [monitor\_exclude\_tags](#input\_monitor\_exclude\_tags) | Tags to be excluded in the monitoring query. Specify in key:value format | `list(string)` | `[]` | no | @@ -84,7 +84,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/apigateway/README.md b/aws/apigateway/README.md index 52cd15d..f4069f2 100644 --- a/aws/apigateway/README.md +++ b/aws/apigateway/README.md @@ -68,7 +68,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/beanstalk/README.md b/aws/beanstalk/README.md index 84f314b..403c541 100644 --- a/aws/beanstalk/README.md +++ b/aws/beanstalk/README.md @@ -79,7 +79,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [root\_disk\_usage\_enabled](#input\_root\_disk\_usage\_enabled) | Enable root disk usage monitor | `bool` | `false` | no | | [root\_disk\_usage\_evaluation\_window](#input\_root\_disk\_usage\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | | [root\_disk\_usage\_no\_data\_window](#input\_root\_disk\_usage\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no | diff --git a/aws/ec2/README.md b/aws/ec2/README.md index 7679e19..312a5fa 100644 --- a/aws/ec2/README.md +++ b/aws/ec2/README.md @@ -57,7 +57,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [status\_failed\_check\_enabled](#input\_status\_failed\_check\_enabled) | Enable ec2 instance status check monitor | `bool` | `true` | no | diff --git a/aws/ecs-cluster/README.md b/aws/ecs-cluster/README.md index cdbab68..4477ed6 100644 --- a/aws/ecs-cluster/README.md +++ b/aws/ecs-cluster/README.md @@ -88,7 +88,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/ecs-fargate/README.md b/aws/ecs-fargate/README.md index 9977961..f62b479 100644 --- a/aws/ecs-fargate/README.md +++ b/aws/ecs-fargate/README.md @@ -89,7 +89,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/ecs-service/README.md b/aws/ecs-service/README.md index c7db7ba..12f42b3 100644 --- a/aws/ecs-service/README.md +++ b/aws/ecs-service/README.md @@ -82,7 +82,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [running\_tasks\_enabled](#input\_running\_tasks\_enabled) | Enable running tasks monitor | `bool` | `true` | no | | [running\_tasks\_evaluation\_window](#input\_running\_tasks\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | diff --git a/aws/elasticsearch/README.md b/aws/elasticsearch/README.md index 20ad716..8153ec2 100644 --- a/aws/elasticsearch/README.md +++ b/aws/elasticsearch/README.md @@ -65,11 +65,11 @@ No modules. | [cpu\_utilization\_anomaly\_threshold\_warning](#input\_cpu\_utilization\_anomaly\_threshold\_warning) | Warning threshold (percent) | `number` | `null` | no | | [cpu\_utilization\_anomaly\_trigger\_window](#input\_cpu\_utilization\_anomaly\_trigger\_window) | Trigger window for anomaly monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_1h"` | no | | [cpu\_utilization\_anomaly\_use\_message](#input\_cpu\_utilization\_anomaly\_use\_message) | Whether to use the query alert base message for CPU utilization anomaly monitor | `bool` | `false` | no | -| [cpu\_utilization\_enabled](#input\_cpu\_utilization\_enabled) | Enable CPU utilization monitor | `bool` | `false` | no | +| [cpu\_utilization\_enabled](#input\_cpu\_utilization\_enabled) | Enable CPU utilization monitor | `bool` | `true` | no | | [cpu\_utilization\_evaluation\_window](#input\_cpu\_utilization\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_5m"` | no | | [cpu\_utilization\_no\_data\_window](#input\_cpu\_utilization\_no\_data\_window) | No data threshold (in minutes, 0 to disable) | `number` | `10` | no | -| [cpu\_utilization\_threshold\_critical](#input\_cpu\_utilization\_threshold\_critical) | Critical threshold (percent) | `number` | `0.9` | no | -| [cpu\_utilization\_threshold\_warning](#input\_cpu\_utilization\_threshold\_warning) | Warning threshold (percent) | `number` | `0.8` | no | +| [cpu\_utilization\_threshold\_critical](#input\_cpu\_utilization\_threshold\_critical) | Critical threshold (percent) | `number` | `90` | no | +| [cpu\_utilization\_threshold\_warning](#input\_cpu\_utilization\_threshold\_warning) | Warning threshold (percent) | `number` | `80` | no | | [cpu\_utilization\_use\_message](#input\_cpu\_utilization\_use\_message) | Whether to use the query alert base message for CPU utilization monitor | `bool` | `false` | no | | [dashboard\_link](#input\_dashboard\_link) | Dashboard link to include in message | `string` | `null` | no | | [env](#input\_env) | Environment the monitored resource is in (leave blank to omit tag) | `string` | `null` | no | @@ -92,7 +92,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/elb/README.md b/aws/elb/README.md index a0edca2..cae1ad0 100644 --- a/aws/elb/README.md +++ b/aws/elb/README.md @@ -84,7 +84,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/lambda/README.md b/aws/lambda/README.md index 77489b5..2fe5b4a 100644 --- a/aws/lambda/README.md +++ b/aws/lambda/README.md @@ -94,7 +94,7 @@ No modules. | [out\_of\_memory\_threshold\_critical](#input\_out\_of\_memory\_threshold\_critical) | Critical threshold (count) | `number` | `5` | no | | [out\_of\_memory\_threshold\_warning](#input\_out\_of\_memory\_threshold\_warning) | Warning threshold (count) | `number` | `null` | no | | [out\_of\_memory\_use\_message](#input\_out\_of\_memory\_use\_message) | Whether to use the query alert base message for out of memory monitor | `bool` | `false` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/rds/README.md b/aws/rds/README.md index 4cdccc4..498cca1 100644 --- a/aws/rds/README.md +++ b/aws/rds/README.md @@ -44,7 +44,7 @@ No modules. | [alert\_critical\_priority](#input\_alert\_critical\_priority) | Priority for alerts within critical threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no | | [alert\_message](#input\_alert\_message) | Message to prepend to alert notifications | `string` | `"Alert"` | no | | [alert\_nodata\_priority](#input\_alert\_nodata\_priority) | Priority for alerts within warning threshold (P1-P5, uses monitor defaults if not specified) | `string` | `null` | no | -| [base\_tags](#input\_base\_tags) | Base tags (key:value format) to add to this type of check (combined with `local.tags` and `var.additional_tags`, generally you should not change this) | `list(string)` |
[
"resource:rds"
]
| no | +| [base\_tags](#input\_base\_tags) | Base tags (key:value format) to add to this type of check (combined with `local.tags` and `var.additional_tags`, generally you should not change this) | `list(string)` |
[
"resource:rds"
]
| no | | [connection\_count\_anomaly\_deviations](#input\_connection\_count\_anomaly\_deviations) | Standard deviations | `number` | `3` | no | | [connection\_count\_anomaly\_enabled](#input\_connection\_count\_anomaly\_enabled) | Enable CPU utilization anomaly monitor | `bool` | `true` | no | | [connection\_count\_anomaly\_evaluation\_window](#input\_connection\_count\_anomaly\_evaluation\_window) | Evaluation window for monitor (`last_?m` (1, 5, 10, 15, or 30), `last_?h` (1, 2, or 4), or `last_1d`] | `string` | `"last_4h"` | no | diff --git a/aws/sqs/README.md b/aws/sqs/README.md index 2d27fa4..0b81d28 100644 --- a/aws/sqs/README.md +++ b/aws/sqs/README.md @@ -68,7 +68,7 @@ No modules. | [queue\_depth\_threshold\_critical](#input\_queue\_depth\_threshold\_critical) | Critical threshold (count) | `number` | `null` | no | | [queue\_depth\_threshold\_warning](#input\_queue\_depth\_threshold\_warning) | Warning threshold (count) | `number` | `null` | no | | [queue\_depth\_use\_message](#input\_queue\_depth\_use\_message) | Whether to use the query alert base message for queue depth monitor | `bool` | `false` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no | diff --git a/aws/vpn/README.md b/aws/vpn/README.md index 662a44a..d5b9978 100644 --- a/aws/vpn/README.md +++ b/aws/vpn/README.md @@ -52,7 +52,7 @@ No modules. | [notify\_prod\_override](#input\_notify\_prod\_override) | List of notifications for 12x5 prod alerts in critical threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_recovery\_override](#input\_notify\_recovery\_override) | List of notifications for alert recovery (uses `notify_default` otherwise) | `list(string)` | `[]` | no | | [notify\_warn\_override](#input\_notify\_warn\_override) | List of notifications for alerts in warning threshold (uses `notify_default` otherwise) | `list(string)` | `[]` | no | -| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `0` | no | +| [renotify\_interval](#input\_renotify\_interval) | Interval in minutes to re-send notifications about an alert | `number` | `60` | no | | [runbook\_link](#input\_runbook\_link) | Runbook link to include in message | `string` | `null` | no | | [service](#input\_service) | Service associated with the monitored resource (leave blank to omit tag) | `string` | `null` | no | | [team](#input\_team) | Team supporting the monitored resource (leave blank to omit tag) | `string` | `null` | no |