Skip to content

kraig-mcfadden/terraform-datadog-apm-monitors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

terraform-datadog-apm-monitors

Opinionated set of monitors and SLOs for your Datadog trace metrics.

It creates a service time monitor and an error rate monitor per resource name for the given operation. There are default threshold values for the monitors but those can be updated.

It also creates an SLO for the service time monitors (looking at all of them) and an SLO for the error rate monitors (again, one SLO looking at all of the monitors). Note that there is a 20 monitor limit currently for an SLO, so if you've got more than 20 resources you may want to use this module more than once (ceil(num_resources / 20) times in fact).

There are also monitors on the SLO error budgets, so if the error budget is exhausted you can get an alert. SLOs span 3 time horizons (7 days, 30 days, 90 days) and there are separate monitors for each time horizon.

Requirements

Name Version
terraform >= 1.0.0

Providers

Name Version
datadog n/a

Modules

No modules.

Resources

Name Type
datadog_monitor.error_rate_slo_monitors resource
datadog_monitor.high_error_rate_monitors resource
datadog_monitor.high_service_time_monitors resource
datadog_monitor.service_time_slo_monitors resource
datadog_service_level_objective.error_rate_slo resource
datadog_service_level_objective.service_time_slo resource

Inputs

Name Description Type Default Required
critical_error_rate Threshold error rate we want to alert at number 0.005 no
critical_service_time Threshold service time (amount of time request takes on server) we want to alert at in seconds number 0.5 no
env Environment your traces are tagged with string n/a yes
notify Notification handle for alerts. Must be of the form @pagerduty-{service} or @slack-{channel} etc. depending on the integration you're using string n/a yes
operation Trace metric the queries will look at. Called 'operation' in the APM dashboard string n/a yes
resource_names The resources you want to monitor by name. Check APM dashboard to see what your service has list(string) n/a yes
service Service name your traces are tagged with string n/a yes
team Team that owns these monitors string n/a yes

Outputs

No outputs.

About

Opinionated set of monitors for your Datadog trace metrics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages