Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Doc] Missing and outdated Airflow metrics >=2.7.0 #18584

Open
Casara360 opened this issue Sep 13, 2024 · 2 comments
Open

[Doc] Missing and outdated Airflow metrics >=2.7.0 #18584

Casara360 opened this issue Sep 13, 2024 · 2 comments
Assignees

Comments

@Casara360
Copy link

Casara360 commented Sep 13, 2024

The DD_DOGSTATSD_MAPPER_PROFILES config used by the Airflow integration is missing the queued_duration metric, and mishandles task_instance_created past 2.6.3.

Queued duration

The dag.<dag_id>.<task_id>.queued_duration looks to have been introduced in Airflow 2.7.0, and the DogStatsD mapper profile provided on the Integrations page is missing its entry.
As a result, it's exploding the number of metrics (+3k in my case).
I suggest adding the following entry

{
    "match": "airflow\\.dag\\.(.*)\\.([^.]*)\\.queued_duration",
    "match_type": "regex",
    "name": "airflow.dag.task.queued_duration",
    "tags": {"dag_id": "$1", "task_id": "$2"},
},

Task instance created

From version 2.6.3 to 2.7.0, the counter the task instance created metric changed format:

# < 2.7.0
task_instance_created-<operator_name>
# >= 2.7.0
task_instance_created_<operator_name>

This results in 1 metric per operator, which makes it hardly usable.
The current mapping is

  {
      "match": "airflow.task_instance_created-*",
      "name": "airflow.task.instance_created",
      "tags": {"task_class": "$1"},
  }

which could perhaps be

{
    "match": "airflow.task_instance_created*",
    "name": "airflow.task.instance_created",
    "tags": {"task_class": "$1"},
}

to handle both versions ?


Would it be worth adding a disclaimer, warning the user that the mapping was created for version X ?

@Casara360
Copy link
Author

Same for scheduled_duration, need to add

{
    "match": "airflow\\.dag\\.(.*)\\.([^.]*)\\.scheduled_duration",
    "match_type": "regex",
    "name": "airflow.dag.task.scheduled_duration",
    "tags": {"dag_id": "$1", "task_id": "$2"},
},

@iliakur iliakur self-assigned this Nov 14, 2024
@iliakur
Copy link
Contributor

iliakur commented Nov 14, 2024

@Casara360 thanks for reporting!

We've recently added a couple more airflow metrics, I'll take a look at your case soon.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants