Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Updates to ECS Executor Docs #37125

Merged
merged 1 commit into from
Feb 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions airflow/providers/amazon/aws/executors/ecs/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ COPY <<"EOF" /install_dags_entrypoint.sh
#!/bin/bash

echo "Downloading DAGs from S3 bucket"
aws s3 sync "$S3_URL" "$CONTAINER_DAG_PATH"
aws s3 sync "$S3_URI" "$CONTAINER_DAG_PATH"

/home/airflow/entrypoint.sh "$@"
EOF
Expand Down Expand Up @@ -98,8 +98,8 @@ ENV CONTAINER_DAG_PATH=$container_dag_path


# Use these arguments to load DAGs onto the container from S3
ARG s3_url
ENV S3_URL=$s3_url
ARG s3_uri
ENV S3_URI=$s3_uri
# If using S3 bucket as source of DAGs, uncommenting the next ENTRYPOINT command will overwrite this one.
ENTRYPOINT ["/usr/bin/dumb-init", "--", "/home/airflow/entrypoint.sh"]

Expand Down
102 changes: 58 additions & 44 deletions docs/apache-airflow-providers-amazon/executors/ecs-executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ Required config options:
- CONTAINER_NAME - Name of the container that will be used to execute
Airflow tasks via the ECS executor. The container should be specified
in the ECS Task Definition. Required.
- REGION - The name of the AWS Region where Amazon ECS is configured.
- REGION_NAME - The name of the AWS Region where Amazon ECS is configured.
Required.

Optional config options:
Expand Down Expand Up @@ -125,51 +125,22 @@ integration, allowing you to interact with AWS services within your
Airflow environment. It also includes options to load DAGs (Directed
Acyclic Graphs) from either an S3 bucket or a local folder.

Base Image
~~~~~~~~~~

The Docker image is built upon the ``apache/airflow:latest`` image. See
`here <https://hub.docker.com/r/apache/airflow>`__ for more information
about the image.

Important note: The Airflow and python versions in this image must align
with the Airflow and python versions on the host/container which is
running the Airflow scheduler process (which in turn runs the executor).
The Airflow version of the image can be verified by running the
container locally with the following command:

.. code-block:: bash

docker run <image_name> version

Similarly, the python version of the image can be verified the following
command:

.. code-block:: bash

docker run <image_name> python --version

Ensure that these versions match the versions on the host/container
which is running the Airflow scheduler process (and thus, the ECS
executor.) Apache Airflow images with specific python versions can be
downloaded from the Dockerhub registry, and filtering tags by the
`python
version <https://hub.docker.com/r/apache/airflow/tags?page=1&name=3.8>`__.
For example, the tag ``latest-python3.8`` specifies that the image will
have python 3.8 installed.
Download this image to use for the docker build commands below or create
your own image if you prefer.

Prerequisites
~~~~~~~~~~~~~

Docker must be installed on your system. Instructions for installing
Docker can be found `here <https://docs.docker.com/get-docker/>`__.

AWS Credentials
~~~~~~~~~~~~~~~
Building an Image
~~~~~~~~~~~~~~~~~

The `AWS CLI <https://aws.amazon.com/cli/>`__ is installed within the
container, and there are multiple ways to pass AWS authentication
information to the container. This guide will cover 2 methods.
The `AWS CLI <https://aws.amazon.com/cli/>`__ will be installed within the
image, and there are multiple ways to pass AWS authentication
information to the container and thus multiple ways to build the image.
This guide will cover 2 methods.

The most secure method is to use IAM roles. When creating an ECS Task
Definition, you are able to select a Task Role and a Task Execution
Expand Down Expand Up @@ -203,6 +174,14 @@ below:
When creating the Task Definition for the ECS cluster (see the :ref:`setup guide <setup_guide>` for more details), select the appropriate
newly created Task Role and Task Execution role for the Task Definition.

Then you can build your image by ``cd``-ing to the directory with the Dockerfile and running:

.. code-block:: bash

docker build -t my-airflow-image \
--build-arg aws_default_region=YOUR_DEFAULT_REGION .


The second method is to use the build-time arguments
(``aws_access_key_id``, ``aws_secret_access_key``,
``aws_default_region``, and ``aws_session_token``).
Expand Down Expand Up @@ -234,6 +213,41 @@ your host machine to the container's ``/home/airflow/.aws`` directory.
Keep in mind the Docker build context when copying the ``.aws`` folder
to the container.

Base Image
~~~~~~~~~~

The Docker image created above is built upon the ``apache/airflow:latest`` image. See
`here <https://hub.docker.com/r/apache/airflow>`__ for more information
about the image.

Important note: The Airflow and python versions in this image must align
with the Airflow and python versions on the host/container which is
running the Airflow scheduler process (which in turn runs the executor).
The Airflow version of the image can be verified by running the
container locally with the following command:

.. code-block:: bash

docker run my-airflow-image version

Similarly, the python version of the image can be verified the following
command:

.. code-block:: bash

docker run my-airflow-image python --version

Ensure that these versions match the versions on the host/container
which is running the Airflow scheduler process (and thus, the ECS
executor.) Apache Airflow images with specific python versions can be
downloaded from the Dockerhub registry, and filtering tags by the
`python
version <https://hub.docker.com/r/apache/airflow/tags?page=1&name=3.8>`__.
For example, the tag ``latest-python3.8`` specifies that the image will
have python 3.8 installed. Update your Dockerfile to use the correct Airflow
image for your Python version.


Loading DAGs
~~~~~~~~~~~~

Expand All @@ -251,8 +265,8 @@ Dockerfile to synchronize the DAGs from the specified S3 bucket to the
provide ``container_dag_path`` as a build argument if you want to store
the DAGs in a directory other than ``/opt/airflow/dags``.

Add ``--build-arg s3_url=YOUR_S3_URL`` in the docker build command.
Replace ``YOUR_S3_URL`` with the URL of your S3 bucket. Make sure you
Add ``--build-arg s3_uri=YOUR_S3_URI`` in the docker build command.
Replace ``YOUR_S3_URI`` with the URI of your S3 bucket. Make sure you
have the appropriate permissions to read from the bucket.

Note that the following command is also passing in AWS credentials as
Expand All @@ -265,7 +279,7 @@ build arguments.
--build-arg aws_secret_access_key=YOUR_SECRET_KEY \
--build-arg aws_default_region=YOUR_DEFAULT_REGION \
--build-arg aws_session_token=YOUR_SESSION_TOKEN \
--build-arg s3_url=YOUR_S3_URL .
--build-arg s3_uri=YOUR_S3_URI .

From Local Folder
^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -559,11 +573,11 @@ To configure Airflow to utilize the ECS Executor and leverage the resources we'v

.. code-block:: bash

export AIRFLOW**CORE**EXECUTOR='airflow.providers.amazon.aws.executors.ecs.ecs_executor.AwsEcsExecutor'
export AIRFLOW__CORE__EXECUTOR='airflow.providers.amazon.aws.executors.ecs.ecs_executor.AwsEcsExecutor'

export AIRFLOW**DATABASE**SQL*ALCHEMY*CONN=<postgres-connection-string>
export AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=<postgres-connection-string>

export AIRFLOW__AWS_ECS_EXECUTOR__REGION=<executor-region>
export AIRFLOW__AWS_ECS_EXECUTOR__REGION_NAME=<executor-region>

export AIRFLOW__AWS_ECS_EXECUTOR__CLUSTER=<ecs-cluster-name>

Expand Down