Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Wrong notebook being run in pipeline when creating NotebookJobStep with shared environment_variables dict #4856

Open
fakio opened this issue Aug 29, 2024 · 1 comment
Assignees
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@fakio
Copy link

fakio commented Aug 29, 2024

Describe the bug
When I create a Pipeline with two NotebookJobStep steps and both steps were created using the same dict as environment_variables parameter the first step is run with the second input notebook isntead of its own input notebook.

To reproduce

env_vars = {
    'test': 'test',
}
steps = [
    NotebookJobStep(
        image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
        kernel_name="python3",
        input_notebook="job1.ipynb",
        initialization_script="setup.sh",
        environment_variables=env_vars,
    ),
    NotebookJobStep(
        image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
        kernel_name="python3",
        input_notebook="job2.ipynb",
        initialization_script="setup.sh",
        environment_variables=env_vars,
    ),
]
pipeline = Pipeline(
    name="pipeline",
    steps=steps,
)
pipeline.upsert(role_arn=role)
execution = pipeline.start()

The problem seems the env vars for each step:

print(json.loads(pipeline.definition())["Steps"][0]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb', <<==== wrong input
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}

print(json.loads(pipeline.definition())["Steps"][1]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb',
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}

Expected behavior
Run job1.ipynb and job2.ipynb in each step.

Screenshots or logs

Screenshot of notebook jobs in Studio UI:

image

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.226.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):
  • Framework version:
  • Python version: 3.8.18
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N
@svia3 svia3 added the component: pipelines Relates to the SageMaker Pipeline Platform label Sep 3, 2024
@qidewenwhen
Copy link
Member

Hi @fakio, thanks for the good finding! And thanks for the investigation.

This is a bug for sure.

The issue is here

job_envs = self.environment_variables if self.environment_variables else {}
system_envs = {
"AWS_DEFAULT_REGION": self._region_from_session,
"SM_JOB_DEF_VERSION": "1.0",
"SM_ENV_NAME": "sagemaker-default-env",
"SM_SKIP_EFS_SIMULATION": "true",
"SM_EXECUTION_INPUT_PATH": "/opt/ml/input/data/"
"sagemaker_headless_execution_pipelinestep",
"SM_KERNEL_NAME": self.kernel_name,
"SM_INPUT_NOTEBOOK_NAME": os.path.basename(self.input_notebook),
"SM_OUTPUT_NOTEBOOK_NAME": f"{self._underlying_job_prefix}.ipynb",
}
if self.initialization_script:
system_envs["SM_INIT_SCRIPT"] = os.path.basename(self.initialization_script)
job_envs.update(system_envs)

The environment_variables from user is directly updated with the system_envs. Thus if the same environment_variables is used across multiple notebook job steps, they could override each other.

The fix is to simply copy the environment_variables from users and add system_envs to the copy.

However, as a workaround, you can create separate environment_variables for different steps until we release a fix.

I'll creating a backlog for the fix of this bug in our service queue, and will update this issue once the fix PR is published.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

5 participants