Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use standard python logging facilities instead of printing to stdout #1657

Closed
saulbein opened this issue Jul 1, 2020 · 5 comments
Closed

Comments

@saulbein
Copy link

saulbein commented Jul 1, 2020

Describe the feature you'd like
Currently, the library uses print() statements to print most logs from the framework itself, and also the ones it receives from models. If the library is used in, for example, a web service, it would be preferable to be able to use the standard python logging library to configure log levels, and be able to send the logs somewhere else than stdout.

How would this feature be used? Please describe.
This would allow the logging to be configured in a standard pythonic way. For example, if I wanted to redirect the logs to a file, I could attach a logging.FileHandler to the sagemaker logger, or alternatively sagemaker.session logger if I wanted to redirect only the session logs. It would also allow customizing the log format, and other features of the python logging module.

Describe alternatives you've considered
The current workaround I am using is to redirect all stdout text to a custom stream handler, which then re-prints them via logging. The problem with this is that I can not control log levels and formats, and if any other library prints to stdout it's output will also end up in the logs.

@metrizable
Copy link
Contributor

@saulbein Thanks for the great suggestion!

@saulbein
Copy link
Author

saulbein commented Jul 2, 2020

I've looked up if I would be able to do a PR myself, and it seems most of the print culprits are in session.py. The difficulty comes from the progress bar like behaviour of some actions, for example when creating a training job:

2020-07-02 10:00:41 Starting - Starting the training job...

The dots here are printed each poll, so they show up like this in logs:

2020-07-02 10:00:41 Starting - Starting the training job
.
.
.

The question then becomes either:

  1. is this behaviour really necessary?
  2. can we preserve this behaviour only when printing to the console, but ignore it when logging?

@metrizable
Copy link
Contributor

off the top of my head, we have a few options:

  1. define a logger as is typical elsewhere in the package and unilaterally convert every print() to a call to the logger
  2. define a logger as is typical elsewhere in the package and judiciously refactor some print() statements to calls to the logger
  3. define a new logger and attach a stream handler targeting sys.stdout, convert print() to calls to this logger

there are probably others that we can consider.

@saulbein
Copy link
Author

saulbein commented Jul 3, 2020

Well, I would strongly prefer the first option - print is essentially the same as logging to stdout (which can be easily set up in cli, and it seems like jupyter sets up logging automatically, so not much work to do there as well). The only real difference is with polling / progress bars, since they would create separate messages for each 'tick'.

In my opinion the third option is a no go, since a stream handler will capture stdout output from other stuff, for example a user's print() statements.

I will try to get some PR done next week, to see how difficult the changes would be. As for the polling 'tick' text, maybe adding an option to disable it would be best? I would not want my logs to end up like this:

2020-07-02 10:00:41 Starting - Starting the training job
2020-07-02 10:00:46 .
2020-07-02 10:00:51 .
2020-07-02 10:00:56 .

@akrishna1995
Copy link
Contributor

Thanks for opening this issue. The fix PR has been merged : #4133. Feel free to reopen if you have any issues - Thank you

Narrohag added a commit to Narrohag/sagemaker-python-sdk that referenced this issue Mar 3, 2025
pintaoz-aws pushed a commit that referenced this issue Mar 5, 2025
…urated Hub Phase 2 (#5070)

* change: update image_uri_configs  01-27-2025 06:18:13 PST

* fix: skip TF tests for unsupported versions (#5007)

* fix: skip TF tests for unsupported versions

* flake8

* change: update image_uri_configs  01-29-2025 06:18:08 PST

* chore: add new images for HF TGI (#5005)

* feat: add pytorch-tgi-inference 2.4.0

* add tgi 3.0.1 image

* skip faulty test

* formatting

* formatting

* add hf pytorch training 4.46

* update version alias

* add py311 to training version

* update tests with pyversion 311

* formatting

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: use jumpstart deployment config image as default optimization image (#4992)

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* prepare release v2.238.0

* update development version to v2.238.1.dev0

* Fix ssh host policy (#4966)

* Fix ssh host policy

* Filter policy by algo-

* Add docstring

* Fix pylint

* Fix docstyle summary

* Unit test

* Fix unit test

* Change to unit test

* Fix unit tests

* Test comment out flaky tests

* Readd the flaky tests

* Remove flaky asserts

* Remove flaky asserts

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* change: Allow telemetry only in supported regions (#5009)

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

---------

Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>

* mpirun protocol - distributed training with @Remote decorator (#4998)

* implemented multi-node distribution with @Remote function

* completed unit tests

* added distributed training with CPU and torchrun

* backwards compatibility nproc_per_node

* fixing code: permissions for non-root users, integration tests

* fixed docstyle

* refactor nproc_per_node for backwards compatibility

* refactor nproc_per_node for backwards compatibility

* pylint fix, newlines

* added unit tests for bootstrap_environment remote

* added  mpirun protocol for distributed training with @Remote decorator

* aligned mpi_utils_remote.py to mpi_utils.py for estimator

* updated docstring for sagemaker sdk doc

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: Add support for deepseek recipes (#5011)

* feat: Add support for deeepseek recipes

* pylint

* add unit test

* feat: [JumpStart] Add access configs and training instance type variants artifact uri handling for Curated Hub Phase 2 training integration (#1653)

* Add access config to training input for Curated Hub Training Integration

* Add support to retrieve instance specific training artifact keys

* Fix some typos and naming issues

* Fix more typos

* fix formatting issues with black

* modify access config logic so accept_eula is passed into fit

* update black formatting

* Add more unit tests for passing access configs

* fix style errors

* fix for failing integ test

* fix styles and integ test error

* skip blocking integ test

* fix formatting

* remove env vars when access configs are being used

* fix docstyle issue

* update usage of access configs, remove conversion of training artifact key to uri

* fix styling issues

* fix styling issues

* fix unit tests

* fix adding hubaccessconfig only if hubcontentarn exists

* move logic to JumpStartEstimator from Job

* Fix styling issues

* Remove unused code

* fix styling issues

* fix unit test failure

* fix some formatting, add comments

* remove typing for estimator in get_access_configs function

* fix circular import dependency

* fix styling issues

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* Always add code channel, regardless of network isolation (#1657)

* fix formatting issue

* fix formatting issue

* fix formatting issue

* fix tensorflow file

---------

Co-authored-by: sagemaker-bot <sagemaker-bot@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
Co-authored-by: varunmoris <176621270+varunmoris@users.noreply.github.com>
Co-authored-by: Gary Wang <38331932+gwang111@users.noreply.github.com>
Co-authored-by: ci <ci>
Co-authored-by: parknate@ <parknate@amazon.com>
Co-authored-by: rsareddy0329 <rsareddy0329@gmail.com>
Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
Co-authored-by: Bruno Pistone <brn.pistone@gmail.com>
mollyheamazon pushed a commit to mollyheamazon/sagemaker-python-sdk that referenced this issue Mar 14, 2025
…urated Hub Phase 2 (aws#5070)

* change: update image_uri_configs  01-27-2025 06:18:13 PST

* fix: skip TF tests for unsupported versions (aws#5007)

* fix: skip TF tests for unsupported versions

* flake8

* change: update image_uri_configs  01-29-2025 06:18:08 PST

* chore: add new images for HF TGI (aws#5005)

* feat: add pytorch-tgi-inference 2.4.0

* add tgi 3.0.1 image

* skip faulty test

* formatting

* formatting

* add hf pytorch training 4.46

* update version alias

* add py311 to training version

* update tests with pyversion 311

* formatting

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: use jumpstart deployment config image as default optimization image (aws#4992)

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* prepare release v2.238.0

* update development version to v2.238.1.dev0

* Fix ssh host policy (aws#4966)

* Fix ssh host policy

* Filter policy by algo-

* Add docstring

* Fix pylint

* Fix docstyle summary

* Unit test

* Fix unit test

* Change to unit test

* Fix unit tests

* Test comment out flaky tests

* Readd the flaky tests

* Remove flaky asserts

* Remove flaky asserts

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* change: Allow telemetry only in supported regions (aws#5009)

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

---------

Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>

* mpirun protocol - distributed training with @Remote decorator (aws#4998)

* implemented multi-node distribution with @Remote function

* completed unit tests

* added distributed training with CPU and torchrun

* backwards compatibility nproc_per_node

* fixing code: permissions for non-root users, integration tests

* fixed docstyle

* refactor nproc_per_node for backwards compatibility

* refactor nproc_per_node for backwards compatibility

* pylint fix, newlines

* added unit tests for bootstrap_environment remote

* added  mpirun protocol for distributed training with @Remote decorator

* aligned mpi_utils_remote.py to mpi_utils.py for estimator

* updated docstring for sagemaker sdk doc

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: Add support for deepseek recipes (aws#5011)

* feat: Add support for deeepseek recipes

* pylint

* add unit test

* feat: [JumpStart] Add access configs and training instance type variants artifact uri handling for Curated Hub Phase 2 training integration (aws#1653)

* Add access config to training input for Curated Hub Training Integration

* Add support to retrieve instance specific training artifact keys

* Fix some typos and naming issues

* Fix more typos

* fix formatting issues with black

* modify access config logic so accept_eula is passed into fit

* update black formatting

* Add more unit tests for passing access configs

* fix style errors

* fix for failing integ test

* fix styles and integ test error

* skip blocking integ test

* fix formatting

* remove env vars when access configs are being used

* fix docstyle issue

* update usage of access configs, remove conversion of training artifact key to uri

* fix styling issues

* fix styling issues

* fix unit tests

* fix adding hubaccessconfig only if hubcontentarn exists

* move logic to JumpStartEstimator from Job

* Fix styling issues

* Remove unused code

* fix styling issues

* fix unit test failure

* fix some formatting, add comments

* remove typing for estimator in get_access_configs function

* fix circular import dependency

* fix styling issues

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* Always add code channel, regardless of network isolation (aws#1657)

* fix formatting issue

* fix formatting issue

* fix formatting issue

* fix tensorflow file

---------

Co-authored-by: sagemaker-bot <sagemaker-bot@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
Co-authored-by: varunmoris <176621270+varunmoris@users.noreply.github.com>
Co-authored-by: Gary Wang <38331932+gwang111@users.noreply.github.com>
Co-authored-by: ci <ci>
Co-authored-by: parknate@ <parknate@amazon.com>
Co-authored-by: rsareddy0329 <rsareddy0329@gmail.com>
Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
Co-authored-by: Bruno Pistone <brn.pistone@gmail.com>
evakravi pushed a commit to evakravi/sagemaker-python-sdk that referenced this issue Mar 20, 2025
…urated Hub Phase 2 (aws#5070)

* change: update image_uri_configs  01-27-2025 06:18:13 PST

* fix: skip TF tests for unsupported versions (aws#5007)

* fix: skip TF tests for unsupported versions

* flake8

* change: update image_uri_configs  01-29-2025 06:18:08 PST

* chore: add new images for HF TGI (aws#5005)

* feat: add pytorch-tgi-inference 2.4.0

* add tgi 3.0.1 image

* skip faulty test

* formatting

* formatting

* add hf pytorch training 4.46

* update version alias

* add py311 to training version

* update tests with pyversion 311

* formatting

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: use jumpstart deployment config image as default optimization image (aws#4992)

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* prepare release v2.238.0

* update development version to v2.238.1.dev0

* Fix ssh host policy (aws#4966)

* Fix ssh host policy

* Filter policy by algo-

* Add docstring

* Fix pylint

* Fix docstyle summary

* Unit test

* Fix unit test

* Change to unit test

* Fix unit tests

* Test comment out flaky tests

* Readd the flaky tests

* Remove flaky asserts

* Remove flaky asserts

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* change: Allow telemetry only in supported regions (aws#5009)

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

* change: Allow telemetry only in supported regions

---------

Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>

* mpirun protocol - distributed training with @Remote decorator (aws#4998)

* implemented multi-node distribution with @Remote function

* completed unit tests

* added distributed training with CPU and torchrun

* backwards compatibility nproc_per_node

* fixing code: permissions for non-root users, integration tests

* fixed docstyle

* refactor nproc_per_node for backwards compatibility

* refactor nproc_per_node for backwards compatibility

* pylint fix, newlines

* added unit tests for bootstrap_environment remote

* added  mpirun protocol for distributed training with @Remote decorator

* aligned mpi_utils_remote.py to mpi_utils.py for estimator

* updated docstring for sagemaker sdk doc

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* feat: Add support for deepseek recipes (aws#5011)

* feat: Add support for deeepseek recipes

* pylint

* add unit test

* feat: [JumpStart] Add access configs and training instance type variants artifact uri handling for Curated Hub Phase 2 training integration (aws#1653)

* Add access config to training input for Curated Hub Training Integration

* Add support to retrieve instance specific training artifact keys

* Fix some typos and naming issues

* Fix more typos

* fix formatting issues with black

* modify access config logic so accept_eula is passed into fit

* update black formatting

* Add more unit tests for passing access configs

* fix style errors

* fix for failing integ test

* fix styles and integ test error

* skip blocking integ test

* fix formatting

* remove env vars when access configs are being used

* fix docstyle issue

* update usage of access configs, remove conversion of training artifact key to uri

* fix styling issues

* fix styling issues

* fix unit tests

* fix adding hubaccessconfig only if hubcontentarn exists

* move logic to JumpStartEstimator from Job

* Fix styling issues

* Remove unused code

* fix styling issues

* fix unit test failure

* fix some formatting, add comments

* remove typing for estimator in get_access_configs function

* fix circular import dependency

* fix styling issues

---------

Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* Always add code channel, regardless of network isolation (aws#1657)

* fix formatting issue

* fix formatting issue

* fix formatting issue

* fix tensorflow file

---------

Co-authored-by: sagemaker-bot <sagemaker-bot@amazon.com>
Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>
Co-authored-by: varunmoris <176621270+varunmoris@users.noreply.github.com>
Co-authored-by: Gary Wang <38331932+gwang111@users.noreply.github.com>
Co-authored-by: ci <ci>
Co-authored-by: parknate@ <parknate@amazon.com>
Co-authored-by: rsareddy0329 <rsareddy0329@gmail.com>
Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
Co-authored-by: Bruno Pistone <brn.pistone@gmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants