Use standard python logging facilities instead of printing to stdout #1657

saulbein · 2020-07-01T09:52:03Z

Describe the feature you'd like
Currently, the library uses print() statements to print most logs from the framework itself, and also the ones it receives from models. If the library is used in, for example, a web service, it would be preferable to be able to use the standard python logging library to configure log levels, and be able to send the logs somewhere else than stdout.

How would this feature be used? Please describe.
This would allow the logging to be configured in a standard pythonic way. For example, if I wanted to redirect the logs to a file, I could attach a logging.FileHandler to the sagemaker logger, or alternatively sagemaker.session logger if I wanted to redirect only the session logs. It would also allow customizing the log format, and other features of the python logging module.

Describe alternatives you've considered
The current workaround I am using is to redirect all stdout text to a custom stream handler, which then re-prints them via logging. The problem with this is that I can not control log levels and formats, and if any other library prints to stdout it's output will also end up in the logs.

The text was updated successfully, but these errors were encountered:

metrizable · 2020-07-02T00:06:47Z

@saulbein Thanks for the great suggestion!

saulbein · 2020-07-02T10:09:07Z

I've looked up if I would be able to do a PR myself, and it seems most of the print culprits are in session.py. The difficulty comes from the progress bar like behaviour of some actions, for example when creating a training job:

2020-07-02 10:00:41 Starting - Starting the training job...

The dots here are printed each poll, so they show up like this in logs:

2020-07-02 10:00:41 Starting - Starting the training job
.
.
.

The question then becomes either:

is this behaviour really necessary?
can we preserve this behaviour only when printing to the console, but ignore it when logging?

metrizable · 2020-07-02T18:02:42Z

off the top of my head, we have a few options:

define a logger as is typical elsewhere in the package and unilaterally convert every print() to a call to the logger
define a logger as is typical elsewhere in the package and judiciously refactor some print() statements to calls to the logger
define a new logger and attach a stream handler targeting sys.stdout, convert print() to calls to this logger

there are probably others that we can consider.

saulbein · 2020-07-03T10:20:11Z

Well, I would strongly prefer the first option - print is essentially the same as logging to stdout (which can be easily set up in cli, and it seems like jupyter sets up logging automatically, so not much work to do there as well). The only real difference is with polling / progress bars, since they would create separate messages for each 'tick'.

In my opinion the third option is a no go, since a stream handler will capture stdout output from other stuff, for example a user's print() statements.

I will try to get some PR done next week, to see how difficult the changes would be. As for the polling 'tick' text, maybe adding an option to disable it would be best? I would not want my logs to end up like this:

2020-07-02 10:00:41 Starting - Starting the training job
2020-07-02 10:00:46 .
2020-07-02 10:00:51 .
2020-07-02 10:00:56 .

akrishna1995 · 2023-12-21T21:51:19Z

Thanks for opening this issue. The fix PR has been merged : #4133. Feel free to reopen if you have any issues - Thank you

…urated Hub Phase 2 (#5070) * change: update image_uri_configs 01-27-2025 06:18:13 PST * fix: skip TF tests for unsupported versions (#5007) * fix: skip TF tests for unsupported versions * flake8 * change: update image_uri_configs 01-29-2025 06:18:08 PST * chore: add new images for HF TGI (#5005) * feat: add pytorch-tgi-inference 2.4.0 * add tgi 3.0.1 image * skip faulty test * formatting * formatting * add hf pytorch training 4.46 * update version alias * add py311 to training version * update tests with pyversion 311 * formatting --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * feat: use jumpstart deployment config image as default optimization image (#4992) Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * prepare release v2.238.0 * update development version to v2.238.1.dev0 * Fix ssh host policy (#4966) * Fix ssh host policy * Filter policy by algo- * Add docstring * Fix pylint * Fix docstyle summary * Unit test * Fix unit test * Change to unit test * Fix unit tests * Test comment out flaky tests * Readd the flaky tests * Remove flaky asserts * Remove flaky asserts --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * change: Allow telemetry only in supported regions (#5009) * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> * mpirun protocol - distributed training with @Remote decorator (#4998) * implemented multi-node distribution with @Remote function * completed unit tests * added distributed training with CPU and torchrun * backwards compatibility nproc_per_node * fixing code: permissions for non-root users, integration tests * fixed docstyle * refactor nproc_per_node for backwards compatibility * refactor nproc_per_node for backwards compatibility * pylint fix, newlines * added unit tests for bootstrap_environment remote * added mpirun protocol for distributed training with @Remote decorator * aligned mpi_utils_remote.py to mpi_utils.py for estimator * updated docstring for sagemaker sdk doc --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * feat: Add support for deepseek recipes (#5011) * feat: Add support for deeepseek recipes * pylint * add unit test * feat: [JumpStart] Add access configs and training instance type variants artifact uri handling for Curated Hub Phase 2 training integration (#1653) * Add access config to training input for Curated Hub Training Integration * Add support to retrieve instance specific training artifact keys * Fix some typos and naming issues * Fix more typos * fix formatting issues with black * modify access config logic so accept_eula is passed into fit * update black formatting * Add more unit tests for passing access configs * fix style errors * fix for failing integ test * fix styles and integ test error * skip blocking integ test * fix formatting * remove env vars when access configs are being used * fix docstyle issue * update usage of access configs, remove conversion of training artifact key to uri * fix styling issues * fix styling issues * fix unit tests * fix adding hubaccessconfig only if hubcontentarn exists * move logic to JumpStartEstimator from Job * Fix styling issues * Remove unused code * fix styling issues * fix unit test failure * fix some formatting, add comments * remove typing for estimator in get_access_configs function * fix circular import dependency * fix styling issues --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * Always add code channel, regardless of network isolation (#1657) * fix formatting issue * fix formatting issue * fix formatting issue * fix tensorflow file --------- Co-authored-by: sagemaker-bot <sagemaker-bot@amazon.com> Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> Co-authored-by: varunmoris <176621270+varunmoris@users.noreply.github.com> Co-authored-by: Gary Wang <38331932+gwang111@users.noreply.github.com> Co-authored-by: ci <ci> Co-authored-by: parknate@ <parknate@amazon.com> Co-authored-by: rsareddy0329 <rsareddy0329@gmail.com> Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> Co-authored-by: Bruno Pistone <brn.pistone@gmail.com>

…urated Hub Phase 2 (aws#5070) * change: update image_uri_configs 01-27-2025 06:18:13 PST * fix: skip TF tests for unsupported versions (aws#5007) * fix: skip TF tests for unsupported versions * flake8 * change: update image_uri_configs 01-29-2025 06:18:08 PST * chore: add new images for HF TGI (aws#5005) * feat: add pytorch-tgi-inference 2.4.0 * add tgi 3.0.1 image * skip faulty test * formatting * formatting * add hf pytorch training 4.46 * update version alias * add py311 to training version * update tests with pyversion 311 * formatting --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * feat: use jumpstart deployment config image as default optimization image (aws#4992) Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * prepare release v2.238.0 * update development version to v2.238.1.dev0 * Fix ssh host policy (aws#4966) * Fix ssh host policy * Filter policy by algo- * Add docstring * Fix pylint * Fix docstyle summary * Unit test * Fix unit test * Change to unit test * Fix unit tests * Test comment out flaky tests * Readd the flaky tests * Remove flaky asserts * Remove flaky asserts --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * change: Allow telemetry only in supported regions (aws#5009) * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> * mpirun protocol - distributed training with @Remote decorator (aws#4998) * implemented multi-node distribution with @Remote function * completed unit tests * added distributed training with CPU and torchrun * backwards compatibility nproc_per_node * fixing code: permissions for non-root users, integration tests * fixed docstyle * refactor nproc_per_node for backwards compatibility * refactor nproc_per_node for backwards compatibility * pylint fix, newlines * added unit tests for bootstrap_environment remote * added mpirun protocol for distributed training with @Remote decorator * aligned mpi_utils_remote.py to mpi_utils.py for estimator * updated docstring for sagemaker sdk doc --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * feat: Add support for deepseek recipes (aws#5011) * feat: Add support for deeepseek recipes * pylint * add unit test * feat: [JumpStart] Add access configs and training instance type variants artifact uri handling for Curated Hub Phase 2 training integration (aws#1653) * Add access config to training input for Curated Hub Training Integration * Add support to retrieve instance specific training artifact keys * Fix some typos and naming issues * Fix more typos * fix formatting issues with black * modify access config logic so accept_eula is passed into fit * update black formatting * Add more unit tests for passing access configs * fix style errors * fix for failing integ test * fix styles and integ test error * skip blocking integ test * fix formatting * remove env vars when access configs are being used * fix docstyle issue * update usage of access configs, remove conversion of training artifact key to uri * fix styling issues * fix styling issues * fix unit tests * fix adding hubaccessconfig only if hubcontentarn exists * move logic to JumpStartEstimator from Job * Fix styling issues * Remove unused code * fix styling issues * fix unit test failure * fix some formatting, add comments * remove typing for estimator in get_access_configs function * fix circular import dependency * fix styling issues --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> * Always add code channel, regardless of network isolation (aws#1657) * fix formatting issue * fix formatting issue * fix formatting issue * fix tensorflow file --------- Co-authored-by: sagemaker-bot <sagemaker-bot@amazon.com> Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> Co-authored-by: varunmoris <176621270+varunmoris@users.noreply.github.com> Co-authored-by: Gary Wang <38331932+gwang111@users.noreply.github.com> Co-authored-by: ci <ci> Co-authored-by: parknate@ <parknate@amazon.com> Co-authored-by: rsareddy0329 <rsareddy0329@gmail.com> Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> Co-authored-by: Bruno Pistone <brn.pistone@gmail.com>

metrizable added the type: feature request label Jul 2, 2020

metrizable added the contributions welcome label Jul 2, 2020

martinRenou mentioned this issue Sep 27, 2023

change Using logging instead of prints #4133

Merged

9 tasks

martinRenou self-assigned this Oct 3, 2023

akrishna1995 closed this as completed Dec 21, 2023

Narrohag added a commit to Narrohag/sagemaker-python-sdk that referenced this issue Mar 3, 2025

Always add code channel, regardless of network isolation (aws#1657)

85e8286

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use standard python logging facilities instead of printing to stdout #1657

Use standard python logging facilities instead of printing to stdout #1657

saulbein commented Jul 1, 2020

metrizable commented Jul 2, 2020

saulbein commented Jul 2, 2020

metrizable commented Jul 2, 2020

saulbein commented Jul 3, 2020

akrishna1995 commented Dec 21, 2023

Use standard python logging facilities instead of printing to stdout #1657

Use standard python logging facilities instead of printing to stdout #1657

Comments

saulbein commented Jul 1, 2020

metrizable commented Jul 2, 2020

saulbein commented Jul 2, 2020

metrizable commented Jul 2, 2020

saulbein commented Jul 3, 2020

akrishna1995 commented Dec 21, 2023