Quality of life `smart validate` Improvements #458

MattToast · 2024-01-19T18:04:36Z

Quality of life smart validate improvements:

Set CUDA_VISIBLE_DEVICES environment variable within smart validate prior to importing any ML deps to prevent false negatives on multi-GPU systems
Move SmartRedis logs from standard out to dedicated log file in the validation temporary directory
Suppress sklearn deprecation warning by pinning KMeans constructor argument

- Set `CUDA_VISIBLE_DEVICES` environment variable within `smart validate` prioro to importing any ML deps to prevent false negatives on multi-GPU systems - Move SmartRedis logs from standard out to dedicated log file in the validation temporary directory - Suppress `sklearn` deprecation warning by pinning `KMeans` constructor argument

al-rigazzi

LGTM, I love quality of life improvements!

codecov · 2024-01-19T18:15:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c38f73f) 90.42% compared to head (53fa04f) 90.80%.
Report is 3 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #458      +/-   ##
===========================================
+ Coverage    90.42%   90.80%   +0.38%     
===========================================
  Files           60       60              
  Lines         3738     3830      +92     
===========================================
+ Hits          3380     3478      +98     
+ Misses         358      352       -6

see 13 files with indirect coverage changes

ashao

Looks good! Thanks for finding this

smartsim/_core/_cli/validate.py

ashao · 2024-01-19T21:30:48Z

smartsim/_core/_cli/validate.py

        if with_pt:
            logger.info("Verifying Torch Backend")
            _test_torch_install(client, device)
+        if with_tf:


If this fixes the problem we found, could you move the Tensorflow test to the very end?

absolutely!!

@MattToast

Quality of life `smart validate` improvements: - Set `CUDA_VISIBLE_DEVICES` environment variable within `smart validate` prior to importing any ML deps to prevent false negatives on multi-GPU systems - Move SmartRedis logs from standard out to dedicated log file in the validation temporary directory - Suppress `sklearn` deprecation warning by pinning `KMeans` constructor argument - Move TF test to last as TF may reserve the GPUs it uses [ committed by @MattToast ] [ reviewed by @al-rigazzi @ashao ]

MattToast added the type: usability Issues related to ease of use label Jan 19, 2024

MattToast requested review from ashao and al-rigazzi January 19, 2024 18:04

MattToast self-assigned this Jan 19, 2024

al-rigazzi approved these changes Jan 19, 2024

View reviewed changes

ashao approved these changes Jan 19, 2024

View reviewed changes

smartsim/_core/_cli/validate.py Outdated Show resolved Hide resolved

reviewer feedback, change order of tests

cdc4646

ashao reviewed Jan 19, 2024

View reviewed changes

run TF last

53fa04f

MattToast merged commit 35973b5 into CrayLabs:develop Jan 20, 2024
26 checks passed

MattToast deleted the val-w-many-gpu branch January 22, 2024 23:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality of life `smart validate` Improvements #458

Quality of life `smart validate` Improvements #458

MattToast commented Jan 19, 2024

al-rigazzi left a comment

codecov bot commented Jan 19, 2024 •

edited

Loading

ashao left a comment

ashao Jan 19, 2024

MattToast Jan 19, 2024

Quality of life smart validate Improvements #458

Quality of life smart validate Improvements #458

Conversation

MattToast commented Jan 19, 2024

al-rigazzi left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 19, 2024 • edited Loading

Codecov Report

ashao left a comment

Choose a reason for hiding this comment

ashao Jan 19, 2024

Choose a reason for hiding this comment

MattToast Jan 19, 2024

Choose a reason for hiding this comment

Quality of life `smart validate` Improvements #458

Quality of life `smart validate` Improvements #458

codecov bot commented Jan 19, 2024 •

edited

Loading