Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Various fixes to work with large datasets in better way #1019

Merged
merged 4 commits into from
Jan 24, 2023
Merged

Conversation

nicl-nno
Copy link
Collaborator

@nicl-nno nicl-nno commented Jan 11, 2023

There are several mini-fixes are applied:

  • Stopping if even one fold failed
  • Presets bug fixed
  • More stable wor for short timeouts
  • Various minor changes
  • Sequatial mode without Joblib parallelization
  • Tuner timeouts processing

@nicl-nno nicl-nno added the in progress task in progress label Jan 11, 2023
@codecov
Copy link

codecov bot commented Jan 11, 2023

Codecov Report

Merging #1019 (3fff7d4) into master (03ae732) will decrease coverage by 0.13%.
The diff coverage is 72.22%.

❗ Current head 3fff7d4 differs from pull request most recent head 4c6ebb4. Consider uploading reports for the commit 4c6ebb4 to get more accurate results

@@            Coverage Diff             @@
##           master    #1019      +/-   ##
==========================================
- Coverage   87.88%   87.75%   -0.13%     
==========================================
  Files         206      206              
  Lines       13805    13849      +44     
==========================================
+ Hits        12132    12153      +21     
- Misses       1673     1696      +23     
Impacted Files Coverage Δ
fedot/core/pipelines/tuning/tuner_builder.py 97.18% <ø> (ø)
.../core/optimisers/opt_history_objects/individual.py 81.41% <20.00%> (-2.85%) ⬇️
fedot/core/optimisers/gp_comp/evaluation.py 94.01% <33.33%> (-3.48%) ⬇️
fedot/core/composer/composer_builder.py 90.74% <50.00%> (-0.77%) ⬇️
.../core/optimisers/archive/individuals_containers.py 90.00% <50.00%> (-1.18%) ⬇️
fedot/core/composer/gp_composer/gp_composer.py 86.95% <55.55%> (-7.92%) ⬇️
fedot/core/pipelines/tuning/unified.py 90.00% <72.41%> (-10.00%) ⬇️
fedot/api/main.py 81.12% <75.00%> (-0.23%) ⬇️
fedot/api/api_utils/api_composer.py 97.79% <100.00%> (+0.03%) ⬆️
...t/api/api_utils/assumptions/assumptions_handler.py 86.00% <100.00%> (+0.58%) ⬆️
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -171,7 +171,10 @@ def fit(self,
self.data_processor.accept_and_apply_recommendations(self.train_data, recommendations)
self.params.accept_and_apply_recommendations(self.train_data, recommendations)
self._init_remote_if_necessary()
self.params.update_available_operations_by_preset(self.train_data)

if self.params.api_params['preset'] != 'auto':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Почему потребовалось добавать условие?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Потому что сейчас если задать список available_operations вначале - то потом в варианте auto они уже не меняются. Не нашел блоее изящного решения, так кажется более масштабный рефакторинг нужен.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?

@@ -171,7 +171,10 @@ def fit(self,
self.data_processor.accept_and_apply_recommendations(self.train_data, recommendations)
self.params.accept_and_apply_recommendations(self.train_data, recommendations)
self._init_remote_if_necessary()
self.params.update_available_operations_by_preset(self.train_data)

if self.params.api_params['preset'] != 'auto':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?


trials = Trials()

remaining_time = self.max_seconds - global_tuner_timer.minutes_from_start * 60
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Почему нельзя считать секунды и надо округлять до минут?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял.

@@ -183,6 +190,9 @@ def fit(self,
self.current_pipeline, self.best_models, self.history = \
self.api_composer.obtain_model(**self.params.api_params)

if self.current_pipeline is None:
raise ValueError('No any models were found')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

грамматика: достаточно "No models were found", c any масло масляное немного

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял

Comment on lines +49 to +50
if not population:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

сюда может придти None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Может прийти [], если ни один индивид в популяции не успел обработаться.

@@ -183,6 +190,9 @@ def fit(self,
self.current_pipeline, self.best_models, self.history = \
self.api_composer.obtain_model(**self.params.api_params)

if self.current_pipeline is None:
raise ValueError('No any models were found')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Здесь можно просто 'No models were found'.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял.

@nicl-nno
Copy link
Collaborator Author

Обновил до мастера.

@nicl-nno
Copy link
Collaborator Author

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?

Если они явно не заданы - то вроде да.

@YamLyubov YamLyubov mentioned this pull request Jan 24, 2023
@nicl-nno nicl-nno merged commit 3deb3e3 into master Jan 24, 2023
@nicl-nno nicl-nno deleted the logger-imp branch January 27, 2023 21:25
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
in progress task in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants