Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Improve execution speed of BBM stress update #682

Merged
merged 7 commits into from
Oct 15, 2024
Merged

Conversation

einola
Copy link
Member

@einola einola commented Sep 4, 2024

Improve execution speed of BBM stress update

Fixes #539

Task List

  • Defined the tests that specify a complete and functioning change (It may help to create a design specification & test specification)
  • Implemented the source code change that satisfies the tests
  • Documented the feature by providing worked example
  • Updated the README or other documentation
  • Completed the pre-Request checklist below

Change Description

Replaced on-the-fly computing of quadrature in BBMStressUpdate by precomputed values from ParametricMomentumMap.

Required changes:

  • added the vector iJMwPSI_dam[...] to ParametricMomentumMap. This is (inverse Mass(x_q))^{-1} * J(x_q) * w(q) * PSI[i] (xq) where x_q is the quadrature point and i is the basis function. iJMwPSI is the same in the DGstress-space, and _dam is in the DGadvection space.

This requires adding DGadvection as a template parameter, and hence, several classes with an interface to ParametricMomentumMap had to be touched. We broke compatibility with the old cgParametricMomentum.cpp class. But I think this should be removed anyway as it is replaced by the Kernels.


Test Description

A one-day BBM computation (config_benchmark.cfg with BBM and shortened to 1 day) gives (on 4 threads)

before:
../build/nextsim --config-file config_benchmark-bbm-short.cfg 145,99s user 11,47s system 276% cpu 57,046 total

now:
../build/nextsim --config-file config_benchmark-bbm-short.cfg 56,99s user 9,79s system 195% cpu 34,141 total

34 instead of 57 seconds.


Documentation Impact

N/A


Other Details

The poor parallelization (below 200% CPU util.) is likely due to some inefficiency in passing data. (too many pointers?). @winzerle will investigate.


Pre-Request Checklist

  • The requirements of this pull request are fully captured in an issue or design specification and are linked and summarised in the description of this PR
  • No new warnings are generated
  • The documentation has been updated (or an issue has been created to track the corresponding change)
  • Methods and Tests are commented such that they can be understood without having to obtain additional context
  • This PR/Issue is labelled as a bug/feature/enhancement/breaking change
  • File dates have been updated to reflect modification date
  • This change conforms to the conventions described in the README

…computed values from ParametricMomentumMap.

Required changes:

- added the vector iJMwPSI_dam[...] to ParametricMomentumMap. This is

(inverse Mass(x_q))^{-1} * J(x_q) * w(q) * PSI[i] (xq)

where x_q is the quadrature point and i the basis function. iJMwPSI is the same in the DGstress-space and _dam is in DGadvection space.

- This requires to add DGadvection as template parameter and hence, several classes with interface to ParametricMomentumMap had to be touched. We broke compatibility to the old cgParametricMomentum.cpp class. But I think this should anyway be removed as it is replaced by the Kernels.

A one-day BBM computation (config_benchmark.cfg with BBM and shortened to 1 day) gives (on 4 threads)

before:
../build/nextsim --config-file config_benchmark-bbm-short.cfg  145,99s user 11,47s system 276% cpu 57,046 total

now:
../build/nextsim --config-file config_benchmark-bbm-short.cfg  56,99s user 9,79s system 195% cpu 34,141 total

34 instead of 57 seconds. I assume that the bad parallelization (below 200% CPU util.) is due to some inefficiency in passing data. (too many pointers?) I'll investigate.
@einola einola mentioned this pull request Sep 4, 2024
27 tasks
I just ran clang-format on all the files. Only formatting changes here.
Oh, yes, and update header file dates.
Trying again with version 14 of clang (instead of 17).
@einola einola requested a review from winzerle October 14, 2024 09:38
Copy link
Collaborator

@winzerle winzerle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all reasonable :-)

@einola einola added the enhancement New feature or request label Oct 15, 2024
@einola einola merged commit f047363 into develop Oct 15, 2024
5 checks passed
@einola einola deleted the issue539_optimize_bbm branch October 15, 2024 08:28
einola added a commit that referenced this pull request Oct 29, 2024
The performance improvements in BBM from PR #682 unfortunately don't
work on a parametric mesh (such as the 25 km Arctic test). This commit
just reverses the changes, leaving the "improved" code in comments to
help with an eventual reimplementation.
@einola einola mentioned this pull request Dec 5, 2024
12 tasks
einola added a commit that referenced this pull request Dec 6, 2024
# Improve execution speed of BBM stress update

Fixes #539 

### Task List
- [x] Defined the tests that specify a complete and functioning change
(*It may help to create a [design specification & test
specification](../../../wiki/Specification-Template)*)
- [x] Implemented the source code change that satisfies the tests
- [x] Documented the feature by providing worked example
- [ ] Updated the README or other documentation
- [x] Completed the pre-Request checklist below

---
# Change Description

Replaced on-the-fly computing of quadrature in BBMStressUpdate by
precomputed values from ParametricMomentumMap.

Required changes:

- added the vector iJMwPSI_dam[...] to ParametricMomentumMap. This is
(inverse Mass(x_q))^{-1} * J(x_q) * w(q) * PSI[i] (xq) where x_q is the
quadrature point and i is the basis function. iJMwPSI is the same in the
DGstress-space, and _dam is in the DGadvection space.

This requires adding DGadvection as a template parameter, and hence,
several classes with an interface to ParametricMomentumMap had to be
touched. We broke compatibility with the old cgParametricMomentum.cpp
class. But I think this should be removed anyway as it is replaced by
the Kernels.

This PR also includes a fix to PR #682 by including dependence on
`cos(lat)`.

---
# Test Description

A one-day BBM computation (config_benchmark.cfg with BBM and shortened
to 1 day) gives (on 4 threads)

before:
../build/nextsim --config-file config_benchmark-bbm-short.cfg 145,99s
user 11,47s system 276% cpu 57,046 total

now:
../build/nextsim --config-file config_benchmark-bbm-short.cfg 56,99s
user 9,79s system 195% cpu 34,141 total

34 instead of 57 seconds. 

---
# Documentation Impact

N/A

---
# Other Details

---
### Pre-Request Checklist

- [x] The requirements of this pull request are fully captured in an
issue or design specification and are linked and summarised in the
description of this PR
- [x] No new warnings are generated
- [x] The documentation has been updated (or an issue has been created
to track the corresponding change)
- [x] Methods and Tests are commented such that they can be understood
without having to obtain additional context
- [x] This PR/Issue is labelled as a bug/feature/enhancement/breaking
change
- [x] File dates have been updated to reflect modification date
- [x] This change conforms to the conventions described in the README
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimisation of the BBM rheology
2 participants