Clean up `identify_variables` / `identify_mutable_parameters`; deprecate `SimpleExpressionVisitor` #3436

jsiirola · 2024-11-26T05:41:34Z

Fixes # .

Summary/Motivation:

This cleans up the implementation of identify_variables to simplify the implementation, improve efficiency, and improve robustness of the named expression cache:

Reduce redundancy in the implementation
Avoid allocating data structures for caching named_expressions unless the cache is defined and a named expression is actually encountered
Improve the cache robustness, including robustly handling cache invalidation if the named expressions have changed.

In addition:

identify_mutables_parameters is moved to build on identify_variables (by deriving from the IdentifyVariablesVisitor`)
switch identify_components to use the StreamBasedExpressionVisitor

This allows us to (finally) deprecate the SimpleExpressionVisitor and remove it from the documentation

Changes proposed in this PR:

(see above)

Legal Acknowledgement

By contributing to this software project, I have read the contribution guide and agree to the following terms and conditions for my contribution:

I agree my contributions are submitted under the BSD license.
I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

…expression

mrmundt

Tiniest of changes

doc/OnlineDocs/explanation/philosophy/expressions/managing.rst

Co-authored-by: Miranda Mundt <55767766+mrmundt@users.noreply.github.com>

Robbybp

Performance comparison on my motivating example for the previous identify_variables rewrite:

Main

Identifier                                       ncalls   cumtime   percall      %
----------------------------------------------------------------------------------
root                                                  1     2.905     2.905  100.0
     -----------------------------------------------------------------------------
     full model post-solve                            1     0.484     0.484   16.7
     solve-scc                                        1     2.421     2.421   83.3
                          --------------------------------------------------------
                          igraph                      1     0.359     0.359   14.8
                          scc-subsolver             546     1.610     0.003   66.5
                          vars-from-components      546     0.187     0.000    7.7
                          other                     n/a     0.264       n/a   10.9
                          ========================================================
     other                                          n/a     0.000       n/a    0.0
     =============================================================================
==================================================================================

This branch

Identifier                                       ncalls   cumtime   percall      %
----------------------------------------------------------------------------------
root                                                  1     2.960     2.960  100.0
     -----------------------------------------------------------------------------
     full model post-solve                            1     0.482     0.482   16.3
     solve-scc                                        1     2.478     2.478   83.7
                          --------------------------------------------------------
                          igraph                      1     0.358     0.358   14.4
                          scc-subsolver             546     1.536     0.003   62.0
                          vars-from-components      546     0.326     0.001   13.1
                          other                     n/a     0.259       n/a   10.4
                          ========================================================
     other                                          n/a     0.000       n/a    0.0
     =============================================================================
==================================================================================

Less than a factor of 2 overhead (see the vars-from-components category) and still much better than the 2s this was taking before exploiting named expressions. Thanks for fixing this.

Robbybp · 2024-12-17T17:43:09Z

pyomo/core/expr/visitor.py

+        # The following attributes will be added by initializeWalker:
+        # self._objs: the list of found objects
+        # self._seen: set(self._objs)
+        # self._exprs: list of (e, e.expr) for any (nested) named expressions


Is there any reason we need to store e.expr in addition to e?

I see that we need to store the immutable expression object in case e changes what expression it contains.

Robbybp · 2024-12-17T18:05:37Z

pyomo/core/expr/visitor.py

+        if self._cache is None:
+            return True, None


To check my understanding: This means that if named_expression_cache was not provided in __init__, we won't exploit repeated named expressions within this expression?

Correct. My assumption is that the same named expression rarely appears twice in a single expression. So, if you don't provide a cache, then there isn't a big win for defining a cache -- there probably won't be a cache hit (and there is the overhead of creating the cache and throwing it away).

That's probably a good assumption in general, but I know that for some IDAES models (at least the autothermal reformer), there is a significant benefit to exploiting named expressions within a single constraint. (Of course the user can always do this by explicitly passing the named expression cache.)

Robbybp · 2024-12-17T18:45:43Z

pyomo/core/expr/visitor.py

+    v = identify_variables.visitor
+    save = v._include_fixed, v._cache
+    try:
+        v._include_fixed = include_fixed
+        v._cache = named_expression_cache
+        yield from v.walk_expression(expr)
+    finally:
+        v._include_fixed, v._cache = save
+
+
+identify_variables.visitor = IdentifyVariableVisitor()


Why use a global visitor instead of a new one every time identify_variables is called? Is there a significant overhead to __init__?

There is a bit of overhead in object creation and disposal. In other cases, I have seen performance benefits by keeping the visitor around between calls. I did not profile the impact here, so maybe this is a red herring?

I have seen this in AMPLRepnVisitor, so I'm not too surprised, but it hasn't shown up in any of my profiles.

jsiirola · 2024-12-17T19:47:45Z

@Robbybp: I am surprised that the overhead went up. can you share your test with me (off-line is fine)? I wonder if we are measuring something else (like the GC)?

Robbybp · 2024-12-17T20:08:19Z

@jsiirola Run this script: https://github.com/Robbybp/surrogate-vs-implicit/blob/main/svi/auto_thermal_reformer/fullspace_flowsheet.py

I insert the timing calls into incidence_analysis.scc_solver and util.subsystems manually when I profile this, so this won't give you the detailed profile, but you should still see the runtime jump.

blnicho · 2025-01-14T20:06:44Z

We are waiting to merge this until we double check the performance degradation @Robbybp noted

Robbybp · 2025-01-15T18:04:17Z

FWIW, the evidence I presented for a performance degradation is pretty flimsy (~0.15s diff), so I wouldn't be opposed to just merging this as-is, given the bug fixes.

jsiirola · 2025-01-31T05:59:07Z

@Robbybp I finally got a chance to look at this, and there was an issue with the implementation. The original approach to validating the named expression cache involved storing a list of all expressions seen in the current expression or any sub-expression. Because of how interconnected named expressions are in IDAES models, this would occasionally get to be a very long list (>1k!). The implementation has been updated to store this as a dict, which prunes redundant instances of the same subexpression (so the list stays under 20 long).

This PR's implementation is now somewhere between equivalent to the old implementation to possibly slightly faster (the noise in timing in Python is now more than the difference between the two implementations) for your case.

codecov · 2025-01-31T09:29:19Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.62%. Comparing base (61e28af) to head (db9c375).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3436      +/-   ##
==========================================
+ Coverage   88.61%   88.62%   +0.01%     
==========================================
  Files         880      880              
  Lines      100639   100646       +7     
==========================================
+ Hits        89181    89198      +17     
+ Misses      11458    11448      -10

Flag	Coverage Δ
linux	`86.20% <100.00%> (+<0.01%)`	⬆️
osx	`76.20% <100.00%> (+0.01%)`	⬆️
other	`86.72% <100.00%> (+<0.01%)`	⬆️
win	`84.67% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Robbybp

The recent commits look good to me, thanks for looking into this.

…ntify-vars

blnicho

I found a minor issue in the docs that needs to be fixed but otherwise this looks fine

doc/OnlineDocs/explanation/philosophy/expressions/managing.rst

pyomo/core/expr/visitor.py

jsiirola added 9 commits November 25, 2024 22:15

Simplify IdentifyVariableVisitor; improve cache robustness

b68a632

Move identify_mutable_parameters to build on identify_variables

0f0f115

NonNumericValue should support the is_constant()/is_fixed() API

7a7d9e5

Move identify_components to use StreamBasedExpressionVisitor

ecad6ac

Fix bug in SimpleExpressionVisitor when passing a native_type as the …

81ee01f

…expression

Deprecate SImpleExpressionVisitor

c5fceea

Add identify_variables tests for nested named expressions

46394f9

Add basic tests for (deprecated) SimpleExpressionVisitor

105fbca

Improve efficiency / caching in get_vars_from_components()

7de8b98

jsiirola changed the title ~~Clean up identify_variables / identify_mutable_parameters; deprecate SImpleExpressionVisitor~~ Clean up identify_variables / identify_mutable_parameters; deprecate SimpleExpressionVisitor Nov 26, 2024

jsiirola added 3 commits November 26, 2024 09:38

Merge branch 'main' into identify-vars

e05cba5

NFC: fix typo

2dd0f5e

Merge branch 'main' into identify-vars

ff03fe0

mrmundt requested changes Dec 9, 2024

View reviewed changes

doc/OnlineDocs/explanation/philosophy/expressions/managing.rst Outdated Show resolved Hide resolved

NFC: fix doc typo

ac92c1e

Co-authored-by: Miranda Mundt <55767766+mrmundt@users.noreply.github.com>

mrmundt approved these changes Dec 16, 2024

View reviewed changes

Robbybp approved these changes Dec 17, 2024

View reviewed changes

pyomo-autotest added the AT: STALE label Dec 30, 2024

jsiirola added 4 commits January 30, 2025 11:58

Merge branch 'main' into identify-vars

b9aee5b

Merge branch 'main' into identify-vars

ea4fc58

Use dict not list to store nested named expressions

a3aa203

Improve efficiency of identify_vars data structures

f592eae

pyomo-autotest removed the AT: STALE label Jan 31, 2025

Robbybp approved these changes Jan 31, 2025

View reviewed changes

Merge remote-tracking branch 'refs/remotes/me/identify-vars' into ide…

2cddb8a

…ntify-vars

blnicho requested changes Feb 4, 2025

View reviewed changes

doc/OnlineDocs/explanation/philosophy/expressions/managing.rst Show resolved Hide resolved

doc/OnlineDocs/explanation/philosophy/expressions/managing.rst Show resolved Hide resolved

pyomo/core/expr/visitor.py Outdated Show resolved Hide resolved

jsiirola added 2 commits February 4, 2025 13:20

Update example to match documentation text

d75bc92

NFC: fix typo

2e85c53

jsiirola requested a review from blnicho February 4, 2025 20:21

NFC: apply black

258a112

blnicho approved these changes Feb 4, 2025

View reviewed changes

blnicho and others added 2 commits February 4, 2025 17:39

Merge branch 'main' into identify-vars

db9c375

Merge branch 'main' into identify-vars

8d1fff7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up `identify_variables` / `identify_mutable_parameters`; deprecate `SimpleExpressionVisitor` #3436

Clean up `identify_variables` / `identify_mutable_parameters`; deprecate `SimpleExpressionVisitor` #3436

jsiirola commented Nov 26, 2024

mrmundt left a comment

Robbybp left a comment

Robbybp Dec 17, 2024

Robbybp Dec 17, 2024

Robbybp Dec 17, 2024

jsiirola Dec 17, 2024

Robbybp Dec 17, 2024

Robbybp Dec 17, 2024

jsiirola Dec 17, 2024

Robbybp Dec 17, 2024

jsiirola commented Dec 17, 2024

Robbybp commented Dec 17, 2024

blnicho commented Jan 14, 2025

Robbybp commented Jan 15, 2025

jsiirola commented Jan 31, 2025

codecov bot commented Jan 31, 2025 •

edited

Loading

Robbybp left a comment

blnicho left a comment

Clean up identify_variables / identify_mutable_parameters; deprecate SimpleExpressionVisitor #3436

Are you sure you want to change the base?

Clean up identify_variables / identify_mutable_parameters; deprecate SimpleExpressionVisitor #3436

Conversation

jsiirola commented Nov 26, 2024

Fixes # .

Summary/Motivation:

Changes proposed in this PR:

Legal Acknowledgement

mrmundt left a comment

Choose a reason for hiding this comment

Robbybp left a comment

Choose a reason for hiding this comment

Main

This branch

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsiirola commented Dec 17, 2024

Robbybp commented Dec 17, 2024

blnicho commented Jan 14, 2025

Robbybp commented Jan 15, 2025

jsiirola commented Jan 31, 2025

codecov bot commented Jan 31, 2025 • edited Loading

Codecov Report

Robbybp left a comment

Choose a reason for hiding this comment

blnicho left a comment

Choose a reason for hiding this comment

Clean up `identify_variables` / `identify_mutable_parameters`; deprecate `SimpleExpressionVisitor` #3436

Clean up `identify_variables` / `identify_mutable_parameters`; deprecate `SimpleExpressionVisitor` #3436

codecov bot commented Jan 31, 2025 •

edited

Loading