Skip to content

meta: Allow global disabling of AutoParsing during TClass::GetClass #18402

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

pcanal
Copy link
Member

@pcanal pcanal commented Apr 14, 2025

This PR helped debug the issue related to #18373

This is an improvement in the detection of unwanted auto-parsing.

The roottest companion PR is root-project/roottest#1305

To ease debugging of unwanted auto-parsing triggered by TClass::GetClass, 2 new features are introduced.

  1. Give access to the list of classes that triggered an autoparsing:
// Print the list
gInterpreter->Print("autoparsed"); 
// Get the list/set:
((TCling*)gInterpreter)->GetAutoParseClasses();
  1. Add interface (to be further refined) to completely disallow auto-parsing during TClass::GetClass
Build ROOT (actually just TClass.cxx) with -DROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING

We "could" allow further way to customize it:

   //   - environment variable ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING
   //   - rootrc key Root.TClass.GetClass.AutoParsing
   //   - TClass::SetGetClassAutoParsing

In addition, gDebug >=1 now print an additional line:

TCling::AutoParse: parsed 1 headers for reco::PFRecHitSoALayout<128,false>

@pcanal
Copy link
Member Author

pcanal commented Apr 14, 2025

@makortel @Dr15Jones Any opinions on the interface to see the auto-parsed classes and to disable auto-parsing in TClass::GetClass.

Note: this was extracted/separated from #18373

@Dr15Jones
Copy link
Contributor

I'm concerned that keeping the full list of what was auto parsed to lead to a noticeable memory increase. In my opinion, the additional gDebug enabled printout is probably sufficient to allow CMS to do what we want.

Copy link

github-actions bot commented Apr 14, 2025

Test Results

    18 files      18 suites   4d 4h 40m 50s ⏱️
 2 742 tests  2 742 ✅ 0 💤 0 ❌
47 673 runs  47 673 ✅ 0 💤 0 ❌

Results for commit 214f513.

♻️ This comment has been updated with latest results.

@pcanal pcanal force-pushed the master-18363-TClass branch 2 times, most recently from a3a40ee to 143831c Compare April 14, 2025 22:37
@makortel
Copy link

I'm also concerned of the memory use of the added fAutoParseClasses and fAutoLoadedLibraries members that seem to be filled unconditionally (i.e. even if we don't want a printout). On a quick look on a random recent CMSSW release, we have O(11k) classes declared in the DataFormats .rootmap files. I'd expect not all of those classes to be auto-loaded (or auto-parsed), but nevertheless the potential cost of these two sets could be in the MB range.

I'm wondering about the difference of TInterpreter::SuspendAutoParsing guard (that seems to call TInterpreter::SetSuspendAutoParsing()) and TInterpreter::SetClassAutoparsing(). By quick look those two functions set different flags in TCling, and figuring out their behavior from the code alone seemed complicated.

From usability standpoint I suspect a global setting would not seem to necessarily help CMS. We'd need the header parsing enabled for the following cases

  • dictionary is not necessary for the type
    • types like std::pair (although we still define std::pair dictionaries); or at least we assume header parsing would be needed for types that ROOT recommends us to not declare a dictionary for
  • cut parser (or other similar things that need the member functions)

To be useful within cmsRun, we'd like a system where the auto-parsing could be disabled for specific ROOT calls (but I understand that approach would not be easy to implement).

@pcanal
Copy link
Member Author

pcanal commented Apr 16, 2025

The memory cost is:

  • one library name per library that is actually autoloaded
  • one normalized class name for (only) each class that actually trigger auto-parsing. Apriori this cost is negligible compared to the cost of the header being parsed (i.e. at the very least it contains that same name somewhere when that class is declared :) ). Indeed the worst case scenario is one class name per header file for all the classes but that is extremely unlikely since it would required all the class being used and needing auto-parsing and more importantly the order being such that the later class' header are not loaded indirectly by the earlier class. In addition, if I remember correctly, the header are actually bundle by dictionary so in reality the worse case scenario would be one class name per dictionary. (And in the case of well behave I/O job, of course, no overhead :) ).

@pcanal
Copy link
Member Author

pcanal commented Apr 16, 2025

From usability standpoint I suspect a global setting would not seem to necessarily help CMS.

The idea of the setting was solely for debugging purposes .... Although ....

Reading your list, it actually sounds like disabling the auto-parsing solely during TClass::GetClass may still do the trick. For the cut parser, it might/should do the auto-parsing later when information about functions are needed.

@pcanal
Copy link
Member Author

pcanal commented Apr 16, 2025

difference between TInterpreter::SetSuspendAutoParsing() and TInterpreter::SetClassAutoparsing()

It is not clear as SetClassAutoparsing started its life with a similar purpose as the one served by SetSuspendAutoParsing today but seem to nowadays serve a purpose related to the support for C++ modules.

@pcanal
Copy link
Member Author

pcanal commented Apr 16, 2025

@makortel Should we test CMSSW with ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING turned on before deciding which version of this to merge?

@makortel
Copy link

difference between TInterpreter::SetSuspendAutoParsing() and TInterpreter::SetClassAutoparsing()

It is not clear as SetClassAutoparsing started its life with a similar purpose as the one served by SetSuspendAutoParsing today but seem to nowadays serve a purpose related to the support for C++ modules.

Just to clarify, do you mean the SetClassAutoparsing is nowadays related to the support for C++ modules, and SetSuspendAutoParsing is for preventing header parsing in, umm, dictionary-using code?

@makortel
Copy link

@makortel Should we test CMSSW with ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING turned on before deciding which version of this to merge?

With "which version" do you mean whether ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING is disabled or enabled by default at build time?

In any case I'd be fine with testing ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING turned on. @smuzaffar What do you think?

@pcanal
Copy link
Member Author

pcanal commented Apr 16, 2025

With "which version"

Choices that we have:

  1. Drop the code related to suspending auto-parsing
  2. Keep the code related to suspending auto-parsing only with a #define
  3. Keep the code related to suspending auto-parsing only with rootrc flag
  4. Keep both
    In addition we have:
    a. Always record auto-parsed library and auto-parsed enducing class name
    b. Record one or both only on demand.

Having the build with the suspending auto-parsing force on was to answer the question on whether it lead to any of:

  1. No failures at all in CMSSW
  2. Failures that shows actually missing dictionary
  3. Failures due to classes that are not meant to have dictionary
  4. Failures in the cut parser

SetSuspendAutoParsing is for preventing header parsing

It suspends/prevent any auto-parsing.

SetClassAutoparsing is nowadays related to the support for C++ modules

It is used in connection to code related to C++ modules ... I did not yet dig deeper than that :)

@makortel
Copy link

Thanks @pcanal for the clarifications. I think a test with CMSSW with this PR and ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING enabled at build time could be useful, but I'm also a bit concerned the first test would reveal many failures (any mixture of 2-4 in your list), and some of the failures hiding other failures and we'd end up iterating with the test after fixing things (although getting things fixed should be good thing).

@smuzaffar What do you think?

@smuzaffar
Copy link
Contributor

@pcanal , @makortel we can run cmssw PR tests with ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING .

pcanal added 4 commits April 17, 2025 13:52
If TClass.cxx is build with the cpp macro:

   ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING

defined, it will no longer do any auto-parsing during the
execution of `TClass::GetClass`.  This will result in not
being able to find TClass-es when the name requires not-already
loaded interpreted information (eg. a typedef to be resolved).

Comments include additional possible interfaces to turn on this
feature.
Use `gInterpreter->Print("autoparsed");` to print a list
of the class names that directly lead to auto-parsing.

Use `gCling->GetAutoParseClasses()` to programatically get a set
of the class names that directly lead to auto-parsing.
This allows to disable auto-parsing during `TClass::GetClass` for debugging purposes.
Use `gInterpreter->Print(autoloaded);` to print a list
of the libraries that have been automaticaly loaded during
TClass::GetClass and due to a symbol requested during code
interpretation.
@pcanal pcanal force-pushed the master-18363-TClass branch from 143831c to 214f513 Compare April 17, 2025 18:53
@ferdymercury

This comment was marked as outdated.

@pcanal
Copy link
Member Author

pcanal commented Apr 23, 2025

we can run cmssw PR tests with ROOT_DISABLE_TCLASS_GET_CLASS_AUTOPARSING .

Was there any noticeable difference?

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants