Skip to content

[io] don't miss writing a histogram that is only in a last file with option -n 2 #18679

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ferdymercury
Copy link
Collaborator

@ferdymercury ferdymercury commented May 9, 2025

This Pull request:

Changes or fixes:

Fixes #9022

Explores solution suggested by @jblomer

fyi @will-cern

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

@ferdymercury ferdymercury requested review from jblomer and silverweed May 9, 2025 16:59
@ferdymercury ferdymercury marked this pull request as ready for review May 9, 2025 17:02
@ferdymercury ferdymercury requested a review from pcanal as a code owner May 9, 2025 17:02
Copy link

github-actions bot commented May 9, 2025

Test Results

    19 files      19 suites   3d 18h 31m 46s ⏱️
 2 745 tests  2 744 ✅ 0 💤 1 ❌
50 721 runs  50 720 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit f312f3b.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@silverweed silverweed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for tackling this long-standing issue!
I left a small comment.

@ferdymercury ferdymercury requested a review from silverweed May 12, 2025 09:37
Copy link
Contributor

@silverweed silverweed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd wait for @pcanal 's approval as well

@@ -542,7 +542,8 @@ Bool_t TFileMerger::MergeOne(TDirectory *target, TList *sourcelist, Int_t type,
keyname, keytitle);
return kTRUE;
}
Bool_t canBeFound = (type & kIncremental) && (current_sourcedir->GetList()->FindObject(keyname) != nullptr);
Bool_t canBeFound = (type & kIncremental) && (current_sourcedir->GetList()->FindObject(keyname) != nullptr) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it && and not || ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more exactly, I don't understand yet why the fact the histogram can be found means that it is not written in the end ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out:

https://github.com/ferdymercury/root/blob/1531153ea11a7b54f4eb3c170bbd28e9bc46447f/io/io/src/TFileMerger.cxx#L808-L817

so if canBeFound is true, there is an optimization that spare some write cycles.

We use && to avoid it being true, ie to force writing to file. Using || would go in the different direction.

Do you want me to rename canBeFound to skipPartialWriting ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to rename canBeFound to skipPartialWriting ?

Not yet. I am still confused.

The original canBeFound meant (in the context of incremental merge) 'histogram can be found in the source' while the new version is histogram can be found in the source and in the target.

The optimization is indeed 'skip partial writing if we can found the histogram again'.

So I don't understand (yet) the semantic of the change. ie. Why is the new criteria the right choice? Is the new criteria instead 'just' making canBeFound always false?

Another avenue of inquiry is 'the original code assume that if canBeFound is true then there will be another change to write the histogram. Why is it no true anymore (it does not seem to be realted to 'can not be found in target')? Is there other variation of the example that also fails (how does the -n X value relates to the number of files in the input list and how many are 'missing' the histograms).

Related: is it possible that the alternative if that at the refresh boundary there needs to be a flush/write as if we were at the end?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I am not sure about these questions; I just followed jblomer's suggestion. My (limited) understanding is that this change just forces an extra partial write the first time that a new histogram appear in any file. So it does not really harm, but is suboptimal since, if all files have exactly the same histograms, then we could have waited until the end. But it makes it work if there are some files with and some without, independently of the chosen N.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change just forces an extra partial write the first time that a new histogram appear in any file.

That would be fair enough. Can we verify that it just one and not number_of_input files? (Related but probably unavoidable is that for the case of just 2 files with all the same histograms the number of write is doubled .... actually the number of double for each histogram that is in more than one file (i.e. usual case)).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here the test with many more files mtest.cpp.txt.

root -l mtest.cpp.txt+ -b -q 2>&1 | grep MergeOne
Info in TFileMerger::MergeOne: Writing partial result of h1 into target
Info in TFileMerger::MergeOne: Writing partial result of h2 into target

to see how often the partial result is written, as pcanal suggested
@ferdymercury ferdymercury added this to the 6.38.00 milestone May 15, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hadd with -n option misses some content
3 participants