-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add datatypes for storing generator (meta)data in a structured and defined way #310
Conversation
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall this looks like a reasonable structure (at least to me). But @dirkzerwas should probably have a look to make sure we don't miss anything obvious yet.
Can you also add tests to read back the things that you currently write. At least the ToolInfo parts have enough additional logic wrt podio to warrant a test, I think.
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Hi @hegner and @tmadlener, looks very nice, a couple of thoughts:
Cheers, |
Co-authored-by: Andre Sailer <andre.philippe.sailer@cern.ch>
Co-authored-by: Juan Miguel Carceller <22276694+jmcarcell@users.noreply.github.com>
Co-authored-by: Andre Sailer <andre.philippe.sailer@cern.ch>
Yes. I think that's a fair point. I will do that rename.
For this we do not need a new data type at all. My idea was to just have an example for how to do it in a to-be-written doc page. With possibly a constant defined for the name of the weight names.
I left it in one piece for simplicity. No problem to split it in two. |
@dirkzerwas - I have a question for clarification. We have event weights stored in our standard |
If downstream algs already access the weights, I understand the difficulty, it was more esthetics than "need" :)
However I would move the signal_vertex (oneToManyRelation) out of the PDF structure if possible? |
@dirkzerwas - no problem. It would then be a standalone collection. I was hoping to keep the number of collections small however. If that's your only comment, I will implement that and finish this feature with writing a simple example. |
Yes, exactly. In principle an EventHeader should only only contain things that are available on both simulation and real data. From a "historical point of view" we can only store multiple weights in there since #254, so there is not too much historical baggage, and this has not yet been part of a released version of EDM4hep. We will probably not get rid of the single weight in the EventHeader as easily, as that has been there since the beginning. So we could in principle move the weights vector closer to the cross sections without boo much breakage. |
@hegner |
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
include/edm4hep/GenToolInfo.h
Outdated
const auto names = frame.getParameter<std::vector<std::string>>(generatorToolName).value(); | ||
const auto versions = frame.getParameter<std::vector<std::string>>(generatorToolVersion).value(); | ||
const auto descriptions = frame.getParameter<std::vector<std::string>>(generatorToolDescription).value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case any of these parameters is not available this implementation will throw a std::bad_optional_access
, which might be a bit cryptic, but would at least be in line with "fail early in case of problems".
If we wanted a slightly more relaxed approach without exceptions escaping, we could do:
const auto names = frame.getParameter<std::vector<std::string>>(generatorToolName).value(); | |
const auto versions = frame.getParameter<std::vector<std::string>>(generatorToolVersion).value(); | |
const auto descriptions = frame.getParameter<std::vector<std::string>>(generatorToolDescription).value(); | |
const auto names = frame.getParameter<std::vector<std::string>>(generatorToolName).value_or({}); | |
const auto versions = frame.getParameter<std::vector<std::string>>(generatorToolVersion).value_or({}); | |
const auto descriptions = frame.getParameter<std::vector<std::string>>(generatorToolDescription).value_or({}); |
In that case the worst that can happen is an empty vector, which probably also shows that something went wrong.
I am in favor of the current implementation as that makes errors harder to ignore and easier to diagnose (even if the std::bad_optional_access
might be a bit cryptic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some more discussion with @dirkzerwas about how this will be used, I think we can do the "error" handling internally, and simply return an empty vector instead of letting the exception propagate. The main reasons are
vector::empty
still let's users check whether information was available- Information on the tools rather falls into a "nice to have" category, rather than crucially important information for event processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_or
should do then the right thing then
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more things triggered by the clashing variable and class names.
include/edm4hep/Constants.h
Outdated
static constexpr const char* GeneratorEventParameters = "GeneratorEventParameters"; | ||
static constexpr const char* GeneratorPdfInfo = "GeneratorPdfInfo"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are now clashing with the generated classes. Maybe all of these should just be prefixed with Gen
instead of Generator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will do that indeed. The label
helped as well before ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we add a namespace for the constants.
And we copy the existing constants into the namespace and deprecate the namespaceless ones (we can even align spelling when we do that without breaking anything)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be a major change and a bit cumbersome if dealing e.g. with the matrices end the corresponding enums. Do you really want to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the enums were supposed to stay where they are, since they aren't constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like a edm4hep::label
namespace for things that are a label. the enums for covariance matrix access can / should stay in the edm4hep
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #315 for a proposal with a new labels
namespace.
include/edm4hep/GenToolInfo.h
Outdated
|
||
namespace edm4hep { | ||
|
||
struct GenToolInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classes we define in the yaml file are all prefixed Generator
, but this one is Gen
. For consistency reasons, I think they should both be the same. The choice is also related with how we want to name the constants.
Resolved the merge conflicts, added a bit of documentation and renamed |
Perfect. Thank you |
Unless there are complaints by tomorrows meeting, I will merge this shortly before that. |
@tmadlener |
BEGINRELEASENOTES
GeneratorEventParameters
andGeneratorPdfInfo
datatypes to store generator related data in a structured and well defined way.GeneratorToolInfo
struct and related utility functionality to store some high level metadata on the generator into Frame parameters.ENDRELEASENOTES
Pull request to collect all the changes required for #309