-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Filtering logs re-design #34794
Comments
@peterfpeterson Nevermind, I was ignorant that the |
I appreciate that you're looking into how the |
In addition, some unit, docs, or system tests might fail. We should keep track of these, and see if the changes make sense (don't just use new numbers for the output). |
Review Notes:
The name |
I generally like the concept here. The current filtering is quite confusing to understand. A question about the high-level changes:
It feels like the filtered log should just be a view onto the original data. Do we need to copy or can we just reference to save even more memory? |
That is a good idea. It would be facilitated by the handles on the logs being copy-on-write pointers. |
Fixed by #34938 |
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of mantidproject#34794
This also does various bits of modernization focussing on reporting issues with inputs as early as possible. This is the reason why some of the unit tests were changed. This is part of #34794
Is your feature request related to a problem? Please describe.
Example of the problem (assume orange parts are the parts to keep)
data:image/s3,"s3://crabby-images/7c4c4/7c4c479e100a9ade55425e82cd6b9c670b5b6c6c" alt="image"
data:image/s3,"s3://crabby-images/ddc37/ddc37761dd39cfb23431ceea2666b439ee9ef64c" alt="image"
The current filtering of logs inserts new values for the logs at the left and right edges of the times. This changes the simple mean, and expands the amount of memory needed for a single log incredibly.
The solution will give the correct results for (being python-centric) the
mantid.api.Run
to give back correct results forgetProtonCharge()
(which should be a sum of theproton_charge
log) andgetPropertyAsSingleValueWithTimeAveragedMean()
andgetTimeAveragedStd()
. It will also simplify the code for filtering the events themselves so it is easier to understand the intent and refactor for better performance.High level changes
TimeROI
will be added to theRun
object to allow for calculating statistics and showing logs in the log viewerTimeROI
will be used directly in the event filtering codeTimeSplitter
will be used directly in the event splitting codeDescribe the solution you'd like
TimeROI object
The solution is to introduce a new object,
TimeROI
which contains information about when the workspace was active. This is in contrast to the current implementation which copies parts of the logs and adds artificial log values at the boundaries of when the workspace is active and not active.The new object,
TimeROI
will useTimeSeriesPropertyBool
as a private member for much of its implementation details.false
will mean that the time period is to not be included andtrue
means that the time period should be included. Events or log values outside of the time-range of theTimeROI
are not included, but may need to be inspected when calculating simple mean values for things that log on change.Since a user may filter/split a workspace multiple times, the
TimeROI
object needs to know how to be merged with anotherTimeROI
. At least internally, it will have union and intersection methods.Changes to the Run object
The
Run
object will get aTimeROI
added to it so it can return the the correctly integrated time average mean and stddev. TheTimeROI
can be set or updated which will clear theLogManager.m_singleValueCache
.Run
will also get its owngetStatistics(str)
to facilitate returning the correct values there as well which should also updateLogManager.m_singleValueCache
. Setting theTimeROI
on aRun
will require recalculating thegd_prtn_chrg
viaRun.integrateProtonCharge()
. TheRun
object will have to take on some amount of responsibility forTimeSeriesProperty.filteredValuesAsVector
so the traditional mean, stddev, median, min, and max get the correct values. The main point is that often the last value before theTimeROI
includes data will need to be returned additionally.LogManager::getPropertyAsSingleValue
(resolves asRun
's method) eventually callsLogManager::convertTimeSeriesToDouble
needs to be modified to calculate the time average value with theTimeROI
taken into account.TimeSeriesProperty::timeAverageValueAndStdDev
should move into theRun
orLogManager
and account for theTimeROI
.TimeSeriesProperty.getStatistics()
and related methods will get an optionalTimeROI
argument for allowing for getting a "correct" calculation rather than the default which will calculate as it currently does.PropertyManager
andTimeSeriesProperty
will lose their methodsfilterByTime
,splitByTime
andfilterByProperty
The
FilteredTimeSeriesProperty
will go away.It appears that
TimeROI
is a variation on the concept ofTimeSplitterType
which is actuallystd::vector<SplittingInterval>
.SplittingInterval
is a start and stop type with an index of the destination workspace. IfTimeROI
is a tweaked version ofTimeSeriesProperty<int>
this behavior could be preserved with the exception of not allowing for a single event to be filtered into multiple outputs. That behavior would require creating a secondTimeROI
.The "log viewer" in mantid will plot the full log of a workspace, with an extra option (i.e. checkbox) to show the regions where the
TimeROI
is not selecting the data with shading.Changes to the Statistics object
Kernel::Statistics
should get extra items for time-average-mean and time-average-stdard-deviation. The values for these if they are not calculated (like all the others) isnan
.Goiniometer object
Goniometer
uses logs valuesRun::calculateAverageGoniometerMatrix
callsRun::getLogAsSingleValue
several times and should be modified to call the newRun::getStatistics
. Otherwise this is already using theKernel::Math::TimeAveragedMean
which is correct.Instrument object
Instrument
uses log valuesThe instrument uses log values via
InstrumentDefinitionParser
. The log is used inParameterValue::operator double()
which callsXMLInstrumentParameter.createParameterValue(Kernel::Property *)
and has its own implementation of determining the value. This should be changed toXMLInstrumentParameter.createParameterValue(const API::Run &, const std:string &)
which would useRun::getLogAsSingleValue(const string &, const StatisticType)
to get the value then use the rest of the structure to use it appropriately.TimeSplitter object
While
TimeROI
is used in filtering events,TimeSplitter
is a new object that uses aTimeSeriesPropertyInt
for much of its implementation and is used for splitting workspaces. A value of-1
is a special value for not including data into any output. It will have methods for converting itself into aTimeROI
object for a given output workspace index. TheTimeSplitter
will also have a field(s) that captures the string from theInformationWorkspace
that is passed betweenGenerateEventsFilter
andFilterEvents
. This is to simplify logic inFilterEvents
. TheTimeSplitter
will generate an error if, at the end of construction, it has any times that are intended to go to multiple output workspaces.Changes to
EventList
EventList
does much of the heavy lifting of filtering and splitting via its "helper" functions:filterByPulseTimeHelper
filterByTimeAtSampleHelper
filterInPlaceHelper
splitByTimeHelper
splitByFullTimeHelper
splitByFullTimeVectorSplitterHelper
splitByFullTimeSparseVectorSplitterHelper
splitByPulseTimeHelper
splitByPulseTimeWithMatrixHelper
Many of these functions can be combined into three functions:
std::copy_if
ExtractSpectra
and should be moved toEventList
.EventList::maskTof()
should also be refactored to use this new method.Changes to algorithms
This does not affect all algorithms in the events category, only those in event filtering category.
CalculateCountRate
- does not current take into account time average. Maybe better to leave for later.ConvertToMD
- does not need to be changed because MDWSDescription already usesRun.getLogAsSingleValue()
CreateLogTimeCorrection
- does not need to be modifiedExportTimeSeriesLog
- should get additional option to determine if the filtered or unfiltered values are exported. Can wait for later.FilterBadPulses
- callsFilterByLogValue
and does not need to be modifiedFilterByLogValue
- this creates aTimeSplitterType
then for in-place operation usesfilterInPlace
and out-of-place usessplitByTime
. These branches should be combined to create aTimeROI
and call the new filter method. The run object will be updated usingRun.updateTimeROI()
FilterByTime
- should be updated to create aTimeROI
, call the new filter method, thenRun.updateTimeROI()
FilterByTime2
- is not a registered algorithm (i.e. can't be called by users) so remove itFilterByXValue
- does not need to be changed. It already callsEventList.maskTof()
and has a note aboutCropWorkspace
being a more appropriate algorithmFilterEvents
- see section belowFilterLogByTime
- does not need to be changed. It gets the values of a log and returns them as an array after operating on them. No changes occur to the workspace.GenerateEventsFilter
- does not need to be changed. It creates a workspace describing how to filter or split an input workspaceGetTimeSeriesLogInformation
- does not need to be changed. It generates outputs workspace based on information from a selected log.RebinByTimeAtSample
- does not need to be changed. It histograms the data based on absolute time of the neutron at the sample.FilterEvents algorithm
This is the heart of where the other changes come together. link to docs
All 3 types of inputs (
MatrixWorkspace
,SplittersWorkspace
, andTableWorkspace
) will be converted into aTimeSplitter
. This will also consume the information fromInformationWorkspace
(sets the title and comment of outputs),FilterByPulseTime
(which will require using the proton charge log),OutputWorkspaceIndexedFrom1
,OutputWorkspaceNames
,RelativeTime
, andFilterStartTime
.The events themselves will be split using the
TimeSplitter
and the new functions inEventList
.The
Run
object will be copied into all of the output workspaces, then theTimeSplitter
will be asked to create aTimeROI
for each output workspace index which will be added to each of the associated output workspace'sRun
object.Implementation notes
The implementation should be in the following order
TimeROI
objectTimeROI
.Statistics
objectRun
object to allow forTimeROI
to be attachedTimeSeriesProperty
toRun
. At this stage previous implementation can stay in-place with deprecation annotationsTimeROI
and add deprecation annotations to old methods (copy if, and erase)TimeSplitter
objectTimeSplitter
. This will throw an exception if a single time period can go to more than one output workspace index.TimeSplitter
and add deprecation annotations to old methodsThe functionality to add
TimeROI
and addTimeSplitter
can happen in parallel, but must be before modifying the algorithms.Things that can be done out of order
Instrument
Goniometer
Additional context
Since the changes will be made to algorithms incrementally the deprecated annotation can be used with
-Wdeprecated-declarations
to track down places that are using the "old" functionality.The issue discovered in #27556 should be fixed as part of this work
The equations for calculating the time-average mean with a
TimeROI
are in section IV of this paper withM(t)
representing theTimeROI
with appendix A containing an example. The challenge is to discretize the equations.This is associated with:
The text was updated successfully, but these errors were encountered: