Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Explain in section 5 the relation and difference between an Event Log and other technologies #7

Open
phochste opened this issue Oct 27, 2021 · 6 comments
Assignees

Comments

@phochste
Copy link
Contributor

In MellonScholarlyCommunication/spec-notifications#20 and our bi-weekly technical Mellon meeting we had a discussion what the relation and differences of an Event Log is with other types of technologies (e.g. such as the as:outbox of ActivityPub). These differences would best be reflected in the Event Log spec to better explain the rationale for an Artefact Event Log (e.g. in section 5) and the differences with other technologies

A recap of the observations that resulted from our discussions in these channels

--

  • The Artefact Event Log is artfefact centric: it says what has happened to an artefact
  • An as:outbox (such as ActivityPub) is activity centric: it ways what activities an actor has done

--

  • The Artefact Event Log (tries to be) symmetric between the Actor and the Service Hub. Both should (when everything works well) contain the same event about an artefact
  • An as:outbox is not symmetric: it contains only the activities an actor (e.g. Alice) doesn, but not the activities of other actors (e.g. Service Hubs)

--

  • The Artefact Event Log is about events that add value in a Scholarly (or Erfgoed) value chain: the values are registration, certification, endorsement, awareness and archivation
    • E.g. the artefact event log contains the facts that artefacts were indeed registered, certified...
  • An as:outbox could include part of this activities but also other things that Alice does in her netwerk 'E.g. likeing a collection item','register for an workshop','annotating a (scholarly) resource'
  • There could be also logs that are not shared at all, ...e.g. housekeeping events of what happened on a pod
@mielvds mielvds self-assigned this Oct 27, 2021
@hvdsomp
Copy link

hvdsomp commented Oct 27, 2021

I was thinking more about something @mielvds said during the discussion regarding scope of the scholcomm event log, namely that maybe only the events in which value was effectively added to an artifact go into that event log. Meaning that Alice's offer would not go in. But a message from a service hub that states that e.g. registration happened for the artifact would be turned into an event in the event log. Although, during the discussion, I responded that I felt that Alice's offers and - for example - rejections thereof also provide information that supports transparency re open science, I could also embrace the perspective of @mielvds . Doing so would mean that events in the event log would always be based on notifications coming in from third parties, never based on Alice's own actions:

  • Notifications re value added by service hubs (e.g. registration) are turned into events
  • Notifications re value added through interactions with an artifact by peers (interaction events in the Mellon proposal) are turned into events

This is not to say that the fact that Alice made an offer that was rejected should not be saved somehow. Just like many other things that happen around the pod, it could be. The question is more whether that information should be public, which the scholcomm event log is.

A lot of this points to the potential existence of multiple logs, with the scholcomm event log being one that is supposed to be public and helps with transparency of schol comm. Maybe this kind of perspective is also helpful for the NDE case that does not deal with registration/certification/etc and hence does not need a scholcomm event log. But it needs another event log that serves another purpose ...

@mielvds
Copy link
Contributor

mielvds commented Oct 27, 2021

Doing so would mean that events in the event log would always be based on notifications coming in from third parties, never based on Alice's own actions:

Mostly, yes. But perhaps "registration was requested from" is an event worth logging, ie. Alice's requested it, and is currently waiting.... but the more I think of it, the more I feel like only "service responses" could end up in the log, which is what you conclude.

We should try listing some collector use cases: why does the collector reconstruct artefact lifecycles? An obvious one is: "I found a paper and I want to know whether it's legit" What do you need to know in order to come to that conclusion?

* Notifications re value added by service hubs (e.g. registration) are turned into events

* Notifications re value added through interactions with an artifact by peers (interaction events in the Mellon proposal) are turned into events

This is not to say that the fact that Alice made an offer that was rejected should not be saved somehow. Just like many other things that happen around the pod, it could be. The question is more whether that information should be public, which the scholcomm event log is.

Yes! Without going too much "if a tree falls in the forest": should an artefact have at least a "creation" event with basic metadata for cases where no services were involved yet? For example: Bob wrote a paper, but hasn't submitted it yet. However, he does want collectors to be able to discover it.
What makes an artefact discoverable by the collector?

A lot of this points to the potential existence of multiple logs, with the scholcomm event log being one that is supposed to be public and helps with transparency of schol comm. Maybe this kind of perspective is also helpful for the NDE case that does not deal with registration/certification/etc and hence does not need a scholcomm event log. But it needs another event log that serves another purpose ...

+1

@hvdsomp
Copy link

hvdsomp commented Oct 27, 2021

I provided my perspective regarding the "creation" event: The fact that Alice created a document may be of interest to some, e.g. her close collaborators. But from my perspective this is out of the scope the the scholcomm event log. For that event log, IMO, it all starts with Registration, which is the entrance of the document in the scholarly record. Again, that is not saying that the creation of the document, and edits thereof, etc should not be saved somewhere. As a matter of fact, they might be of interest to generate provenance information regarding a registered artifact, by which I mean some technical metadata detailing the creation/evolution of the artifact prior to being registered in the scholarly record. But, IMO, this is not necessarily part of the scholcomm event log.

@hvdsomp
Copy link

hvdsomp commented Oct 27, 2021

I keep thinking about this all. In this comment, I want to think in general terms rather than in terms of the Mellon scholarly use case and the related scholcomm event log. And with this regard I keep thinking about things @mielvds said (e.g. the remove case in NDE) and what @Dexagod said (about the potential role of the Orchestrator in deciding which events are considered worthwhile saving in an event log and which are not. In this, I assume that the Orchestrator is always a "machine in the middle" that is aware of the kind of events that this discussion is about, both events that are considered worthwhile to save in an event log and those that are not. By which I mean, events for which the Orchstrator isn't in the loop are outside of this discussion.

  1. This leads me to the notion of a registry of event types that are considered worthwhile and that could be characterized (e.g. in rules invoked on notifications) by means of (among others?):
  • what is the as2 activity
  • what is the more specific, community-related activity (cf the COAR vocab)
  • who is the sender of the notification regarding the activity (e.g. Alice herself, a service hub, which service hub, which type of service hub, ...)
  • what is the type of artifact
    Based on this, some notifications would be evaluated as pertaining to events that are considered worthwhile and others will not.

Another aspect of this all is the question "event considered worthwhile for which purpose?" We've already touched upon this in the discussion: worthwhile for transparency of open science scholcom, worthwhile when it comes to recording a creation/update provenance trail about an artifact in a pod, worthwhile regarding workflows in NDE collection registration, etc., etc.

  1. This leads me to the notion of event logging to serve different purposes. Which in turn probably leads to the notion of multiple separate events logs (one per purpose) because likely different applications/users will consume them and will or will not be allowed to access them (ie different access rights for different event logs). The aforementioned registry of event types could also contain the information concerning which worthwhile event goes into which event log and the Orchestrator would then write an event to the appropriate log.

I am not saying that we need to immediately go into this direction. But merely considering the scholcomm registration/certification/awareness/archiving events, the scholcomm interaction events, the artifact evolution events (create, update, delete), and the NDE events it's kind of becoming obvious that one size will not fit all and that event logs will end up having a certain profile related to the purpose they serve.

@phochste
Copy link
Contributor Author

phochste commented Oct 27, 2021

Technically this is what is currently already possible with the rule language and current orchestrator demonstrators.

The registry are N3 policy files. For the sake of the orchestrator they can be anywhere in the world. The orhestrator only need to know where to get these policies.

In these policies exactly like you describe @hvdsomp it said : from who is the activity and to what Log you want to write them and in what form. For now I assume that all event logs are LDP Containers. What we put in them is our choice.

In general think that Alice could work in different communities where one artefact can have "worthwhile"-ness that are different in the communities X & Y for the same artefact. And maybe a combined "worthwhile"-ness for the community Z that does both X & Y.

E.g. Alice could work on a research about an old manuscript in Digital Heritage land and Schol Communication land (different value chain) but is also a public speaker about this subject for a third community. Are these 3 different logs?

@hvdsomp
Copy link

hvdsomp commented Oct 27, 2021

So, that's totally great. In which case, IMO, the whole discussion boils down to the need to express what is technically already possible in more architectural terms. Which is - I think - what I've tried to do.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants