-
Notifications
You must be signed in to change notification settings - Fork 685
Remove URL fragment in lineage IDs #6011
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This PR removes a lot of unnecessary complexity around the LID filesystem that I feel is getting in the way of core use cases. There are a few tests still failing, mainly because the |
Before we move this forward, I think we should decide whether we actually want to embed the workflow output value in the metadata. Like I mentioned, a pipeline with a Alternatively, the user can save the output to an index file (i.e. samplesheet) and just reference that index file in a downstream pipeline, like they already do. We could simply reference this index file in the metadata instead of the contents. I intend to explore this as part of the TraceObserverV2 proposal. In that case, maybe we could drop the use of URI fragments entirely. Users already have the index file as a drop-in replacement for samplesheets, and traversing the metadata can already be done more effectively via So I think my main goal for the 25.04 release is to remove the URI syntax overloading (fragment, query params) in favor of something more familiar and flexible. |
modules/nf-lineage/src/main/nextflow/lineage/model/TaskRun.groovy
Outdated
Show resolved
Hide resolved
modules/nf-lineage/src/main/nextflow/lineage/LinObserver.groovy
Outdated
Show resolved
Hide resolved
AFAIK, the index file is not mandatory, users need to indicate they want to write the index file, right? In the case, they do not write it, what we should put as output? Maybe we should write the index file in all the cases and publish to the file if indicated or store as lineage metadata. |
I think when |
4e29bb8
to
e897e77
Compare
Update this PR based on the latest changes, focusing on removing the use of URL fragments entirely in the LID filesystem. This makes the user experience simpler (don't need to remember the Every workflow execution now has a "launch LID" and "run LID". The former is added to the history log when the workflow begins, the latter is added when the workflow completes. Here's a simple test you can run to get started: rm -rf .lineage/
nextflow run rnaseq-nf -r lineage -profile conda -resume --labels foo,bar
nextflow lineage list @jorgee if these changes make sense to you, can you help me fix the tests? I think they are the same ones as before |
Expected output: $ nextflow lineage list
TIMESTAMP RUN NAME SESSION ID LAUNCH LID RUN LID
2025-05-02 19:06:15 CDT stoic_shaw bc79451f-c573-4b7d-8e7c-697be8d9cefc lid://cd7197c02ab1250eafc2bf7499715e5f lid://304c57e48ab6b324715ad2c5ba55b25e
$ nextflow lineage view lid://304c57e48ab6b324715ad2c5ba55b25e
{
"type": "WorkflowRun",
"createdAt": "2025-05-02T19:06:16.160559118-05:00",
"workflowLaunch": "lid://cd7197c02ab1250eafc2bf7499715e5f",
"output": [
{
"type": "Path",
"name": "summary",
"value": "lid://cd7197c02ab1250eafc2bf7499715e5f/summary/multiqc_report.html"
},
{
"type": "Collection",
"name": "samples",
"value": "lid://cd7197c02ab1250eafc2bf7499715e5f/samples.json"
}
]
} |
modules/nf-lineage/src/main/nextflow/lineage/model/WorkflowRun.groovy
Outdated
Show resolved
Hide resolved
I added the completion status to the WorkflowRun. It can be SUCCEEDED, FAILED, or CANCELLED, consistent with platform terminology |
I think this changes are valuable enough to merge now. @jorgee do you have any remaining concerns? You mentioned something about the workflow outputs not having LIDs, but I don't remember exactly. Otherwise if these changes make sense to you, please approve and I will merge later |
My main concern was about the two LIDs and the path of the workflow output files. They are based on the |
Could it be done the other way? I would expected the final workflow run hash to include the workflow outputs as components, so there would be no way for a workflow output to refer to the final workflow run that is referring to it |
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
3453af8
to
0193cfe
Compare
This PR removes the use of URL fragments in LIDs, since
jq
can be used on the command line. The#output
shortcut is replaced by adding both WorkflowLaunch and WorkflowRun to the history log.WorkflowRun
->WorkflowLaunch
WorkflowOutput
->WorkflowRun
nextflow lineage list
TaskOutput
since it is not used