-
Notifications
You must be signed in to change notification settings - Fork 194
nextflow format hello-nextflow/
#565
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for nextflow-training ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Please let me know if you have opinions about formatting. I will hear you out. Get your views in before things become set in stone! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some questions/comments that may be more for @bentsherman
|
||
script: | ||
""" | ||
echo '$greeting' > output.txt | ||
echo '${greeting}' > output.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the curly braces going to be mandatory? why do we need them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are optional in certain cases like simple variable names. The formatter doesn't try to be smart here, it just always uses braces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for whether we should put them everywhere, there is a trade-off between consistency and shortcuts
Using curly braces everywhere requires a bit more typing but makes the code more consistent, therefore easier to read
Omitting the curly braces where possible saves a bit on typing but makes the code less consistent, thus the reader has to work a bit more to understand the different syntax forms
So I think it comes down to how much time we spend reading vs writing code. At least in software engineering, we spend way more time reading code, so I tend to favor consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easier to teach consistency as well. So I like this change.
@@ -1,4 +1,10 @@ | |||
#!/usr/bin/env nextflow | |||
// Include modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea here that module includes should go before anything else? Is there a functional reason or is it purely stylistic? Any interactions wrt setting any variable values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, it's just a convention as it shouldn't matter where you put the includes
In practice, users sometimes put the params first so that they are propagated to the included modules as a side effect. We are trying to move away from that pattern anyway
greeting_ch = Channel.fromPath(params.greeting) | ||
.splitCsv() | ||
.map { line -> line[0] } | ||
greeting_ch = Channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like having to put the .fromPath()
bit on the next line. I would keep that like it was before tbh. That way you have the first line do a full thing, then the next lines modify that thing. If you have just greeting_ch = Channel
on one line that feels... interrupted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I agree. I would just need to add a special case in the formatter. Though I wonder if we would feel the same about any old foo.bar()
or if it's just with channel factories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me I would do this for any old foo.bar()
as well I think.. The first line should contain the first action. Whether it's Channel
or foo
, neither of these are actions, only the starting points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I imagine something like this:
ch_samples
.filter { sample -> sample.n_reads >= 1_000_000 }
.map { sample -> sample.id }
I think I prefer both calls to be on their own line. I think I treat channel factories differently in my head because the Channel
and the factory call are sort of a unit, but in general it seems like the "starting point" should be separate from any transformations performed on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the format
command will now put Channel.fromPath()
on a single line, as well as any other channel factory
@@ -30,7 +31,7 @@ workflow { | |||
collectGreetings(convertToUpper.out.collect(), params.batch) | |||
|
|||
// emit a message about the size of the batch | |||
collectGreetings.out.count.view { "There were $it greetings in this batch" } | |||
collectGreetings.out.count.view { "There were ${it} greetings in this batch" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the extra curlies necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also should we be explicitly naming the variable like we do later with line -> line[0]
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my point above about curly braces. As for the implicit it
, yes I would recommend as a best practice that you declare an explicit name such as line
. Right now it's only a "paranoid" warning but it will eventually become an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I got all of the $it
s in #538 - must have missed this one.
@ewels Before this can go in we're also going to have to update a bunch of line numbers, check the explanatory text for any code that is quoted and needs to be updated to match etc., so this will still need some substantial effort. Also, do you have any thoughts on how this will make the code different in a lot of places compared to what you show in the videos? I know we can tell people to expect minor changes but a few of these aren't really minor, at least for newcomers (eg chunks of code that move to a different part of the file) and in general it's going to add 'noise' in terms of people trying to follow what's going on on screen and on their own machine. So I'm a little reluctant to push this out right away if there's not a driving need for it. At some level I'd rather sit on it until we feel we may need to update the recordings themselves (hopefully not for some number of months as they are so good and it takes so much effort to make), OR there's a strong reason for why we need the formatting to be strictly aligned. |
Yup, totally agree - the material will need some significant changes. Also reordering some sections, eg. introducing workflows first then processes. I wanted to get feedback on the formatter's style choices first really, to see how many changes it would make and see if we agree with them. Then we can adjust it before it goes live in Nextflow. Agree that we shouldn't rush this, and happy to leave it for quite a while before doing. That said, maybe we can progressively implement it, using it for any new courses / material from now on. And maybe applying to the courses that don't have videos. Does that sound ok? |
workflow { | ||
|
||
// emit a greeting | ||
sayHello() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike having workflow above processes. I use this order:
- include
- params
- functions
- processes
- workflows
- anonymous workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debating whether to add an option to control the order (e.g. "Formatting > Entry workflow position")
Or maybe just not try to re-order the definitions at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always done processes first then workflows, but I'm coming around to it.
I don't mind making @adamrtalbot feel uncomfortable, auto-formatters have that effect on everyone 😅 Though maybe this goes beyond what most formatters do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll survive if it gets reordered, but do we have a particular order that is preferred? It feels weird to be opinionated about something that isn't consequential (not that it's stopped me before).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what auto-formatters do? 😆 Basically no formatting is consequential. It's mostly for convention and consistency, so that folks get used to navigating pipeline code in the same way everywhere.
Any colour, as long as it's black.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most languages don't have this problem because they just have functions, no processes / workflows.
I do think that most people put their workflows at the bottom. This is also the convention that Paolo used when he introduced DSL2 in the docs, which could explain it.
I'm starting to think the formatter should just preserve the user's order for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - or at least make it opt in, off by default for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..Maybe with includes as an exception? I do like having them at the top and alphabetically sorted 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, feature flags / includes / params should go first no matter what
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the format
command now will not try to re-order your declarations unless you enable it, should make the formatting less intrusive overall
Ran the new upcoming
nextflow format
command on thehello-nextflow
directory.Once this is out, we'll start to run this formatter on a lot of nf-core code and it should quickly become the established formatting style for Nextflow code. I figure it's good to get ahead of the game and get the training to follow this style too.
No manual changes here at all, purely the automated formatting changes. We'll need to update the markdown code to reflect these changes.
Note that in other docs I've used mkdocs material "snippets" to directly include code files into the markdown without copying + pasting, which makes these kinds of changes easier, not sure if it's worth considering here as we update or not worth the effort.. See docs, docs and example:
training/docs/basic_training/containers.md
Lines 447 to 449 in ce85956
Once we have sign off on the use of this formatting tool and the code style it produces, I can update this PR to also adjust the markdown.
Once that's done we should apply it to all other material, then set up pre-commit to automatically check (/ apply on comment) this auto-formatting.
Phil