Skip to content

nextflow format hello-nextflow/ #565

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ewels
Copy link
Member

@ewels ewels commented Mar 13, 2025

Ran the new upcoming nextflow format command on the hello-nextflow directory.
Once this is out, we'll start to run this formatter on a lot of nf-core code and it should quickly become the established formatting style for Nextflow code. I figure it's good to get ahead of the game and get the training to follow this style too.

No manual changes here at all, purely the automated formatting changes. We'll need to update the markdown code to reflect these changes.

Note that in other docs I've used mkdocs material "snippets" to directly include code files into the markdown without copying + pasting, which makes these kinds of changes easier, not sure if it's worth considering here as we update or not worth the effort.. See docs, docs and example:

```yaml title="nf-training/env.yml" linenums="1"
--8<-- "nf-training/env.yml"
```

Once we have sign off on the use of this formatting tool and the code style it produces, I can update this PR to also adjust the markdown.

Once that's done we should apply it to all other material, then set up pre-commit to automatically check (/ apply on comment) this auto-formatting.

Phil

Copy link

netlify bot commented Mar 13, 2025

Deploy Preview for nextflow-training ready!

Name Link
🔨 Latest commit d9d5ab7
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-training/deploys/67d3399d0d36a2000713178c
😎 Deploy Preview https://deploy-preview-565--nextflow-training.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@bentsherman
Copy link
Member

Please let me know if you have opinions about formatting. I will hear you out. Get your views in before things become set in stone!

Copy link
Collaborator

@vdauwera vdauwera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some questions/comments that may be more for @bentsherman


script:
"""
echo '$greeting' > output.txt
echo '${greeting}' > output.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the curly braces going to be mandatory? why do we need them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are optional in certain cases like simple variable names. The formatter doesn't try to be smart here, it just always uses braces

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for whether we should put them everywhere, there is a trade-off between consistency and shortcuts

Using curly braces everywhere requires a bit more typing but makes the code more consistent, therefore easier to read

Omitting the curly braces where possible saves a bit on typing but makes the code less consistent, thus the reader has to work a bit more to understand the different syntax forms

So I think it comes down to how much time we spend reading vs writing code. At least in software engineering, we spend way more time reading code, so I tend to favor consistency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easier to teach consistency as well. So I like this change.

@@ -1,4 +1,10 @@
#!/usr/bin/env nextflow
// Include modules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that module includes should go before anything else? Is there a functional reason or is it purely stylistic? Any interactions wrt setting any variable values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, it's just a convention as it shouldn't matter where you put the includes

In practice, users sometimes put the params first so that they are propagated to the included modules as a side effect. We are trying to move away from that pattern anyway

greeting_ch = Channel.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }
greeting_ch = Channel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like having to put the .fromPath() bit on the next line. I would keep that like it was before tbh. That way you have the first line do a full thing, then the next lines modify that thing. If you have just greeting_ch = Channel on one line that feels... interrupted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I agree. I would just need to add a special case in the formatter. Though I wonder if we would feel the same about any old foo.bar() or if it's just with channel factories?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me I would do this for any old foo.bar() as well I think.. The first line should contain the first action. Whether it's Channel or foo, neither of these are actions, only the starting points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I imagine something like this:

ch_samples
  .filter { sample -> sample.n_reads >= 1_000_000 }
  .map { sample -> sample.id }

I think I prefer both calls to be on their own line. I think I treat channel factories differently in my head because the Channel and the factory call are sort of a unit, but in general it seems like the "starting point" should be separate from any transformations performed on it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the format command will now put Channel.fromPath() on a single line, as well as any other channel factory

@@ -30,7 +31,7 @@ workflow {
collectGreetings(convertToUpper.out.collect(), params.batch)

// emit a message about the size of the batch
collectGreetings.out.count.view { "There were $it greetings in this batch" }
collectGreetings.out.count.view { "There were ${it} greetings in this batch" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the extra curlies necessary here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also should we be explicitly naming the variable like we do later with line -> line[0] etc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my point above about curly braces. As for the implicit it, yes I would recommend as a best practice that you declare an explicit name such as line. Right now it's only a "paranoid" warning but it will eventually become an error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I got all of the $its in #538 - must have missed this one.

@vdauwera
Copy link
Collaborator

@ewels Before this can go in we're also going to have to update a bunch of line numbers, check the explanatory text for any code that is quoted and needs to be updated to match etc., so this will still need some substantial effort.

Also, do you have any thoughts on how this will make the code different in a lot of places compared to what you show in the videos? I know we can tell people to expect minor changes but a few of these aren't really minor, at least for newcomers (eg chunks of code that move to a different part of the file) and in general it's going to add 'noise' in terms of people trying to follow what's going on on screen and on their own machine.

So I'm a little reluctant to push this out right away if there's not a driving need for it. At some level I'd rather sit on it until we feel we may need to update the recordings themselves (hopefully not for some number of months as they are so good and it takes so much effort to make), OR there's a strong reason for why we need the formatting to be strictly aligned.

@vdauwera vdauwera added improve New feature or request hello series labels Mar 17, 2025
@ewels
Copy link
Member Author

ewels commented Mar 17, 2025

Yup, totally agree - the material will need some significant changes. Also reordering some sections, eg. introducing workflows first then processes.

I wanted to get feedback on the formatter's style choices first really, to see how many changes it would make and see if we agree with them. Then we can adjust it before it goes live in Nextflow.

Agree that we shouldn't rush this, and happy to leave it for quite a while before doing.

That said, maybe we can progressively implement it, using it for any new courses / material from now on. And maybe applying to the courses that don't have videos. Does that sound ok?

Comment on lines +2 to +7
workflow {

// emit a greeting
sayHello()
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike having workflow above processes. I use this order:

  1. include
  2. params
  3. functions
  4. processes
  5. workflows
  6. anonymous workflow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm debating whether to add an option to control the order (e.g. "Formatting > Entry workflow position")

Or maybe just not try to re-order the definitions at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always done processes first then workflows, but I'm coming around to it.

I don't mind making @adamrtalbot feel uncomfortable, auto-formatters have that effect on everyone 😅 Though maybe this goes beyond what most formatters do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll survive if it gets reordered, but do we have a particular order that is preferred? It feels weird to be opinionated about something that isn't consequential (not that it's stopped me before).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what auto-formatters do? 😆 Basically no formatting is consequential. It's mostly for convention and consistency, so that folks get used to navigating pipeline code in the same way everywhere.

Any colour, as long as it's black.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most languages don't have this problem because they just have functions, no processes / workflows.

I do think that most people put their workflows at the bottom. This is also the convention that Paolo used when he introduced DSL2 in the docs, which could explain it.

I'm starting to think the formatter should just preserve the user's order for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - or at least make it opt in, off by default for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..Maybe with includes as an exception? I do like having them at the top and alphabetically sorted 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, feature flags / includes / params should go first no matter what

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the format command now will not try to re-order your declarations unless you enable it, should make the formatting less intrusive overall

@ewels ewels marked this pull request as draft March 19, 2025 23:54
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
hello series improve New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants