Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Move igenomes config into pipeline code #3131

Closed
bentsherman opened this issue Aug 21, 2024 · 4 comments
Closed

Move igenomes config into pipeline code #3131

bentsherman opened this issue Aug 21, 2024 · 4 comments
Milestone

Comments

@bentsherman
Copy link

The params.genomes is a map that is conditionally included, which is a very clever hack that will not be supported by the new config parser 😅

Instead, I suggest refactoring the igenomes.config into a function that returns a giant map:

def getGenomesMap(base_url) {
  [
    'GRCh37': [
      fasta: "${base_url}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa",
      // ...
    ],
    // ...
  ]
}

You can then include this function into your workflow and use it however you want:

include { getGenomesMap } from '...'

workflow {
  genomes = !params.igenomes_ignore ? getGenomesMap(params.igenomes_base) : [:]
}
@bentsherman
Copy link
Author

@ewels mentioned that users like to modify this genomes config. In that case, it should be refactored into an index file (e.g. JSON, YAML). Then you can load it in the config, although it still might be easier to have a param for the file name (e.g. params.igenomes_index) and load it in the pipeline code:

def getGenomesIndex(filename) {
  new groovy.json.JsonSlurper().parseText(file(filename).text)
}

@ewels
Copy link
Member

ewels commented Aug 23, 2024

My summary is that we have two problems here:

  1. The includeConfig is within an if statement in the config. This won't work with the new config parser, and needs resolving ASAP.
  2. The config itself uses nested parameters, support for which may be dropped in the slightly longer term.

So the most urgent thing is a short term fix for (1). Instead of this:

// Load igenomes.config if required
if (!params.igenomes_ignore) {
    includeConfig 'conf/igenomes.config'
} else {
    params.genomes = [:]
}

Suggestion is to use a ternary expression, which should work:

// Load igenomes.config if required
includeConfig !params.igenomes_ignore ? 'conf/igenomes.config' : 'conf/igenomes_ignored.config'

Where a new igenomes_ignored.config file would simply contain:

params.genomes = [:]

Can maybe think of more elegant syntax, but that's the gist of it.


Regarding (2) in the longer term - it'd be nice to rewrite how all of the iGenomes configuration works. This will break how existing user's config files work, but I think it's ok if we're overhauling the entire references system anyway. So hopefully we can incorporate this syntax change along with the new references back end.

@ewels ewels added this to the 3.0 milestone Sep 11, 2024
@ewels
Copy link
Member

ewels commented Sep 11, 2024

See #3132 (comment) for another issue about includeConfig, essentially the same as above.

@mirpedrol
Copy link
Member

Done by #3168

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants