parallelize sequential flag with drmaa or multiple cores #133

Terf · 2021-08-09T18:18:22Z

Submitting validator jobs to the cluster (in batches) provides massive speedup, running on multiple cores provides more modest speedup. one drawback is that the progress bar is kinda meaningless since jobs are run asynchronously, and when qsubing jobs a bunch of tempfiles need to be created in a networked-mounted directory (uses bids_dir) so they're visible to the exec nodes, which is a bit hacky.

Terf · 2021-08-09T18:30:10Z

also when running on CUBIC you'll see the output

User-specified core binding strategy: linear
User-specified core binding type: set
User-specified core binding amount: 1

for each job submitted which kinda nukes your terminal but Mark said it's just bc he was debugging the scheduler and will take them out

mattcieslak · 2021-08-09T19:31:28Z

cubids/cli.py

-                    call = build_validator_call(tmpdirname,
-                                                nifti_head,
-                                                subj_consist)
+                # TMPDIR isn't networked (available on login + exec nodes), so use bids_dir


will this be ok if bids_dir is in datalad?

hmm good point I hadn't thought about needing to unlock stuff. I admit it's very hacky and made me almost think this isn't a good problem to submit to the grid as it requires so many temporary files that need to be on a network drive (not $TMPDIR), but I'm not sure what'd the best solution would be. Maybe we could use a users home directory, say, ~/.cubids as the tmpdir?

Is there a way to get a tmpdir on the compute node and copy the files into that?

I think it'd be possible to move more of the logic within the grid job so scripts don't have to be written to a networked drive, but since it's impossible to connect the stdout of the grid job to the main process, the output will ultimately have to get written out to some file which needs to be on a networked drive unless all the jobs, including the main process, are running on the same exec node

cubids/cli.py

mattcieslak · 2021-08-09T19:33:23Z

cubids/cli.py

                    if ret.returncode != 0:
                        logger.error("Errors returned "
-                                     "from validator run, parsing now")
+                                        "from validator run, parsing now")


this may break flake8

is there a particular formatter e.g. black or autopep8 you're using for the project?

mattcieslak · 2021-08-09T19:38:15Z

cubids/cli.py

+                    jids = []
+
+                    for batch in build_drmaa_batch(queue):
+                        tmp = tempfile.NamedTemporaryFile(delete=False, dir=opts.bids_dir, prefix=".", suffix=".sh")


Is this something a user can customize? Or will they need to customize it? does this work out of the box on cubic?

Not sure what would need to be customized? It indeed works out of the box on CUBIC, LSF also supports DRMAA but PMACS set it up in a weird way and sounded uninteresting in changing that when I asked :(

Terf added 2 commits August 9, 2021 14:11

parallelize sequential flag with drmaa or multiple cores

1dc5369

oops left in a debug stmt

10e72d6

mattcieslak reviewed Aug 9, 2021

View reviewed changes

small edit

808220e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelize sequential flag with drmaa or multiple cores #133

parallelize sequential flag with drmaa or multiple cores #133

Terf commented Aug 9, 2021

Terf commented Aug 9, 2021

mattcieslak Aug 9, 2021

Terf Aug 9, 2021

mattcieslak Aug 9, 2021

Terf Aug 9, 2021 •

edited

Loading

mattcieslak Aug 9, 2021

Terf Aug 9, 2021 •

edited

Loading

mattcieslak Aug 9, 2021

Terf Aug 9, 2021

parallelize sequential flag with drmaa or multiple cores #133

Are you sure you want to change the base?

parallelize sequential flag with drmaa or multiple cores #133

Conversation

Terf commented Aug 9, 2021

Terf commented Aug 9, 2021

mattcieslak Aug 9, 2021

Choose a reason for hiding this comment

Terf Aug 9, 2021

Choose a reason for hiding this comment

mattcieslak Aug 9, 2021

Choose a reason for hiding this comment

Terf Aug 9, 2021 • edited Loading

Choose a reason for hiding this comment

mattcieslak Aug 9, 2021

Choose a reason for hiding this comment

Terf Aug 9, 2021 • edited Loading

Choose a reason for hiding this comment

mattcieslak Aug 9, 2021

Choose a reason for hiding this comment

Terf Aug 9, 2021

Choose a reason for hiding this comment

Terf Aug 9, 2021 •

edited

Loading

Terf Aug 9, 2021 •

edited

Loading