Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

RangeSet: nD folding optimization (#485) #486

Merged
merged 1 commit into from
Nov 16, 2022

Conversation

thiell
Copy link
Collaborator

@thiell thiell commented Aug 31, 2022

Optimize multi-dimensional range set folding algorithm:

  • get rid of heuristic when pre-expanding vectors for folding to improve
    reproducibility
  • when merging vectors, perform O(n) "folding" passes as long as changes
    are detected to reduce the problem size, before attempting a final
    O(n^2) pass
  • avoid casting string into integer when sorting (as RangeSet is now
    using strings natively)

Closes #485

Optimize multi-dimensional range set folding algorithm:

* get rid of heuristic when pre-expanding vectors for folding to improve
reproducibility
* when merging vectors, perform O(n) "folding" passes as long as changes
are detected to reduce the problem size, before attempting a final
O(n^2) pass
* avoid casting string into integer when sorting (as RangeSet is now
using strings natively)

Closes cea-hpc#485
@thiell
Copy link
Collaborator Author

thiell commented Aug 31, 2022

$ time (cluset -e x[2000-2073]c[0-7]s[0-7]b[0-1] | python3 -m cProfile -s cumulative lib/ClusterShell/CLI/Nodeset.py  -f)
x[2000-2073]c[0-7]s[0-7]b[0-1]
         4773441 function calls (4276373 primitive calls) in 5.067 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    107/1    0.009    0.000    5.169    5.169 {built-in method exec}
        1    0.000    0.000    5.169    5.169 Nodeset.py:27(<module>)
        1    0.001    0.001    5.046    5.046 Nodeset.py:334(main)
        1    0.000    0.000    5.045    5.045 Nodeset.py:155(nodeset)
        1    0.285    0.285    5.042    5.042 Nodeset.py:44(process_stdin)
    18950    0.031    0.000    3.148    0.000 RangeSet.py:884(inner)
590373/94356    0.058    0.000    2.538    0.000 {built-in method len}
     9473    0.008    0.000    2.537    0.000 RangeSet.py:1147(_fold)
        3    0.000    0.000    2.527    0.842 NodeSet.py:238(__len__)
        3    0.000    0.000    2.527    0.842 RangeSet.py:926(__len__)
        1    0.000    0.000    2.459    2.459 RangeSet.py:1180(_fold_multivariate)
    18947    0.023    0.000    2.104    0.000 NodeSet.py:1508(update)
     9474    0.041    0.000    1.873    0.000 NodeSet.py:1202(__init__)
        1    0.199    0.199    1.827    1.827 RangeSet.py:1194(_fold_multivariate_merge)
    18947    0.031    0.000    1.418    0.000 NodeSet.py:789(parse)
     9472    0.064    0.000    1.375    0.000 NodeSet.py:810(parse_string)
        6    0.000    0.000    1.074    0.179 RangeSet.py:1126(_sort)
        6    0.083    0.014    1.074    0.179 {method 'sort' of 'list' objects}
    28419    0.045    0.000    1.067    0.000 NodeSet.py:539(update)
    37889    0.085    0.000    1.043    0.000 NodeSet.py:490(_add)
    14876    0.057    0.000    0.991    0.000 RangeSet.py:1128(rgveckeyfunc)
    74380    0.128    0.000    0.899    0.000 RangeSet.py:1136(<genexpr>)
   148807    0.231    0.000    0.826    0.000 RangeSet.py:533(copy)
    18944    0.047    0.000    0.770    0.000 NodeSet.py:996(_scan_string)
   119008    0.154    0.000    0.765    0.000 RangeSet.py:478(__getitem__)
   224583    0.507    0.000    0.737    0.000 RangeSet.py:106(__init__)
   156896    0.243    0.000    0.705    0.000 RangeSet.py:262(_sorted)
        1    0.015    0.015    0.632    0.632 RangeSet.py:1188(_fold_multivariate_expand)
     9472    0.195    0.000    0.608    0.000 NodeSet.py:962(_scan_string_single)
    75776    0.164    0.000    0.603    0.000 RangeSet.py:194(fromone)
    18946    0.042    0.000    0.581    0.000 RangeSet.py:898(copy)
    18946    0.020    0.000    0.500    0.000 RangeSet.py:904(<listcomp>)```

@thiell thiell added this to the 1.9 milestone Sep 9, 2022
@thiell thiell self-assigned this Sep 9, 2022
@mattaezell
Copy link
Contributor

We've been running with this patch for more than 2 months. Are there any blockers to it being merged?

@thiell thiell merged commit ec14536 into cea-hpc:master Nov 16, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance on large many-dimensional nodesets
2 participants