-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
HNP topology not found in hetero node scenario #803
Comments
In case it helps diagnose these new issues with topology info: there are two "elements" unpacked one for "batch" node (which has 1 slot on it) and one worker compute node and their sig fields are:
|
@acolinisi @jjhursey I think my question over whether I think I have an answer, but will investigate when I wake up fully in a bit. Just didn't want to lose this track. |
All I can offer is that this is definitely a regression introduced after 2020-12-02, because there are no errors in that older version, just rebuilt and checked:
Working versions are: |
I'm unable to replicate the cited error message, even when I force the topologies to be hetero. I've made an attempt to do a better job of matching topos with nodes in #808, but will have to wait and see if you find that helped. |
Fixed by #808 |
Assuming I just built the right thing, we're back to the error without :NOLOCAL:
With -n 1 the first two errors are printed but there is no timeout, exits immediately:
All good with :NOLOCAL:
The text was updated successfully, but these errors were encountered: