-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[k8s] Add sky status
flag to query global Kubernetes status
#4040
Conversation
Updated to query job status from all running controllers:
Still needs clean up and edge case handling. |
…o k8s_global_status # Conflicts: # sky/data/storage_utils.py
UX LGTM; quick nits:
I tried launching a managed job on the same shared k8s cluster, and the job loops forever in starting. Controller logs:
|
Fixed UX comments and added error handling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @romilbhardwaj.
sky/jobs/core.py
Outdated
def queue_kubernetes(pod_name: str, | ||
context: Optional[str] = None, | ||
skip_finished: bool = False) -> List[Dict[str, Any]]: | ||
"""Gets the jobs queue from a specific controller pod. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This naming is surprising. From the name I thought it's "gets the queue info for an entire k8s cluster". Maybe rename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to queue_from_kubernetes_pod
Adds a
--kubernetes
flag tosky status
to show the global state of the Kubernetes cluster, including SkyPilot clusters created by other users. Helps users see the current state of the Kubernetes cluster.Example:
TODO:
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh