Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Propagate kubernetes custom metadata annotations to sub-services #3767

Merged
merged 3 commits into from
Jul 19, 2024

Conversation

fozziethebeat
Copy link
Contributor

This forwards kubernetes annotation values in the ~/.sky/config to sub services such as the load balancer. This ensures users can set custom annotations, such as for AWS, to ensure load balancers are internet facing.

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

Manually update my ~/.sky/config to have the following contents:

kubernetes:
  ports: loadbalancer
  custom_metadata:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing

And ran sky launch with this config:

service:
  readiness_probe: /v1/models

resources:
  # Can change to use more via `--gpus A100:N`.  N can be 1 to 8.
  accelerators: A100:2
  cpus: 22
  memory: 500
  # Note: Big models need LOTS of disk space, especially if saved in float32.
  # So specify a lot of disk.
  disk_size: 400
  # Keep fixed.
  cloud: kubernetes
  ports: 8000
  image_id: docker:vllm/vllm-openai:latest

envs:
  # Specify the training config via `--env MODEL=collinear-ai/model-repo-name`
  MODEL: ""

setup: |
  conda deactivate
  python3 -c "import huggingface_hub; huggingface_hub.login('${HUGGINGFACE_TOKEN}')"

run: |
  conda deactivate
  python3 -u -m vllm.entrypoints.openai.api_server \
    --host 0.0.0.0 \
    --port 8000 \
    --tensor-parallel-size $SKYPILOT_NUM_GPUS_PER_NODE \
    --trust-remote-code \
    --model $MODEL

I verified that my EKS cluster in AWS launched a standard network load balancer that was internet facing.

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fozziethebeat! Left a quick comment about doing the same for labels. LGTM to otherwise!

sky/provision/kubernetes/network_utils.py Show resolved Hide resolved
sky/provision/kubernetes/network_utils.py Show resolved Hide resolved
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thanks for the fix @fozziethebeat!

@romilbhardwaj romilbhardwaj added this pull request to the merge queue Jul 19, 2024
Merged via the queue into skypilot-org:master with commit aea7322 Jul 19, 2024
20 checks passed
@fozziethebeat fozziethebeat deleted the k8s-loadbalancers branch July 19, 2024 20:50
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants