Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support node selector in the API server #2523

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

blublinsky
Copy link
Contributor

Why are these changes needed?

Current version of the API server do not supports node selector, which are very useful for Ray deployment

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

if len(template.NodeSelector) != 0 {
t.Errorf("failed to convert config map, expected no node selector, got %d", len(template.NodeSelector))
}
if len(template.NodeSelector) != 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@kevin85421
Copy link
Member

can you rebase with the master branch? I just fixed a CI issue.

@blublinsky
Copy link
Contributor Author

can you rebase with the master branch? I just fixed a CI issue.

done

Copy link
Contributor

@tedhtchang tedhtchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM
Created the a RayCluster and verified pods are deployed to the ray-test-worker2

cd apiserver && KIND_CLUSTER_NAME=ray-test make cluster
helm install kuberay-operator kuberay/kuberay-operator --version 1.2.2
make run

# Create a template with nodeselector
curl -X 'POST'   'http://localhost:8888/apis/v1/namespaces/default/compute_templates'   -H 'accept: application/json'   -H 'Content-Type: application/json'   -d '{
  "name": "default-template",
  "namespace": "default",
  "cpu": "2",
  "memory": "2",
  "nodeSelector": {
    "kubernetes.io/hostname": "ray-test-worker2"
  }
}'

curl -X 'POST' \
  'http://localhost:8888/apis/v1/namespaces/default/clusters' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "name": "test-cluster",
  "namespace": "default",
  "user": "tedchang",
  "version": "2.9.0",
  "environment": "DEV",
  "clusterSpec": {
    "headGroupSpec": {
      "computeTemplate": "default-template",
      "image": "rayproject/ray:2.9.0",
      "serviceType": "NodePort",
      "rayStartParams": {"dashboard-host": "0.0.0.0"}
    },
    "workerGroupSpec": [
      {
        "groupName": "small-wg",
        "computeTemplate": "default-template",
        "image": "rayproject/ray:2.9.0",
        "replicas": 1,
        "minReplicas": 0,
        "maxReplicas": 1,
        "rayStartParams": {"metrics-export-port": "8080"}
      }
    ]
  }
}'

Should see something:

# kubectl get po -owide
NAME                                 READY   STATUS    RESTARTS   AGE     IP           NODE               NOMINATED NODE   READINESS GATES
kuberay-operator-79bb76fc8-lp9bv     1/1     Running   0          4m1s    10.244.3.2   ray-test-worker2   <none>           <none>
test-cluster-head-chg9t              1/1     Running   0          2m12s   10.244.3.3   ray-test-worker2   <none>           <none>
test-cluster-small-wg-worker-n74s7   1/1     Running   0          2m12s   10.244.3.4   ray-test-worker2   <none>           <none>

@blublinsky
Copy link
Contributor Author

@kevin85421 can you, please, merge?

@blublinsky
Copy link
Contributor Author

@andrewsykim, @kevin85421 any progress on merging this?

@blublinsky
Copy link
Contributor Author

@kevin85421 what seems to be a holdup? also there is another #2556 that is the same as mine

@blublinsky
Copy link
Contributor Author

@andrewsykim, @kevin85421 it has been 2 weeks with no progress

@blublinsky
Copy link
Contributor Author

blublinsky commented Dec 10, 2024

@kevin85421, @andrewsykim any chance we can close this before the end of the year?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants