Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use longer exec probe timeouts for Head pods #2353

Merged
merged 1 commit into from
Sep 10, 2024

Conversation

andrewsykim
Copy link
Collaborator

Why are these changes needed?

This is a follow-up to #2264 and #2265

For the Head pod, the exec probe timeout should be higher than 2 as exec probes run 2 "wget" commands:

wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep success
 && wget -T 2 -q -O- http://localhost:8265/api/gcs_healthz | grep success

This PR updates the probe timeout for Head pods to 5 seconds. See #2265 (comment) for more details.

Related issue number

#2264

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@andrewsykim andrewsykim changed the title User longer exec probe timeouts for Head pods Use longer exec probe timeouts for Head pods Sep 5, 2024
@andrewsykim
Copy link
Collaborator Author

I did more testing and I'm not sure that increasing the probe timeout is actually helping. Let's hold off on this PR for now, I will do more testing and report back

@andrewsykim
Copy link
Collaborator Author

I opened a separate PR to use HTTP probes which resolves some of the issues I'm encountering with exec probes: #2360

However, HTTP probes have a limitation of checking only a single endpoint

@kevin85421 kevin85421 merged commit 6cbb5df into ray-project:master Sep 10, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants