Buggy behavior after failed health check recover #12694

mhkarimi1383 · 2024-03-06T06:12:27Z

Is there an existing issue for this?

I have searched the existing issues

Kong version (`$ kong version`)

3.5.0 (With KIC 2.12)

Current Behavior

Sometimes when health check fails and health check recovers Kong response is still 503 for that service

Expected Behavior

Response recovers while Health check recovers

Steps To Reproduce

In K8s environment
Bring up a project and create a Ingress and UpstreamPolicy (With TCP or HTTP health check [TCP Preferred])
configure health check to failure for some time (you will get 503 error)
make that health check to success again (you may get 503 error again

Anything else?

No response

The text was updated successfully, but these errors were encountered:

StarlightIbuki · 2024-03-19T07:40:35Z

The behavior sounds expected to me. The health check status does not update immediately, and the passive health checker cannot predict if the next request will succeed. Could you elaborate?

mhkarimi1383 · 2024-03-19T07:58:45Z

@StarlightIbuki
Hi
But after the interval passes it should recover, but it doesn't.
Also clearing Kong cache via Admin API fixes the issue

mhkarimi1383 · 2024-03-19T08:00:24Z

It happens when we have a rolling update on our K8s Deployment

StarlightIbuki · 2024-03-19T09:17:14Z

@mhkarimi1383 Is the upstream failing in a predictable or manipulatable manner? So that you are sure that the status is not reflecting the fact?

mhkarimi1383 · 2024-03-19T09:37:28Z

@StarlightIbuki
Yes
I have sent a request to that pod and monitor that health check endpoint using a blackbox exporter pointing to it's k8s service

StarlightIbuki · 2024-03-19T09:43:30Z

@mhkarimi1383 Could you share the config that you are using?

mhkarimi1383 · 2024-03-19T09:46:46Z

@StarlightIbuki

        upstream:
          healthchecks:
            active:
              healthy:
                interval: 5
                successes: 3
              type: tcp
              unhealthy:
                tcp_failures: 1
                interval: 5

Here is my KongIngress spec

StarlightIbuki · 2024-03-19T09:52:08Z

5s seems a short interval. How long do you wait before inspecting the status?

mhkarimi1383 · 2024-03-19T09:56:43Z

@StarlightIbuki
About 5 minutes

StarlightIbuki · 2024-03-20T06:18:54Z

I still do not really understand the reproduction steps. When the health checker reports green and you get 503, what real status are you expecting?

mhkarimi1383 · 2024-03-20T06:21:06Z

I still do not really understand the reproduction steps. When the health checker reports green and you get 503, what real status are you expecting?

Yes,
Kong said the project is unhealthy but it is healthy, clear king cache fixes the problem

StarlightIbuki · 2024-03-20T06:26:12Z

I still do not really understand the reproduction steps. When the health checker reports green and you get 503, what real status are you expecting?

Yes, Kong said the project is unhealthy but it is healthy, clear king cache fixes the problem

Sorry, but let's confirm if my understanding is correct: for step 4, we configure the upstream to back to work again, and we will observe the health checker reporting unhealthy condition?

mhkarimi1383 · 2024-03-20T06:27:18Z

@StarlightIbuki Yes

github-actions · 2024-04-04T01:48:27Z

This issue is marked as stale because it has been open for 14 days with no activity.

ADD-SP · 2024-05-27T05:59:17Z

I have reproduced this issue locally using the master branch. @mhkarimi1383, thanks for your report.

Internal ticket for tracking: KAG-4588

_format_version: "3.0"

_transform: true

services:
- name: service_1
  host: upstream_1
  routes:
   - name: route_1
     paths:
     - /1



upstreams:
- name: upstream_1
  targets:
  - target: localhost:80
  healthchecks:
    active:
      timeout: 10
      healthy:
        interval: 5
      unhealthy:
        http_statuses: [500]
        http_failures: 1
        interval: 5

mhkarimi1383 · 2024-05-27T06:02:08Z

Thanks

Sometimes clearing cache will not work and we have to wait (for example 20 minutes) or we have to restart to fix the problem.

chronolaw added the area/kubernetes Issues where Kong is running on top of Kubernetes label Mar 6, 2024

StarlightIbuki added the pending author feedback Waiting for the issue author to get back to a maintainer with findings, more details, etc... label Mar 19, 2024

github-actions bot added the stale label Apr 4, 2024

StarlightIbuki removed pending author feedback Waiting for the issue author to get back to a maintainer with findings, more details, etc... stale labels Apr 7, 2024

ADD-SP added the bug label May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buggy behavior after failed health check recover #12694

Buggy behavior after failed health check recover #12694

mhkarimi1383 commented Mar 6, 2024 •

edited

Loading

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024 •

edited

Loading

StarlightIbuki commented Mar 20, 2024

mhkarimi1383 commented Mar 20, 2024

StarlightIbuki commented Mar 20, 2024

mhkarimi1383 commented Mar 20, 2024

github-actions bot commented Apr 4, 2024

ADD-SP commented May 27, 2024

mhkarimi1383 commented May 27, 2024

Buggy behavior after failed health check recover #12694

Buggy behavior after failed health check recover #12694

Comments

mhkarimi1383 commented Mar 6, 2024 • edited Loading

Is there an existing issue for this?

Kong version ($ kong version)

Current Behavior

Expected Behavior

Steps To Reproduce

Anything else?

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024

StarlightIbuki commented Mar 19, 2024

mhkarimi1383 commented Mar 19, 2024 • edited Loading

StarlightIbuki commented Mar 20, 2024

mhkarimi1383 commented Mar 20, 2024

StarlightIbuki commented Mar 20, 2024

mhkarimi1383 commented Mar 20, 2024

github-actions bot commented Apr 4, 2024

ADD-SP commented May 27, 2024

mhkarimi1383 commented May 27, 2024

mhkarimi1383 commented Mar 6, 2024 •

edited

Loading

Kong version (`$ kong version`)

mhkarimi1383 commented Mar 19, 2024 •

edited

Loading