fix(discovery/kubernetes): preserve cached endpoints when payload is empty#13299
Open
andrefun wants to merge 2 commits intoapache:masterfrom
Open
fix(discovery/kubernetes): preserve cached endpoints when payload is empty#13299andrefun wants to merge 2 commits intoapache:masterfrom
andrefun wants to merge 2 commits intoapache:masterfrom
Conversation
fix(discovery/kubernetes): preserve cached endpoints when payload is empty
test: add unit test for empty endpoint guard
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR: fix(discovery/kubernetes): preserve cached endpoints when payload is empty
Type of change
What this PR does / why we need it
The kubernetes service discovery currently writes an empty endpoint payload
to the shared dict whenever the inbound
Endpoints/EndpointSliceeventcontains no ready addresses. Subsequent requests against that service then
receive
no valid upstream node: nil(HTTP 503) — a route is matched, anupstream is selected, but
dis.nodes(service_name)returns nil because thedict-stored endpoint object has no entry for the requested port.
This is the failure mode reported in #12923: a small burst of 503s on a
single APISIX pod during a canary rollout. The error log only shows
init.lua:541: handle_upstream(): failed to set upstream: no valid upstream node: nil— there is no other warn / info / error before it.Why this happens
on_endpoint_modified(andon_endpoint_slices_modified) only considersubset.addresses(i.e. ready pods);notReadyAddressesis ignored. Whilethat is the correct routing decision, the write to the discovery dict still
replaces the previous endpoint payload — even when the new payload is empty.
Several legitimate transient k8s states can produce an "all not ready"
payload at one watch tick:
the watch stream skipped
Once the empty payload is written, every request to that service on this
worker returns 503 until the next non-empty event arrives. The window can be
arbitrarily long (we observed ~13s in production); the dict does not
self-heal because nothing forces a re-list, and
endpoint_lrucachekeys onthe dict-stored version, so the empty result is cached.
What this PR changes
In
apisix/discovery/kubernetes/core.lua:on_endpoint_modified: if the computedendpoint_bufferis empty(no port has any node), skip
update_endpoint_dictand emit a warn log.The previous payload is preserved.
on_endpoint_slices_modified: same guard, applied to the mergedcached_endpointsview across all cached slices for the service.Both callbacks now emit a warn-level diagnostic when an inbound event has
no ready addresses (independent of whether the guard fires), so future
operators can see the empty events even when the guard prevents the
503s.
The legitimate "service truly went away" path is unchanged — it is handled
by
on_endpoint_deleted/on_endpoint_slices_deleted, which delete thekey outright. The
operate == "list"accounting in the guarded path stillmarks the key as "seen" so that
post_list's dirty-data cleanup does notremove a key we intentionally left in place.
What about truly-dead nodes still in the dict?
If every pod really is gone, the cached payload now points at IPs that fail
to connect. APISIX active health checks (default 5s interval, 3 failures)
detect and evict them within ~15s, after which the upstream balancer
naturally returns no candidates and the request fails. Crucially, the
failure mode degrades from "instant 503 with no upstream attempted" to
"~15s of routing to dead pods, then upstream balancer says no node" — which
is the same end state, but exposed via the normal upstream-down path
(observable via active-check logs, retry counters, etc.) instead of the
silent discovery-state failure.
Reproduction
A minimal local reproduction (no k8s required) is included as a unit test
in
t/discovery/kubernetes_empty_endpoint.t:The unit test fails on master without the patch (the dict gets overwritten
to
{}after the empty events) and passes with it.End-to-end verification on apisix 3.10.0
The same logic was back-ported to apisix 3.10.0 (which still has the
single-file
discovery/kubernetes/init.lua) and exercised against theapache/apisix:3.10.0-debianimage, using a serverless plugin tosynthesize
on_endpoint_modifiedevents:subsets[0].addresses=[ip]502502notReadyAddresses=[ip]503502subsets=[]503502(
502here meansset_upstreamsucceeded and the request reached theload-balancer phase — the fake upstream is intentionally unreachable so
the test isolates the discovery layer from real network success. The
relevant assertion is "no longer 503".)
When the guard fires, the warn log is emitted:
Compatibility
No public API change, no schema change, no metric / log format change other
than the two new warn lines. The dict layout is identical. Operators who
silently relied on "empty endpoint events clear the dict" do not exist —
that behavior was a bug, and the only observable consequence was 503s.
Checklist