VYPR
Moderate severityNVD Advisory· Published Aug 16, 2024· Updated Aug 16, 2024

Cilium vulnerable to information leakage via incorrect ReferenceGrant update logic in Gateway API

CVE-2024-42486

Description

Cilium is a networking, observability, and security solution with an eBPF-based dataplane. In versions on the 1.15.x branch prior to 1.15.8 and the 1.16.x branch prior to 1.16.1, ReferenceGrant changes are not correctly propagated in Cilium's GatewayAPI controller, which could lead to Gateway resources being able to access secrets for longer than intended, or to Routes having the ability to forward traffic to backends in other namespaces for longer than intended. This issue has been patched in Cilium v1.15.8 and v1.16.1. As a workaround, any modification of a related Gateway/HTTPRoute/GRPCRoute/TCPRoute CRD (for example, adding any label to any of these resources) will trigger a reconciliation of ReferenceGrants on an affected cluster.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
github.com/cilium/ciliumGo
>= 1.16.0, < 1.16.11.16.1
github.com/cilium/ciliumGo
>= 1.15.0, < 1.15.81.15.8

Affected products

1

Patches

3
92c110e58a7b

gateway-api: Enqueue gateway for Reference Grant changes

https://github.com/cilium/ciliumTam MachJul 26, 2024via ghsa
2 files changed · +36 0
  • backporter-state.json+1 0 added
    @@ -0,0 +1 @@
    +{"Process":"oss","RepoDir":"/workspace/cilium/cilium","TargetBranch":"v1.15","Upstream":"cilium/cilium","Downstream":"cilium/cilium","StateFilePath":"/workspace/cilium/cilium/backporter-state.json","Status":1,"Candidates":[{"Number":33666,"Title":"linux/node: reallocate nodeID upon conflict","Body":"NodeIDs and IPsec state suffer from a lack of reconciliation. If the agent misses a node deletion event, stale state is never cleaned up. This is somewhat known (#29822, #26298), but was generally considered not of huge consequence. Stale XFRM states/policies can accumulate, but will not match traffic - the effect is mostly slowing down processing in agent and kernel. nodeIDs can eventually run out if too many node deletions are missed, but the rate at which these are missed is expected to be low.\r\n\r\nUnfortunately, there are large clusters with high node churn in which rare events become common, and hence the following sequence of events is probable enough to actually observe:\r\n\r\n1. a node is deleted while the agent is down (e.g. due to being upgraded)\r\n2. a new node joins the cluster and is allocated IPs which overlap with previously used IPs.\r\n\r\nIf this occurs, the agent can have a partioned view of what nodeID this node should have - in the BPF map, the k8s internal IP will map to a different nodeID than the cilium internal ip. This breaks IPsec traffic towards this node, as BPF applies a mark based on the BPF map nodeID of the tunnnel endpoint, but the xfrm states expect to match the mark based on the cilium internal IP. The result is traffic which doesn't match any xfrm state/policy, falling back to the catch all block policy.\r\n\r\nTo work around this, we enforce that all IPs of a node get the same nodeID - even if an IP was already pointing to an existing nodeID. Since this node update is more current than whatever state we had held, it seems more correct to ensure all IPs point to the same nodeID than avoiding a BPF map write. We do so by forcing the allocation of a new nodeID.\r\n\r\nIt is possible that unmapping the IPs but not deallocating the nodeID can leak the nodeID. Deallocating unconditionally would be wrong, as the stale mapping might point to a nodeID which represents a different, alive node.\r\n\r\n```release-note\r\nThe cilium agent will now recover from stale nodeID mappings which could occur in clusters with high node churn, possibly manifesting itself in dropped IPsec traffic.\r\n```\r\n","State":"closed","Labels":["kind/bug","release-note/bug","ready-to-merge","affects/v1.13","affects/v1.14","backport-pending/1.14","feature/ipsec","needs-backport/1.15","backport-done/1.16"],"Author":"bimmlerd","MergedAt":"2024-07-23T17:27:54Z","MergeCommitSHA":"50b9a31dd25e5498079528bb534059106b5bbb51","HTMLURL":"https://github.com/cilium/cilium/pull/33666","Commits":[{"Author":{"Name":"David Bimmler","Email":"david.bimmler@isovalent.com","Date":"2024-06-06T08:49:57Z"},"Message":"linux/node: reallocate nodeID upon conflict\n\nNodeIDs and IPsec state suffer from a lack of reconciliation. If the\nagent misses a node deletion event, stale state is never cleaned up.\nThis is somewhat known (#29822, #26298), but was generally considered\nnot of huge consequence. Stale XFRM states/policies can accumulate, but\nwill not match traffic - the effect is mostly slowing down processing\nin agent and kernel. nodeIDs can eventually run out if too many node\ndeletions are missed, but the rate at which these are missed is expected\nto be low.\n\nUnfortunately, there are large clusters with high node churn in which\nrare events become common, and hence the following sequence of events is\nprobable enough to actually observe:\n\n1. a node is deleted while the agent is down (e.g. due to being\n   upgraded)\n2. a new node joins the cluster and is allocated IPs which overlap with\n   previously used IPs.\n\nIf this occurs, the agent can have a partioned view of what nodeID this\nnode should have - in the BPF map, the k8s internal IP will map to a\ndifferent nodeID than the cilium internal ip. This breaks IPsec traffic\ntowards this node, as BPF applies a mark based on the BPF map nodeID of\nthe tunnnel endpoint, but the xfrm states expect to match the mark based\non the cilium internal IP. The result is traffic which doesn't match any\nxfrm state/policy, falling back to the catch all block policy.\n\nTo work around this, we enforce that all IPs of a node get the same\nnodeID - even if an IP was already pointing to an existing nodeID. Since\nthis node update is more current than whatever state we had held, it\nseems more correct to ensure all IPs point to the same nodeID than\navoiding a BPF map write. We do so by forcing the allocation of a new\nnodeID.\n\nIt is possible that unmapping the IPs but not deallocating the nodeID\ncan leak the nodeID. Deallocating unconditionally would be wrong, as the\nstale mapping might point to a nodeID which represents a different,\nalive node.\n\nSigned-off-by: David Bimmler \u003cdavid.bimmler@isovalent.com\u003e","UpstreamSHA":"50b9a31dd25e5498079528bb534059106b5bbb51","PRSHA":"48dbe1855e645e15ddb83a4faba7a3ada10a1da3"}],"DownstreamIssue":0},{"Number":33894,"Title":"Make hubble-relay more resilient to transient errors","Body":"Fixes: #33891 \r\n\r\nSee the commit messages for a concise description of what each change aims to do. Overall, in some cases, hubble-relay isn't very robust to transient errors, typically caused by individual agents restarting or the Hubble API being briefly unavailable. As a result, hubble-relay often gets restarted when downstream dependencies like Cilium are down for a only a moderate amount of time. In general, it's probably not wise for us to use the peer manager to define Hubble-relay's health, however changing that requires a more substantial change and rethink of how we do health for API services.\r\n\r\nAs an intermediate solution, we should give hubble-relay more time to do the actions it needs to become healthy (ie: connect to the peer service). This means giving it more time to do various actions such as connecting to the peer service, and retrying.\r\n\r\nTo start, and address the issue reported in #33891, we can increase the default dial timeout for the peer service, which should improve Hubble-relay in constrained environments like small dev clusters or KIND clusters, where it may take longer than 5 seconds to connect to cilium's Hubble API. \r\n\r\nAdditionally, in the event that hubble-relay cannot connect to the peer service, such as when cilium is being updated, reconfigured or otherwise restarted, we should retry more aggressively to improve how quickly it responds to failures and reduce the amount of exponential backoff accrued. During rollouts, especially in smaller clusters, it's fairly likely Hubble-relay will end up being disconnected from the cilium pod it's currently using as its peer service. It will then try to connect to the same pod, or another one which might get restarted while it's retrying; quickly causing the exponential backoff timer to grow, resulting in relay only attempting to reconnect a small number times before failing it's livenessProbe and being killed, negating many of the benefits of having support for retrying at all. Retrying more quickly should improve this, and shouldn't adversely impact Cilium or Hubble with the new default backoff values.\r\n\r\nAs mentioned, with the current livenessProbe settings, since it relies on relay being connected to the peer service, its health is tied to downstream dependencies, which is out of its control. To reduce the impact of downstream agents being restarted on Hubble-relay, we should give it enough time to retry and reconnect before it's livenessProbe results in the pod being restarted, so let's increase the number of failures in the livenessProbe to give hubble-relay ample time to reconnect and become healthy. ","State":"closed","Labels":["kind/enhancement","release-note/minor","sig/hubble","needs-backport/1.15","backport-done/1.16"],"Author":"chancez","MergedAt":"2024-07-23T21:54:33Z","MergeCommitSHA":"45db088d7de70dc0cf8e99f06c5ed455070ad252","HTMLURL":"https://github.com/cilium/cilium/pull/33894","Commits":[{"Author":{"Name":"Chance Zibolski","Email":"chance.zibolski@gmail.com","Date":"2024-07-18T17:39:56Z"},"Message":"hubble: Increase the default relay dial timeout to 30 seconds\n\nWhen connecting to the peer service sometimes it takes longer than 5\nseconds to connect, especially on over burdened nodes or in CI, so\nchange the default to a more reasonable default of 30 seconds.\n\nSigned-off-by: Chance Zibolski \u003cchance.zibolski@gmail.com\u003e","UpstreamSHA":"275b78671cedd8a2dc3a22fe010e84dac8369ef3","PRSHA":"e4aae4c0997958279e94b2e8f4c6620623da6573"},{"Author":{"Name":"Chance Zibolski","Email":"chance.zibolski@gmail.com","Date":"2024-07-18T17:49:00Z"},"Message":"hubble: Reduce relay peer manager exponential back-off min/max\n\nThe current back-off settings for the hubble peer manager gRPC\nretries are way too high.\n\nMost users tend to only run 1 replica, or a small number of replicas of\nhubble-relay, meaning we aren't at high risk of impacting cilium/hubble,\neven with a large number of retries.\nThis is especially true since each relay pod only connects to a single\npeer service, thus a single cilium agent/hubble server is being\nconnected to by each replica of relay.\n\nTuning the back-off for hubble-relay's peer manager is especially\nimportant since the relay's livenessProbe relies on the health of it's\npeers, meaning we need to be more aggressive in retries to avoid being\nkilled due to the health status failing.\n\nStarting at 10 seconds for back-off is pretty long for a retry. In my\nexperience, most of the time back-offs start in the milliseconds and ramp\nup quickly from there.\n\nGiven most failures are expected to be transient it doesn't make any\nsense to start with such a high retry delay, as it just delays the\nsuccessful connections needlessly and Starting at 10 seconds means we\nwill quickly see the delay increase as the back-off doubles.\n\nBy setting it to 1 second we'll reach 10 second of back-off (the current\nstarting value) after 3 retries and 12 seconds of elapsed time, so it\nwill only add 3 additional requests total in the same period as the\ncurrent settings first retry of 10 seconds, which should be negligible.\n\nAdditionally, 90 minutes is way too long for a maximum back-off, it\nbasically means relay would be unhealthy for an extended period of time\ndespite the fact that the underlying hubble servers might be fine.\n1 minute feels reasonably acceptable for a maximum retry rate for\nconnecting to the peer manager.\n\nSigned-off-by: Chance Zibolski \u003cchance.zibolski@gmail.com\u003e","UpstreamSHA":"2325bc5afa843f955835e0fe53b3c9ebd314eec4","PRSHA":"8f400a2d81f9af1b7eea6cd21d72af5d458feaaa"},{"Author":{"Name":"Chance Zibolski","Email":"chance.zibolski@gmail.com","Date":"2024-07-18T18:13:49Z"},"Message":"helm: Update hubble-relay livenessProbe and startupProbe\n\nAdd initialDelaySeconds of 10 seconds to both, since hubble-relay may\ntake a bit of time to start, and modify the defaults for the\nlivenessProbe since livenessProbes kill the pod which should be a last\nresort.\n\nWe want to give relay time to retry before killing it, and the default\nlivenessProbe only lets it fail for 30 seconds before terminating the\npod. Increase this to 2 minutes worth of failures before allowing the\npod to be terminated, giving relay time to retry and become healthy\nagain.\n\nSigned-off-by: Chance Zibolski \u003cchance.zibolski@gmail.com\u003e","UpstreamSHA":"45db088d7de70dc0cf8e99f06c5ed455070ad252","PRSHA":"f7639f6ce6d6afa16c7803c14e898922ce685007"}],"DownstreamIssue":0},{"Number":33908,"Title":"gha: lint absence of trailing spaces in workflow files","Body":"Trailing spaces in workflow files tend to be a pain during backports, as they cause conflicts due to slight divergences across branches, and require manual intervention. Additionally, depending on the editor settings of each developer, they either lead to unnecessary churn in the code-base, or require extra effort to explicitly ignore them.\r\n\r\nTo prevent these issues, let's introduce a linter that verifies the absence of trailing spaces in all files under `.github`, and prompts a command to remove them if found.\r\n\r\nExample run with failures: https://github.com/cilium/cilium/actions/runs/10005692170/job/27656895161?pr=33908","State":"closed","Labels":["area/CI","ready-to-merge","release-note/ci","backport-pending/1.14","needs-backport/1.15","needs-backport/1.16"],"Author":"giorio94","MergedAt":"2024-07-24T14:28:42Z","MergeCommitSHA":"83338599965a579ef635f9e7b77f3c53cdfec056","HTMLURL":"https://github.com/cilium/cilium/pull/33908","Commits":[{"Author":{"Name":"Marco Iorio","Email":"marco.iorio@isovalent.com","Date":"2024-07-19T09:02:13Z"},"Message":"gha: lint absence of trailing spaces in workflow files\n\nTrailing spaces in workflow files tend to be a pain during backports,\nas they cause conflicts due to slight divergences across branches,\nand require manual intervention. Additionally, depending on the editor\nsettings of each developer, they either lead to unnecessary churn in\nthe code-base, or require extra effort to explicitly ignore them.\n\nTo prevent these issues, let's introduce a linter that verifies the\nabsence of trailing spaces in all files under `.github`, and prompts\na command to remove them if found.\n\nSigned-off-by: Marco Iorio \u003cmarco.iorio@isovalent.com\u003e","UpstreamSHA":"e7ebc9a4489fb209384d7a85d3f1e43a7b0269a7","PRSHA":"1a57fff09f1568b5b48032143ca6cf4cb59e8b2b"},{"Author":{"Name":"Marco Iorio","Email":"marco.iorio@isovalent.com","Date":"2024-07-19T09:36:03Z"},"Message":"test: fix empty named It clause in Checks E/W loadbalancing test\n\nOtherwise, it leads to a trailing space in the main-focus.yaml file,\nwhich is now flagged by the linter.\n\nSigned-off-by: Marco Iorio \u003cmarco.iorio@isovalent.com\u003e","UpstreamSHA":"f6fae66998bfbe71ce178909e6f3c1991d2ecc10","PRSHA":"c052de9081b1e27689a20a0f31435643c7149014"},{"Author":{"Name":"Marco Iorio","Email":"marco.iorio@isovalent.com","Date":"2024-07-19T09:24:09Z"},"Message":"gha: drop trailing spaces from all files under .github\n\nFollowing the introduction of the dedicated linter, in the previous\ncommit, let's address all existing occurrences:\n\n$ find .github -type f -exec sed -ri 's/[[:blank:]]+$//' {} \\;\n\nSigned-off-by: Marco Iorio \u003cmarco.iorio@isovalent.com\u003e","UpstreamSHA":"83338599965a579ef635f9e7b77f3c53cdfec056","PRSHA":"b32aaa887b90f6dfc6cba65264167e3923ef9867"}],"DownstreamIssue":0},{"Number":34004,"Title":"test: use cgr.dev/chainguard/busybox:latest instead of docker.io image.","Body":"Right now, we're seeing regular ImagePullBackoffs on test images being pulled from docker.io.  Switching busybox off of the docker.io image should reduce pressure on the ratelimit and reduce flakes.\r\n\r\nAs a follow up, I'll track failures in ci-ginkgo and begin to move other images to quay.io as needed.\r\n\r\nRunning this through CI a couple times, I didn't see any pull issues (but that could just be the time of day) so the plan is to merge this and see if it makes any difference in CI failures.\r\n\r\n**note:** the chainguard repo only allows unauthenticated access to \"latest\" versions, but this should be fine for test containers just using busybox.","State":"closed","Labels":["ready-to-merge","release-note/ci","backport-pending/1.14","needs-backport/1.15","needs-backport/1.16"],"Author":"tommyp1ckles","MergedAt":"2024-07-25T19:01:22Z","MergeCommitSHA":"16ed7db50ac3359d78a175dfa16afb4d5f63c42c","HTMLURL":"https://github.com/cilium/cilium/pull/34004","Commits":[{"Author":{"Name":"Tom Hadlaw","Email":"tom.hadlaw@isovalent.com","Date":"2024-07-25T00:09:50Z"},"Message":"test: use cgr.dev/chainguard/busybox:1.36.0 instead of docker.io image.\n\nSigned-off-by: Tom Hadlaw \u003ctom.hadlaw@isovalent.com\u003e","UpstreamSHA":"16ed7db50ac3359d78a175dfa16afb4d5f63c42c","PRSHA":"5ad873c83d2b0da7b2f66ba6287431dfec1fa4e5"}],"DownstreamIssue":0},{"Number":33905,"Title":"auth: Fix data race in Upsert","Body":"Fixes: #33899\r\n\r\n```release-note\r\nauth: Fix data race in Upsert\r\n```\r\n","State":"closed","Labels":["release-note/bug","ready-to-merge","needs-backport/1.15","needs-backport/1.16"],"Author":"chaunceyjiang","MergedAt":"2024-07-26T14:45:39Z","MergeCommitSHA":"b9bb0c28b9bde4e231c2fc1427334ddb9cf41535","HTMLURL":"https://github.com/cilium/cilium/pull/33905","Commits":[{"Author":{"Name":"chaunceyjiang","Email":"chaunceyjiang@gmail.com","Date":"2024-07-19T07:14:37Z"},"Message":"auth: Fix data race in Upsert\n\nFixes: #33899\n\nSigned-off-by: chaunceyjiang \u003cchaunceyjiang@gmail.com\u003e","UpstreamSHA":"b9bb0c28b9bde4e231c2fc1427334ddb9cf41535","PRSHA":"04322be4adf6fdea21271b5ad832c96988a35171"}],"DownstreamIssue":0},{"Number":34032,"Title":"gateway-api: Enqueue gateway for Reference Grant changes","Body":"This commit is to make sure that the reconciliation loop is kicked off for Gateway object if there is any change in ReferenceGrant (mainly for SecretObjectReference).\r\n\r\n","State":"closed","Labels":["release-note/bug","ready-to-merge","needs-backport/1.15","needs-backport/1.16"],"Author":"sayboras","MergedAt":"2024-07-29T04:54:00Z","MergeCommitSHA":"ed3dfa0aab8b80f7e841a6d49d2a990ac2dca053","HTMLURL":"https://github.com/cilium/cilium/pull/34032","Commits":[{"Author":{"Name":"Tam Mach","Email":"tam.mach@cilium.io","Date":"2024-07-26T12:33:36Z"},"Message":"gateway-api: Enqueue gateway for Reference Grant changes\n\nThis commit is to make sure that the reconciliation loop is kicked off\nfor Gateway object if there is any change in ReferenceGrant (mainly for\nSecretObjectReference).\n\nSigned-off-by: Tam Mach \u003ctam.mach@cilium.io\u003e","UpstreamSHA":"ed3dfa0aab8b80f7e841a6d49d2a990ac2dca053","PRSHA":"15389c00cd89836021812086563c5f80af07e014"}],"DownstreamIssue":0},{"Number":34044,"Title":"doc: update slack channel reference","Body":"Please ensure your pull request adheres to the following guidelines:\r\n\r\n- [x] For first time contributors, read [Submitting a pull request](https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#submitting-a-pull-request)\r\n- [x] All code is covered by unit and/or runtime tests where feasible.\r\n- [x] All commits contain a well written commit description including a title,\r\n      description and a `Fixes: #XXX` line if the commit addresses a particular\r\n      GitHub issue.\r\n- [x] If your commit description contains a `Fixes: \u003ccommit-id\u003e` tag, then\r\n      please add the commit author[s] as reviewer[s] to this issue.\r\n- [x] All commits are signed off. See the section [Developer’s Certificate of Origin](https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#dev-coo)\r\n- [x] Provide a title or release-note blurb suitable for the release notes.\r\n- [x] Are you a user of Cilium? Please add yourself to the [Users doc](https://github.com/cilium/cilium/blob/main/USERS.md)\r\n- [x] Thanks for contributing!\r\n\r\n\u003c!-- Description of change --\u003e\r\n\r\nFixes: #34043\r\n\r\nUpdate the dead links `https://cilium.herokuapp.com/` `https://cilium.io/slack` to  `https://slack.cilium.io`","State":"closed","Labels":["ready-to-merge","release-note/misc","kind/community-contribution","backport-pending/1.14","needs-backport/1.15","needs-backport/1.16"],"Author":"Huweicai","MergedAt":"2024-07-30T09:36:34Z","MergeCommitSHA":"83fb3812a9bfa5a1f59e17958ae65b271bf1eea7","HTMLURL":"https://github.com/cilium/cilium/pull/34044","Commits":[{"Author":{"Name":"Huweicai","Email":"i@huweicai.com","Date":"2024-07-27T18:34:52Z"},"Message":"doc: update slack channel reference\n\nSigned-off-by: Huweicai \u003ci@huweicai.com\u003e","UpstreamSHA":"83fb3812a9bfa5a1f59e17958ae65b271bf1eea7","PRSHA":"4ef4e7c03c0fb079d4b51be84763af279c084ac0"}],"DownstreamIssue":0},{"Number":34106,"Title":"lbipam: fixed bug in sharing key logic","Body":"@brb found a bug in the sharing key logic where if you remove a service that is part of a sharing key and then add it back, it would get a new IP while it should have gotten the same IP back.\r\n\r\nThis turns out to be caused by the `sharingKeyToServiceViewIPs` index which maps a sharing key to a list of IPs associated with that sharing key. Each IP can be assigned to multiple services, however, when a service was removed or received a new IP because it was no longer compatible with the rest of the services with the same sharing key, we would always remove the IP from the index. Because of this, upon re-adding the service it seems like the sharing key wasn't in use so a new IP is allocated and added as the sole IP used by the sharing key.\r\n\r\nThe fix is to now we only remove it if the IP is not used by any other service with the same sharing key.\r\n\r\nThis PR also adds a recession test for this case.\r\n\r\n```release-note\r\nlbipam: fixed bug in sharing key logic\r\n```\r\n","State":"closed","Labels":["kind/bug","release-note/bug","feature/lb-ipam","needs-backport/1.15","needs-backport/1.16"],"Author":"dylandreimerink","MergedAt":"2024-07-31T16:50:12Z","MergeCommitSHA":"e5912040fe2751529c9cfab3856f4bd8634539a6","HTMLURL":"https://github.com/cilium/cilium/pull/34106","Commits":[{"Author":{"Name":"Dylan Reimerink","Email":"dylan.reimerink@isovalent.com","Date":"2024-07-31T11:01:22Z"},"Message":"lbipam: fixed bug in sharing key logic\n\n@brb found a bug in the sharing key logic where if you remove a service\nthat is part of a sharing key and then add it back, it would get a new\nIP while it should have gotten the same IP back.\n\nThis turns out to be caused by the `sharingKeyToServiceViewIPs` index\nwhich maps a sharing key to a list of IPs associated with that sharing\nkey. Each IP can be assigned to multiple services, however, when a\nservice was removed or received a new IP because it was no longer\ncompatible with the rest of the services with the same sharing key, we\nwould always remove the IP from the index. Because of this, upon\nre-adding the service it seems like the sharing key wasn't in use so\na new IP is allocated and added as the sole IP used by the sharing key.\n\nThe fix is to now we only remove it if the IP is not used by any other\nservice with the same sharing key.\n\nThis commit also adds a recession test for this case.\n\nSigned-off-by: Dylan Reimerink \u003cdylan.reimerink@isovalent.com\u003e","UpstreamSHA":"e5912040fe2751529c9cfab3856f4bd8634539a6","PRSHA":"76ef2cc4a8e938794ad1c42559fb36356b57ef97"}],"DownstreamIssue":0},{"Number":34109,"Title":"gateway-api: Add HTTP method condition in sortable routes","Body":"As per the below, method match should be considered before the largest of header/query param matches. This commit is to consider method attribute into sorting rule.\r\n\r\nRelates: https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1.HTTPRouteRule\r\n","State":"closed","Labels":["release-note/bug","ready-to-merge","needs-backport/1.15","needs-backport/1.16"],"Author":"sayboras","MergedAt":"2024-08-01T00:43:29Z","MergeCommitSHA":"a3510fe4a92305822aa1a5e08cb6d6c873c8699a","HTMLURL":"https://github.com/cilium/cilium/pull/34109","Commits":[{"Author":{"Name":"Tam Mach","Email":"tam.mach@cilium.io","Date":"2024-07-31T13:43:58Z"},"Message":"gateway-api: Add HTTP method condition in sortable routes\n\nAs per the below, method match should be considered before the largest\nof header/query param matches. This commit is to consider method attr\ninto sorting rule.\n\nRelates: https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1.HTTPRouteRule\nSigned-off-by: Tam Mach \u003ctam.mach@cilium.io\u003e","UpstreamSHA":"a3510fe4a92305822aa1a5e08cb6d6c873c8699a","PRSHA":"05b6238efb6844c315b441954c89166605d03f14"}],"DownstreamIssue":0},{"Number":34121,"Title":"tests-clustermesh-upgrade: Don't hardcode test namespace","Body":"https://github.com/cilium/cilium-cli/pull/2637 changed the behavior of --test-namespace flag. It's now treated as a prefix, and the namespaces always contain \"-$index\" suffix even if --test-concurrency is set to 1. Instead of hardcoding the namespace, use app.kubernetes.io/name label to find the test namespace.\r\n\r\nRef: https://github.com/cilium/cilium-cli/releases/tag/v0.16.14","State":"closed","Labels":["area/clustermesh","release-note/ci","needs-backport/1.15","needs-backport/1.16"],"Author":"michi-covalent","MergedAt":"2024-08-01T08:26:07Z","MergeCommitSHA":"b189c578ea4da02467dc636e338683b8b6b4b012","HTMLURL":"https://github.com/cilium/cilium/pull/34121","Commits":[{"Author":{"Name":"Michi Mutsuzaki","Email":"michi@isovalent.com","Date":"2024-08-01T00:04:43Z"},"Message":"tests-clustermesh-upgrade: Don't hardcode test namespace\n\nhttps://github.com/cilium/cilium-cli/pull/2680 changed the behavior of\n--test-namespace flag. It's now treated as a prefix, and the namespaces\nalways contain \"-$index\" suffix even if --test-concurrency is set to 1.\nInstead of hardcoding the namespace, use app.kubernetes.io/name label to\nfind the test namespace.\n\nSigned-off-by: Michi Mutsuzaki \u003cmichi@isovalent.com\u003e","UpstreamSHA":"b189c578ea4da02467dc636e338683b8b6b4b012","PRSHA":"be4d82682fee5824395003d26b6435ddf1687b65"}],"DownstreamIssue":0},{"Number":34110,"Title":"bugtool: dumping more Envoy information","Body":"Currently, during a Cilium sysdump the bugtool dumps the full Envoy config and prometheus metrics.\r\n\r\nIn some cases it would be nice to have some information about listeners and clusters easier at hand.\r\n\r\nTherefore this commit extends the bugtool to dump the listeners, clusters and server_info that are available though the [Envoy admin interface](https://www.envoyproxy.io/docs/envoy/latest/operations/admin#get--listeners) during a sysdump.","State":"closed","Labels":["kind/feature","area/proxy","ready-to-merge","area/bugtool","area/cli","release-note/misc","needs-backport/1.15","needs-backport/1.16"],"Author":"mhofstetter","MergedAt":"2024-08-02T07:21:56Z","MergeCommitSHA":"43dd4ee416a8eec7d4b2d9a3e8fccbc0893d04d7","HTMLURL":"https://github.com/cilium/cilium/pull/34110","Commits":[{"Author":{"Name":"Marco Hofstetter","Email":"marco.hofstetter@isovalent.com","Date":"2024-07-31T16:08:21Z"},"Message":"bugtool: enhance dumping Envoy information\n\nCurrently, during a Cilium sysdump the bugtool dumps the full\nEnvoy config and prometheus metrics.\n\nIn some cases it would be nice to have some information about\nlisteners and clusters easier at hand.\n\nTherefore this commit enrichers the bugtool to also dump the\nlisteners, clusters and server_info during a sysdump.\n\nSigned-off-by: Marco Hofstetter \u003cmarco.hofstetter@isovalent.com\u003e","UpstreamSHA":"43dd4ee416a8eec7d4b2d9a3e8fccbc0893d04d7","PRSHA":"e76049f36f32d58abf261d97574da7dbbc37aaf6"}],"DownstreamIssue":0},{"Number":34091,"Title":"etcd: fix paginated list missing events with parallel operations","Body":"Currently, the etcd paginatedList implementation is affected by a bug that can lead to missing events of both upsertions and deletions that are performed between the end of the first Get call and the last Get call. Indeed, the tracked revision is incorrectly updated after every Get call, and etcd always returns the current revision, regardless of whether WithRev is actually requesting a specific revision. In turn, this leads to subsequent Get calls targeting different revisions, hence missing all the events happened in between for the already processed chunks of the prefix, as the watch operation is eventually started from the latest retrieved revision.\r\n\r\nLet's address this by consistently using the same revision during the entire paginatedList execution, to ensure that we correctly list all entries at that revision, and subsequently start watching events from the next one. Additionally, let's update the associated unit test to additionally cover the parallel operations case and prevent future regressions in this respect.\r\n\r\nThis bug affected both Cilium running in KVStore mode and clustermesh, although the race condition window is sufficiently short to trigger its occurrence only rarely and in high scale environments.\r\n\r\nMarked for backport to all affected versions, as IMO it fits into the\r\n\r\n\u003e Major bugfixes relevant to the correct operation of Cilium\r\n\r\ncategory, considering the Cilium then only recovers after a restart of the agent (although that may again trigger the same problem).\r\n\r\n\u003c!-- Description of change --\u003e\r\n\r\n```release-note\r\nFix bug causing etcd upsertion/deletion events to be potentially missed during the initial synchronization, when Cilium operates in KVStore mode, or Cluster Mesh is enabled. \r\n```\r\n","State":"closed","Labels":["kind/bug","release-note/bug","area/clustermesh","sig/kvstore","backport-pending/1.14","needs-backport/1.15","needs-backport/1.16"],"Author":"giorio94","MergedAt":"2024-08-02T16:05:37Z","MergeCommitSHA":"8b210fb7124eef3194378cdfd9971da39b5b5f08","HTMLURL":"https://github.com/cilium/cilium/pull/34091","Commits":[{"Author":{"Name":"Marco Iorio","Email":"marco.iorio@isovalent.com","Date":"2024-07-30T13:23:57Z"},"Message":"etcd: fix paginated list missing events with parallel operations\n\nCurrently, the etcd paginatedList implementation is affected by a bug\nthat can lead to missing events of both upsertions and deletions that\nare performed between the end of the first Get call and the last Get\ncall. Indeed, the tracked revision is incorrectly updated after every\nGet call, and etcd always returns the current revision, regardless of\nwhether WithRev is actually requesting a specific revision. In turn,\nthis leads to subsequent Get calls targeting different revisions,\nhence missing all the events happened in between for the already\nprocessed chunks of the prefix, as the watch operation is eventually\nstarted from the latest retrieved revision.\n\nLet's address this by consistently using the same revision during the\nentire paginatedList execution, to ensure that we correctly list all\nentries at that revision, and subsequently start watching events from\nthe next one. Additionally, let's update the associated unit test\nto additionally cover the parallel operations case and prevent future\nregressions in this respect.\n\nThis bug affected both Cilium running in KVStore mode and clustermesh,\nalthough the race condition window is sufficiently short to trigger\nits occurrence only rarely and in high scale environments.\n\nFixes: e33b9f9bff20 (\"kvstore: add support for paginated lists in etcd.ListAndWatch\")\nSigned-off-by: Marco Iorio \u003cmarco.iorio@isovalent.com\u003e","UpstreamSHA":"8b210fb7124eef3194378cdfd9971da39b5b5f08","PRSHA":"48266fae5e3605ecbb8ac5cdac46d20d255edb1d"}],"DownstreamIssue":0}],"Completed":[{"Number":33803,"Title":"helm: remove duplicate metrics for Envoy pod","Body":"Currently, having Prometheus enabled `envoy.prometheus.enabled=true` results in duplicated metrics for the Envoy daemonset Pods if scraping via a dedicated `ServiceMonitor` isn't enabled. The reason is that prometheus are added to the `Pod` (in the `DaemonSet`) and to the `Service`.\r\n\r\nTherefore, this commit removes the Envoy K8s `Service` completely, as this is only used for prometheus scraping.\r\n\r\nFixes: #32747","State":"closed","Labels":["kind/bug","release-note/bug","area/helm","area/servicemesh","needs-backport/1.15","backport-done/1.16"],"Author":"mhofstetter","MergedAt":"2024-07-15T09:23:08Z","MergeCommitSHA":"abe0acc8aadb5c41a016c1c133cb12362bfd6ba6","HTMLURL":"https://github.com/cilium/cilium/pull/33803","Commits":[{"Author":{"Name":"Marco Hofstetter","Email":"marco.hofstetter@isovalent.com","Date":"2024-07-15T07:27:23Z"},"Message":"helm: remove duplicate metrics for Envoy pod\n\nCurrently, having Prometheus enabled `envoy.prometheus.enabled=true` results\nin duplicated metrics for the Envoy daemonset Pods if scraping via a dedicated\n`ServiceMonitor` isn't enabled. The reason is that prometheus are added to the\n`Pod` (in the `DaemonSet`) and to the `Service`.\n\nTherefore, this commit removes the Envoy K8s `Service` completely, as this is only\nused for prometheus scraping.\n\nSigned-off-by: Marco Hofstetter \u003cmarco.hofstetter@isovalent.com\u003e","UpstreamSHA":"abe0acc8aadb5c41a016c1c133cb12362bfd6ba6","PRSHA":"7ff7050f01f8907e794d898e5eac2559601a6b6b"}],"DownstreamIssue":0}],"Skipped":null,"PR":{"Branch":"pr/v1.15-backport-2024-08-02-07-03","Title":"v1.15 Backports 2024-08-02","URL":""},"Conflicts":{"33803":{}}}
    
  • operator/pkg/gateway-api/gateway.go+35 0 modified
    @@ -20,6 +20,7 @@ import (
     	"sigs.k8s.io/controller-runtime/pkg/reconcile"
     	gatewayv1 "sigs.k8s.io/gateway-api/apis/v1"
     	gatewayv1alpha2 "sigs.k8s.io/gateway-api/apis/v1alpha2"
    +	gatewayv1beta1 "sigs.k8s.io/gateway-api/apis/v1beta1"
     
     	"github.com/cilium/cilium/operator/pkg/gateway-api/helpers"
     	"github.com/cilium/cilium/operator/pkg/model/translation"
    @@ -89,6 +90,8 @@ func (r *gatewayReconciler) SetupWithManager(mgr ctrl.Manager) error {
     		// Watch related namespace in allowed namespaces
     		Watches(&corev1.Namespace{},
     			r.enqueueRequestForAllowedNamespace()).
    +		// Watch for changes to Reference Grants
    +		Watches(&gatewayv1beta1.ReferenceGrant{}, r.enqueueRequestForReferenceGrant()).
     		// Watch created and owned resources
     		Owns(&ciliumv2.CiliumEnvoyConfig{}).
     		Owns(&corev1.Service{}).
    @@ -287,3 +290,35 @@ func (r *gatewayReconciler) enqueueRequestForAllowedNamespace() handler.EventHan
     func (r *gatewayReconciler) usedInGateway(obj client.Object) bool {
     	return len(getGatewaysForSecret(context.Background(), r.Client, obj)) > 0
     }
    +
    +func (r *gatewayReconciler) enqueueRequestForReferenceGrant() handler.EventHandler {
    +	return handler.EnqueueRequestsFromMapFunc(r.enqueueAll())
    +}
    +
    +func (r *gatewayReconciler) enqueueAll() handler.MapFunc {
    +	return func(ctx context.Context, o client.Object) []reconcile.Request {
    +		scopedLog := log.WithFields(logrus.Fields{
    +			logfields.Controller: "gateway",
    +			logfields.Resource:   client.ObjectKeyFromObject(o),
    +		})
    +		list := &gatewayv1.GatewayList{}
    +
    +		if err := r.Client.List(ctx, list, &client.ListOptions{}); err != nil {
    +			scopedLog.WithError(err).Error("Failed to list Gateway")
    +			return []reconcile.Request{}
    +		}
    +
    +		requests := make([]reconcile.Request, 0, len(list.Items))
    +		for _, item := range list.Items {
    +			gw := client.ObjectKey{
    +				Namespace: item.GetNamespace(),
    +				Name:      item.GetName(),
    +			}
    +			requests = append(requests, reconcile.Request{
    +				NamespacedName: gw,
    +			})
    +			scopedLog.Info("Enqueued Gateway for resource", gateway, gw)
    +		}
    +		return requests
    +	}
    +}
    
414a96b53d51

gateway-api: Enqueue gateway for Reference Grant changes

https://github.com/cilium/ciliumTam MachJul 26, 2024via ghsa
1 file changed · +35 0
  • operator/pkg/gateway-api/gateway.go+35 0 modified
    @@ -20,6 +20,7 @@ import (
     	"sigs.k8s.io/controller-runtime/pkg/reconcile"
     	gatewayv1 "sigs.k8s.io/gateway-api/apis/v1"
     	gatewayv1alpha2 "sigs.k8s.io/gateway-api/apis/v1alpha2"
    +	gatewayv1beta1 "sigs.k8s.io/gateway-api/apis/v1beta1"
     
     	"github.com/cilium/cilium/operator/pkg/gateway-api/helpers"
     	ciliumv2 "github.com/cilium/cilium/pkg/k8s/apis/cilium.io/v2"
    @@ -90,6 +91,8 @@ func (r *gatewayReconciler) SetupWithManager(mgr ctrl.Manager) error {
     		// Watch related namespace in allowed namespaces
     		Watches(&corev1.Namespace{},
     			r.enqueueRequestForAllowedNamespace()).
    +		// Watch for changes to Reference Grants
    +		Watches(&gatewayv1beta1.ReferenceGrant{}, r.enqueueRequestForReferenceGrant()).
     		// Watch created and owned resources
     		Owns(&ciliumv2.CiliumEnvoyConfig{}).
     		Owns(&corev1.Service{}).
    @@ -288,3 +291,35 @@ func (r *gatewayReconciler) enqueueRequestForAllowedNamespace() handler.EventHan
     func (r *gatewayReconciler) usedInGateway(obj client.Object) bool {
     	return len(getGatewaysForSecret(context.Background(), r.Client, obj)) > 0
     }
    +
    +func (r *gatewayReconciler) enqueueRequestForReferenceGrant() handler.EventHandler {
    +	return handler.EnqueueRequestsFromMapFunc(r.enqueueAll())
    +}
    +
    +func (r *gatewayReconciler) enqueueAll() handler.MapFunc {
    +	return func(ctx context.Context, o client.Object) []reconcile.Request {
    +		scopedLog := log.WithFields(logrus.Fields{
    +			logfields.Controller: "gateway",
    +			logfields.Resource:   client.ObjectKeyFromObject(o),
    +		})
    +		list := &gatewayv1.GatewayList{}
    +
    +		if err := r.Client.List(ctx, list, &client.ListOptions{}); err != nil {
    +			scopedLog.WithError(err).Error("Failed to list Gateway")
    +			return []reconcile.Request{}
    +		}
    +
    +		requests := make([]reconcile.Request, 0, len(list.Items))
    +		for _, item := range list.Items {
    +			gw := client.ObjectKey{
    +				Namespace: item.GetNamespace(),
    +				Name:      item.GetName(),
    +			}
    +			requests = append(requests, reconcile.Request{
    +				NamespacedName: gw,
    +			})
    +			scopedLog.Info("Enqueued Gateway for resource", gateway, gw)
    +		}
    +		return requests
    +	}
    +}
    
ed3dfa0aab8b

gateway-api: Enqueue gateway for Reference Grant changes

https://github.com/cilium/ciliumTam MachJul 26, 2024via ghsa
1 file changed · +32 0
  • operator/pkg/gateway-api/gateway.go+32 0 modified
    @@ -20,6 +20,7 @@ import (
     	"sigs.k8s.io/controller-runtime/pkg/reconcile"
     	gatewayv1 "sigs.k8s.io/gateway-api/apis/v1"
     	gatewayv1alpha2 "sigs.k8s.io/gateway-api/apis/v1alpha2"
    +	gatewayv1beta1 "sigs.k8s.io/gateway-api/apis/v1beta1"
     
     	"github.com/cilium/cilium/operator/pkg/gateway-api/helpers"
     	"github.com/cilium/cilium/operator/pkg/model/translation"
    @@ -92,6 +93,8 @@ func (r *gatewayReconciler) SetupWithManager(mgr ctrl.Manager) error {
     		// Watch related namespace in allowed namespaces
     		Watches(&corev1.Namespace{},
     			r.enqueueRequestForAllowedNamespace()).
    +		// Watch for changes to Reference Grants
    +		Watches(&gatewayv1beta1.ReferenceGrant{}, r.enqueueRequestForReferenceGrant()).
     		// Watch created and owned resources
     		Owns(&ciliumv2.CiliumEnvoyConfig{}).
     		Owns(&corev1.Service{}).
    @@ -280,3 +283,32 @@ func (r *gatewayReconciler) enqueueRequestForAllowedNamespace() handler.EventHan
     func (r *gatewayReconciler) usedInGateway(obj client.Object) bool {
     	return len(getGatewaysForSecret(context.Background(), r.Client, obj, r.logger)) > 0
     }
    +
    +func (r *gatewayReconciler) enqueueRequestForReferenceGrant() handler.EventHandler {
    +	return handler.EnqueueRequestsFromMapFunc(r.enqueueAll())
    +}
    +
    +func (r *gatewayReconciler) enqueueAll() handler.MapFunc {
    +	return func(ctx context.Context, o client.Object) []reconcile.Request {
    +		scopedLog := r.logger.With(logfields.Controller, gateway, logfields.Resource, client.ObjectKeyFromObject(o))
    +		list := &gatewayv1.GatewayList{}
    +
    +		if err := r.Client.List(ctx, list, &client.ListOptions{}); err != nil {
    +			scopedLog.Error("Failed to list Gateway", logfields.Error, err)
    +			return []reconcile.Request{}
    +		}
    +
    +		requests := make([]reconcile.Request, 0, len(list.Items))
    +		for _, item := range list.Items {
    +			gw := client.ObjectKey{
    +				Namespace: item.GetNamespace(),
    +				Name:      item.GetName(),
    +			}
    +			requests = append(requests, reconcile.Request{
    +				NamespacedName: gw,
    +			})
    +			scopedLog.Info("Enqueued Gateway for resource", gateway, gw)
    +		}
    +		return requests
    +	}
    +}
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

7

News mentions

0

No linked articles in our index yet.