From 2de60ea721a8347d0bf31cb7a97faf1448e782c4 Mon Sep 17 00:00:00 2001 From: Kjeld Schouten Date: Sat, 20 Apr 2024 11:31:55 +0200 Subject: [PATCH] fix(common): improve probes (#787) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit **Description** There is a design mistake in the probe setup: Liveness probe: Stops the container on failure, this should happen AFTER readiness probe detects failure. Readiness Probe: Stops the service forwarding traffic. This should happen BEFORE the liveness probe detects failure Startup Probe: By having a tcp startup-probe, charts might pass CI when not fully working. or, worse, get passed-over to the readiness/liveness probes when the container is still doing maintenance work (such as plex database updates). Leading to it right-away being thrown-out by liveness/readyness I also attempted to offset start of liveness and readyness, this ensures they don't send probes at the same time, while making it most-likely that liveness clears before readiness on startup. While also offsetting the intervals, to prevent the same while running **โš™๏ธ Type of change** - [ ] โš™๏ธ Feature/App addition - [x] ๐Ÿช› Bugfix - [ ] โš ๏ธ Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] ๐Ÿ”ƒ Refactor of current code **๐Ÿงช How Has This Been Tested?** **๐Ÿ“ƒ Notes:** **โœ”๏ธ Checklist:** - [ ] โš–๏ธ My code follows the style guidelines of this project - [ ] ๐Ÿ‘€ I have performed a self-review of my own code - [ ] #๏ธโƒฃ I have commented my code, particularly in hard-to-understand areas - [ ] ๐Ÿ“„ I have made corresponding changes to the documentation - [ ] โš ๏ธ My changes generate no new warnings - [ ] ๐Ÿงช I have added tests to this description that prove my fix is effective or that my feature works - [ ] โฌ†๏ธ I increased versions for any altered app according to semantic versioning - [ ] I made sure the title starts with `feat(chart-name):`, `fix(chart-name):` or `chore(chart-name):` **โž• App addition** If this PR is an app addition please make sure you have done the following. - [ ] ๐Ÿ–ผ๏ธ I have added an icon in the Chart's root directory called `icon.png` --- _Please don't blindly check all the boxes. Read them and only check those that apply. Those checkboxes are there for the reviewer to see what is this all about and the status of this PR with a quick glance._ --- library/common/Chart.yaml | 2 +- library/common/templates/lib/container/_probes.tpl | 1 + library/common/values.yaml | 12 ++++++------ 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/library/common/Chart.yaml b/library/common/Chart.yaml index 78028ddd..dfd5b170 100644 --- a/library/common/Chart.yaml +++ b/library/common/Chart.yaml @@ -15,7 +15,7 @@ maintainers: name: common sources: null type: library -version: 20.3.10 +version: 20.3.11 annotations: artifacthub.io/category: "integration-delivery" artifacthub.io/license: "BUSL-1.1" diff --git a/library/common/templates/lib/container/_probes.tpl b/library/common/templates/lib/container/_probes.tpl index a76d9ba7..7f9ba057 100644 --- a/library/common/templates/lib/container/_probes.tpl +++ b/library/common/templates/lib/container/_probes.tpl @@ -90,6 +90,7 @@ objectData: The object data to be used to render the container. {{- fail (printf "Container - Expected [probes] [successThreshold] to be 1 on [%s] probe" $probeName) -}} {{- end -}} {{- end }} + initialDelaySeconds: {{ $timeouts.initialDelaySeconds }} failureThreshold: {{ $timeouts.failureThreshold }} successThreshold: {{ $timeouts.successThreshold }} diff --git a/library/common/values.yaml b/library/common/values.yaml index f051919e..dc05811a 100644 --- a/library/common/values.yaml +++ b/library/common/values.yaml @@ -34,21 +34,21 @@ global: # -- Default probe timeouts probeTimeouts: liveness: - initialDelaySeconds: 10 - periodSeconds: 10 + initialDelaySeconds: 12 + periodSeconds: 15 timeoutSeconds: 5 failureThreshold: 5 successThreshold: 1 readiness: initialDelaySeconds: 10 - periodSeconds: 10 + periodSeconds: 12 timeoutSeconds: 5 - failureThreshold: 5 + failureThreshold: 4 successThreshold: 2 startup: initialDelaySeconds: 10 periodSeconds: 5 - timeoutSeconds: 2 + timeoutSeconds: 3 failureThreshold: 60 successThreshold: 1 # -- Define a postgresql version for CNPG @@ -200,7 +200,7 @@ workload: port: "{{ $.Values.service.main.ports.main.targetPort | default .Values.service.main.ports.main.port }}" startup: enabled: true - type: "tcp" + type: "{{ .Values.service.main.ports.main.protocol }}" port: "{{ $.Values.service.main.ports.main.targetPort | default .Values.service.main.ports.main.port }}" # -- Timezone used everywhere applicable