You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pods are getting killed after 100 minutes and marked as Failed progressing.
We have some big boys in our environment. Our image is about ~5GB and needs around 90-110 minutes to be up and running (1st deployment).
After approximately 100 mins our pod(s) is/are getting deleted without a reason, even though everything inside was going well.
conditions:
- lastTransitionTime: '2019-09-27T05:33:28Z'lastUpdateTime: '2019-09-27T05:33:28Z'message: Deployment config does not have minimum availability.status: 'False'type: Available
- lastTransitionTime: '2019-09-27T07:30:32Z'lastUpdateTime: '2019-09-27T07:30:32Z'message: replication controller "app-wls-1" has failed progressingreason: ProgressDeadlineExceededstatus: 'False'type: Progressing
We have been also trying to patch the DC and include it in YAML DC file, but it seems that Openshift is ignoring this spec in YAML (is it available only for deployments?). Patch command returns following output:
$ oc patch dc app --patch='{"spec":{"progressDeadlineSeconds":7200}}'
deploymentconfig.apps.openshift.io/app not patched
Expected Result
Successful deployment without exceeding any deadline. ;)
Additional Information
$ oc get all -o yaml -n szymon-sandbox >> namespace.yml
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 1h replication-controller Created pod: app-wls-10-mtlpz
Normal SuccessfulDelete 13m replication-controller Deleted pod: app-wls-10-mtlpz
I have been forwarding the logs to a local file and the last entry is:
rpc error: code = Unknown desc = Error: No such container: df6088d60dd12b4d2ff69108b350a66487db07541d1ab45712a06e8ea5e42956
I belive it's not related with the application itself :)
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
Pods are getting killed after 100 minutes and marked as Failed progressing.
We have some big boys in our environment. Our image is about ~5GB and needs around 90-110 minutes to be up and running (1st deployment).
After approximately 100 mins our pod(s) is/are getting deleted without a reason, even though everything inside was going well.
Version
oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
openshift v3.11.69
kubernetes v1.11.0+d4cacc0
Steps To Reproduce
Current Result
Example from one of the timeouts:
We have been also trying to patch the DC and include it in YAML DC file, but it seems that Openshift is ignoring this spec in YAML (is it available only for deployments?). Patch command returns following output:
$ oc patch dc app --patch='{"spec":{"progressDeadlineSeconds":7200}}' deploymentconfig.apps.openshift.io/app not patched
Expected Result
Successful deployment without exceeding any deadline. ;)
Additional Information
$ oc get all -o yaml -n szymon-sandbox >> namespace.yml
namespace.yml
rc.yml
Please, kindly advise. :) Feel free to ask me for any additional info or missing details.
Thanks!
The text was updated successfully, but these errors were encountered: