While testing a k8s workload charm with Juju 2.8-rc3, I executed the following changes in quick succession:
juju upgrade-charm [...] # caused pod spec changes, pods recycled
juju config [...] # also pod spec changes, pods recycled again
Final state:
[agnew(~)] juju status mattermost/6
Model Controller Cloud/Region Version SLA Timestamp
mattermost mm-rc3 k8s/localhost 2.8-rc3 unsupported 09:43:42+12:00
SAAS Status Store URL
postgresql active mm-rc3 admin/database.
postgresql
App Version Status Scale Charm Store Rev OS Address Notes
mattermost active 0/2 mattermost local 4 kubernetes 10.152.183.202
Unit Workload Agent Address Ports Message
mattermost/6 terminated failed 10.1.1.93 8065/TCP unit stopped by the cloud
[agnew(~)] juju debug-log --replay | grep mattermost/6
application-
mattermost: 09:34:31 DEBUG juju.worker.uniter starting uniter for "mattermost/6"
application-
mattermost: 09:34:31 DEBUG juju.worker.
caasoperator start "mattermost/6"
application-
mattermost: 09:34:31 INFO juju.worker.
caasoperator start "mattermost/6"
application-
mattermost: 09:34:31 DEBUG juju.worker.
caasoperator "mattermost/6" started
application-
mattermost: 09:34:31 DEBUG juju.worker.
leadership mattermost/6 making initial claim for mattermost leadership
application-
mattermost: 09:34:31 INFO juju.worker.
leadership mattermost leadership for mattermost/6 denied
application-
mattermost: 09:34:31 DEBUG juju.worker.
leadership mattermost/6 waiting for mattermost leadership release
application-
mattermost: 09:34:31 INFO juju.worker.uniter unit "mattermost/6" started
application-
mattermost: 09:34:32 DEBUG juju.worker.
leadership mattermost/6 is not mattermost leader
application-
mattermost: 09:34:32 DEBUG juju.worker.
caasoperator trigger running status for caas unit mattermost/6
application-
mattermost: 09:34:35 DEBUG unit.mattermost
/6.juju-
log Legacy hooks/install does not exist.
application-
mattermost: 09:34:35 DEBUG unit.mattermost
/6.juju-
log Emitting Juju event install.
application-
mattermost: 09:34:35 DEBUG juju.worker.
leadership mattermost/6 is not mattermost leader
application-
mattermost: 09:34:39 DEBUG unit.mattermost
/6.juju-
log db:0: Legacy hooks/db-
relation-
created does not exist.
application-
mattermost: 09:34:39 DEBUG unit.mattermost
/6.juju-
log db:0: Emitting Juju event db_relation_
created.
application-
mattermost: 09:34:42 DEBUG unit.mattermost
/6.juju-
log Legacy hooks/leader-
settings-
changed does not exist.
application-
mattermost: 09:34:42 DEBUG unit.mattermost
/6.juju-
log Emitting Juju event leader_
settings_
changed.
application-
mattermost: 09:34:42 DEBUG unit.mattermost
/6.juju-
log mirroring app relation data for relation 0
application-
mattermost: 09:34:42 DEBUG juju.worker.
leadership mattermost/6 is not mattermost leader
application-
mattermost: 09:34:51 INFO juju.worker.uniter unit "mattermost/6" shutting down: executing operation "remote init": attempt count exceeded: container not running not found
application-
mattermost: 09:34:51 DEBUG juju.worker.
uniter.
remotestate got leadership change for mattermost/6: leader
application-
mattermost: 09:34:51 INFO juju.worker.
caasoperator stopped "mattermost/6", err: executing operation "remote init": attempt count exceeded: container not running not found
application-
mattermost: 09:34:51 DEBUG juju.worker.
caasoperator "mattermost/6" done: executing operation "remote init": attempt count exceeded: container not running not found
application-
mattermost: 09:34:51 ERROR juju.worker.
caasoperator exited "mattermost/6": executing operation "remote init": attempt count exceeded: container not running not found
application-
mattermost: 09:34:51 INFO juju.worker.
caasoperator restarting "mattermost/6" in 3s
application-
mattermost: 09:34:54 INFO juju.worker.
caasoperator start "mattermost/6"
application-
mattermost: 09:34:54 DEBUG juju.worker.
caasoperator "mattermost/6" started
application-
mattermost: 09:35:13 INFO juju.worker.
caasoperator stopped "mattermost/6", err: executing operation "remote init": Internal error occurred: error executing command in container: failed to exec in container: failed to create exec "65308676472d86
e7dbbd91c44e2e5
2ab28b5ed85ae13
5ce8a57d454d887
aefbb": cannot exec in a stopped state: unknown
application-
mattermost: 09:35:13 DEBUG juju.worker.
caasoperator "mattermost/6" done: executing operation "remote init": Internal error occurred: error executing command in container: failed to exec in container: failed to create exec "65308676472d86
e7dbbd91c44e2e5
2ab28b5ed85ae13
5ce8a57d454d887
aefbb": cannot exec in a stopped state: unknown
application-
mattermost: 09:35:13 ERROR juju.worker.
caasoperator exited "mattermost/6": executing operation "remote init": Internal error occurred: error executing command in container: failed to exec in container: failed to create exec "65308676472d86
e7dbbd91c44e2e5
2ab28b5ed85ae13
5ce8a57d454d887
aefbb": cannot exec in a stopped state: unknown
application-
mattermost: 09:35:13 INFO juju.worker.
caasoperator restarting "mattermost/6" in 3s
application-
mattermost: 09:35:16 INFO juju.worker.
caasoperator start "mattermost/6"
application-
mattermost: 09:35:16 DEBUG juju.worker.
caasoperator "mattermost/6" started
[agnew(~)] _
With deployments, pods get new identities as they are restarted so Juju will mark the original unit as stopped until the pod is removed from the cluster, and a new unit will be created when the new pod spins up.
It seems you've found a race in dealing with this when pod updates happen quickly. Restarting the controller causes Juju to reevaluate the state if the cluster and resync the juju model. It could also be that the pods were terminating and transitioned to removed during the controller restart.