Troubleshooting
If an ActionSet fails to perform an action, then the failure events can be seen in the respective ActionSet as well as its associated Blueprint by using the following commands:
# Example of failure events in an ActionSet:
$ kubectl --namespace kanister describe actionset <ActionSet Name>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started Action 14s Kanister Controller Executing action delete
Normal Started Phase 14s Kanister Controller Executing phase deleteFromS3
Warning ActionSetFailed Action: delete 13s Kanister Controller Failed to run phase 0 of action delete: command terminated with exit code 1
# Example of failure events of ActionSet emitted to its associated Blueprint:
$ kubectl --namespace kanister describe blueprint <Blueprint Name>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Added 4m Kanister Controller Added blueprint 'Blueprint Name'
Warning ActionSetFailed Action: delete 1m Kanister Controller Failed to run phase 0 of action delete: command terminated with exit code 1
If you ever need to debug a live Kanister system and the information available in ActionSets you might have created is not enough, looking at the Kanister controller logs might help. Assuming you have deployed the controller in the kanister
namespace, you can use the following commands to get controller logs.
$ kubectl get pods --namespace kanister
NAME READY STATUS RESTARTS AGE
release-kanister-operator-1484730505-l443d 1/1 Running 0 1m
$ kubectl logs -f <operator-pod-name-from-above> --namespace kanister
If you are not successful in verifying the reason behind the failure, please reach out to us on Slack or file an issue on GitHub. A mailing list is also available if needed.
Validating webhook for Blueprints
For the validating webhook to work, the Kubernetes API Server needs to connect to port 9443
of the Kanister operator. If your cluster has a firewall setup, it has to be configured to allow that communication.
GKE
If you get an error while applying a blueprint, that the webhook can't be reached, check if your firewall misses a rule for port 9443
:
$ kubectl apply -f blueprint.yaml
Error from server (InternalError): error when creating "blueprint.yaml": Internal error occurred: failed calling webhook "blueprints.cr.kanister.io": failed to call webhook: Post "https://kanister-kanister-operator.kanister.svc:443/validate/v1alpha1/blueprint?timeout=5s": context deadline exceeded
See GKE: Adding firewall rules for specific use cases and kubernetes/kubernetes: Using non-443 ports for admission webhooks requires firewall rule in GKE for more details.