Debugging Argo Events
Table of Contents
Introduction⌗
When working with event driven systems like Argo Events, it can become quite painful to troubleshoot these systems. I’m very familiar with Argo Events and using it to achieve a lot of things. In my several of my posts Argo Events is mentioned to be part of the stack.
But when things do not work as you want you need to be aware of what is happening. And if all components are getting the information they need.
Debugging event flow.⌗
Checking the resources⌗
- Make sure you have a
EventBus
deployedkubectl get eventbus
- Make sure you have a
EventSource
deployedkubectl get eventsource
- Make sure you have a
Sensor
deployedkubectl get sensor
If all resources are deployed, this does not mean that they’re running. The deployments / pods might not be created by the controller.
The above 3 steps you can combine into one single command: kubectl get pods
NAME READY STATUS RESTARTS AGE
my-eventsource-kgnvq-5b54758d96-v68nf 1/1 Running 0 4h42m
my-sensor-fnhql-597b8cc5ff-kk6xb 1/1 Running 2 (4h42m ago) 4h42m
eventbus-default-stan-0 2/2 Running 0 4h42m
eventbus-default-stan-1 2/2 Running 0 4h42m
eventbus-default-stan-2 2/2 Running 0 4h42m
This should show you a EventBus
, EventSource
and Sensor
. If it doesn’t something is wrong with the deployment. As this could be many things, here are a few pointers:
-
Check the affected resource eg:
kubectl describe eventbus default
kubectl describe eventsource my-sensor
kubectl describe sensor my-sensor
-
Check the controller logs of the affected resource eg:
kubectl logs deployment/eventbus-controller -n argo-events
kubectl logs deployment/eventsource-controller -n argo-events
kubectl logs deployment/sensor-controller -n argo-events
Read the logs carefully, there potentially a message hidden.
Checking the flow⌗
If all resources are there then there could be something wrong with the flow of information.
-
Check the logs of the
EventSource
by:kubectl logs my-eventsource-kgnvq-5b54758d96-v68nf
This should show you something similar to upon posting of an event:2022-12-10T04:07:37.942338269Z {"level":"info","ts":1670645257.9422746,"logger":"argo-events.eventsource","caller":"eventsources/eventing.go:512","msg":"succeeded to publish an event","eventSourceName":"my-eventsource","eventName":"webhook","eventSourceType":"webhook","eventID":"36633463313738302d653536622d346263362d383838312d613631346231313133643333"}
If that is not the case the
EventSource
is probably configured incorrectly, can not authenticate (SQS, slack ect..) or possibly the event is sent to the wrongEventSource
type. -
Check the logs of the
Sensor
by:kubectl logs my-sensor-fnhql-597b8cc5ff-kk6xb
This should show you something similar to:2022-12-10T04:07:37.3631563Z {"level":"info","ts":1670645257.3631563,"logger":"argo-events.sensor","caller":"sensors/listener.go:416","msg":"successfully processed the trigger","sensorName":"my-sensor","triggerName":"workflow-trigger","triggerType":"Kubernetes","triggeredBy":["webhook"],"triggeredByEvents":["36633463313738302d653536622d346263362d383838312d613631346231313133643333"]}
if that is not the case the
Sensor
is probably configured incorrectly. Make sure your dependency is pointing to the correctEventSource
apiVersion: argoproj.io/v1alpha1 kind: Sensor metadata: name: my-sensor namespace: default spec: dependencies: - name: my-dependency eventSourceName: the-event-source-name eventName: the-name-of-the-event-source-type
Note when using the resource trigger, to deploy a
Workflow
, make sure all namespaces are set correctly. Especially if you’re using it in conjunction withKustomize
. Kustomize will overwrite all base kubernetes resource namespaces. But not those that are part of anyCustomResourceDefinition
. (kustomize build . | grep "namespace:"
)
For me the above steps were 9/10 times enough to figure out what goes wrong with triggering the Sensor
.