PagerDuty#
| Description | PagerDuty creates incidents from EDA alarms and query results. |
| Author | Nokia |
| Supported OS | N/A |
| Catalog | nokia-eda/catalog |
| Language | Go |
Overview#
The PagerDuty app connects EDA to PagerDuty and creates or resolves PagerDuty incidents from EDA alarms and query results.
The app exposes two resources:
Pager: namespace-scoped incident generator created in a user namespace such aseda.ClusterPager: cluster-scoped incident generator created in the EDA base namespace.
Each Pager or ClusterPager references a Kubernetes Secret that contains the PagerDuty service integration routing key under data.key.
Installation#
Before installing the app, create a secret containing the PagerDuty REST API key in the EDA base namespace. The controller uses this key when communicating with PagerDuty.
Install the app from EDA Store or by running an AppInstaller workflow with kubectl or edactl:
Install Settings#
The app exposes the following install-time settings:
pdApiKeySecretName: name of the secret containing the PagerDuty REST API key.proxyConfigMapName: ConfigMap name used forHTTP_PROXY,HTTPS_PROXY, andNO_PROXY.pdConfigMapName: ConfigMap name used for runtime tuning.pdCPULimit: CPU limit for the controller pod.pdMemoryLimit: memory limit for the controller pod.
These settings control the deployment in the EDA base namespace and can be provided through spec.apps[].appSettings in the AppInstaller workflow or directly in the EDA UI.
If you provide the ConfigMap named by pdConfigMapName, the controller can tune rate limiting and logging with these keys:
PD_APIKEY_REQ_PER_MINPD_APIKEY_BURST_REQUESTSPD_RKEY_REQ_PER_MINPD_RKEY_BURST_REQUESTSPD_RKEY_BUFFER_SIZEPD_RKEY_ENQ_TIMEOUTPD_LOG_STATSPD_LOG_STATS_INTERVALLOG_LEVELENABLE_WEBHOOKS
Getting Started#
The setup has two steps:
- Store the PagerDuty service integration routing key in a Kubernetes
Secret. - Create a
PagerorClusterPagerthat references the routing key secret and defines an alarm or query source.
If the secret reference is just a name, the app looks for it in the same namespace as the Pager or ClusterPager resource. You can also use namespace/name to reference a secret in another namespace.
Namespace behavior
Pager resources must be created outside the EDA base namespace. ClusterPager resources must be created in the EDA base namespace.
PagerDuty Routing Key Secret#
Each resource references a Kubernetes secret through spec.routingKeySecret.
The secret must contain this key:
key
For a namespace-scoped Pager, create the secret in the same namespace as the Pager, or use namespace/name in routingKeySecret. For ClusterPager, create the secret in the EDA base namespace, or use an explicit namespace/name reference.
Example routing key secret in namespace eda:
Alarm Sources#
Use sources.alarms to create PagerDuty events for matching EDA alarms.
Notable specification fields:
sources.alarms.include: alarm types that create PagerDuty events.sources.alarms.exclude: alarm types to ignore.sources.alarms.autoResolve: sends a PagerDutyresolveevent when the alarm clears.sources.alarms.namespaces: limits which namespaces are watched by aClusterPager.
Payload behavior:
summary:<namespace> - <alarm type> - <resource>source:EDA-ALARMScomponent:<namespace>/<resource>group: alarmkindseverity: mapped from the EDA alarm severity.majorandminorare sent aserror.
Example ClusterPager watching interface alarms:
apiVersion: pagers.eda.nokia.com/v1alpha1
kind: ClusterPager
metadata:
name: pagerduty-alarms
namespace: eda-system
spec:
description: Raise incidents for interface-related alarms
routingKeySecret: eda-pd-rkey
sources:
alarms:
autoResolve: true
include:
- InterfaceDown
- TopoLinkDown
exclude:
- InterfaceMemberDown
cat << 'EOF' | kubectl apply -f -
apiVersion: pagers.eda.nokia.com/v1alpha1
kind: ClusterPager
metadata:
name: pagerduty-alarms
namespace: eda-system
spec:
description: Raise incidents for interface-related alarms
routingKeySecret: eda-pd-rkey
sources:
alarms:
autoResolve: true
include:
- InterfaceDown
- TopoLinkDown
exclude:
- InterfaceMemberDown
EOF
Query Sources#
Use sources.query to subscribe to an EDA table and generate a PagerDuty event whenever a matching object appears or changes.
Notable specification fields:
sources.query.table: the EDA table to watch.sources.query.where: filter applied to the subscription.sources.query.fields: optional list of fields to fetch. If omitted, all fields of the subscribed table are used.sources.query.autoResolve: resolves the PagerDuty incident when the matching object disappears.sources.query.includeDetails: copies the returned query data into PagerDutycustom_details.sources.query.summary,sources.query.source, andsources.query.severity: required templates.sources.query.component,sources.query.group, andsources.query.class: optional templates.
Query Templates#
Use Go templates to build the PagerDuty payload. Because many keys contain dots, index is usually the easiest way to reference them, for example {{ index . "node.name" }}.
The severity template must render to one of critical, error, warning, or info.
For a namespace-scoped Pager, the controller automatically rewrites .namespace... paths so the subscription stays inside the Pager's namespace.
Example Pager that raises an incident when an interface goes down:
apiVersion: pagers.eda.nokia.com/v1alpha1
kind: Pager
metadata:
name: pagerduty-interface-down
namespace: eda
spec:
description: Raise incidents for down interfaces in the eda namespace
routingKeySecret: eda-pd-rkey
sources:
query:
autoResolve: true
table: .namespace.node.srl.interface
fields:
- oper-state
- admin-state
where: oper-state = "down" AND admin-state = "enable"
summary: 'Interface {{ index . "interface.name" }} is down on node {{ index . "node.name" }}'
source: '{{ index . "node.name" }}'
severity: "critical"
component: '{{ index . "namespace.name" }}/{{ index . "node.name" }}'
group: "interfaces"
class: "state-change"
includeDetails: true
cat << 'EOF' | kubectl apply -f -
apiVersion: pagers.eda.nokia.com/v1alpha1
kind: Pager
metadata:
name: pagerduty-interface-down
namespace: eda
spec:
description: Raise incidents for down interfaces in the eda namespace
routingKeySecret: eda-pd-rkey
sources:
query:
autoResolve: true
table: .namespace.node.srl.interface
fields:
- oper-state
- admin-state
where: oper-state = "down" AND admin-state = "enable"
summary: 'Interface {{ index . "interface.name" }} is down on node {{ index . "node.name" }}'
source: '{{ index . "node.name" }}'
severity: "critical"
component: '{{ index . "namespace.name" }}/{{ index . "node.name" }}'
group: "interfaces"
class: "state-change"
includeDetails: true
EOF
Cluster-Scoped Resources#
Use ClusterPager from the EDA base namespace when you want centralized paging across namespaces.
The fields are the same as the namespace-scoped Pager, but:
ClusterPagermust exist in the EDA base namespace.- Alarm sources can watch selected namespaces through
sources.alarms.namespaces. - Query subscriptions are not namespace-rewritten, so they can watch cross-namespace data with fully qualified
.namespacepaths.
Validation Notes#
When creating resources, follow these rules:
- You must configure at least one source:
alarmsorquery. - Alarm sources must define at least one of
includeorexclude. - Query sources must define
table,summary,source, andseverity. - Query severity templates must render to one of
critical,error,warning, orinfo. - Query templates are validated when the resource is created or updated.