Iāve been involved in the DevOps world for a while and yet, I finished reading only recently. This piqued my interest in how teams execute their incident response playbooks. Itās enlightening to see the different approaches teams take, to hone what works best for them. The Phoenix Project ( : ) Disclaimer The Author is the Head of Developer Relations at n8n I wanted to test how automating a minimalist incident response playbook would look like and I decided to test it out with three of my favorite tools , and . Hereās a quick introduction to the three tools, in case you arenāt aware of them: n8n PagerDuty Mattermost n8n is a licensed tool that helps you automate tasks, sync data between various sources, and react to events all via a visual workflow editor. fair-code PagerDuty is a SaaS incident response platform for IT departments in companies. Mattermost is a flexible and open-source messaging alternative to Slack. To avoid panic during an incident, a lot of companies have an incident response playbook. I created a minimalist six-step playbook for this tutorial. Whenever, a service goes down or something unexpected happens, the on-call team would follow this high-level protocol: Triage issue in Jira Create auxiliary channel Invite the on-call team to the channel Acknowledge the issue Fix the issue Resolve the ticket We will automate this playbook with three workflows in n8n and this is how the result shall look like once we are done. Workflow 1āāāMake sure everyone knows whatĀ happened Our first workflow will cover the first three steps of the playbook. Whenever a service goes down and creates an incident report on PagerDuty, we want the workflow to automate the following tasks for us: A webhook gets triggered and informs a general incidents channel on Mattermost that something is wrong. Create an auxiliary channel for the specific incident, invite the on-call team to it and share its link for those interested in the incident. Triage an issue on Jira. Share the links of the auxiliary channel, PagerDuty incident and the Jira issue in the Incidents channel, and the auxiliary channel. Share action buttons in the auxiliary channel to acknowledge and resolve the incident. Letās get started with the nodes of the first workflow. I have also submitted on n8n.io, in case youād like to skim through this workflow. Please note that youāll still need to configure a couple of things like your credentials, channels on Mattermost as well as the settings of the nodes. You can find information on how to setup n8n in the . Workflow 1 documentation 1. Webhook node: Get data from PagerDuty First of all, we need to pull in the new incident reports from PagerDuty. To do that start n8n with the tunnel parameter: n8n start --tunnel Make sure that you donāt forget to add the parameter. Note: --tunnel Add a new node by clicking on the + button on the top right of the Editor UI. Select the node under the section. Webhook Triggers In the Node Editor view, set the to . For the , I have entered but feel free to add something else here according to your preferred convention. Now, youāll need to save the workflow. I named it āIncident Response Workflowā. Once the workflow is saved, click on , select and then click on the URL to copy it to the clipboard. HTTP method POST Path webhook Webhook URLs Test, Donāt forget to save the workflow first before copying the Webhook URLs. Note: Hereās a GIF of me following the steps mentioned above. Now that we have our Webhook node ready on n8n, weāll need to configure the settings on PagerDuty, so that it sends the new incident reports to the webhook. Unless your team already uses PagerDuty, you can create a free trial account on PagerDuty. If you are creating a new account, youāll also have to create a service that PagerDuty will be monitoring. PagerDuty has integrations with a lot of services, to monitor them, in case something goes wrong. Once you have created your service, letās configure the webhooks for the service. To do that, select the menu on the top and click on . Click on the button on the right side and select from the menu (do this for the service that you want to configure the webhook for). Now, under the section called , click on the button and select āGeneric V2 Webhookā as the . I entered as the name and entered the URL that the copied from the Webhook node. Click on the button and we are done! Configuration Services More View Integrations Extensions New Extension Extension Type n8n Save Hereās a GIF of me following the steps mentioned above. Now, click on the button to register the webhook. Once youāve done that, you can create a new incident at PagerDuty. Your Webhook node will receive all the details. Keep in mind that the Test webhooks are only valid for 120 seconds. It should look something like in the following image. Execute Workflow At times, when you are sending too many requests from PagerDuty, it will disable the webhook. Youāll have to re-enable it by going to the list of extensions and clicking on the button. Re-enable 2. Mattermost node: Create an auxiliary channel Now, we need to create a Mattermost node that will create an auxiliary channel so that the on-call team can coordinate on a fix for the incident. To do that, click on the + button and click on the node. In the Node Editor, enter your Mattermost credentials. Hereās some detailed on how to create an access token for the credentials. I have used an access token from a bot account, but you can also use the access token from your account. Mattermost information Throughout the tutorial, please make sure that the nodes are connected properly before you start the configuration in the Node Editor. If you donāt do this, the variables mentioned in the tutorials might not be visible to you. Note: Once you are all sorted out with the credentials, select āChannelā as the in the Node Editor Now select your team as the (in case you are unable to acquire that, please check with your system admin). We now need to enter a for the channel. Since this would be a dynamic piece of information, click on the gears icon next to the field and select . Select the following in the Variable Selector: Resource . Team ID Display Name Add Expression Nodes > Webhook > Output Data > JSON > body > messages > [Item: 0] > log_entries > [Item: 0] > incident > summary Quite some indentation, I know! This will make sure that the display name of the channel would be the same as the incident summary on PagerDuty to keep things coherent. Now you need to enter a . This needs to be a unique value, so weāll select the from the Incident report. Click on and select the following in the Variable Selector: Name id Add Expression Nodes > Webhook > Output Data > JSON > body > messages > [Item: 0] > id Perfect, now click on and this will create an auxiliary channel on Mattermost. Hereās a GIF of me following the steps mentioned above. Execute Node 3. Mattermost node: Add on-call team to auxiliary channel Once the auxiliary channel has been created, we need to make sure that all the on-call team members have been added to the channel. However, right now weāll add a single user to the channel. To do that create another Mattermost node. Select the credentials that you entered earlier. Select āChannelā as the and click on āAdd Userā for . Now we have to specify the where the user should be added. Since this is another dynamic piece of information, click on and in the Variable Selector, select the following: Resource Operation Channel ID Add Expression Nodes > Mattermost > Output Data > JSON > id Now we will specify a user by selecting ourselves from the dropdown list for . Click on the button and you will notice that you will be added to the channel. This node ensures that the specified user is always added to the auxiliary channel created by the workflow. User ID Execute Node Hereās a GIF of me following the steps mentioned above. As an exercise, try using the PagerDuty API to pull a list of the email IDs of the people who are on-call and add them to the auxiliary channel in Mattermost. Feel free to pick this up once you are finished with the tutorial. 4. Jira Software node: Triage the issue inĀ Jira Since the playbook specifies that the issue should also be triaged in Jira, weāll need to add a node that creates a ticket in Jira. To do that, create a Jira node by clicking on the + button on the top right. In the Node Editor, enter the for Jira. Hereās detailed on how you can create a new API Token for the credentials. Credentials information Once you are sorted out with the , select the where the tickets would be created. I selected a test project that I created specifically for this tutorial. In the , I selected āStoryā but feel free to select āBugā or something else. is a dynamic piece of information, select and pick the variable just like you did for the section while configuring the Mattermost node to create a channel. Credentials Project Issue Type Summary Add Expressions summary Display Name Click on e and this will create a Jira ticket for you. Hereās a GIF of me following the steps mentioned above. Execute Nod 5. Mattermost node: Post details in the Incidents channel The next thing that needs to be done is to post the details of the incident in the Incidents channel. We will need to share the following information in the channel: Summary of the incidentLink to the Auxiliary channelLink to the PagerDuty incidentLink to the Jira ticket Sharing these pieces of information will ensure that if someone outside of the on-call team is interested to check out what is going on, they can get this information from the Incidents channel. To do this, create a new Mattermost node. In the Node Editor, select your . Now we need to enter the . Since this is not a dynamic piece of information (the Incidents channel would always be there and hence, the ID will remain the same), we need to grab its . Credentials Channel ID Channel ID If you donāt already have a channel like this for the tutorial, you create manually create a new channel on Mattermost. To get its ID, click on the down arrow next to the channel name and click on the option. This will reveal the ID of the channel. You can then copy and paste that in the field in the node. In the message section, I entered the following expression to include the information that we mentioned in the list above. View Info Channel ID šØ New incident: {{$node[ ].json[ ][ ][ ][ ][ ]}} Auxiliary Channel -> https://mattermost.internal.n8n.io/test/channels/{{$node["Mattermost"].json["name"]}} PagerDuty Incident -> {{$node[ ].json[ ][ ][ ][ ][ ]}} Jira Issue -> https://n8n.atlassian.net/browse/{{$node["Jira Software"].json["key"]}} "Webhook" "body" "messages" 0 "incident" "summary" "Webhook" "body" "messages" 0 "incident" "html_url" Finally, click on the button to send this information to your Incidents channel. Hereās a GIF of me following the steps mentioned above. Execute Node 6. Mattermost node: Post details and action buttons in the auxiliary channel As a last step of this workflow, we need to provide the information that we talked about in the previous node to the auxiliary channel as well. Moreover, we will need to provide the following two buttons in the channel: Clicking this button will change the status of the incident on PagerDuty from āTriggeredā to āAcknowledgedā. Acknowledge: Clicking this button will change the status of the incident on PagerDuty from āAcknowledgedā to āResolvedā and mark the ticket in Jira to āDoneā. Resolve: To do this, create a new Mattermost node and connect it to the Jira node. This will ensure that this and the previous Mattermost node can run in parallel. In the Node Editor, select your . Next, youāll need to enter the of the auxiliary channel. You can follow the steps mentioned in to do that. In the e section, I entered the following expression (this is quite similar to the from the previous node): Credentials Channel ID Workflow 1, Step 3 Messag Message ā ļø {{$node[ ].json[ ][ ][ ][ ][ ][ ][ ]}} PagerDuty incident: {{$node[ ].json[ ][ ][ ][ ][ ][ ][ ]}} Jira issue: https://n8n.atlassian.net/browse/{{$node["Jira Software"].json["key"]}} "Webhook" "body" "messages" 0 "log_entries" 0 "incident" "summary" "Webhook" "body" "messages" 0 "log_entries" 0 "incident" "html_url" Now, we need to create the buttons which will trigger the actions that we talked about. To do that, under , click on the button, click on , and select . Then click on the button and name it . Attachments Add attachment Add attachment item Actions Add Actions Acknowledge Now click on the button. This will allow us to give the URL of the webhook this button will trigger on being clicked. Weāll leave this empty for now. Add Integration Weāll also need to send details (to the next workflow) about the PagerDuty incident to mark as resolved when the button is clicked. To do that, click on the button under the section. Weāll enter as the . Since the is a dynamic piece of information, click on In the Variable Selector, select the following: Add Context to Integration Context pagerduty_incident Property Name Property Value Add Expression. Nodes > Webhook > Output Data > JSON > body > messages > [Item: 0] > incident > id Now, add another button called and following the same steps mentioned above. For this button, weāll need to add the context of the pager duty incident and the Jira ticket key. Iāll leave this as an exercise for you. For the sake of uniformity, you can name the . Resolve Property Name jira_key In case you were wondering, it is important to send the context with the buttons as there might be multiple auxiliary channels at any given time and multiple people clicking on different Acknowledge and Resolve buttons. We need the correct context so that we donāt close up the wrong PagerDuty incidents and Jira tickets by mistake. Click on the button to send all this information to the auxiliary channel. Hereās a GIF of me following the steps mentioned above. Execute Node Workflow 2āāāMake sure that the incident is acknowledged Our second workflow will cover the fourth step of the playbook. Once all the people responsible get notified that an incident has occurred, we need to make sure that there is a quick and easy way to acknowledge the incident so that it is clear that someone in the on-call team has got it. Letās get started with the nodes of the second workflow. I have also submitted on n8n.io, in case youād like to skim through this workflow. Please note that youāll still need to configure a couple of things like your credentials as well as the settings of the nodes. Workflow 2 1. Webhook node: Get data from the Acknowledge button We now need to set up a Webhook node that listens to the event when somebody clicks on the button in the auxiliary channel. Acknowledge Create a Webhook node the same way you did in . Now copy the link of the webhook from this Webhook node, go to the node from and paste it in the field in the section of the Acknowledge button under . Workflow 1, Step 1 Test Workflow 1, Step 6 URL Integration Actions Once you are done with that, click on the button to register the webhook and test it by clicking on the Acknowledge button in the auxiliary channel. Hereās a GIF of me following the steps mentioned above. Execute Node 2. PagerDuty node: Acknowledge the incident on PagerDuty Now we need to get the ID of the incident from the webhook node to know which incident to mark as acknowledged. We get this information from the context that we added to the of the button. Integration Add a PagerDuty node by clicking on the + button on the right side. In the Node Editor view, first of all, youāll have to enter the for PagerDuty. Hereās detailed on how you can create a new API Token for the credentials. Once you are done with that, select āUpdateā as the . Since the is a dynamic piece of information, click on and select the following in the Variable Selector: Credentials information Operation Incident ID Add Expression Nodes > Webhook > Output Data > JSON > body > context > pagerduty_incident In the field, I have just entered my email. In the section, click on the and select . From the dropdown list in the field, select āAcknowledgedā. Now, click on the button. Go to the auxiliary channel and click on the button. This will change the status of your incident report from āTriggeredā to āAcknowledgedā. Hereās a GIF of me following the steps mentioned above. Email Update Fields Add Field button Status Status Execute Workflow Acknowledge 3. Mattermost node: Confirm the acknowledgment Now we just need to confirm the change of status of the PagerDuty incident by sending a message to the auxiliary channel. Iāll leave this as an exercise for you. In case you run into any troubles, hereās a GIF of me creating this node. Workflow 3āāāMake sure that everything is marked resolved after theĀ fix Our third workflow will cover the sixth step of the playbook. Once the issue has been fixed, we need to make sure that the incident on PagerDuty has been marked as āResolvedā and the ticket on Jira has been marked as āDoneā. We also need to ensure that everyone in the Incidents and the auxiliary channel is aware of the resolution as well. Letās get started with the nodes of the third workflow. The nodes of this workflow have been left as an exercise for you. I have added GIFs for the nodes and have also submitted on n8n.io, in case you run into any troubles. Please note that youāll still need to configure a couple of things like your credentials as well as the settings of the nodes. Workflow 3 1. Webhook node: Get details from the ResolveĀ button Just like in the last workflow, we need a Webhook node that listens to the event when somebody clicks on the button in the auxiliary channel. Hereās a GIF of me creating this node. Resolve 2. PagerDuty node: Resolve the incident on PagerDuty Now we need to change the status of the PagerDuty incident from āAcknowledgedā to āResolvedā. This is very similar to the . Hereās a GIF of me creating this node. Workflow 2, Step 2 3. Jira Software node: Resolve the incident onĀ Jira Now we need to update the status of the Jira ticket to āDoneā. Hereās a GIF of me creating this node. 4. Mattermost nodes: Announce the resolution in the auxiliary and Incidents channel Lastly, we need to create two Mattermost nodes: To acknowledge in the auxiliary channel that the incident report on PagerDuty and the ticket on Jira have been resolved. To announce in the Incidents channel that the incident has been resolved. Hereās a GIF of me creating this node. Congratulations, you successfully built an automated incident response workflow using n8n, PagerDuty and Mattermost š Letās run the whole system end to end. First of all, youāll have to click on the button on all three workflows to register the Webhook nodes. Go ahead and get started by creating a new incident on PagerDuty. Execute Workflow Now, to make sure that the workflow runs permanently without you having to press the on all three workflows before each incident creation, weāll need to use the webhook. Execute Workflow Production To do that, youāll just need to get the Production webhook URL from the different Webhook nodes, update the URLs on PagerDuty and the Mattermost node from , save the workflows and finally activate the workflows. This will make your workflows ready to use. Workflow 1, Step 6 Production Production Note: When working with a webhook, please ensure that you have saved and activated the workflow. Donāt forget that the data flowing through the webhook wonāt be visible in the Editor UI with the webhook. Conclusion Today we created an automatic incident workflow using a variety of n8n nodes. The first-class support for webhooks and APIs allows n8n to integrate a very wide array of services and products, to create powerful workflows in a simplified way. This was an example of automating a minimalist incident response playbook. Which other services are you using for managing incidents in your organization? In case you have created other workflows with n8n that use different nodes, Iād love to check them out, please consider those workflows with the community. sharing In case youāve run into an issue while following the tutorial, feel free to reach out to me on or ask for help on our š Twitter forum ( : ) Disclaimer The Author is the Head of Developer Relations at n8n