WPNinjas HeaderWPNinjas Header

Sentinel Incident Automation – Playbook dependencies

Intro

In this blog post I follow up on my previous blog post. There we addressed the challenge to to handle the (potentially massive) delay in entity mappings for security incidents. 

Here’s the link in case you missed the blog post: Sentinel Incident Automation – handle entity mapping delay in Playbooks – Workplace Ninja’s (wpninjas.ch)

Cool, you know how to solve the entity mapping case (yes, previous blog post). But how can other playbooks (LogicApps) benefit from the solution we’ve just added?

The challenge

In my case, I faced the challenge of multiple playbooks (LogicApps) triggered for a security incident:

  1. The first playbook (LogicApp) does the incredible magic and automates stuff on multiple conditions and checks
  2. The second playbook (LogicApp) even adds some information to the security incident from external sources (we all know the VirusTotal enrichment thingy)
  3. And then there is the last playbook (LogicApp), which simply alerts everyone

All playbooks (LogicApps) are triggered one by one by Microsoft Sentinel Automation rules. Automation rules run in a sequence after the full security incident has been generated in the Log Analytic workspace.

For some customers all three playbooks (LogicApps) are triggered, some other customers only a subset is triggered (you get the idea). So we don’t want to permanently hardcode any links between the Playbooks (LogicApps). Their trigger should solely remain within the Automation rules in Microsoft Sentinel.

So, how do we achieve that the playbooks (LogicApps) wait for each other? Automation must not alert nor enrich an already closed security incident. 

Playbook dependencies - Identification of the common ground

I have addressed the challenge with some thoughts around what is the common ground looking from a playbook’s (LogicApp) perspective? Obviously the security incident!

I cannot control/delay the playbook (LogicApp) trigger in Automation rules, nor does a static delay make sense within the playbook (LogicApp) because after the delay, the playbook (LogicApp) does not get updated security incident information (e.g. security incident is closed, etc.).

Loop it!

Looking at the picture below, where playbook (LogicApp)#1 gets triggered first followed by the playbook (LogicApp)#2, we leave the trigger as is; the main part lies within the playbooks (LogicApps) triggered after the first one.

Playbook (LogicApp) #1:

  1. Playbook (LogicApp) #1 adds a tag indicating that automation was applied (e.g. Tag = AutomationApplied)
    1. Entities are enumerated (through the loop part as this is the way to go in case of entity mapping delay imho)
    2. This first playbook (LogicApp) has a max overall runtime of the Duration(EntityMappingLoopMaxRunTime [Minutes]) + Duration(AutomationSteps [Seconds])
    3. So our assumption is that the first playbook is done within 10 Minutes (max)  
    4. If no automation was applied, the tag will be removed

Playbook (LogicApp) #2

  1. Playbook (LogicApp) #2 loops the check for the tag (e.g. AutomationApplied) != available on the security incident
    1. The maximum runtime of this loop must be slightly greater than the maximum overall runtime of the first LogicApp (Playbook) #1
      Why? We then know that the first playbook (LogicApp) was not able to automate the security incident (e.g. no entities => loop timed out or other conditions were not met)
  2. After the loop, check for the availability of the automation tag. If the tag is still there we know the security incident has been automated and we can cancel the  workflow in this playbook (LogicApp)

Closing thoughts

Play around with the timeouts. You might want to finetune these. Furthermore, from my experience, it’s worth to first enrich the security incidents with comments describing what would happen prior to terminate playbooks (LogicApps) – especially for alarming playbooks (LogicApps) which are crucial to SLA/contract agreements!

This approach adds LogicApp’s runtime with direct impact on the costs/budget; from my point of view and based on the scenario I see that 

  1. Microsoft Sentinel currently does not provide a safe option to control playbooks (LogicApps) Automation rules trigger in a sequence based on previous runtime results
  2. Better pay the additional runtime rather than a human doing the automation checks manually; An analyst woken up during the night or performing the checks during the work time is way more expensive

This solution is definitively not a beauty. I hope we see better, built-in options to solve this challenge.

Interested in technical details or better; already have implemented a better approach? Leave a comment, happy to learn ;-)!

    0 Comments

    Leave a Reply

    Avatar placeholder

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.