How to Improve Alert Management?
This page is a comprehensive guide to Advanced Alert Management for SOCs.
Novel Approach
Alert Management with Attack Simulation
What is Alert Management?
In modern SOCs, security analysts deal with a large number of alerts generated by tools such as SIEMs and EDRs. Alerts indicate potentially malicious activities organizations face day in day out and are generated based on the detection rules that were built on collected event logs. Field reports show that the number of alerts generated in a SOC can be from tens of thousands to millions per day. Alert management defines a set of processes in handling these large numbers of alerts.
There are five main areas to look at to measure if alert management in a SOC is handled effectively:
- Alert coverage: Are alerts triggered for all security events? Or, are there security events that go undetected?
- Percentage of false alerts: What percentage of the alerts do not indicate real security events?
- The number of alerts triaged: How many alerts are triaged in a given time period?
- Speed of handling alerts: How quickly are alerts triaged?
- Assigning the right priority: Are alerts handled with the right priorities?
How Do You Best Improve Detection Rate and Speed?
Alert management is a field that requires utmost collaboration among cybersecurity teams and should be managed holistically, with end-to-end planning and execution. In order to handle SOC units effectively, pre-alert and post-alert functions should work well and be strategically aligned.
The pre-alert stage involves log management and detection rule engineering and this is where data turns into knowledge and action. Log management provides visibility on security events that take place in a network.
Detection engineering adds intelligence on log data. Detection engineers write rules to trigger alerts if:
- A data point on a log is worth to be seen by analysts.
- There are security events that take place on different user devices, network zones, or at different times are part of the same cyber attack chain.
Quality and coverage of logs and detection rules are key in defining the success of the pre-alert stage.
Post alert stage involves alert triage. Alert triage can result in three potential actions:
- Escalate for further triage,
- Escalate as an incident,
- Drop (due to false positives, identification of low/no risk status, etc.)
Factors such as quality of alerts, the skill set of security analysts, and use of automation define the effectiveness in the alert triage phase.
Pre and post-alert stages are tightly interconnected and should be operationalized as part of the same function. Any shortcoming in the chain of log management, detection rule engineering and alert triage affects the overall efficiency.
What are the Challenges of Alert Management?
There are a good number of challenges security analysts and detection engineers face as part of alert management.
Alert Fatigue
First and foremost, the number of alerts SOCs need to handle are way too many while no stone can be left unturned and one can do only as much. Different reports indicate a security analyst could triage 10 to 20 alerts per day whereas a SOC receives alerts in the range of thousands and millions, depending on the size and vertical of the business. Alert fatigue, as a consequence, has become a widespread phenomenon and causes stress, dissatisfaction at the workplace and a high employee turnover. In 2019, 8 out of the 10 SOCs experienced 10% to 50% analyst turnover (For more information: Critical Start, Research Report: The Impact of Security Alert Overload, 2019).
Tracking and Adapting to Changes in Risk Factors
Data that SOCs collect and process are susceptible to remaining out of date. Changes that happen in the adversarial landscape and network environment may not always be seen or captured due to reasons in or out of the remit of SOCs. Being blind to new attack techniques and not being aware of network changes or new applications may result in not getting alerted on high severity security events and puts digital assets at risk.
Adapting the detection rule set to external and internal changes is also a challenge on its own. Detection engineers need to have advanced cyber security, platform and software knowledge to create rules that will trigger alerts to start detection and response activities. Rule development is a tedious process. It can take time and if not handled well, rules can cause false alerts or may not generate alerts when they should have.
False Positives, Alert Noise and Challenge of Prioritization
According to a study by Ponemon, on average 25% of the alerts SOC deals with today are false positives and 55% of the alerts are not investigated. Excessive number of alerts and shortcomings related to detection rule quality and log content contribute to the intensive false-positive condition, alert noise and difficulty in prioritization.
Skill Set Limitations
In a SOC a good number of junior and senior cyber security professionals work together. Junior cyber security professionals monitor, manage and assign priority to alerts. They watch logs and related alerts of network intrusion prevention systems, other controls to strengthen defences and investigate suspicious emails. Handling these tasks, junior members often need guidance. Senior members, scarce in number, while mentoring their junior colleagues, deal with high severity incidents under constraints of time and budget.
Risk-Based Alert Handling
No matter what there will be a large number of security events taking place in an enterprise network. The most effective way to deal with this heavy load is to understand security event patterns, categorize and prioritize accordingly, and deal with categories as a whole. Defining user behaviours and mapping adversarial activities to frameworks such as MITRE ATT&CK and cyber-kill chain help a great deal. Also, being in control of the attack surface, knowing what key assets are and what needs to be secured before others help prioritize better.
The territory between broadly defined detection rule set and narrowly defined rule set ise wide. The first gives a good number of alerts, false positives, noise and alert fatigue. The latter may cause missing critical security events. Managing alerts based on risk is key for striking the right balance
Building and Implementing Detection Content with Agility
Developing detection rules is a demanding and error-prone task. If not handled with precision, detection rules can cause false alerts or may not generate alerts when they should have. Rapid changes in adversarial and internal contexts make it difficult to keep up and adapt the existing rule set. Blue-teams should streamline the rule development process. They should build the capabilities in acquiring indicators of compromises on new and targeted attacks and use verified detection content when and if available.
Putting Threats in the Centre in Aligning Priorities
Internalizing any threat intelligence as quickly as possible to understand the existing readiness status in detecting the associated techniques is a key competence. If such competence could become a continuous blue-team capability, security analysts on every level would be empowered in prioritizing alerts and resolving incidents.
Using Automation
Automation helps! SIEM,SOAR and EDR technologies offer a good degree of automation. Eliminating low severity alerts with the power of automation frees up time and allows security teams to dedicate time to targeted and other high severity attacks.
Log management is challenging
New perspective to cope with this problem
Alert Management with
Attack Simulation
As we discussed, adapting the detection rule base to the changing adversarial context is a difficult task. This difficulty results in detection gaps, false positives, alert noise and alert fatigue.
Challenging SIEM and EDR detection rules with an extensive attack simulation and using an automated platform addresses some key challenges mentioned. The Picus platform offers security insights that combines detection gaps and detection content, empowers red and blue team practices and makes purple teaming an integrated capability whereby cyber defense teams can improve security posture. Explore how Picus Security Control Validation Platform:
➔Fix detection gaps in minutes with vendor specific detection content and Sigma
➔ Proactively improve EDR alerting capabilities
➔Proactively improve SIEM alerting capabilities
➔ Help hunting threats that matter the most with speed