As the teams who build systems become responsible for resolving incidents within those systems, it’s logical that the same team be responsible for running postmortems, doing the detective work to get to the root cause of an incident, and making recommendations that will prevent or lessen the impact of future incidents. The overlap in problem and incident management may also be connected with the industry-wide shift toward a “ you build it, you run it” approach.
Collaboration is at the core of continuous improvement.Postmortems should be blameless and inclusive of any team impacted by an incident.There is often more than one root cause of an incident.This shift comes from not only the fact that the practices are two sides of the same coin- preventing and resolving incidents-but also from a DevOps approach that typically affirms that: The downside to this approach is that separating the two practices-which are so tightly linked in reality-can create knowledge gaps and a breakdown in communication between incident resolution and the root cause analysis that leads to the underlying cause. If an incident manager’s primary goal is the quick resolution of incidents and a problem manager’s primary goal is prevention, combining these roles may mean one of those goals-both of which are vital to an organization-may get deprioritized in favor of the other. By making them separate and equally important practices, presumably, the guidelines are attempting to avoid the common problem of IT teams constantly putting out incident fires without dealing with the root cause of those fires. The benefit of the ITIL approach is that it prioritizes the core goals of both problem management and incident management. Incident management is focused on addressing incidents in real time. Problem management is a practice focused on preventing incidents or reducing their impact. One causes the other and teams have to pay attention to both.įor traditional IT teams, the latest ITIL guidelines call for teams to manage both problems and incidents, but to do so separately. Obviously, problems and incidents are inextricably linked.