The Why and How of Critical Risk Management
Critical Risks focus on the prevention of serious injuries and fatalities and in some cases can include other unwanted material events (UME’s) relating to significant financial loss or business disruption to an organisation.
In almost all cases, low probability extreme consequence events are identified through a structured process and updated into the organisation’s Risk Register where controls are identified, however these low probability extreme consequence events are often not treated any different from a disciplined Risk Management perspective. These events generally don’t present an issue, that is until the event happens. Without an effective and well managed Critical Risk Management system, early warning signs and control weaknesses usually remain unidentified.
When designed correctly, Critical Risk Management programs provide effective Lead Indicators, multi layered Control Effectiveness tests conducted by employees at all levels of an organisation and a scheduled auditing and reporting process, which focuses solely on the Critical Risk and the identified Critical Controls. Dashboards are set up as a visual representation of how each Critical Risk and Control is performing.
In order to create an effective and efficient Critical Risk Management system, the correct framework must first be established. The framework needs to take into account such things as:
The size of the organisation,
The existing risk management system,
The current safety climate of the organisation,
Existing procedures and controls, and
The composition of the workforce, for example are how are contractors utilised , who is the PCBU.
A crucial part of the Critical Risk Management system is designing fit for purpose tools to engage the workforce. These tools provide each worker with the ability to understand if the task they are about to complete is a Critical Risk, and if so, assist them to identify that the correct Critical Controls are in place prior to commencing the task. These tools will vary from organisation to organisation in order to meet the specific needs.
A Case Study on the Why and How of Effective Critical Risk Management
Generative HSE was engaged to assist a client to embed a Critical Risk Management program at their operation. It was a FIFO operation that had an open cut mine and an underground hard rock mine, there was a combination of the organisation’s employees and multiple contracting companies conducting work across the asset. The organisation had completed some preliminary work on the Critical Risk program, this formed a good foundation for the Generative HSE team to build on.
The client demonstrated risk mindfulness and organisational maturity, to recognise the requirement for the embedding of a Critical Risk Management program as part of the “business as usual” daily running of the operation.
The initial work conducted by the client prior to engagement consisted of;
Identification of the Critical Risk,
Facilitated workshops to completed in depth bowtie analysis of each identified critical risk,
Assigned Critical Controls for each of the Critical Risks,
Assigned Risk Owners for each Critical Risk,
Easy to identify graphics designed for each identified Critical Risk, and
Commenced an awareness campaign to educate the workforce.
Before we progress any further it is important that we are on the same page so please take some time to digest the following definitions.
Risk Owners are responsible for the management of their assigned risk/s. They chose who the Control Owners are for each Critical Control. Risk Owners also approve bowties analysis for their risk and decide on what the Critical Control for the risk are. Risk Owners also approve the Critical Control Standards. Risk Owners review all tasks that have been conducted by the task Owners and Control Owners, they make proposed control effectiveness ratings on each individual Critical Control and a proposed Overall Risk Control Effectiveness rating taking into account the ratings of all of the controls. These proposed ratings and all of the findings from the verification process are then reported to a review committee that endorse these ratings.
Control Owners are an essential part of the Critical Risk Process, they are responsible to ensure that all of the scheduled verifications are completed throughout the reporting cycle. They work with Risk Owners in detecting early warning signs relating to their Critical Control. Control Owners need to be subject matter experts in field of which the control relates to. Control owners analyse data competed by task owners and raise actions where there are control weaknesses or non-conformances identified.
Each Critical Control will have Task Owners, Task Owners have scheduled tasks that are stipulated in the Control Standard. These tasks are either completed in the field or as a desktop study, this will also be identified in the Control Standard. Task Owners are provided a checklist that clearly defines what is required to be measured and how to measure it. Once completed these checklist are reviewed by the Control Owners. Task Owners are responsible for raising actions corrective actions and reporting incidents if they are discovered during the process. There can be one or many Task Owners for each Critical Control, this is decided in consultation between the Risk and Control owner.
Critical Control Standards
Critical Control Standards identify the “Who, What, Where, When and How” to measure and monitor the effectiveness of the Critical Control.
A review Committee comprising of identified members of the leadership team needs to be established. The role of this committee is to provide governance to the entire Critical Risk process, the committee endorses the Control and Overall Risk Control Effectiveness Ratings. The Risk Committee also monitor the progress of actions relating to the Critical Risk Management Process.
Critical Risk Observations
Rather than relying on an audit schedule to assess if controls are in place and effective Critical Risk Observations are designed and implemented, these observations are specific to each risk and are in an easy to follow format based on binary questions. The intent of these observations is they can be completed by any member of the organisation and can be scheduled or completed ad-hoc. Targets can be established around the number of observations to be completed, this target can be a useful Lead Indicator metric. The value of these observations are that they can provide valuable information that can be used proactively to manage risk rather than finding that gaps exist during an investigation to incident that has occurred.
Critical Control Plan, Do, Check, Act tool
In our opinion this tool is he most crucial part of the whole process. Our belief is that if an operator commences a tasks without knowing that is is a Critical Risk and without knowing the controls and making sure they are in place the whole Critical Risk process ceases to exist. If a Critical Control is absent the whole process fails.
The tool that the workers are give them the ability by answering a few simple questions:
A: Is the task they are about to commence a Critical Risk, and if so
B. What are the required Critical Controls for that task.
This tool is not a tick and flick process and prompts the user to actually write down what the Critical Controls are. This tool needs to be designed to be specific to the needs of an organisation and sets workers up for success.
Generative HSE provided a dedicated consultant to the client on a 4:3 roster for six months, the consultant was integrated into the commercial team and was given access to all the required resources to enable effective execution of the project. Our consultant worked directly with the identified Risk Owners to construct simplified bowties that could be used as a working document, establish Critical Control Owner's, Task Owners and to refine the Critical Controls including the development of Critical Control Standards.
Our team also assisted in establishing a review committee, the design of Critical Risk Observations and the design of a Critical Control Plan, Do, Check, Act tool for the workforce to do prior to conducting a task
Implementing a Critical Risk Management System can be a cause of great angst for the workforce especially for those that are identified as Risk Owners. Without a framework that is specifically designed for the organisation people struggle to see how all of the pieces fit together and how they can possible have oversight of the process, this of course is in addition to existing workloads and demands. During the initial meet and greet meeting with these clients one of the managers said “We have done all of this work, but it feels like there are a few pieces missing”. He was correct they were missing a few of the necessary components of the framework.
With a constructive engagement process established consultants collaborate with Risk Owners to help build all of the aspects relating to their Critical Risk, that leaves the Risk Owner feeling like they are in control of the process and not merely a passenger that has been handed something to take care of.
One Risk Owner that we collaborated with owned quite a few risks and understandably felt overwhelmed with the process. Prior to our engagement with the organisation he could not understand how he could incorporate all of this very important additional work into an already packed workload. This person was rightfully skeptical that the Critical Risk Management System could be successfully implemented, by the end of the process he was not only confident with the system he was an advocate of it. He showed tremendous leadership of the process and led his department as early adopters, he worked with his supervisors to enable them to understand the process and set his personnel targets to achieve for Critical Risk Observations.
Critical Risk is not a hard thing to sell to the workforce everyone wants to work together to prevent fatalities in their workplace, what can be difficult is getting people to understand the requirement to implement and maintain the required framework. Often there is skepticism around the following and here were some points a few of the organisations personnel raised:
There is nothing wrong with how we are doing it now, we have been doing it for years and we have not had a fatality,
We are doing fine our lag indicators have been trending down for a long time,
Why do we need to measure that, we know that we do it,
That’s how we do it around here and it works?
I have worked in an organisation where for years there was not a fatality, however there were many early warning signs however. There had been many significant incidents that resulted in thorough investigations and ICAMS, one common thing that continued throughout this time is that there was no change to the Risk Management System. Layers of controls were piled on top of each other but there was nothing that made one particular control more relevant or important than the other. Operations continued and incidents continued to occur and then the day came when there was a fatality, this was a very sobering moment for the organisation. There was a significant change relating only to that specific task, but no substantive changes to the risk management system. Two years later there was another fatality, this was the catalyst for the implementation of Critical Risk at this organisation. This serves as a reminder that just because an extreme consequence in past has not transpired that it can be credibly relied on as assurance it will not occur in the future.
You get what you measure
If organisations overly fixate on such things as Total Recordable Injury Frequency Rates (TRIFR) they will generally see a change in the result. You will see this result as there has been expectations and accountability set t thus making people feel compelled to make a change. When organisations fixate on lag indicators like TRIFR they can take their eye of the ball for the low probability extreme consequence events. When we see good results in reductions of lag indicators we must ask ourselves two key questions to substantiate whether they are a result of good management or a result of good luck. The first question “How has this change occurred?” If there is no evidence of how the change has occurred there is cause for concern. The second question is “how will the reduction be sustained?”, if there is no plan on how the change will be capitalized on or at least sustained again there is cause concern. Weknow of an operation that had a TRIFR of one for over three years and then a fatality tragically happened, during the investigation there was evidence that the main contributory cause had been occurring for sometime. That leads us to believe that it was just luck that the fatality had not occurred sooner in that particular case.
You get what you inspect not what you expect.
When we set out to formalise current checks that are conducted there is quite often resistance as it is seen as double handling - why check that the check is occurring? If something has been recognised as a Critical Control is is vital that we regularly monitor the Control Effectiveness. We have seen in many cases that once we actually measure how we do things versus how procedures tell us to do things there are significant gaps. When Task Owners have actually completed verifications on how the metrics are performing we do see that either these measurements are not occurring as often, as they should or the data they produce is not being analyzed correctly or not analysed at all.
Another thing that we find is that when there are checks in place they have not been formally established or documented, by this we mean it is not clearly defined who is responsible or how often they are meant to occur. They do happen, but if certain people suddenly left the organisation the whole process would most likely fall over quickly.
The success of the Critical Risk Management system in this operation was largely due to the Risk Owners being active participants in the process not merely passengers, a very supportive leadership team and a great working synergy between the Generative HSE team and the client.
Both sides worked together and understood the requirement to make the system one that would meet the intent (prevention of fatalities), be achievable, be measurable and based in reality on how the operation actually works.