Out-Law News 3 min. read
14 Sep 2023, 9:27 am
A recent report commissioned by the Bank of England has found that fail-safes that ensure safety and reduce systemic risks for the financial system are essential as the financial sector increases its use of artificial intelligence (AI)
Tests done by researchers showed that different models of deep learning (DL), a subset of AI that is used in automated decision-making in the financial sector, can produce similar predictions but provide different explanations. This means different models, when given the same set of data, would show different reasonings for their predictions or decisions despite displaying very similar results overall. The explanations could be substantially different due to small or even arbitrary changes to the models.
The research warned that getting variable explanations for the same automated decisions made by deep learning models could damage consumer trust and create moral hazard in financial institutions. In the Bank of England’s staff working paper, the researchers explained that “the presence of different explanations presents a morally hazardous scenario where model developers could choose deep learning models with explanations that might be appealing to their risk and governance. If the governance mechanisms are unaware of DL model fragility they might underweight the risk of adverse outcomes under different scenarios.”
The research also highlighted that a model with variable explanations could be hard to debug, because there is no certainty that efforts to rectify flaws will in fact lead to desired changes.
Model developers, therefore, are urged to conduct fragility testing to understand their models’ robustness in-terms of explanations and predictions. A risk mitigation approach recommended by the paper is to develop a wide array of competing models with a view to adopting only those models that display acceptably low amounts of variability.
More importantly, the research pointed out that the prediction sets generated by different DL models become more concentrated or consistent during crises, such as the 2008 financial crisis. This creates “concentration risk”, which could be one of the main possible causes of major losses in a credit institution. “Measures of concentration become significantly stronger in times of distress. This could prove dangerous for the financial system which is increasingly becoming automated and relying more and more on complex machine learning models such as neural nets,” said the researchers.
Developers of DL models, therefore, are asked to invest more time in understanding how their models would work during turbulent times and include fail-safes to stop the models under certain untested conditions. Having human in the loop settings, where human experts are actively involved to provide oversight, training and intervention throughout the decision-making process, is another recommendation to ensure safety.
This is especially crucial in mitigating flash crash events, according to the working paper, as “it is hard to monitor and prepare for DL model misbehaviour in different scenarios”.
“This research highlights that many organisations will still have work to do in developing their strategies on the explainability of AI systems and that a one-size-fits-all approach may not be appropriate for all use case,” said financial services and technology expert Luke Scanlon of Pinsent Masons. “Where explanations vary, it is fundamental that other safeguards be put in place, particularly human oversight review mechanisms and measures for considering the robustness of choices made in using a particular model. Sometimes there will be trade-offs between the usefulness of decisions than cannot be consistently explained and other objectives which can be justified – it is important however that those justifications are documented.”
“It is interesting to see the need for fail-safes highlighted in this research which aligns with the approaches taken internationally including that of the EU, with the fail-safe concept as it stands remaining in the final drafts of the EU Act,” he said.
From the regulation and policy level, the research acknowledged that “existing financial sector model governance and monitoring regimes, built in an earlier era of data analytic technology, are likely to fall short in addressing the new risks posed by deep learning”. Several recommendations were made in the paper for financial regulators when engaging financial institutions to ensure stability and safety in the time of AI.
It is suggested that regulators could enquire if financial institutions have included fragility of the deep learning models in the internal governance and risk management processes. They could pay more attention to whether the financial institution in question had tried to utilise rule-based methods for simpler, more stable models in their development of tools.
Lastly, the paper recommended regulators to create tools and technologies that supervise AI technologies. Using network analysis is an approach to monitor autonomous agents and find vulnerabilities, as regulators stand at “a unique point” where they can gather all the data and look for vulnerabilities.