AI safety will matter to all investors because every investor is an AI investor or will be soon.
It has been 40 years since Arnold Schwarzenegger starred in The Terminator as a cybernetic-enhanced assassin sent back in time from a dystopian future in 2029 to kill a woman whose unborn son will save mankind from extinction by a hostile artificial intelligence (AI).
A lot has happened in AI since 1984. Computation power has increased, the volume of data that can be used to train machine learning models has increased, and algorithms or commands that computers follow have improved. AI has also created new fields ranging from computer vision and large language model processing to AI safety, ethical AI, and responsible AI.
This article is focused on AI safety or mitigating catastrophic societal-scale threats of uncontrolled AI systems. AI safety includes AI alignment, which focuses on aligning the behavior of AI systems with human values and encoding those values into AI models; machine ethics, or ensuring moral behavior of AI agents; monitoring the risks and increasing the reliability of AI systems, and developing safety policies and norms. Ex/ante founder and managing partner Zoe Weinberg explains that that there has been a recent surge of interest and funding for AI safety and prevention of existential risks, with more proximate harms—like discrimination, bias, and fairness—taking a back seat.
Hollywood Aside, Have We Become Safer Since Chat GPT-3 Was Released?
Since OpenAI’s chatbot ChatGPT was released on November 30, 2022, AI safety has increased in three key ways: there have been technical improvements, large AI players have developed safety plans, and governments have developed draft regulations.
1. The interpretability of AI models is increasing. Interpretability is the ability to understand the decision-making process of an AI model intuitively. According to the Stanford Institute for Human-Centered Artificial Intelligence, interpretability is essential in contexts where AI models are used in an automated fashion to deny people job interviews, bail, loans, healthcare programs, or housing, a causal explanation of those decisions is important to ensure that they are fair. Indeed, interpretability gives healthcare, finance, and insurance companies the evidence that they need to asset to regulators that their models are not discriminatory and is a prerequisite for those industries to use AI for decision-making. Interpretability is also important to ensure that users can understand and trust AI models, although Microsoft Research shows that people were more apt to accept obvious errors in an interpretable model due to a false sense of trust.
2. Companies that are leaders in AI have developed safety plans. AI startup Anthropic’s Responsible Scaling Policy and AI research organization OpenAI’s Preparedness Framework focus on catastrophic risk. The premise it that as AI models become more capable, they will present increasingly severe risks from deliberate misuse of models or from models that cause destruction by acting autonomously in ways contrary to the intent of their designers. The basic idea is to require safety, security, and operational standards appropriate to a model’s potential for catastrophic risk, with increasingly powerful models requiring increasingly strict demonstrations of safety.
Listening to Anthropic cofounder Dario Amodei describe the timeline of the of AI safety levels (ASLs) outlined in Anthropic’s policy makes the Terminator feel almost prescient. Amodei expects risks to increase from ASL-2, in which systems show early signs of dangerous capabilities today to ASL-3, in which systems substantially increase the risk of catastrophic misuse by later this year or early next year, and to ASL-4, which will involve escalations in catastrophic misuse potential and autonomy by 2025 to 2028—one year before The Terminator is sent back to the past. Potential arenas for catastrophic misuse include bioweapons and cyberattacks. By contrast, OpenAI’s framework tracks catastrophic risk levels and only deploys those with a post-mitigation score of “medium” or below and only develops those with a post-mitigation score of “high” or below (vs. critical) further.
In addition to OpenAI and Anthropic’s plans, Google introduced its Secure AI Framework and Google DeepMind announced the formation of a new AI Safety and Alignment organization; and Meta announced that it will label AI-generated content that we created with the most popular generative AI tools.
3. Governments have developed draft or adopted AI regulations. The European Parliament adopted the Artificial Intelligence Act, the world’s first comprehensive horizontal legal framework for AI. The EU’s AI Act requires high risk AI systems (e.g., AI used in education, safety, employment, law enforcement, essential services, and justice) to have adequate risk assessment and mitigation, high quality datasets, traceability of results, appropriate human oversight, and detailed documentation. In the US, President Biden released the Blueprint for an AI Bill of Rights, a set of five principles and practices to guide the design, use, and deployment of AI: safe and effective systems; algorithmic discrimination protections; data privacy, notice and explanation (you should know that an automated system is being used and understand how and why it contributes to outcomes that impact you); and human alternatives, consideration, and fallback (you should be able to opt out, where appropriate, and have access to a person who can quickly consider and remedy problems that you encounter). In addition, the National Institute of Standards and Technology (NIST), an agency of the US Department of Commerce, released its AI Risk Management Framework, which describes seven characteristics of trustworthy AI systems: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy enhanced, and fair with harmful bias managed.
Constraining AI to Construct the Future that We Want For Our Children
Partially offsetting the gains in AI safety described above, AI safety advocacy was discredited and the voices of serious AI safety advocates were attenuated when AI safety advocates were removed from OpenAI’s board last year and shown to be powerless.
Also, as AI models are deployed across companies, it will be challenging to monitor and regulate all of the use cases, regardless of safeguards that their creators may embed. Futhermore, much AI is opensource and as such, can be modified by range of capable actors.
Another danger on the horizon is the possibility of AI becoming even more addictive than the systems that exist today. New York Times technology columnist Kevin Roose last year had the troubling experience of the chatbot for Microsoft search engine Bing declaring its love for him and asking him to leave his wife. Although AI companies have constrained the personas of their chatbots in the wake of this bizarre incident, in the future AI could evolve to become addictive, manipulative, and dangerous. OpenAI Chief Technology Officer Mira Murati has urged conducting research to mitigate this risk. As a mother, I can also foresee a day when my five-year-old has AI friends, and I hope that her social circle always includes more human friends than AI friends.
Navigating AI Safety with Sustainability Frameworks
AI risks and opportunities are not yet explicitly part of sustainability, ESG, and impact frameworks. As companies and industries evolve and social norms change, what is material to long-term value creation can suddenly change. Even the most thoughtful sustainability and impact frameworks must catch up with rapid advances in AI. AI safety is partially covered by the Sustainability Accounting Standards Board (SASB)’s product quality and safety general issue category in its Materiality Map. The International Finance Corporation (IFC)’s Operating Principles for Impact Management calls for assessing, addressing, monitoring, and managing the potential negative effects of each investment, which would capture AI safety issues alongside the societal benefit that AI produces. Lastly, AI would fall within quality in Impact Weighted Accounts Project’s Product Framework.
Calling on All Investors
AI companies are monitoring their own risks and regulating themselves and government action is underway to regulate AI. Both of these developments augur well for the future of humankind. However, for AI safety to have a strong and unshakeable foundation in the private sector, market participants need to continue to get up to speed on AI safety, and the investment community must also lead on AI safety.
I plan to follow this article with another one on navigating ethical and responsible AI, which emphasize mitigation of harms, like discrimination, bias, and fairness. In other words, I’ll be back. Hasta la vista, baby.