What insights are needed for the AI that we want?

--

Kai Sandbrink

October 26, 2023

Over the past year, the world has witnessed calls for AI regulation from a list of players of increasing prominence: Tech executives like Steve Wozniak and Elon Musk have signed open letters calling for the regulation of their industry; Yoshua Bengio and other AI researchers have testified before the US Senate; on November 1 and 2, the UK recently held the first global summit on AI safety; and the White House put into place the first executive order on safe and responsible AI. With the onset of ChatGPT, safety concerns related to AI have finally gone mainstream.

Traditional arguments of the pros and cons of AI regulation focus on a cost-benefit analysis of regulation weighing up the risks associated with the widespread development and implementation of AI against the potential benefits. On both sides, the arguments are numerous: The risks take the shape of existential risks due to misuse in bio- and cyber-security, potentially destabilizing forces on democracy due to misinformation in society, or ethical concerns due to fairness issues and biases that would arise from the implementation. The potential benefits associated with AI, meanwhile, include increasing productivity and accessibility in critical industries such as healthcare, and raising the potential for new scientific discoveries that would otherwise have been impossible which could help in the fight against climate change or the search for a cure for cancer.

This debate assumes a static view of “AI,” in which there is a set version of AI waiting to be developed and “discovered”. However, the AI systems that we build is a dynamic system that is a function not only of what is possible but also responds to the social structures and incentives that we have in place. We need to construct systems that will incentivize the safe development of AI that takes into account implications not only for business but also for social systems and international security.

From a technical perspective, the implementation of safe and ethical AI requires progress in two primary areas: (1.) better insight into the uncertainty bounds and technological limits of AI systems, and (2.) understanding how an AI system computes a given response. While solving these two issues by themselves will not be enough, they will represent necessary foundational steps to the safe and responsible development and implementation of AI.

Most AI systems today function as black-box models that output predictions based on a given input. They are frequently implemented as artificial neural networks: Systems that are composed of a multitude of individual nodes that work together to implement a function that isn’t specified by the programmer but instead learned from millions of training examples. This decentralization gives them greater computational flexibility, and given enough training, neural networks of sufficient size are theoretically capable of approximating any function to arbitrary precision. However, these very same AI systems also have a legacy of being very confidently wrong in particular in the face of changes in the environment, unfamiliar inputs, or direct attacks. Ideally, AI systems implemented in the real world will be more robust to perturbations in their inputs, yet in the noisy and chaotic world we live in, it will be impossible for all possible instances to be covered during training. Instead, the ability for an AI system to report when it is not capable of producing a response with adequate confidence is necessary for systems to be deployed safely in the real world, as they could then be safely taken offline or know to search out additional guidance when needed.

Similarly, understanding how an AI system computes a given response would allow for addressing numerous concerns about safety and ethics. The distributed nature of computation in neural networks makes it impossible to determine which factors are driving a networks’ decisions by examining its structure directly, which means that inconsistencies can usually only be detected by post-hoc statistical analysis. Famously, Amazon once implemented an AI-based application sorting system that had to be taken offline after it was found to discriminate against potential employees on the basis of gender and the company was unable to find a way to alleviate the issue. Bias, access, and inconsistencies in data quality are key sources of risk for AI systems in sensitive fields such as healthcare and criminal justice.

Policymakers and regulators therefore should craft measures requiring progress in these two areas, and users should demand it of the companies they patronize. In both domains, progress is difficult but possible: Red-teaming, in which developers stress-test AI systems by seeing if they can get it to produce dangerous or inconsistent output, is an attempt to gain insight into which types of domains AI systems struggle to give robust answers on and determine where additional safeguards are needed. Adversarial robust training ameliorates the susceptibility of neural networks to certain forms of attacks. Models of superposition, such as those developed by the OpenAI spin-off company Anthropic, provide mechanistic insight into how data is represented in distributed artificial neural systems. Requirements do not necessarily need to be implemented all at once: Rather, requirements for insight into robustness and uncertainty could scale together with the capabilities and the amount of responsibility that AI systems are trusted with.

With these two foundational steps in place, we will hopefully finally be in a position to make AI systems fair, well-aligned, and robust. Understanding which factors drive an AI system to decision-making, for instance, is the first step to being able to constrain and direct its reasoning to include (or exclude) the factors that we consider to be important. In addition to the myriad other potential benefits, as a neuroscientist working on how insights from AI can be leveraged to increase our understanding of the brain, I am particularly excited about the insights we could gain by fundamental advances in these areas on cognition and learning in the face of uncertainty.

Historical precedent from industries such as automotive manufacturing, aviation, medicine, and consumer safety have shown that regulations can help make dangerous technologies safe and work for our benefit. Over the coming years, we will have the chance to dictate how a generations-defining technology will be developed and implemented. We should use this chance wisely.

--

--

Center for International Human Rights

A research center at John Jay College focused on a critical examination of long-standing and emerging issues on the human rights agenda.