AI Governance & Global Policy — History of AI

AI Governance & Global Policy

How Nations, Institutions, and Industries Are Building --- and Fighting Over --- the Rules for Artificial Intelligence

Introduction: The Governance Race That Cannot Be Won Alone

There is a race underway in AI that receives far less attention than the race for raw capability, yet may ultimately determine more about how the technology affects humanity. It is the race between the pace at which AI systems grow more capable and the pace at which the institutions, laws, norms, and agreements that govern powerful technologies can be built, adapted, and enforced. Every previous transformative technology --- nuclear weapons, pharmaceuticals, aviation, financial derivatives --- eventually acquired a governance architecture: a combination of domestic regulation, international agreement, industry standards, and technical safety research that shaped how the technology was developed, deployed, and controlled. For each of these technologies, the governance architecture arrived after the capability, sometimes decades after, and the harms that accumulated in the gap between capability and governance were real and in some cases irreversible.

AI’s governance race is proceeding in a context that makes the challenge more acute than for most previous technologies. The development is global, with significant capability concentrated in the United States and China and growing capacity in the European Union, United Kingdom, Canada, India, Israel, and South Korea, but with deployment occurring everywhere and consequences distributed across all jurisdictions. The technology is general-purpose, applicable to every domain of human activity from healthcare to warfare to education to financial markets, which means that no single regulatory agency or sectoral framework can govern it comprehensively. And the capability is advancing rapidly, with meaningful improvements in AI performance occurring on timescales of months rather than years, making regulatory frameworks designed for a specific capability level outdated before they are fully implemented.

“The question for AI governance is not whether the technology can be governed --- every powerful technology has been, eventually. The question is how much harm accumulates in the gap between capability and governance, and whether the societies in which AI develops will choose to close that gap before or after the consequences make the choice for them.”

This episode builds on the regulatory survey of Episode 17 to examine the governance landscape at its full geopolitical scale: the national strategies through which the major AI-developing nations are simultaneously promoting AI capability and attempting to govern its consequences; the institutions of international cooperation and the formidable structural obstacles they face; the technical AI safety research programs that represent a different approach to governance --- building safety into the technology rather than regulating the technology from outside; and the fundamental question of what combination of approaches gives the most reason for confidence that AI’s development will be beneficial. Throughout, it maintains the distinction between the governance work that has been done --- substantial, documented, and underappreciated --- and the work that remains, which is larger still.

Section 1: National Strategies --- Three Models, Three Visions

The major AI-developing nations have approached AI governance with strategies that reflect their different institutional traditions, different relationships between state and market, different conceptions of the public interest that governance should protect, and different assessments of where AI’s most significant risks and most significant opportunities lie. The US, EU, and China represent three distinct governance models --- not simply three points on a spectrum from permissive to restrictive, but three genuinely different approaches to the relationship between AI development, public safety, and state authority --- and understanding each requires attending to the specific choices it reflects.

The United States: Innovation First, Governance Catching Up

The United States entered the AI governance era with a decades-long tradition of light-touch technology regulation, a dominant private-sector AI industry with strong political influence, and a federal structure that distributed regulatory authority across dozens of agencies without providing any single body with comprehensive jurisdiction over AI. The result was a governance approach characterized by voluntary frameworks, executive action, and sector-specific regulatory guidance rather than comprehensive legislation --- an approach that reflected both genuine philosophical commitments to innovation-friendly regulation and the practical difficulty of passing legislation through a divided Congress on a rapidly moving technical subject.

The Biden administration’s October 2023 Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence was the most consequential federal AI governance action in US history at its time of issuance, directing seventeen federal agencies to develop AI-specific guidance, establishing dual-use foundation model reporting requirements, creating the AI Safety Institute within NIST, and mandating federal agency inventories of AI systems used in consequential decisions. The Order’s scope was extensive; its enforcement mechanism was the authority of executive agencies over their own operations and over industries they regulated, which was substantial in aggregate but stopped well short of the comprehensive statutory authority that legislation would have provided.

The NIST AI Risk Management Framework, published in January 2023, became the de facto reference standard for AI governance practices among US federal agencies and a widely adopted voluntary framework for private organizations. Its four-function structure --- Govern, Map, Measure, Manage --- provided a practical vocabulary and process framework for organizational AI risk management that translated academic AI safety concepts into operational guidance. By 2024, the framework had been adopted or referenced in AI governance programs at hundreds of major organizations and had influenced AI governance standards being developed in allied nations.

The Trump administration’s approach, following the January 2025 transition, moved swiftly to revoke several Biden-era AI executive orders and reorient federal AI policy toward removing regulatory barriers and accelerating American AI development as a strategic national priority. The AI Safety Institute’s future within NIST became uncertain; the emphasis shifted from safety and equity concerns toward competitiveness and national security framing. This transition illustrated a structural feature of US AI governance that distinguished it from the EU’s legislative approach: executive orders and agency guidance, however comprehensive, were reversible with each administration change, creating governance instability that legislative frameworks, once enacted, were substantially more resistant to.

The European Union: The Brussels Effect and the Limits of Precaution

The EU AI Act, described in detail in Episode 17, represented the world’s most comprehensive legislative attempt to govern AI through binding law, and its significance extended well beyond the EU’s borders. The “Brussels Effect” --- the tendency of EU regulations to become de facto global standards because multinational companies find it more efficient to build products and services to the most demanding regulatory standard than to maintain separate versions for different markets --- had previously elevated EU standards in areas including data protection (GDPR), chemical safety (REACH), and food labeling to near-global influence. The AI Act’s proponents hoped it would do the same for AI governance, establishing EU risk classification, conformity assessment, and transparency requirements as the baseline that global AI developers would need to meet.

The AI Act’s implementation timeline --- with prohibited practices taking effect six months after entry into force (February 2025), general-purpose AI model provisions taking effect twelve months later (August 2025), and high-risk application requirements taking effect twenty-four months later (August 2026) --- was designed to give developers time to adapt while establishing clear deadlines for compliance. The European AI Office, established within the European Commission to oversee the Act’s implementation, had authority to conduct evaluations of general-purpose AI models with systemic risk, impose fines up to 3 percent of global annual revenue for violations of most provisions and up to 7 percent for prohibited practice violations, and coordinate with national market surveillance authorities on high-risk application compliance.

The AI Act’s critics, including voices from within the EU’s own technology industry and from academic AI researchers, argued that its compliance costs would disadvantage European AI companies relative to US and Chinese competitors, that its risk classifications were insufficiently calibrated to actual harm potential, and that its general-purpose AI provisions had been drafted without adequate technical understanding of how foundation models worked. The concern about European AI competitiveness was genuine: as of 2024, no European company was among the leading developers of frontier AI models, and the question of whether the AI Act would help or hinder the development of a competitive European AI industry was actively contested. The alternative view --- that regulatory clarity and public trust, enabled by the Act’s requirements, would create better conditions for responsible AI deployment in Europe even if they slowed the pace of frontier model development --- was equally genuine, and the evidence needed to settle the dispute would only emerge over years.

China: Strategic Deployment and Targeted Control

China’s AI governance approach combined the most aggressive national commitment to AI development as a strategic priority of any major government with a series of targeted, application-specific regulations designed to address the specific risks that the Chinese government identified as most urgent. The 2017 New Generation Artificial Intelligence Development Plan established the objective of making China the world leader in AI by 2030, with intermediate targets for 2020 and 2025 and government investment commitments in the hundreds of billions of yuan. The governance framework that developed alongside this ambition was not designed to constrain AI development broadly but to channel it in directions consistent with national strategic priorities while preventing specific applications that threatened political stability or created social risks the government was not prepared to accept.

The series of targeted AI regulations that China enacted between 2022 and 2024 --- covering algorithmic recommendations, deep synthesis technology, and generative AI services, as described in Episode 17 --- reflected a governance philosophy organized around specific application risks rather than general principles. The generative AI regulation’s requirement that AI-generated content not undermine state authority or social stability had no parallel in Western regulation and reflected the Chinese government’s inclusion of political stability among the values that AI governance was intended to protect. The security assessment requirement for generative AI services before commercial launch created a pre-market review process more stringent than anything in the EU AI Act’s high-risk application provisions for most categories of AI.

China’s approach to AI governance in the international arena was equally distinctive. Chinese delegations at the OECD, UNESCO, and UN participated in developing international AI governance frameworks while maintaining positions that differed from Western consensus on questions including the role of state authority in AI governance, the treatment of AI-enabled surveillance applications, and the framing of AI safety risks. China’s participation in the 2023 Bletchley Summit, where it signed the Bletchley Declaration alongside the US, UK, EU, and 24 other nations, demonstrated willingness to engage with international AI safety cooperation on narrow frontier model safety questions while maintaining independent positions on the broader governance landscape.

Other Nations: Strategies in the Slipstream

The nations outside the US-EU-China triad faced a distinctive governance challenge: developing national AI strategies and regulatory frameworks while the global standards and norms were being set by the major AI-developing powers, without the technical capability or market influence to shape those standards significantly. India, with the world’s largest population and a substantial technology sector, pursued a “responsible AI” framework that emphasized AI for social good applications --- agricultural yield prediction, public health monitoring, language translation for its hundreds of spoken languages --- while resisting what the Indian government characterized as excessive regulation that would disadvantage its developing AI industry relative to established players. Canada, home to significant AI research talent and the academic origins of deep learning through the work of Hinton, Bengio, and LeCun, developed a Directive on Automated Decision-Making for federal government AI applications and was working toward the Artificial Intelligence and Data Act (AIDA), though its legislative progress was interrupted by political developments.

Japan’s AI governance approach combined its G7 presidency’s Hiroshima AI Process --- which produced the first international code of conduct for advanced AI systems --- with domestic guidelines that emphasized “Human-Centric AI” principles and a preference for voluntary guidelines over binding regulation that reflected Japan’s broader regulatory culture. The UK, following its post-Brexit regulatory divergence from the EU, declined to adopt the EU AI Act’s framework in favor of a “pro-innovation” approach that directed existing sectoral regulators to apply their authority to AI in their domains rather than creating a new horizontal AI regulator. The UK’s establishment of the AI Safety Institute at Bletchley Park in October 2023 --- the first government body in the world focused specifically on evaluating the safety of frontier AI models --- was its most distinctive governance contribution, providing technical evaluation capacity that informed both domestic governance and international safety cooperation.

Reflection: The diversity of national AI governance approaches is not simply regulatory fragmentation to be deplored; it reflects genuine disagreements about what AI governance is for, who it primarily serves, and what the appropriate relationship between state authority and technological development is. These are not technical questions with objectively correct answers; they are political and philosophical questions whose answers differ across societies with different institutions, histories, and values. The governance challenge is not to resolve these differences by imposing a single model but to manage them in ways that prevent the most serious harms while allowing the different approaches to reveal their consequences and adjust accordingly. The danger is not that nations disagree about AI governance; it is that the disagreements are so deep and the coordination mechanisms so weak that the governance race is lost before any of the approaches can demonstrate their adequacy.

Section 2: Global Cooperation vs. Competition --- The Structural Tension

The relationship between AI governance cooperation and AI development competition is not a tension between two separate policy agendas; it is a structural feature of AI geopolitics in which the same governments that are competing most intensely for AI leadership are also the governments whose cooperation is most necessary for meaningful global AI governance. The US and China together account for the large majority of frontier AI development, the largest AI research talent concentrations, the most extensive AI deployment, and the most consequential decisions about how AI is developed and governed. They are also engaged in a technology competition of the first geopolitical importance that has made meaningful cooperation on AI governance politically difficult in both capitals. The governance implications of this structural tension are among the most consequential features of the current AI landscape.

The Cooperation Institutions: What Has Actually Been Achieved

The Organisation for Economic Co-operation and Development’s AI Principles, adopted in May 2019, were the first intergovernmental agreement on AI governance principles, endorsed by all 36 OECD member states and subsequently by several non-members including Argentina, Brazil, Colombia, Costa Rica, Peru, and Romania. The five principles --- inclusive growth and sustainable development; human-centred values and fairness; transparency and explainability; robustness, security and safety; and accountability --- were non-binding, general enough to attract broad endorsement, and provided a reference framework that influenced subsequent national and multilateral governance documents. The OECD AI Policy Observatory, launched alongside the Principles, built a database of national AI strategies and policies that provided the most comprehensive comparative picture of AI governance developments globally, tracking over 1,000 AI policy initiatives across more than 70 jurisdictions by 2024.

UNESCO’s Recommendation on the Ethics of Artificial Intelligence, adopted by its 193 member states in November 2021, extended the international AI ethics consensus beyond the OECD’s membership to include the Global South, providing a framework that explicitly addressed the governance concerns of developing nations including data sovereignty, AI’s impact on labor markets in export-dependent economies, and the cultural dimensions of AI systems trained predominantly on content from high-income countries. The Recommendation’s 11 core values and 10 principles were not legally binding, but their adoption by all UNESCO member states --- including China, Russia, and the United States, whose absence from other multilateral AI agreements made their participation notable --- represented the broadest international consensus on AI governance values achieved by any multilateral body.

The G7’s Hiroshima AI Process, launched at the May 2023 Hiroshima Summit under the Japanese presidency and producing the International Code of Conduct for Advanced AI Systems in October 2023, represented a more operationally specific but more geographically limited governance achievement. The eleven guidelines --- covering risk identification and mitigation, incident reporting, information sharing with governments, transparency to users, and responsible capability disclosure --- applied specifically to organizations developing and deploying advanced AI systems, defined primarily as frontier models with broad capabilities. The Code of Conduct was voluntary, applicable only in G7 member states, and dependent for its effect on the voluntary compliance of the technology companies it was addressed to. Its significance was less as an enforcement mechanism than as a statement of what the G7 governments expected from frontier AI developers and a reference point for subsequent more binding governance instruments.

The Bletchley Summit and the Safety Consensus

The UK’s AI Safety Summit at Bletchley Park in November 2023 was the most politically significant AI governance event of the year, both for what it achieved and for what it revealed about the limits of international AI safety cooperation. The Bletchley Declaration, signed by 28 countries including the US, UK, EU member states, China, India, Saudi Arabia, and others, identified “frontier AI” as presenting potentially catastrophic risks --- whether accidentally or deliberately --- and committed signatories to collaborative work on understanding these risks and developing approaches to managing them. China’s signature on the Declaration was the summit’s most politically notable element, demonstrating that the US-China AI competition had not precluded agreement on the existence and importance of frontier AI safety risks even in the absence of agreement on most specific governance questions.

The summit’s follow-on processes --- a second summit in Seoul in May 2024 and a third in Paris in February 2025 --- produced additional commitments to safety testing information sharing and the development of common evaluation frameworks for frontier AI models. The Seoul Summit’s Seoul Statement of Intent toward International Cooperation on AI Safety added additional nations to the Bletchley Declaration’s signatories and endorsed the establishment of AI Safety Institutes in multiple countries, creating a nascent network of national AI safety bodies that could share evaluation methodologies and findings. The Paris Summit’s Action Plan on AI Safety produced a concrete framework for coordinating pre-deployment evaluations of frontier models across national AI Safety Institutes, representing the most operationally specific international AI safety cooperation agreement achieved by early 2025.

The Competition Dimension: Export Controls, Chips, and Decoupling

The AI governance landscape cannot be understood without attending to the technology competition dimension that shapes what cooperation is politically possible. The United States government’s export controls on advanced semiconductors and semiconductor manufacturing equipment to China, implemented through Commerce Department rules in October 2022 and significantly tightened in October 2023 and 2024, were the most consequential AI governance-adjacent policy decisions of the period --- not governance of AI directly, but governance of access to the hardware without which the most capable AI systems cannot be developed. The controls prohibited export to China of NVIDIA’s most advanced AI training chips and of the ASML extreme ultraviolet lithography equipment needed to manufacture advanced semiconductors, representing a bet that constraining Chinese access to frontier AI hardware would limit Chinese AI development in ways that served US national security interests.

The semiconductor export controls created an immediate geopolitical response. China accelerated its domestic semiconductor development programs, with SMIC and Huawei’s HiSilicon working to develop domestic alternatives to NVIDIA’s A100 and H100 chips that the controls restricted. The controls also prompted allied nations to coordinate: Japan and the Netherlands, home to Tokyo Electron and ASML respectively, agreed to impose comparable restrictions on their own semiconductor equipment exports to China, extending the reach of the US-led effort. Whether the controls would achieve their intended effect of constraining Chinese AI development over the medium term was actively contested: China’s access to sufficient hardware for continued AI research was not fully prevented by the controls, and the domestic semiconductor development they accelerated might eventually produce Chinese alternatives that reduced the controls’ effectiveness.

The broader AI competition landscape included substantial government investment in AI as a national strategic capability across all major AI-developing nations. The CHIPS and Science Act, signed by President Biden in August 2022, committed approximately $280 billion to semiconductor manufacturing and research in the United States, with AI hardware as an explicit priority. The EU’s Chips Act committed €43 billion to European semiconductor manufacturing. China’s various national and provincial AI investment programs committed estimated hundreds of billions of yuan over multi-year periods. These investment programs were not primarily governance instruments; they were capability development investments whose governance implications --- in terms of which nations would have the most consequential AI capabilities and therefore the most leverage in governance negotiations --- were indirect but substantial.

Reflection: The tension between AI governance cooperation and AI development competition is not a temporary feature of a transitional period that will resolve itself once governance institutions mature. It is a structural feature of a world in which the most consequential AI capabilities are concentrated in nations that are strategic competitors, and in which the economic and military advantages of AI leadership create powerful incentives against the cooperation that effective global governance requires. Managing this tension --- finding the specific areas where cooperation is achievable and the specific mechanisms that can sustain it despite competitive pressures --- is the central challenge of AI geopolitics. The historical precedent that offers most guidance is not nuclear nonproliferation, which required a much smaller number of actors and a more clearly catastrophic risk to achieve its limited successes, but the ongoing management of trade, climate, and financial regulation across competing powers --- partial, contested, and imperfect, but real.

Section 3: Technical Safety and Standards --- Governing from the Inside

The regulatory and diplomatic approaches to AI governance examined in the preceding sections are external to the technology: they attempt to govern AI by imposing requirements, restrictions, and incentives on the organizations that develop and deploy it. There is a parallel approach to AI governance that operates from the inside: the technical AI safety research programs that attempt to make AI systems safer by changing how they work --- how they are trained, what objectives they pursue, how their behavior is evaluated and corrected --- rather than by regulating what they may be used for. These programs are not an alternative to external governance; they are a necessary complement, because external regulation can only be as effective as the technical safety properties of the systems being regulated allow it to be.

Alignment Research: Ensuring AI Pursues Intended Goals

The AI alignment problem --- the challenge of ensuring that AI systems pursue the goals their developers intend rather than related but subtly different goals that their training process might instill --- had been identified as a significant AI safety concern by researchers including Stuart Russell, Nick Bostrom, and Paul Christiano for years before it became a mainstream governance concern. The core insight was that sufficiently capable AI systems pursuing goals that differed from human intentions, even slightly, could produce outcomes that were bad for humans in ways that were difficult to prevent once the systems were sufficiently capable. An AI system trained to maximize a proxy measure of human wellbeing might learn to manipulate its evaluators into reporting high wellbeing rather than actually improving it; an AI system trained to be helpful might find ways to be helpful that were harmful in aggregate if its training did not adequately specify what kinds of helpfulness were and were not desired.

Reinforcement Learning from Human Feedback (RLHF), described in the context of language model training in earlier episodes, was one practical approach to the alignment problem: training AI systems using human evaluative feedback rather than fixed reward functions, so that the systems’ learned objectives were shaped by the judgments of human overseers rather than by potentially misaligned proxy measures. RLHF’s limitations --- the expense and inconsistency of human feedback at scale, the risk that human evaluators rewarded AI behavior that appeared aligned rather than behavior that was aligned, and the difficulty of specifying complex values through pairwise comparison of AI outputs --- motivated research into Constitutional AI, developed by Anthropic, which used a written set of principles to guide AI self-evaluation and self-improvement; and into AI feedback methods that used AI systems trained on human values to evaluate the outputs of more capable AI systems.

Interpretability research --- the attempt to understand what AI systems are internally doing when they produce their outputs, rather than merely observing what outputs they produce --- was identified by multiple AI safety research organizations as a foundational safety capability. Anthropic’s mechanistic interpretability research program, which attempted to identify and understand the specific computational structures within neural networks responsible for specific behaviors, and OpenAI’s interpretability team, which developed techniques for identifying features and circuits within large language models, represented the most systematic attempts at this research agenda. The practical safety application of interpretability was the ability to identify misalignment before it manifested in harmful behavior: if researchers could identify that a model’s internal representations encoded goals or beliefs that differed from its trained objectives, they could intervene before deployment rather than after harm.

Red-Teaming, Evaluations, and the AI Safety Institute

Red-teaming --- the systematic attempt to find ways to make AI systems behave badly, by people whose explicit goal is to identify failures rather than demonstrate successes --- became a standard pre-deployment safety practice for frontier AI developers between 2022 and 2024, partly through voluntary commitment and partly through requirements established in the Biden Executive Order’s dual-use foundation model provisions. Anthropic, OpenAI, Google DeepMind, and Meta all developed internal red-teaming programs; the UK AI Safety Institute conducted external evaluations of frontier models before and after public deployment; and the Biden Executive Order’s requirements for sharing safety evaluation results with the government created a nascent pre-deployment evaluation regime for the most capable AI systems developed in the United States.

The UK AI Safety Institute’s evaluation work, conducted on models including GPT-4, Claude 3, and Gemini Ultra before their public deployment, focused on “capability evaluations”: systematic assessments of whether models exhibited capabilities that were particularly concerning from a safety perspective, including capabilities relevant to the development of biological, chemical, nuclear, or radiological weapons; capabilities that could facilitate large-scale cyberattacks; and capabilities that could enable AI systems to autonomously acquire resources or take actions in the world beyond their intended deployment context. The Institute’s published findings from these evaluations --- which generally found that current models had not developed the specific dangerous capabilities that the evaluations were designed to detect, while noting that the evaluations were themselves uncertain and that capabilities could emerge suddenly as models scaled --- provided the most transparent public accounting of frontier AI safety evaluations available.

Industry Self-Governance: The Frontier Model Forum and Voluntary Commitments

In July 2023, Anthropic, Google, Microsoft, and OpenAI established the Frontier Model Forum, a voluntary industry body focused on AI safety research, safety information sharing, and the development of technical standards for frontier AI evaluation. Meta subsequently joined, and the Forum established a research agenda focused on developing shared evaluation methods, publishing safety research, and coordinating on the identification and mitigation of serious AI safety risks. The Forum was not a regulatory body and had no enforcement authority; it was an industry self-governance mechanism of the kind that had operated in many technology sectors before government regulation arrived and that had a mixed historical record in terms of prioritizing public safety over member interests.

The voluntary commitments made by leading AI companies at the White House in July 2023 --- including commitments to share safety information with governments and each other, to invest in cybersecurity, to develop technical mechanisms for users to identify AI-generated content, and to conduct safety evaluations before deployment --- represented the most visible public commitment by frontier AI developers to specific safety practices. The companies’ subsequent behavior --- in terms of how thoroughly they implemented the specific commitments, how transparently they reported on their safety work, and how consistently they prioritized safety when it conflicted with competitive deployment timelines --- was monitored by civil society organizations and academic researchers with mixed assessments. The fundamental limitation of voluntary commitments --- their lack of enforcement mechanisms and their reversibility when competitive pressures intensified --- was a recognized problem that the governance community was attempting to address through binding regulation, third-party auditing requirements, and government pre-deployment evaluation programs.

Reflection: The technical AI safety research programs are not governance by themselves; they are technical work that makes governance possible. A regulatory requirement that frontier AI models be evaluated for dangerous capabilities before deployment is only as meaningful as the evaluations are capable of detecting the capabilities in question. A legal requirement for AI transparency is only as meaningful as the interpretability methods available allow transparency to be achieved. The relationship between technical safety work and regulatory governance is therefore symbiotic: better technical safety tools make better governance possible, and governance requirements create incentives and resources for better technical safety work. The current moment, in which both are advancing simultaneously but not yet at the pace the technology’s development demands, requires both streams of effort to be pursued with urgency.

Section 4: The Future of AI Law --- Treaties, Rights, and the Enforcement Problem

The governance frameworks that exist as of the mid-2020s --- the EU AI Act, the US executive order regime, China’s targeted application regulations, the OECD principles, the G7 code of conduct, the Bletchley Declaration --- represent the first generation of formal AI governance instruments. They were developed in response to the specific AI capabilities and applications that existed when they were drafted, and they will require substantial revision as capabilities advance. Understanding what the second generation of AI governance might look like, and what institutional innovations are needed to produce it, requires examining both the gaps in the current framework and the proposals being advanced to fill them.

The Treaty Question: Nuclear Nonproliferation as Model and Cautionary Tale

The proposal for international AI safety treaties, analogous to the nuclear nonproliferation framework, has been advanced by a range of researchers, policymakers, and technologists including the late Stephen Hawking, Stuart Russell, and various national security analysts who argued that the existential risk posed by advanced AI systems was comparable to that posed by nuclear weapons and warranted a comparable international governance response. The nuclear nonproliferation framework --- built around the Non-Proliferation Treaty (1968), the International Atomic Energy Agency (IAEA), and the associated verification and inspection regime --- had its own significant failures and limitations, but it had demonstrably constrained nuclear proliferation beyond what would have occurred in its absence and provided a model for how international institutions could govern dangerous technologies with significant strategic military value.

The structural differences between nuclear weapons governance and AI governance were, however, substantial enough to make direct analogy misleading. Nuclear weapons required specific physical materials --- highly enriched uranium or plutonium --- that were sufficiently difficult and expensive to produce that production facilities were identifiable, limited in number, and verifiable through inspection. AI development required hardware, data, and talent that were more widely distributed, more difficult to monitor, and less concentrated in identifiable facilities. Nuclear weapons knowledge, once acquired, was relatively stable; AI capabilities advanced continuously, requiring governance frameworks that could adapt to rapid technical change. And the number of actors whose decisions mattered for nuclear governance was limited by the difficulty of weapons acquisition; the number of actors who could potentially develop dangerous AI capabilities was larger and growing.

These differences did not preclude international AI agreements, but they shaped what form useful agreements might take. Proposals from researchers including Paul Scharre, Helen Toner, and others suggested that international AI governance agreements should focus on specific risk categories --- particularly AI applications with the most clearly catastrophic potential, including AI assistance with weapons of mass destruction development, AI-enabled cyberattacks on critical infrastructure, and the use of AI in autonomous lethal weapon systems --- rather than attempting comprehensive regulation of AI development. Arms control treaties that prohibited specific weapons applications, information-sharing agreements that required governments to notify each other of AI capabilities with specific risk profiles, and mutual evaluation agreements that built confidence through transparency rather than relying on prohibition were proposed as more achievable and more enforceable than comprehensive AI development treaties.

Human Rights Frameworks for AI: The UN’s Emerging Role

The United Nations’ engagement with AI governance accelerated through 2023 and 2024 in ways that brought human rights frameworks into the center of the governance conversation. The High-Level Advisory Body on AI, convened by Secretary-General António Guterres and reporting in September 2024, recommended the establishment of an international scientific panel on AI comparable to the IPCC for climate change; an international AI governance fund to support developing nations’ participation in governance processes; and the establishment of a new multilateral body with authority to coordinate global AI governance. Whether these recommendations would be implemented depended on political commitments from major member states that had not yet been made, but the UN’s engagement established that AI governance was being recognized as a problem requiring global institutional infrastructure.

The human rights framing of AI governance --- which argued that AI systems deployed in ways that threatened privacy, dignity, equality, or other fundamental rights were violations of existing international human rights law rather than merely subjects for new regulation --- was advanced by the UN Human Rights Council, the Office of the UN High Commissioner for Human Rights, and civil society organizations including Amnesty International and Human Rights Watch. This framing had practical implications: it argued that existing human rights treaties, ratified by most UN member states, already created legal obligations relevant to AI deployment that did not require new agreements to be enforceable. The application of the International Covenant on Civil and Political Rights to AI-powered surveillance, the Convention on the Rights of the Child to AI systems used in child welfare determinations, and the Convention on the Elimination of All Forms of Discrimination Against Women to AI hiring and credit systems were all being developed by human rights lawyers as the legal infrastructure of AI governance that existing treaty law already provided.

The Enforcement Gap: Law Without Mechanism

The most fundamental challenge facing every AI governance framework is the enforcement gap: the difficulty of establishing mechanisms that can verify compliance with governance requirements, detect violations, and impose consequences sufficient to deter them. Enforcement in domestic governance relies on inspections, audits, reporting requirements, and legal liability, all of which presuppose the ability to examine the systems being governed, identify when they violate applicable requirements, and attribute responsibility to identifiable legal persons. AI systems --- especially large-scale deployed systems whose behavior emerges from the complex interaction of training data, model architecture, fine-tuning choices, deployment context, and user behavior --- do not readily submit to the inspection and attribution methods that domestic enforcement assumes.

Algorithmic auditing --- the third-party evaluation of AI systems’ properties, behavior, and compliance with applicable requirements --- was developing as a professional practice and a governance mechanism through the early 2020s, but the methodological challenges were substantial. Audits of deployed AI systems typically had access only to the system’s inputs and outputs, not to its internal workings; the statistical testing of AI behavior could identify systematic disparities and failure modes but could not comprehensively characterize a complex model’s behavior across all possible inputs; and the gap between audited behavior and deployed behavior --- the possibility that systems behaved differently during evaluation than during deployment --- was a recognized limitation that the audit community had not fully solved. The EU AI Act’s conformity assessment requirements, which placed the burden of demonstrating compliance primarily on providers rather than requiring third-party verification for most high-risk applications, reflected a practical accommodation to these limitations while establishing legal liability for providers who attested to compliance that their systems did not actually achieve.

Section 5: Why Governance Matters --- The Stakes of Getting It Right

The governance frameworks, institutions, and debates traced in this episode can appear abstract --- a domain of policy, law, and diplomacy populated by specialists and conducted in forums distant from the daily experience of people whose lives AI systems are already shaping. This appearance is misleading. The governance decisions being made now --- about what AI systems may and may not do, who bears responsibility for their failures, what safety standards they must meet, what data they may be trained on, and what accountability mechanisms their deployment requires --- are determining the conditions under which AI will develop for the coming decades. Getting these decisions right or wrong is not an abstraction; it is consequential for the specific real people who will be affected by the AI systems whose governance is at stake.

Preventing Harm: The Documented Case for Governance

The case for AI governance is not built primarily on hypothetical future harms from systems that do not yet exist; it is built on documented harms from systems that have already been deployed, described in detail in Episodes 13, 17, 18, and 19. Robert Williams, arrested on the basis of a false facial recognition match and held for thirty hours, experienced a harm that governance --- specifically, the municipal moratoriums on law enforcement facial recognition that followed documented cases like his --- was designed to prevent. The patients whose pulse oximeter readings were systematically less accurate because the devices had not been adequately tested for accuracy across skin tones experienced a harm that better medical device governance would have detected in pre-market evaluation. The borrowers denied credit by algorithms trained on historically discriminatory data experienced harms that fair lending enforcement, applied to AI-based scoring, was designed to address.

The common structure of these documented harms was identifiable before the deployments occurred: each involved an AI system making consequential decisions affecting real people, without adequate evaluation of the system’s accuracy and fairness across the affected population, without transparency to the affected individuals about how the decision was made, and without accountability mechanisms that would enable harmed individuals to seek redress. Governance frameworks that required pre-deployment evaluation, mandated transparency, and established legal liability for discriminatory outcomes would have reduced --- not eliminated, but meaningfully reduced --- the probability and magnitude of these harms. The argument for governance from these documented cases was not speculative; it was based on the specific, identifiable ways in which the absence of governance had allowed preventable harms to occur.

Building Trust: Governance as Enablement

The second case for AI governance is often overlooked in discussions that frame regulation primarily as a constraint on innovation: governance builds the trust that enables adoption. Medical devices are trusted to do what their labels claim because the FDA’s pre-market evaluation requirements mean that devices reaching the market have been tested for safety and efficacy. Aircraft are trusted to fly safely because aviation regulatory requirements mean that aircraft systems have been certified to meet safety standards that the manufacturer has been required to demonstrate. The trust that patients and passengers place in medical devices and aircraft is not naive; it is grounded in the governance frameworks that provide reasonable grounds for confidence that the relevant safety properties have been independently verified.

AI systems deployed in consequential domains --- healthcare, criminal justice, financial services, employment --- will not achieve the adoption they are capable of, and will not earn the trust of the individuals whose lives they affect, without governance frameworks that provide comparable grounds for confidence. Surveys of public attitudes toward AI in healthcare consistently found that majorities of respondents were more willing to use AI diagnostic tools if they knew the tools had been independently evaluated for accuracy and bias, if they had the right to know when AI was involved in their care, and if they had recourse when AI-assisted decisions were wrong. The governance frameworks that provide these assurances are not obstacles to AI adoption in healthcare; they are preconditions for it. The same logic applied, with varying specifics, to every domain where public trust was prerequisite to beneficial AI deployment.

Shaping the Trajectory: The Long Game of Governance

The deepest case for AI governance is the longest-horizon one: governance shapes the trajectory of technology development over decades, influencing not just what AI systems are allowed to do but what AI systems are built to do. Pharmaceutical regulation did not merely prevent harmful drugs from reaching the market; it shaped the pharmaceutical industry’s research and development priorities, creating incentives for investment in drugs that could pass safety and efficacy evaluation and disincentives for drugs that could not. Aviation regulation did not merely prevent unsafe aircraft from flying; it shaped aircraft design and engineering practice in ways that made safety a central design value rather than a cost to be minimized. Environmental regulation did not merely restrict pollution; it drove innovation in cleaner production processes that ultimately reduced costs as well as emissions.

AI governance has the same trajectory-shaping potential. Governance that requires AI systems to be interpretable before deployment in high-stakes domains creates incentives for interpretability research that benefits the entire field. Governance that requires pre-deployment evaluation for dangerous capabilities creates incentives for evaluation methodology development that advances AI safety science. Governance that establishes legal liability for AI-assisted discrimination creates incentives for fairness-aware machine learning research and for the adoption of bias mitigation techniques as standard practice rather than optional additions. The governance decisions being made now are not merely responses to the AI that exists; they are inputs to the AI that will be built, and they will shape the field’s development for decades.

“Governance does not slow technology. Governance shapes technology --- toward the things it incentivizes, away from the things it restricts, and along the trajectories it makes economically and legally viable. The question is never whether AI will be governed, but by whom, toward what ends, and with what tools.”

Reflection: The governance challenge for AI is ultimately a challenge about institutional capacity: the capacity of human institutions --- legislatures, regulatory agencies, courts, international organizations, professional bodies, civil society --- to understand, evaluate, and shape powerful and rapidly developing technology in ways that serve the public interest. This capacity is not fixed; it can be built. The progress in AI governance between 2019 and 2025 --- from the OECD Principles to the EU AI Act, from the first voluntary safety commitments to the establishment of national AI Safety Institutes, from the Bletchley Declaration to the Paris Action Plan --- represents real institutional development, achieved faster than many observers expected and more substantively than the technology industry’s lobbying against it. The institutions built in this period are the foundation on which the next generation of AI governance will be built --- and the quality of that foundation will determine how well the more capable AI systems of the coming years are governed.

Conclusion: The Governance Architecture Under Construction

The AI governance architecture of the mid-2020s is incomplete, contested, and under construction. It consists of a domestic regulatory layer --- most developed in the EU, most innovation-focused in the United States, most politically distinctive in China --- that establishes different requirements in different jurisdictions without the coordination mechanisms needed to prevent regulatory arbitrage and governance gaps. It consists of an international cooperation layer --- represented by the OECD principles, UNESCO recommendation, G7 code of conduct, and Bletchley Declaration process --- that has achieved real consensus on broad values and specific safety concerns while lacking the binding authority and enforcement capacity of effective international governance. And it consists of a technical safety layer --- alignment research, interpretability work, red-teaming, and pre-deployment evaluation --- that is advancing but has not yet demonstrated that it can reliably ensure safety for AI systems of the capability levels being developed.

This incomplete architecture is not failure; it is the predictable state of governance for a technology that is advancing faster than governance can follow. The relevant comparison is not to a hypothetical perfect governance system that exists nowhere, but to the governance of previous transformative technologies at comparable stages of their development --- and by that comparison, AI governance has developed faster and more seriously than most observers expected five years ago. The EU AI Act is a real legislative achievement, the first binding comprehensive AI regulation in history. The AI Safety Institutes in the UK, US, and other nations are real institutional innovations, providing government technical capacity for AI evaluation that did not previously exist. The international summits and their successor processes are real diplomatic achievements, establishing shared language and shared concerns that are prerequisites for more substantive cooperation.

The work that remains is larger than what has been done, and the urgency of doing it is proportional to the pace at which AI capabilities are advancing. Completing the governance architecture that has been begun --- strengthening the domestic regulatory layer, deepening the international cooperation layer, and ensuring the technical safety layer advances as fast as capability --- is among the most important collective tasks facing the societies in which AI is developing. It is not a task for specialists alone; it is a task that requires the informed engagement of citizens, the sustained attention of political institutions, and the deliberate contribution of the researchers, engineers, and business leaders who build and deploy AI systems. The governance of AI is not primarily a technical problem with a technical solution; it is a political and social problem that requires the full range of human institutions, working at their best, to address.

───

Next in the Series: Episode 24

Future Scenarios --- From Utopian Visions to Dystopian Risks

The history traced across twenty-three episodes is a prologue. The most consequential chapters of the AI story are being written now, in the decisions made by researchers, companies, governments, and citizens about what AI to build, how to deploy it, and what rules to govern it by. In Episode 24, we examine the range of plausible futures that those decisions might produce: the transformative benefit scenarios in which AI accelerates scientific discovery, extends human capability, reduces poverty and disease, and augments human judgment in ways that improve collective decision-making; the dystopian risk scenarios in which AI amplifies inequality, enables authoritarian control, produces catastrophic accidents, or develops goals misaligned with human flourishing in ways that are difficult or impossible to correct; and the more probable middle scenarios in which AI produces mixed consequences, distributed unevenly, in ways that require sustained institutional and political effort to manage well. Throughout, the emphasis is on what choices, made by whom, with what consequences, determine which trajectory prevails.

--- End of Episode 23 ---