The Birth of AI as a Field: From Turing's Question to Dartmouth's Answer — History of AI

AI HISTORY SERIES --- EPISODE 6

The Birth of AI as a Field

From Turing’s Question to Dartmouth’s Answer --- How a Dream Became a Discipline

Introduction: The Moment a Question Became a Field

By 1950, the computer existed. Five episodes of this series have traced the long road to that point: the myths of ancient civilizations that first expressed the dream of artificial life; the automata of Hero, Al-Jazari, and Leonardo that translated that dream into engineering; the philosophers --- Descartes, Leibniz, Hobbes, Pascal --- who provided the theoretical framework; the Victorian visionaries --- Babbage, Lovelace, Boole --- who designed the architecture and the mathematics; and finally the extraordinary decade of the 1930s and 1940s in which Turing, Shannon, Zuse, and von Neumann built the theoretical and physical foundations of electronic computation. The machine was real. The infrastructure of computation existed.

But building the computer was the easy part, in a certain sense. The harder question --- the question that had haunted philosophers since Descartes, that Turing had posed with new precision, that the builders of ENIAC and Colossus had bracketed in favour of practical urgency --- was still wide open: could the machine think? Could the new computers, given the right programs, be made to reason, to learn, to understand language, to solve problems in the open-ended way that human intelligence could? Was artificial intelligence, in other words, not just a philosophical possibility but a practical engineering project that could be pursued systematically and with realistic hope of success?

“The computer had been built. The question that remained --- the question that would define the next seventy years of research --- was what to do with it.”

The answer to that question, at least in institutional and programmatic terms, came in the years between 1950 and 1956. It came from a small number of researchers --- mathematicians, psychologists, engineers, and logicians --- who had independently arrived at the conviction that the new computers could be programmed to perform intelligent behavior, that this was a legitimate scientific project rather than a philosophical fantasy, and that it was worth organizing a systematic effort to pursue it. This episode traces the intellectual and institutional emergence of artificial intelligence as a field: from Turing’s foundational 1950 paper, through the Dartmouth workshop of 1956 that gave the field its name and its founding agenda, to the early programs and systems that defined the first decade of AI research and the wave of optimism --- and the seeds of its eventual disappointment --- that they generated.

Section 1: Turing and the Question ‘Can Machines Think?’

Alan Turing’s 1950 paper “Computing Machinery and Intelligence,” published in the philosophical journal Mind, opens with one of the most famous sentences in the history of science: “I propose to consider the question, ‘Can machines think?’” What follows is a masterclass in intellectual strategy. Rather than attempting to answer the question directly --- which would require definitions of “machine” and “thinking” that are themselves deeply contested --- Turing immediately proposes to replace it with a different question, one that can be answered empirically rather than philosophically.

The Imitation Game

The replacement question takes the form of a game, which Turing calls the Imitation Game and which history has come to know as the Turing Test. In the original formulation, the game involves three participants: a human interrogator communicating by typewritten messages, a human respondent, and a machine. The interrogator’s task is to determine, through questioning alone, which of the other two participants is the human and which is the machine. The machine’s task is to answer in ways indistinguishable from a human. If the machine can do this consistently --- if the interrogator cannot do better than chance at identifying it --- Turing argues that it is reasonable to say the machine thinks.

The elegance of this formulation is that it sidesteps the definitional quagmire that had previously made the question unanswerable. We do not need to agree on what “thinking” means, or whether machines can in principle possess consciousness or understanding, or whether there is something it is “like” to be a machine processing information. We only need to evaluate behavior: does the machine produce, in conversational interchange, outputs indistinguishable from those of a thinking human being? If yes, Turing proposes, we have no more reason to deny that the machine thinks than we have to deny that other humans think --- which we routinely infer from their behavior, without any direct access to their inner experience.

Turing was careful to acknowledge objections to his proposal, and the paper’s central section is a systematic survey of the most powerful of them. He considers the theological objection (only beings with souls can truly think, and machines have no souls), the mathematical objection (Gödel’s incompleteness theorems show there are things humans can know that machines cannot), the argument from consciousness (a machine cannot truly understand what it is doing), the argument from informality of behavior (human behavior is too flexible and contextual to be captured by rules), and several others. His rebuttals vary in their persuasiveness, but his treatment is honest and thorough, and the objections he raises are essentially the same objections that philosophers and cognitive scientists are still debating today.

Turing’s Prediction and Its Legacy

Perhaps the most striking passage in the paper, from the perspective of historical hindsight, is Turing’s prediction. He writes that he believes that within fifty years --- that is, by approximately the year 2000 --- computers will have been programmed well enough to play the Imitation Game so successfully that an average interrogator will have no more than a seventy percent chance of correctly identifying the machine after five minutes of questioning. He then adds, with characteristic understatement, that he believes the use of the phrase “machines think” will no longer seem so strange.

This prediction was, in retrospect, both prescient and premature. By the year 2000, no program had convincingly passed the Turing Test in a rigorous setting. But by 2023, large language models were engaging in extended conversations that many users found compelling and difficult to distinguish from human communication, at least in limited contexts. The fifty-year timeline was off; the general direction was right. More importantly, Turing’s paper accomplished something more valuable than any specific prediction: it provided the field of AI with both a practical criterion for success and a philosophical justification for pursuing the project. After 1950, researchers could say, with Turing’s authority behind them, that building intelligent machines was not mere science fiction but a legitimate scientific program with a clear operational goal.

“Turing did not answer the question ‘Can machines think?’ He did something more useful: he showed researchers how to stop arguing about it and start building.”

Reflection: Turing’s 1950 paper is not just a historical landmark. It is the founding document of AI as a practical discipline --- the moment when the question ‘Can machines think?’ was transformed from a philosophical puzzle into an engineering challenge. Every AI researcher who has ever built a system intended to exhibit intelligent behavior is, in some sense, responding to the invitation Turing issued in that paper.

Section 2: The Dartmouth Conference 1956 --- Naming the Field

Six years after Turing’s paper, and against a background of rapidly accelerating computer development and increasing confidence among a small community of researchers that machine intelligence was achievable, a proposal was submitted to the Rockefeller Foundation for funding to support a summer workshop. The proposal was drafted by four men: John McCarthy, a young mathematician at Dartmouth College; Marvin Minsky, a mathematician and neuroscientist at Harvard; Nathaniel Rochester, the chief architect of the IBM 701, IBM’s first commercially available scientific computer; and Claude Shannon, the inventor of information theory at Bell Labs. The workshop, they proposed, would be held at Dartmouth College in the summer of 1956, and its subject would be what they proposed to call “artificial intelligence.”

The Proposal and Its Audacious Claim

The funding proposal contained a sentence that is now one of the most celebrated --- and most contested --- in the history of science. The study, it declared, was to “proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.” This was an extraordinary claim, and it is worth pausing to appreciate its full audacity. Not some aspects of intelligence. Not simple, well-defined aspects. Every aspect. In principle. The conjecture that underwrote the entire enterprise of AI research was, at its core, a philosophical thesis about the nature of intelligence: that it is, in its entirety, a process precise enough to be described and simulated.

This conjecture was, of course, not proven. It remains, in important respects, unproven today. But it was a productive conjecture --- a working hypothesis precise enough to generate a research program, bold enough to attract talented researchers, and grounded enough in the results already achieved by Turing, Shannon, and the builders of the first computers to be scientifically respectable. McCarthy and his colleagues were not fantasists; they were researchers who had seen what machines could already do and were making an educated bet about what they could eventually be made to do.

The Workshop and Its Participants

The Dartmouth Summer Research Project on Artificial Intelligence ran for roughly six weeks in the summer of 1956, on the campus of Dartmouth College in Hanover, New Hampshire. About twenty researchers attended at various points, though the full group was rarely assembled simultaneously. The atmosphere, by all accounts, was informal and intellectually electric: small groups of researchers gathered in the evenings to argue about what intelligence was, what programs could already do, and what might be achievable with sufficient effort.

The workshop did not produce a unified research agenda or a set of landmark results. What it produced was a community: a network of researchers who shared a common vocabulary, a common set of ambitions, and a common conviction that the project of building intelligent machines was legitimate, tractable, and worth dedicating a career to. The term “artificial intelligence”, coined by McCarthy for the proposal, gave that community a name and an identity. Before Dartmouth, there were isolated researchers working on machine translation, theorem proving, neural networks, and game playing. After Dartmouth, there was a field.

The Founding Figures

The Dartmouth workshop brought together, in one place for the first time, the researchers who would define AI’s first decade. Each deserves individual attention, because their different backgrounds and temperaments shaped the field’s early character in ways that are still visible today.

John McCarthy was born in Boston in 1927 and trained as a mathematician, completing his doctorate at Princeton under Solomon Lefschetz. He was the organizational architect of the Dartmouth workshop and would go on to found the MIT AI Laboratory (later moving to Stanford, where he founded a second AI lab) and to invent the programming language LISP --- the lingua franca of AI research for the next three decades. McCarthy was a committed believer in symbolic AI, the approach that represented knowledge as formal logical structures and intelligence as manipulation of those structures. He would remain a defender of this approach long after the field had moved in other directions, and his insistence on formal precision shaped the culture of AI research in its early decades.

Marvin Minsky was born in New York in 1927 and brought to AI an unusually eclectic intellectual background: he had studied psychology, mathematics, and neuroscience, and his approach to intelligence was always more biological and cognitive than McCarthy’s purely formal one. Minsky’s most important early work was on neural networks and the perceptron (which we will examine in Section 3), but his interests ranged far beyond any single approach, and his 1969 book “Perceptrons,” co-authored with Seymour Papert, would prove as significant for what it closed down as for what it opened up. Minsky spent most of his career at MIT and was, by any measure, the most intellectually adventurous and provocative of the Dartmouth generation.

Allen Newell was born in San Francisco in 1927 and came to AI via a doctorate in political science and a career as a researcher at the RAND Corporation. He was less interested in formal mathematics than in cognitive science: his driving question was not what logic could achieve in principle, but how human beings actually thought, and whether computers could be made to replicate those cognitive processes. His collaborator Herbert A. Simon --- born in Milwaukee in 1916, trained in political science and economics, and a future Nobel laureate --- shared this cognitive orientation. Together, Newell and Simon would produce the most impressive early AI programs of the 1950s and articulate the “physical symbol system hypothesis” that became one of the field’s central theoretical commitments. Their approach --- building AI by modelling the cognitive processes of human problem solving --- was, from the beginning, distinct from McCarthy’s more purely formal approach, and the tension between these two orientations would run through AI research for decades.

Claude Shannon, the fourth proposer and the oldest and most established member of the group, contributed his towering reputation, his information-theoretic framework, and his early work on machine chess playing. He was not primarily an AI researcher; his interests were broader and his commitments less fixed than those of the younger men. But his presence at Dartmouth lent the enterprise credibility, and his intellectual framework --- the idea that information could be quantified and that the manipulation of information was the fundamental operation of both communication and cognition --- permeated the field’s early culture.

Reflection: The Dartmouth conference did not produce a single breakthrough or a unified program. What it produced was something more valuable and more lasting: a community, a vocabulary, and a shared conviction. Before Dartmouth, AI was a set of isolated research threads. After Dartmouth, it was a field with a name, a founding mythology, and a cohort of brilliant, ambitious researchers determined to make it succeed.

Section 3: Early Symbolic Programs and Pioneering Systems

The confidence expressed at Dartmouth was not merely aspirational. It was grounded in programs that had already been written --- programs that, for the first time in history, performed tasks that had previously seemed to require human intelligence: proving mathematical theorems, solving algebraic problems, engaging in conversation, understanding and acting on instructions in natural language. These early programs were narrow, fragile, and limited in ways that their creators sometimes failed to appreciate. But they were real, they worked, and they demonstrated beyond philosophical argument that machines could, in some meaningful sense, reason.

The Logic Theorist: Machines That Prove Theorems

The most impressive demonstration at the Dartmouth workshop --- and the result that generated the most excitement among attendees --- was a program called the Logic Theorist, written by Allen Newell, Herbert Simon, and their RAND colleague Cliff Shaw. The Logic Theorist was designed to prove theorems in propositional logic using heuristic search: rather than exhaustively checking all possible proofs, it used rules of thumb --- heuristics --- to guide its search toward promising proof strategies, mimicking (as Newell and Simon believed) the way human mathematicians actually approached proof problems.

The results were striking. The Logic Theorist was able to prove thirty-eight of the fifty-two theorems in the first two chapters of Russell and Whitehead’s “Principia Mathematica” --- the foundational text of mathematical logic that had taken two of the greatest logicians of the twentieth century years to produce. For one theorem, it found a proof more elegant than the one in the Principia. When Simon showed the results to his wife one evening in December 1955, he reportedly told her that they had invented something that thought.

The Logic Theorist was significant not just for its results but for its methodology. Newell and Simon were not trying to prove theorems by brute force enumeration; they were trying to model the cognitive processes by which humans solve problems, and to test that model by running it on a computer. This approach --- AI as cognitive simulation, the computer as a tool for testing theories of human thinking --- would generate a rich research tradition and eventually develop into the field of cognitive science. It also generated Newell and Simon’s most ambitious project: the General Problem Solver.

The General Problem Solver: A Universal Reasoner

Building on the success of the Logic Theorist, Newell, Simon, and Shaw developed the General Problem Solver (GPS) between 1957 and 1959. Where the Logic Theorist was designed for the specific domain of propositional logic, GPS was intended to be a domain-independent problem solver --- a program that could, given a description of any problem in terms of an initial state, a goal state, and a set of available operators, find a sequence of operators that transformed the initial state into the goal state. The key mechanism was means-ends analysis: at each step, GPS would identify the difference between the current state and the goal state, select an operator likely to reduce that difference, and apply it.

GPS was genuinely impressive. It could solve simple problems in propositional logic, symbolic integration, and a number of other domains when given appropriate operator descriptions. And Newell and Simon’s analysis of its behavior in comparison with “think-aloud” protocols from human subjects --- records of people talking through their reasoning as they solved problems --- suggested real structural similarities between the way GPS worked and the way people actually approached problems. This was the foundation of Newell and Simon’s “physical symbol system hypothesis,” their claim that intelligent behavior, in both humans and machines, consists in the manipulation of symbolic structures according to rules, and that any physical system capable of doing this is capable of intelligent behavior.

The limitations of GPS were, in retrospect, equally revealing. It worked in toy domains with small, well-defined state spaces and precisely specified operators, but it scaled very poorly: as the state space grew larger, the combinatorial explosion of possibilities overwhelmed its search capacity. Real-world problems rarely come with clean operator descriptions; they are embedded in messy, ambiguous contexts that resist formalization. GPS --- like many of its successors in classical AI --- was very good at what it could do, and what it could do turned out to be a narrow slice of what human intelligence actually involves.

ELIZA: The Illusion of Understanding

If the Logic Theorist and GPS represented the high-water mark of early AI optimism, a program written in the mid-1960s by Joseph Weizenbaum at MIT offered a very different kind of lesson. ELIZA, completed in 1966, was designed to demonstrate something Weizenbaum considered philosophically interesting but practically trivial: that the appearance of intelligent conversation could be produced by a program of remarkable simplicity, using nothing more than pattern matching and scripted responses.

ELIZA’s most famous script, DOCTOR, was designed to simulate a Rogerian psychotherapist --- a style of therapy based on reflecting the patient’s own statements back to them in the form of questions. The technique was ideal for ELIZA because it required almost no actual understanding: if the user wrote “I am feeling sad,” ELIZA could respond “Why do you say you are feeling sad?” without needing to know anything about sadness, psychology, or the user’s situation. When the user’s input matched a pattern like “I am [X],” ELIZA would respond “Why do you say you are [X]?” The script contained a few hundred such patterns, along with fallback responses for inputs that matched nothing.

The result was a program that many users, encountering it for the first time, found eerily convincing. Weizenbaum was disturbed to discover that his secretary asked him to leave the room so she could have a private conversation with it; that psychiatrists seriously suggested it might have therapeutic applications; and that users consistently attributed to it understanding, empathy, and insight that it manifestly did not possess. He wrote a book, “Computer Power and Human Reason,” to argue that this tendency to attribute intelligence to programs that merely manipulated symbols was a dangerous category error --- and that AI researchers, as much as anyone, needed to be honest about the difference between the appearance of intelligence and the real thing. ELIZA was, in Weizenbaum’s view, a cautionary tale masquerading as a demonstration.

“ELIZA taught the field a lesson it has had to relearn many times since: the appearance of intelligent behavior and the reality of intelligent understanding are very different things, and the gap between them is treacherous.”

SHRDLU: Intelligence in a Microworld

Where ELIZA demonstrated how little understanding was needed to simulate conversation in the right context, Terry Winograd’s SHRDLU, completed in 1970 as a doctoral thesis at MIT, demonstrated how much genuine understanding was possible --- in the right, carefully chosen context. SHRDLU was a program that could understand and respond to natural language instructions about a simulated “blocks world”: a virtual table populated by geometric shapes of different colors and sizes, which SHRDLU could manipulate, describe, and reason about.

The results were impressive. SHRDLU could parse complex natural language sentences, understand pronouns and ellipsis, reason about the state of the world and the effects of actions, ask clarifying questions when an instruction was ambiguous, and explain its own reasoning. It could handle instructions like “Place the large red block on top of the blue pyramid, but first make sure nothing is on top of it” and execute them correctly, explaining its actions as it went. In its microworld, SHRDLU appeared to understand natural language and physical reasoning at a level that no previous program had approached.

The lesson of SHRDLU was subtler than it appeared. Its success depended entirely on the carefully circumscribed nature of its domain: the blocks world was simple enough that every relevant concept could be precisely defined and every relevant inference pre-specified. As soon as researchers tried to extend SHRDLU’s approach to richer, more open-ended domains, performance collapsed. The program’s apparent understanding was, in reality, a kind of brittle completeness --- it worked because every case had been thought of, not because it had genuine comprehension. SHRDLU illustrated, perhaps more clearly than any other program of the era, the limits of the “microworld” approach to AI: you could build a system that fully understood a toy world, but building a system that understood the real world was a qualitatively different and vastly harder problem.

The Perceptron: The Neural Alternative

While Newell, Simon, McCarthy, and Minsky were developing symbolic AI --- programs that represented knowledge as explicit symbolic structures and intelligence as rule-governed manipulation of those structures --- a parallel research tradition was exploring a very different approach, inspired not by logic and formal reasoning but by the biology of the brain. The key figure in this tradition, in the late 1950s, was Frank Rosenblatt, a psychologist and computer scientist at the Cornell Aeronautical Laboratory.

In 1957, Rosenblatt introduced the perceptron: a mathematical model of a neuron that could learn to classify input patterns by adjusting the weights of connections between its inputs and its output. The perceptron was not a computer program in the conventional sense; it was an algorithm --- a learning procedure --- that, given a set of labelled training examples, would iteratively adjust the connection weights until it could correctly classify the examples. Rosenblatt showed that the perceptron learning algorithm was guaranteed to converge --- to find a set of weights that correctly classified all training examples --- provided the examples were linearly separable (that is, provided there existed a linear boundary that separated the classes).

The perceptron generated enormous excitement. For the first time, a machine was not just following rules programmed into it; it was learning from data, adjusting its own behavior in response to experience. The New York Times reported that the Navy was showing it off as “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” Rosenblatt himself made confident claims about the perceptron’s potential that, with hindsight, dramatically outran the evidence.

The backlash was decisive. In 1969, Minsky and Papert published “Perceptrons,” a mathematical analysis of what single-layer perceptrons could and could not do. Their central result was that a single-layer perceptron could not learn to compute the logical XOR function --- the exclusive-or, which returns true if its inputs differ and false if they are the same. Since XOR was a simple and entirely natural logical operation, this was a significant limitation. More broadly, Minsky and Papert showed that single-layer perceptrons could only learn linearly separable functions, which excluded many practically important classification problems. The book was widely read as a refutation of the neural network approach to AI, and funding for neural network research largely dried up for more than a decade --- the first of the field’s periodic swings between enthusiasm and disillusionment.

Reflection: The early AI programs --- the Logic Theorist, GPS, ELIZA, SHRDLU, the perceptron --- collectively established both the promise and the persistent difficulty of artificial intelligence. They showed that machines could perform tasks previously thought to require intelligence. They also showed, with equal clarity, that performing those tasks in narrow, well-defined domains did not imply the ability to generalize. The gap between narrow competence and general intelligence would define the field’s central challenge for the next six decades.

Section 4: Early Applications, Optimism, and Institutional Support

Game Playing and Early Demonstrations

Some of the most publicly visible early AI work was in game playing, a domain that combined the appeal of competition with the technical virtue of being fully specified: games have exact rules, clear winning conditions, and no ambiguity about what counts as intelligent play. Arthur Samuel at IBM spent much of the 1950s developing a checkers-playing program that used a technique he called “machine learning” --- a term he appears to have coined --- to improve its play through self-generated experience. Samuel’s checkers program could beat Samuel himself by 1962, a result he presented as evidence that machines could learn to improve performance at tasks beyond the explicit knowledge of their programmers. It was a landmark demonstration, and the term “machine learning” that Samuel attached to it would eventually become the name of one of AI’s most important subfields.

Chess was a more complex target, but early chess programs --- building on Shannon’s 1950 paper outlining how a chess-playing program might be designed --- generated consistent interest and investment throughout the 1950s and 1960s. The programs were not strong players by human standards, but their existence challenged the intuition that chess required genuine intelligence, and they provided useful testing grounds for search algorithms, evaluation functions, and other techniques that would find broader application.

Machine Translation: The First Big Disappointment

Not all early AI applications were greeted with sustained enthusiasm. Machine translation --- the automatic translation of text from one natural language to another --- was one of the most heavily funded AI applications of the 1950s, driven in part by Cold War interest in translating Russian scientific literature into English. The Georgetown—IBM experiment of 1954 demonstrated the automatic translation of sixty Russian sentences into English and generated optimistic predictions that machine translation would be a solved problem within three to five years.

It was not. The 1966 ALPAC (Automatic Language Processing Advisory Committee) report, commissioned by the US government after a decade of disappointing results and escalating costs, concluded that machine translation was slower, less accurate, and twice as expensive as human translation, and recommended that funding be sharply reduced. The report was controversial and its prescriptions arguably too severe, but its underlying diagnosis was correct: translating natural language required an understanding of meaning, context, and world knowledge that the symbolic AI approaches of the 1950s and 1960s could not provide. Machine translation would not become practically useful until statistical methods in the 1990s and neural methods in the 2010s provided new approaches to the problem.

The AI Labs: MIT, Carnegie Mellon, Stanford

The institutional infrastructure of early AI research was established in the late 1950s and early 1960s at three universities that would remain the field’s dominant centers for decades. At MIT, McCarthy and Minsky established the MIT Artificial Intelligence Laboratory in 1959 (initially as the AI Group within the Research Laboratory of Electronics), which would become one of the most productive research environments in computing history, producing not just AI programs but programming languages (LISP), operating systems (Multics), and foundational work in robotics, computer vision, and natural language processing.

At Carnegie Mellon, Newell and Simon built a research culture focused on cognitive simulation and symbolic reasoning that produced a distinguished tradition of work in problem solving, knowledge representation, and expert systems. The CMU AI program was, from its beginning, more psychologically oriented than MIT’s --- more interested in understanding human cognition than in building systems that could perform any particular task --- and this orientation gave it a distinctive character that is still visible today.

At Stanford, McCarthy’s arrival in 1962 established the Stanford Artificial Intelligence Laboratory (SAIL), which became a center for work on formal logic, knowledge representation, robotics, and one of the most ambitious and influential early AI projects: the DENDRAL program, which used AI techniques to infer the molecular structure of organic compounds from mass spectrometry data. DENDRAL, developed from the mid-1960s by Edward Feigenbaum, Joshua Lederberg, and their colleagues, was one of the first successful expert systems --- programs that encoded the specialized knowledge of human experts and used it to perform expert-level tasks in a specific domain. It pointed toward a research direction --- knowledge-based AI, expert systems --- that would dominate much of the 1970s and 1980s.

Government Funding and Cold War Context

The early growth of AI research was substantially funded by the United States government, and particularly by the Defense Advanced Research Projects Agency (DARPA, known until 1972 as ARPA). The Cold War context was pervasive: the Soviet Union’s launch of Sputnik in 1957 had galvanized American investment in science and technology, and AI was seen as a potentially strategic capability --- both for military applications (autonomous weapons, codebreaking, logistics optimization) and for the broader project of maintaining American technological supremacy. DARPA’s funding was generous, its oversight light, and its time horizons long: AI researchers in the 1960s had resources and freedom that are difficult to imagine in the more constrained funding environments of later decades.

This combination of generous funding, intellectual excitement, and relative freedom from accountability created the conditions for both rapid progress and extravagant overconfidence. Researchers made predictions that, in retrospect, were wildly optimistic: Herbert Simon predicted in 1965 that within twenty years machines would be capable of doing any work a man can do. Minsky told Life magazine in 1970 that within three to eight years we would have a machine with the general intelligence of an average human being. These predictions were not idle boasts; they were the honest assessments of brilliant people working at the frontier of a genuinely exciting field, convinced that the remaining obstacles were primarily engineering challenges rather than fundamental conceptual ones. They were wrong, and their wrongness would have consequences for the field’s funding and reputation in the years to come.

“The optimism of AI’s first decade was not stupidity. It was the natural response of brilliant people who had built programs that seemed, for the first time, to reason. The difficulty was that reasoning in a microworld is not the same as reasoning in the world.”

Reflection: The institutional infrastructure built in the late 1950s and 1960s --- the AI labs at MIT, Carnegie Mellon, and Stanford; the DARPA funding; the community of researchers united by the Dartmouth vision --- provided the organizational foundation for everything that followed. Even when the first AI winter came, these institutions survived and preserved the knowledge, the techniques, and the culture that would make the eventual breakthroughs possible.

Conclusion: A Field With an Agenda

In the decade between Turing’s 1950 paper and the early 1960s, artificial intelligence was transformed from a philosophical speculation into a scientific field. It acquired a name, a founding myth, a community of researchers, an institutional infrastructure, and a clear --- if ultimately overambitious --- agenda. The early programs it produced --- the Logic Theorist, GPS, ELIZA, SHRDLU, the perceptron, Samuel’s checkers player --- were genuine intellectual achievements, and the excitement they generated was not baseless. Machines were, for the first time, performing tasks that had previously been the exclusive province of human intelligence.

But the field had also accumulated, without always recognizing them, a set of foundational problems that would require decades to address. The brittleness of symbolic programs --- their dependence on precisely specified domains and complete operator descriptions --- pointed toward the challenge of common-sense knowledge: the vast, implicit, contextual understanding of the world that humans bring to every cognitive task and that proved extraordinarily difficult to encode in any formal system. The limitations of the perceptron, exposed by Minsky and Papert, pointed toward the challenge of learning: building systems that could acquire knowledge from data rather than having it programmed in. And the gap between ELIZA’s superficial conversational competence and SHRDLU’s brittle microworld understanding pointed toward the challenge of genuine language comprehension: understanding not just the syntactic structure of sentences but their meaning, their context, and the intentions behind them.

These problems would not be solved in the 1960s. They would not be fully solved in the 1970s or 1980s either. The history of AI from the mid-1960s onward is the history of a field repeatedly discovering that the remaining distance to human-level intelligence was larger than it had appeared --- and repeatedly finding new ways to make progress despite that discovery. It is a history of booms and busts, of summer and winter, of paradigm shifts and paradigm restorations. And it is, ultimately, a history of success: of a field that started from Turing’s question and, through seventy years of effort, arrived at systems whose capabilities would have astonished the founders of the Dartmouth workshop.

“The 1950s and 1960s established what AI was trying to do. The next sixty years were the story of discovering how hard it actually was --- and doing it anyway.”

───

Next in the Series: Episode 7

AI Winters and Expert Systems --- Setbacks, Recalibrations, and the Limits of Symbolic AI

The optimism of the 1950s and 1960s could not survive contact with the fundamental difficulties of general intelligence. In Episode 7, we trace the first AI winter of the 1970s --- the funding cuts, the critical reports, and the sobering realization that the remaining problems were harder than anyone had thought. We examine the partial recovery built on expert systems: rule-based programs that encoded specialized human knowledge and achieved genuine commercial success in the 1980s, before their own limitations triggered a second, deeper winter. And we ask what the boom-and-bust cycle of early AI tells us about the nature of the challenge: why human-level intelligence is not a single engineering problem but a thousand different ones, and why progress in AI has so often been mistaken for proximity to a goal that remains, decade after decade, just out of reach.

--- End of Episode 6 ---