Chapter 1 — Introduction: the adoption-value paradox

Two observations frame everything that follows. First, AI adoption has crossed a decisive threshold in 2024–2026. Second, value capture has not. The gap between the two is the central managerial puzzle of the era — and, this chapter argues, the most economically important productivity puzzle of the decade.

Chapter overview

This chapter does five things. §1.2 lays out the empirical puzzle that organises the book — the gap between AI adoption rates (around 78% of organisations by 2025) and AI value capture (around 5–6% of organisations report material EBIT impact). §1.3 situates this puzzle in the longer Solow-paradox lineage running from the personal-computer wave of the 1980s through the internet wave of the 1990s to the AI wave today. §1.4 defines what this book treats as “AI in business” and explains why we adopt an inclusive definition spanning rule-based expert systems through agentic foundation models. §1.5 develops Iansiti and Lakhani’s runtime argument as the book’s organising frame. §1.6 introduces a simple formal model of the adoption-value gap that we use throughout the rest of the book.

Reading this chapter

This is a graduate-level introduction. It assumes basic familiarity with managerial economics, an undergraduate course in production functions, and casual exposure to recent AI developments. It is heavier on conceptual framing than on case material; the cases come in Chapters 6–12 and 18.

The empirical puzzle

The McKinsey State of AI series

The McKinsey State of AI survey (McKinsey & Company, 2025), fielded annually since 2017, is the longest continuous series of large-sample enterprise AI adoption data. Its 2025 wave reports adoption figures that would have been called fanciful only three years earlier.

78% — use AI in ≥1 function (McKinsey 2025)
71% — use generative AI specifically (up from 33% in 2023)
5–6% — are AI high performers — meaningful EBIT impact
~1% — claim full AI maturity (executives’ self-report)

The headline figures are based on a survey of $n = 1{,}491$ executives across 101 countries fielded in February–March 2025, stratified across industries and regions (McKinsey & Company, 2025). The combined picture is unambiguous: adoption has gone mainstream while value capture has not.

Methodology and its limits

A graduate reader should immediately ask: what does “use AI in ≥1 function” actually measure? The McKinsey instrument asks executives to confirm AI deployment across 27 candidate use cases spanning marketing, sales, product development, service operations, software engineering, supply chain, manufacturing, risk and compliance, strategy and finance, HR, and IT. A “yes” on any one cell counts toward the headline 78%.

Three methodological caveats follow:

Self-report bias. Executives have professional incentive to over-report AI adoption — McKinsey notes a 4–6 percentage point gap between executive and operational-staff reports of the same firm.
Definitional drift. What counts as “AI” has expanded substantially since 2017. A firm using vendor-supplied software with embedded ML (e.g., a CRM with a propensity model) now plausibly counts; in 2017 it might not have.
Selection. McKinsey’s panel is professional-services-firm-curated; the population of firms surveyed skews toward larger and more digitally mature organisations than a true random sample of the global firm distribution would.

The right reading of the 78% figure is therefore: at least 78% of large, digitally-mature firms have some AI somewhere in their operations, by an inclusive 2025 definition. The same caveats apply, in roughly mirror form, to the 5–6% high-performer figure: it captures executives’ self-reported confidence in EBIT-level impact, with no audited financial verification.

The Stanford AI Index complementary evidence

The 2025 AI Index from Stanford HAI (Stanford HAI, 2025) provides triangulating evidence with different methodology. Two findings are particularly important for this chapter:

Inference cost decline of approximately 280× in under three years, measured as cost per million tokens for GPT-3.5-class output quality. This is the single most important supply-side fact about the period.
US private AI investment of $109.1B in 2024 vs China’s $9.3B. Investment-side evidence of concentration (which we take up in Chapter 5).

The AI Index is methodologically more transparent than the McKinsey survey — its raw data and source attributions are public — but it measures inputs to AI deployment rather than outcomes from it. The two sources together give us the period’s supply curve (collapsing) and the demand curve (rising rapidly), without yet being able to say definitively whether marginal value is rising or falling.

The Gartner abandonment forecasts

Gartner’s 2024 forecast that 30% of generative AI projects would be abandoned after proof of concept by end of 2025 has been broadly vindicated. The mechanism is well-understood: median pilot cost has risen to $5–20 million per deployment for large enterprises, while time-to-measurable-value has stayed at 9–18 months. Gartner’s 2025 follow-up extends the same finding to agentic AI: more than 40% of agentic AI projects will be cancelled by end of 2027.

Deloitte’s Q4 2024 State of Generative AI in the Enterprise (Deloitte, 2024) (n ≈ 2,800 leaders globally) finds two-thirds of executives reporting that no more than 30% of their experiments will scale within 3–6 months. The triangulation is consistent: AI projects are easy to start and hard to scale.

The McKinsey 2026 update: top quintile pulling away

McKinsey’s January 2026 AI Transformation Manifesto (McKinsey & Company, 2026) reports that the gap between leaders and laggards is widening. Top-quintile firms are now capturing 16–30% productivity improvements in functional areas (software engineering, customer service, operations, knowledge work), and 31–45% improvements in software quality. For the bulk of firms in the middle three quintiles, productivity gains remain in the low single digits.

This is the most economically consequential pattern of the post-2022 wave: AI’s mean productivity effect at the firm level is small, but its variance is enormous. The top decile is pulling away; the bulk of firms are not.

The Solow paradox lineage

“You can see the computer age everywhere…”

In 1987, Robert Solow famously remarked that “you can see the computer age everywhere but in the productivity statistics” (Solow, 1987). The personal computer had become ubiquitous in US offices by the mid-1980s, but aggregate measured productivity growth had been slowing since 1973 and showed no sign of reversal. The remark, half off-handed, named what became known as the productivity paradox: a wedge between visible technological adoption and measured aggregate productivity outcomes.

The Brynjolfsson–Hitt resolution

Brynjolfsson and Hitt (1996) reformulated the paradox using firm-level rather than aggregate data. They estimated production functions of the form

\[ \ln Y_i = \alpha + \beta_K \ln K_i + \beta_L \ln L_i + \beta_C \ln C_i + \varepsilon_i, \]

where $Y_i$ is output, $K_i$ is non-IT capital, $L_i$ is labour, and $C_i$ is computer capital, on a sample of 367 large US firms. They found $\beta_C$ substantially larger than the cost-of-capital share — implying that, at the firm level, IT investments were producing excess returns even as aggregate productivity was flat.

The reconciliation: aggregate measurement was missing IT’s contribution because the gains were concentrated in firms that had also invested in complementary intangibles — workflow redesign, training, organisational restructuring, product innovation. Firms that bought the computers but did not redesign the workflows captured no gains. The aggregate average was being pulled down by the long tail of partial-investment firms.

This finding generalises. David (1990) had argued, by analogy, that the electric-motor productivity paradox of the 1900s–1920s was resolved similarly: firms that bought electric motors but kept their factory layouts unchanged (with line shafts radiating out from a central steam engine) captured almost none of the gains; firms that redesigned their factory layouts around distributed motors captured most. The lag between technology availability and productivity gain ran roughly forty years.

The current AI productivity slowdown

The same pattern appears to be repeating. US labour-productivity growth averaged 1.4% per year in 2010–2019, slowed to roughly 1.1% per year in 2019–2023, and only began to recover toward 1.7–2.0% in 2024 (BLS data). Acemoglu (2024) estimates total-factor-productivity gains from AI at approximately 0.66% over a decade — at the low end of widely circulated figures. Goldman Sachs’ 2023 estimate of $4.4 trillion in annual generative-AI value sits at the high end. The two estimates differ by roughly two orders of magnitude.

The right reading is that we cannot yet distinguish empirically between an AI productivity boom and an AI productivity damp squib. The McKinsey microeconomic evidence (top-quintile firms capturing 16–30% function-level gains) is real; the macroeconomic evidence will be visible only with a five-to-ten-year lag, exactly as the PC and internet booms played out.

📚 Further reading

For the productivity paradox in its historical context, the canonical readings are Brynjolfsson and Hitt (1996), David (1990), and Triplett and Bosworth (2003). For the current AI slice of the debate, see Acemoglu (2024) and the Brynjolfsson, Rock, and Syverson (2021) J-curve framework that we develop in Chapter 15.

What this book treats as “AI in business”

The technology stack

A precise definition matters because the boundary chosen drives the empirical findings. The pyramid of nested technical concepts is:

flowchart TB
    AI[Artificial Intelligence] --> Sym[Symbolic AI]
    AI --> ML[Machine Learning]
    ML --> Class[Classical ML — SVMs, RFs, GBMs]
    ML --> DL[Deep Learning]
    DL --> CNN[CNNs — vision]
    DL --> RNN[RNNs / LSTMs — sequence]
    DL --> Trans[Transformers — attention-based]
    Trans --> FM[Foundation Models]
    FM --> LLM[Large Language Models]
    FM --> VLM[Vision-Language Models]
    FM --> Agents[Agentic systems]

    style AI fill:#003F66,stroke:#003F66,color:#fff
    style FM fill:#006DAE,stroke:#006DAE,color:#fff
    style Agents fill:#C8102E,stroke:#C8102E,color:#fff

Figure 2.1: The technology stack underlying contemporary business AI.

We use AI in business inclusively to mean any computational system that performs tasks normally requiring human cognition and is deployed in a commercial or organisational context. This deliberately covers:

Rule-based expert systems (DENDRAL, MYCIN, XCON, INTERNIST-1, PROSPECTOR) — historically important even if rarely called “AI” in the deep-learning era. See Buchanan and Shortliffe (1984) for the canonical methodological treatment.
Classical statistical learning (FICO Falcon fraud detection, Amazon collaborative filtering after Linden, Smith, and York (2003), Netflix Prize matrix factorisation (Koren, Bell, and Volinsky, 2009), Google PageRank (Brin and Page, 1998)) — quietly powering most enterprise AI value through 2020.
Deep learning (AlexNet (Krizhevsky, Sutskever, and Hinton, 2012), ResNet (He et al., 2016), AlphaGo (Silver et al., 2016), AlphaFold (Jumper et al., 2021), BERT (Devlin et al., 2019)) — the 2012–2022 wave.
Foundation models and generative AI (GPT-3 (Brown et al., 2020), Llama, DeepSeek (DeepSeek-AI, 2024), Qwen, Mistral, Claude, Gemini) — the post-2022 wave; conceptual umbrella defined by Bommasani et al. (2021).
Agentic systems (AutoGPT, OpenAI Operator, Salesforce Agentforce, Anthropic Computer Use, Cognition Devin) built on the ReAct architecture (Yao et al., 2023) and tool-use methodology (Schick et al., 2023) — the 2024+ wave.

Why an inclusive definition

Drawing the line generously matters because most of the value in operational enterprises today still comes from the second and third categories — credit scoring, demand forecasting, recommender systems, image classification — even as headlines and budgets concentrate on the fifth. A textbook that treated only generative AI would mislead a graduate student about where economic value actually lives.

The data backs this up. McKinsey 2025 (McKinsey & Company, 2025) estimates that of the roughly $200B in attributable enterprise AI economic value globally as of 2024, classical ML and deep-learning-classical-task deployments (fraud detection, recommendation, forecasting, image classification) account for roughly 60–65% of the total; generative and agentic AI account for the balance, and that share is rising. By 2027 the balance may flip, but the older deployments will not vanish — they are infrastructure.

Five waves and their commercial signatures

Wave	Period	Commercial signature	Reading
Pre-ML / expert systems	1965–1995	XCON saved DEC ~$25M/year; FICO score (1989)	Buchanan and Shortliffe (1984); Lindsay et al. (1980)
Statistical ML	1995–2010	FICO Falcon (1992); Amazon item-to-item (1998); Netflix Prize (2009)	Vapnik (1995); Breiman (2001); Koren, Bell, and Volinsky (2009)
Deep learning	2012–2022	AlexNet (2012); BERT in Search (2019); Tesla Autopilot	Krizhevsky, Sutskever, and Hinton (2012); He et al. (2016); Devlin et al. (2019)
Foundation models	2017–2024	ChatGPT (Nov 2022); GPT-4 (Mar 2023); Microsoft Copilot (Nov 2023)	Vaswani et al. (2017); Brown et al. (2020); Hoffmann et al. (2022)
Agentic	2024–present	Salesforce Agentforce; Operator; Devin; MCP	Yao et al. (2023); Schick et al. (2023)

Chapter 2 develops each wave in depth.

Iansiti and Lakhani: the runtime argument

The runtime metaphor

AI is becoming the universal engine of execution. As digital technology increasingly shapes “all of what we do” and enables a rapidly growing number of tasks and processes, AI is becoming the new operational foundation of business — the core of a company’s operating model, defining how the company drives the execution of tasks. AI is not only displacing human activity, it is changing the very concept of the firm.

— Iansiti and Lakhani (2020), Ch. 1

Iansiti and Lakhani’s central claim — that AI is becoming the “runtime” of the firm in the sense Satya Nadella means when he says “AI is the runtime that is going to shape all of what we do” — frames the textbook’s argument. When a business is driven by AI, software instructions and algorithms make up the critical path in the way the firm delivers value. Humans may have designed the operational systems, but computers are doing the work in real time: setting a price on Amazon, recommending a film on Netflix, qualifying a borrower for an Ant Group loan, painting the digital Rembrandt at ING.

Digital scale, scope, and learning

Three properties characterise firms whose runtime is software-mediated rather than human-mediated:

Digital scale: serving the next user costs essentially zero. Marginal cost of an additional Netflix subscriber, an additional Amazon shopper, an additional Ant Group borrower is microscopic relative to a traditional bank’s marginal customer.
Digital scope: the same operational backbone supports many products and services. Amazon’s recommendation infrastructure powers product, ad, video, and grocery recommendations; the same fraud detection system protects payments, marketplace transactions, and AWS billing.
Digital learning: every interaction generates labelled data that improves future predictions. The system is more accurate tomorrow than today, and more accurate next week than tomorrow.

These three properties produce increasing returns to scale in the strict economic sense (Arthur, 1989) — average cost falls and average quality rises as the firm grows. This contrasts sharply with the U-shaped average cost curve of traditional firms, where coordination costs rise faster than revenue beyond some efficient scale.

The Ant Group archetype

Iansiti and Lakhani open Competing in the Age of AI with a contrast that has become the canonical illustration:

Firm	Customers (~)	Employees (~)	Customers per employee
Industrial and Commercial Bank of China (ICBC)	700 million	425,000	1,650
Ant Group	700 million	10,000	70,000

The 40× employee differential is not because Ant’s bankers are 40× more productive. It is because Ant Group’s operational critical path is run by software, with humans designing, supervising, and improving the system rather than executing inside it. ICBC’s tellers, branch managers, and credit officers execute decisions that Ant Group’s algorithms execute. The runtime is different.

The Ant Group case also illustrates the regulatory failure mode of frictionless impact (Iansiti–Lakhani Rule 4 in Chapter 5). The Chinese regulator’s 2020 intervention — first the suspended IPO in November 2020, then forced restructuring in 2021–2023 — was the single largest regulatory action against a digital firm in modern financial history. We return to this case in detail in Chapters 3 and 5.

A formal model of the adoption-value gap

The puzzle this book examines admits of a simple formal treatment that clarifies the intuition.

Setup

Consider a firm with output $Y$ produced from three inputs: AI capital $A$, complementary intangible capital $C$, and a residual factor $X$ comprising labour, physical capital, and traditional inputs. Assume a Cobb–Douglas form:

\[ Y = X^{1-\alpha-\beta} \cdot A^{\alpha} \cdot C^{\beta} \cdot e^{\varepsilon}, \]

where $\alpha, \beta > 0$ and $\alpha + \beta < 1$. The exponent $\alpha$ captures the elasticity of output with respect to AI capital (algorithms, models, compute, data infrastructure); $\beta$ captures the elasticity with respect to complementary intangibles (workflow redesign, training, governance, organisational structure); $\varepsilon$ is an idiosyncratic shock.

Two firms

Compare two firms making AI investments at the same time:

Firm L (laggard) invests $\Delta A$ in AI capital but holds complementary intangibles fixed at the pre-investment level $C_0$. Output rises by approximately

\[ \frac{\Delta Y_L}{Y_0} \approx \alpha \cdot \frac{\Delta A}{A_0}. \]

Firm H (high performer) invests both $\Delta A$ in AI capital and $\Delta C$ in complementary intangibles, with $\Delta C / C_0 \approx \Delta A / A_0 = r$. Output rises by approximately

\[ \frac{\Delta Y_H}{Y_0} \approx (\alpha + \beta) \cdot r. \]

If $\alpha = 0.05$ and $\beta = 0.10$ (illustrative values consistent with Brynjolfsson and Hitt (1996)), and both firms invest 50% more in AI capital ($r = 0.5$), then:

Firm L sees a $0.05 \times 0.5 = 2.5\%$ output gain.
Firm H sees a $(0.05 + 0.10) \times 0.5 = 7.5\%$ output gain.

The 3× ratio between the two firms is roughly consistent with the McKinsey 2026 finding that top-quintile firms capture 16–30% productivity gains while the bulk capture single digits (McKinsey & Company, 2026).

Why complementary intangibles are hard to acquire

The model is mechanical; the empirical question is why $\Delta C$ is hard to achieve. Three reasons recur in the case literature:

Complementary capital is firm-specific and tacit. Workflow redesign cannot be bought from a vendor; it is built by the people who already do the work, in a process that takes 6–24 months and reduces measured productivity in the interim.
It requires authority and coordination across silos. The benefit of redesigning a customer-service workflow accrues to the operations function, but the cost falls partly on IT, HR, and legal. Without C-suite sponsorship, the cross-silo bargain breaks down.
It is not separately measured. Conventional accounting treats $\Delta A$ as a capitalised expense (often) but $\Delta C$ as a current expense. Firms that do well on $\Delta C$ look less profitable in the short run, even though they will outperform in the medium run. This is precisely the J-curve phenomenon Brynjolfsson, Rock, and Syverson (2021) documents and that we develop in Chapter 15.

What the model implies for managers

Three implications follow:

Returns to AI investment are bimodal, not continuous. A firm that invests in $A$ alone captures the small marginal share $\alpha r$; a firm that invests in $A$ and $C$ together captures $(\alpha + \beta) r$. The middle path — partial investment in complement — captures something closer to the laggard’s outcome than the leader’s.
The order of investment matters. Firms that build $C$ first (governance frameworks, workflow redesign capacity, change-management muscle) and then invest in $A$ tend to capture the upside faster than firms that buy $A$ first and try to retrofit $C$ around it.
The right unit of measurement is firm-level financial performance, not project-level productivity. A firm can have many high-productivity projects and still no measurable EBIT impact, because the gains leak out to customers (price compression), competitors (capability diffusion), or are absorbed by parallel inefficiency elsewhere in the firm.

We return to this model in Chapters 4 and 15.

What is genuinely new in 2024–2026

Despite the insistence that the bottleneck is organisational, the technology curve really has steepened. Five facts deserve to be memorised by every graduate student in this area.

Inference cost decline of approximately 280×

Stanford AI Index 2025 (Stanford HAI, 2025) documents that inference cost for GPT-3.5-equivalent quality fell by approximately 280× from late 2022 to early 2025. The decline is driven by smaller models trained better (post-Chinchilla scaling (Hoffmann et al., 2022)), distillation, quantisation (running models at INT8 or INT4 instead of FP16/32), and hardware specialisation (NVIDIA H100, B200; Google TPU v5p; AWS Trainium2; Cerebras and Groq inference-optimised silicon).

The economic implication is profound: a use case that cost $10 per 1,000 queries in November 2022 costs roughly $0.04 per 1,000 queries by early 2025. Many use cases that were uneconomic in late 2022 are commercially viable by mid-2024. We will see this directly in Chapters Chapter 6, Chapter 7, and Chapter 8.

Open-weight models reach parity with closed frontier models

DeepSeek-R1 (DeepSeek-AI, 2025) (January 2025, MIT-licensed) matched OpenAI’s o1 on most reasoning benchmarks; Llama 4 (April 2025) is competitive with GPT-4-class models; Qwen 2.5 and Mistral Large 2 followed similar trajectories. The implication for enterprise architecture is that vendor lock-in to a single foundation-model provider is no longer required. We develop the strategic implications in Chapter 5.

Multimodality is the default

GPT-4o (May 2024), Gemini 2.5 Pro (March 2025), Claude 4 (May 2025) combine vision, language, code, and audio in single models with sub-second voice latency. The architectural implication is that the boundary between AI use cases and traditional software functions is dissolving — a single model can handle what previously required a vision API, a speech-to-text API, an LLM API, and an OCR API stitched together.

Agentic capability has crossed a usability threshold

OpenAI Operator scored 87% on WebVoyager in January 2025 — a benchmark of autonomous web-browsing tasks. Salesforce Agentforce 3.0 (June 2025) supports cross-platform tool use via the Model Context Protocol. The Anthropic Computer Use capability (October 2024) lets Claude 3.5 Sonnet perceive screens, move cursors, and execute keyboard input. We develop the agentic frontier in Chapter 13.

Reasoning models inflected the cost-quality curve

OpenAI o1 (September 2024) and DeepSeek-R1 (January 2025) introduced inference-time reasoning — models that “think” before they answer. This category materially extends what models can do: complex multi-step reasoning, verifiable mathematics, and code that compiles and passes tests on the first attempt all improve substantially.

The cost structure shift matters. Reasoning models burn far more inference compute per query than classical LLMs — and the 280× drop in inference cost over 2022–2025 is partly absorbed by this shift. The net economics depend on the task: for high-stakes, low-volume tasks (code, finance, legal), reasoning models pay back; for high-volume, low-stakes tasks (FAQ chatbots), they do not.

Where the book lands

The book takes positions on three contested empirical questions, each defended in the chapters that follow.

Position 1: Foundation models are commoditising; AI factories are not

The post-DeepSeek strategic landscape moves the locus of advantage up the stack from algorithms to operational architecture. Chapters 3 and 5 develop the argument. The empirical evidence is the rapid convergence between open-weight and closed-weight frontier models documented above, against the durable persistence of AI-factory advantage at firms like DBS, Amazon, Netflix, and Ant Group.

Position 2: The labour effects of generative AI compress the skill distribution

Novices benefit substantially more than experts; the augmentation effect is real and large in customer service (Brynjolfsson, Li, and Raymond, 2025), software engineering, and professional writing (Noy and Zhang, 2023) — but the “jagged frontier” matters (Dell’Acqua et al., 2023). Chapter 15 develops the empirical evidence. The right policy and managerial implication is that AI changes the composition of work, with a particular pattern (compression) rather than a uniform substitution.

Position 3: The right minimum unit of AI investment is all six Rewired capabilities at once

Capability investment in concert outperforms capability investment in isolation by a wide margin (Lamarre, Smaje, and Zemmel, 2023). Chapters 4 and 16 develop the argument. The implication is that the “use case zoo” approach (many pilots, none integrated) systematically underperforms the focused-domain-with-full-capability approach, even though the latter is harder to start.

The unit of analysis problem

A subtle but important point that recurs throughout the book: different empirical literatures measure AI’s effects at different units of analysis, and the units do not always agree.

Unit	Studies	Typical finding	Source
Task	Brynjolfsson–Mitchell SML rubric	Most occupations have some SML tasks, few are fully replaceable	Brynjolfsson and Mitchell (2017); Brynjolfsson, Mitchell, and Rock (2018)
Worker	RCT studies on copilots	14–55% productivity gains, larger for novices	Brynjolfsson, Li, and Raymond (2025); Noy and Zhang (2023)
Firm	McKinsey survey of AI high performers	5–6% capture EBIT-level value	McKinsey & Company (2025)
Industry	Acemoglu macro estimate	~0.66% TFP gain over a decade	Acemoglu (2024)
Economy	Goldman Sachs aggregate	$4.4T annual GenAI value	Goldman 2023

The same evidence supports different stories at different units. A firm where every worker captures a 30% productivity gain may show only a 5% EBIT improvement (if competition forces price compression) or no measurable industry-level productivity gain (if the gains are entirely absorbed by parallel inefficiency). A graduate student should learn to read each empirical claim and ask: at what unit of analysis is this measured, and what would the same phenomenon look like at a higher or lower unit?

Outline of the book

Structure

Part I — Foundations (Chapters 1–5). Establishes vocabulary, chronology, and the foundational frameworks. Read these in order.

Part II — Sectors (Chapters 6–12). Covers seven sector domains — they can be read independently or in any order.

Part III — Frontier and Synthesis (Chapters 13–18). Takes up the frontier (agents), the empirical evidence on labour and ROI, the maturity-and-roadmap question, consolidates the conceptual frameworks, and presents teaching cases.

For a one-semester graduate unit, the recommended path is Chapters 1–5 (weeks 1–4), three sector chapters from Part II of the student’s choice (weeks 5–7), then Chapters 13–17 (weeks 8–12), with the teaching cases in Chapter 18 distributed across the semester. Chapters 1, 3, 4, 5, 14, and 15 each support two-hour graduate seminars; the sector chapters support one-hour seminars.

Exercises 1.1

The McKinsey 78% figure. The McKinsey survey instrument asks executives to confirm AI deployment across 27 candidate use cases. (a) Identify three sources of bias in this measurement. (b) For each, suggest a methodological improvement that would reduce the bias. (c) What would a more conservative point estimate of “real” AI adoption be?
Solow paradox redux. David (1990) argued that the electric-motor productivity paradox of the 1900s–1920s ran for about 40 years from technology availability to aggregate productivity gain. Identify two structural differences between the 1900s and the 2020s that might shorten or lengthen this lag for AI. Defend each.
The formal model. Using the production function in §Chapter 1, §1.6, suppose $\alpha = 0.05$ and $\beta = 0.10$. (a) What is the output gain for a firm that increases $A$ by 100% and $C$ by 25%? (b) What is the output gain for a firm that increases $A$ by 50% and $C$ by 50%? (c) Which strategy yields a higher return on invested capital, assuming each unit of $A$ and $C$ has the same cost?
Unit of analysis. A pharmaceutical firm finds that its AI-augmented medicinal chemists are 40% more productive in pre-clinical compound design. Six years later, the firm’s gross margin is unchanged. Construct three plausible explanations. Which is most likely, and what evidence would you collect to distinguish them?
The runtime metaphor. Iansiti and Lakhani (2020) claim AI is becoming the “runtime” of the firm. Identify (a) one firm in your country where this claim is essentially true and (b) one firm where it is not. What distinguishes them? What would the latter firm need to invest in to make the claim true?
Inference cost forecasting. Stanford AI Index 2025 reports that inference cost has fallen ~280× in under three years. Construct a five-year price forecast (2026–2031) using two different functional forms (e.g., log-linear and exponential decay with a floor). Identify two business models that become viable at the lower forecast.
The DeepSeek shock. Why did DeepSeek-R1’s release on 20 January 2025 produce a $600B Nvidia loss seven days later? What assumption did the market revise? What is the strongest counter-argument that this revision was wrong?
A research design. Design a longitudinal study to test whether the 5–6% high-performer share captured by McKinsey is genuine or an artefact of measurement. Specify the firm panel, the treatment and control groups, the outcome measures, and the identification strategy.
The complementarity hypothesis. Brynjolfsson and Hitt (1996) attribute IT productivity gains to complementary intangibles. Apply the same lens to a 2024 AI deployment in your industry. What complementary intangibles are required? Estimate their cost relative to the AI capital cost.
Quiet value. §1.2 asserts that the most economically valuable AI deployments are quiet. (a) Identify a quiet 2024–2026 deployment that is plausibly already producing $500M+ annual value but has received little press coverage. (b) Identify a noisy deployment that is plausibly producing less value than its press coverage suggests. (c) What systematic biases produce the gap between value and coverage?
Reading McKinsey critically. McKinsey is both a measurement source and a market participant — its consulting practice benefits from clients believing AI investment is necessary. Identify three specific points in this chapter where this conflict of interest could distort the interpretation of McKinsey-sourced evidence. Construct an alternative interpretation for each.
The next chapter. Chapter 2 argues that the most economically valuable deployments are quiet. Pre-read the chapter and identify three “quiet” deployments from the pre-2020 period. For each, explain why it received less attention than it merited.

References for this chapter

[Reference for sec not in bibliography]
McKinsey & Company (2025). The state of AI: Global survey.
Stanford HAI (2025). AI Index Report 2025.
Deloitte (2024). State of generative AI in the enterprise, Q4 2024.
McKinsey & Company (2026). The AI transformation manifesto: 12 themes driving growth.
Solow, R. M. (1987). We’d better watch out. New York Times Book Review, 12 July 1987, p. 36.
Brynjolfsson, E. and Hitt, L. (1996). Paradox lost? Firm-level evidence on the returns to information systems spending. Management Science 42(4): 541–558.
David, P. A. (1990). The dynamo and the computer: An historical perspective on the modern productivity paradox. American Economic Review 80(2): 355–361.
Acemoglu, D. (2024). The simple macroeconomics of AI. NBER Working Paper 32487.
Triplett, J. E. and Bosworth, B. P. (2003). Productivity measurement issues in services industries: “Baumol’s disease” has been cured. Federal Reserve Bank of New York Economic Policy Review 9(3): 23–33.
Brynjolfsson, E., Rock, D., and Syverson, C. (2021). The productivity J-curve: How intangibles complement general purpose technologies. American Economic Journal: Macroeconomics 13(1): 333–372.
Buchanan, B. G. and Shortliffe, E. H., eds. (1984). Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley.
Linden, G., Smith, B., and York, J. (2003). Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing 7(1): 76–80.
Koren, Y., Bell, R., and Volinsky, C. (2009). Matrix factorization techniques for recommender systems. IEEE Computer 42(8): 30–37.
Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1–7): 107–117.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. NeurIPS.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. CVPR.
Silver, D., Huang, A., Maddison, C. J., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature 529(7587): 484–489.
Jumper, J. et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596: 583–589.
Devlin, J. et al. (2019). BERT: Pre-training of deep bidirectional transformers. NAACL.
Brown, T. et al. (2020). Language models are few-shot learners. NeurIPS.
DeepSeek-AI (2024). DeepSeek-V3 technical report. arXiv:2412.19437.
Bommasani, R., Hudson, D. A., Adeli, E., et al. (2021). On the opportunities and risks of foundation models. arXiv:2108.07258.
Yao, S. et al. (2023). ReAct: Synergizing reasoning and acting in language models. ICLR.
Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., and Scialom, T. (2023). Toolformer: Language models can teach themselves to use tools. NeurIPS.
Lindsay, R. K., Buchanan, B. G., Feigenbaum, E. A., and Lederberg, J. (1980). Applications of Artificial Intelligence for Organic Chemistry: The DENDRAL Project. McGraw-Hill.
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer.
Breiman, L. (2001). Random forests. Machine Learning 45(1): 5–32.
Vaswani, A. et al. (2017). Attention is all you need. NeurIPS.
Hoffmann, J. et al. (2022). Training compute-optimal large language models (Chinchilla). arXiv:2203.15556.
Iansiti, M. and Lakhani, K. R. (2020). Competing in the Age of AI: Strategy and Leadership When Algorithms and Networks Run the World. Harvard Business Review Press.
Arthur, W. B. (1989). Competing technologies, increasing returns, and lock-in by historical events. Economic Journal 99(394): 116–131.
DeepSeek-AI (2025). DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv:2501.12948.
Brynjolfsson, E., Li, D., and Raymond, L. R. (2025). Generative AI at work. Quarterly Journal of Economics, forthcoming (working paper, 2023).
Noy, S. and Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science 381(6654): 187–192.
Dell’Acqua, F. et al. (2023). Navigating the jagged technological frontier. Harvard Business School Working Paper 24-013.
Lamarre, E., Smaje, K., and Zemmel, R. (2023). Rewired: The McKinsey Guide to Outcompeting in the Age of Digital and AI. Wiley.
Brynjolfsson, E. and Mitchell, T. (2017). What can machine learning do? Workforce implications. Science 358(6370): 1530–1534.
Brynjolfsson, E., Mitchell, T., and Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings 108: 43–47.
Rochet, J.-C. and Tirole, J. (2003). Platform competition in two-sided markets. Journal of the European Economic Association 1(4): 990–1029.