Chapter 16 — Maturity frameworks
This chapter develops the maturity-assessment framework for AI deployment. Maturity frameworks are diagnostic tools — they help analysts evaluate where specific deployments stand, where they need to develop, and what the operational and strategic implications of the current stage are. The framework this chapter develops integrates the deployment-maturity framework introduced in §12.8 with capability-maturity dimensions, firm-level frameworks, application-level frameworks, and explicit AI Deployment Maturity Levels that allow systematic comparison.
The framework’s analytical purpose is structurally important. Part II’s case material demonstrated that AI deployment success and failure depend less on raw technical capability than on the deployment-environment fit — the alignment between the technology, the organisational context, the regulatory environment, the data-and-feedback infrastructure, and the integration with existing operations. Several of Part II’s cautionary cases (Watson Health, Klarna, Tradelens, Robodebt) failed not because the underlying technology was incapable but because the deployment context was inadequate for the technology’s actual maturity. The maturity framework provides the diagnostic tools to assess deployment-context-and-capability alignment before deployment decisions, rather than learning the lessons through expensive failures.
The chapter develops the framework with explicit cross-references to Part II cases (which provide empirical material for the framework’s categories) and to Part V playbook chapters (which apply the framework’s discipline to the build process). The framework is operational rather than purely descriptive: students completing the unit will use the framework to assess their own deployment decisions, evaluate cases they encounter, and inform strategic recommendations to firms.
The chapter develops fourteen sections. Section 16.1 explains why maturity frameworks matter. Section 16.2 surveys existing maturity frameworks in adjacent domains. Section 16.3 distinguishes capability maturity from deployment maturity. Section 16.4 develops the five-factor deployment-maturity framework. Section 16.5 covers capability maturity adapted from technology-readiness levels. Section 16.6 covers firm-level AI maturity with five levels. Section 16.7 covers application-level maturity. Section 16.8 covers sector-level maturity, extending §12.8. Section 16.9 integrates the Iansiti-Lakhani factor framework. Section 16.10 covers assessment methodologies. Section 16.11 develops explicit AI Deployment Maturity Levels. Section 16.12 covers pathologies of premature maturity claims. Section 16.13 covers operational implications. Section 16.14 sketches the forward trajectory.
16.1 Why maturity frameworks matter
Maturity frameworks support specific kinds of decisions across multiple stakeholder groups.
Strategic decisions for AI-deploying firms. A firm considering an AI deployment investment needs to know whether the underlying technology is mature enough to produce reliable operational value, and whether the firm’s deployment context (data infrastructure; operational integration; organisational capability; regulatory environment) is mature enough to support successful deployment. Premature deployment commitment based on overestimates of either capability maturity or deployment maturity has been the underlying pattern in many of Part II’s cautionary cases. The Watson Health case (Section 7.3) overestimated the capability of medical AI in the 2014–2018 period; the Klarna case (Section 8.4) overestimated the deployment maturity of customer-service-AI substitution. The strategic decision to invest, deploy, scale, or retreat depends substantially on the maturity assessment that precedes the decision.
Investment decisions. Venture investors, public-market investors, and corporate-development functions all make investment decisions about AI firms and AI applications. The investment thesis typically depends on assumptions about both the firm’s specific maturity and the broader sector’s maturity. The 2010s VC enthusiasm for autonomous-vehicle startups (Section 12.4) reflected optimistic maturity assumptions that did not hold; the 2024 retrenchment in the AV industry was substantially a correction to those assumptions. Maturity frameworks support more-rigorous investment analysis by making the implicit assumptions explicit.
Regulatory decisions. Regulators making decisions about AI deployment requirements need to assess where specific applications stand on capability and deployment maturity. The EU AI Act’s risk-based framework (Chapter 14) implicitly incorporates maturity assessments — high-risk applications face requirements that low-risk applications do not, with the categorisation reflecting both inherent risk and assessment of how mature current capability is in addressing that risk. The FDA AI/ML SaMD framework, the various financial-services AI frameworks, and adjacent sector-specific frameworks similarly incorporate maturity reasoning.
Evaluation decisions. Researchers, journalists, and analysts evaluating AI deployment claims need frameworks to distinguish substantive from inflated claims. The “AI hype cycle” pattern that recurs in industry analysis is partly a product of inadequate maturity assessment — claims that AI applications have reached production maturity that subsequent operational experience does not bear out. Maturity frameworks support more-rigorous evaluation by providing structured criteria for what claims of maturity should be supported by.
The motivating cases. Several Part II cases motivate the maturity framework directly.
Watson Health — IBM’s clinical-AI deployment in 2014–2018 substantially overstated the capability and deployment maturity of medical AI. The eventual collapse and divestiture (Section 7.3) reflected the gap between claimed and actual maturity. Five-billion-dollar capital write-down provides expensive evidence of the cost of premature maturity claims.
Klarna — The February 2024 announcement (Section 8.4) implicitly claimed deployment maturity for customer-service AI substitution that operational experience over the subsequent year did not bear out. The May 2025 reversal demonstrated that the deployment maturity for substitution was lower than claimed.
Tradelens — Maersk and IBM’s industry platform (Section 11.2) claimed sector-level deployment maturity for digital trade infrastructure that the broader industry adoption pattern did not produce. The 2022 closure reflected the gap between claimed and actual sector maturity.
Robodebt — The Australian government’s deployment (Section 12.1) implicitly claimed legal, operational, and ethical maturity that subsequent Royal Commission analysis showed was inadequate. The AUD 1.8 billion refund liability and the cumulative political accountability provided evidence of the cost of premature maturity claims in public-sector contexts.
The cumulative case material demonstrates that maturity assessment is not optional. The cost of getting maturity wrong is substantial; the discipline of explicit maturity assessment is among the most-valuable analytical tools that AI deployment work requires.
16.2 Existing maturity frameworks in adjacent domains
Maturity frameworks have been developed in several adjacent domains over the prior decades. Understanding what translates to AI and what does not provides foundation for the AI-specific framework.
The Capability Maturity Model Integration (CMMI). The CMMI framework, developed at Carnegie Mellon’s Software Engineering Institute through the 1990s and progressively elaborated since, is the foundational maturity framework for software-engineering practice. The framework specifies five maturity levels: Initial (Level 1, ad hoc and chaotic processes); Managed (Level 2, processes are planned and tracked); Defined (Level 3, processes are characterised and consistent); Quantitatively Managed (Level 4, processes are measured and controlled); Optimising (Level 5, focus on continuous improvement).
CMMI has been substantially influential beyond software engineering; multiple subsequent maturity frameworks have adopted its five-level structure. The framework has limitations for AI deployment specifically — the focus on process maturity does not directly capture the technological and capability dimensions that AI deployment requires — but the structural insight (that organisations progress through identifiable maturity stages with characteristic features) generalises.
Cloud maturity frameworks. Multiple frameworks address organisational maturity for cloud-computing deployment. The AWS Cloud Adoption Framework, the Microsoft Cloud Adoption Framework, and the Google Cloud Adoption Framework each provide multi-level maturity assessment for cloud deployment. The frameworks typically address dimensions including business strategy, people, governance, platform, security, and operations. The cloud-maturity frameworks are useful templates for AI maturity frameworks because AI deployment shares structural features with cloud adoption (organisational change requirements; data governance; cost management; operational integration).
Digital transformation maturity frameworks. A broader category of maturity frameworks addresses digital transformation generally. The MIT Center for Digital Business framework, the Gartner Digital Maturity Model, and various consulting-firm frameworks (McKinsey, BCG, Deloitte, Accenture, Capgemini) each provide multi-level assessments of organisational digital capability. The frameworks are useful for understanding the broader organisational context within which AI deployment occurs but are typically less granular than AI-specific assessment requires.
The Technology Readiness Level (TRL) framework. The TRL framework, developed by NASA in the 1970s and standardised by various organisations subsequently (the European Commission adopted it in 2014 for Horizon 2020 funding decisions; the US Department of Defense uses it for technology investment decisions), provides nine levels of technology maturity from basic principles observed (Level 1) through actual system proven in operational environment (Level 9). The TRL framework is useful for capability-maturity assessment specifically; Section 16.5 develops AI-adapted TRL.
Industry-specific maturity frameworks. Several sectors have developed industry-specific maturity frameworks. The FDA’s Software as a Medical Device (SaMD) framework includes implicit maturity progression. The Financial Stability Board’s work on AI in financial services includes maturity considerations. Sector-specific frameworks have advantages (alignment with sector-specific regulatory and operational concerns) and disadvantages (limited cross-sector applicability).
What translates to AI. Several elements of adjacent-domain frameworks translate to AI maturity assessment:
- The multi-level structural pattern (typically five levels) supports systematic assessment without excessive granularity.
- The distinction between capability maturity and process maturity generalises to the distinction between technical capability and operational deployment.
- The dimensional decomposition (CMMI’s process areas; cloud frameworks’ six dimensions; digital-transformation frameworks’ multiple dimensions) supports comprehensive assessment.
- The TRL framework’s specific focus on capability maturity maps to the AI-specific capability dimensions.
What does not translate. Some adjacent-domain framework elements do not translate well:
- The deterministic-process assumption of CMMI does not fit AI’s probabilistic-and-evolving nature.
- The sequential-improvement framing of most frameworks understates the complexity of AI deployment, where regression on specific dimensions can occur as the deployment evolves.
- The single-organisation focus understates the cross-organisational dimensions of AI deployment (foundation-model providers; deployment-platform providers; data partners; the firm itself; downstream customers).
The AI-specific framework this chapter develops draws on the adjacent-domain frameworks while addressing the specific characteristics of AI deployment.
16.3 Capability maturity vs deployment maturity
A foundational distinction for the AI-specific framework is between capability maturity and deployment maturity. The two dimensions are independent: a system may have high capability maturity (the underlying technology works reliably for the specific task) but low deployment maturity (the operational and organisational context cannot support successful production use), or vice versa. Substantive maturity assessment requires evaluating both dimensions independently.
Capability maturity. Capability maturity addresses the technology itself — does the AI system actually work for the specific task at the required level of reliability, robustness, and quality? Capability maturity is assessed through technical evaluation: benchmark performance; production-deployment performance metrics; reliability under varying conditions; robustness against adversarial inputs; quality of outputs across the input distribution. Capability maturity is partially objective; benchmark scores can be compared across systems and over time. It is partially context-dependent; the same system may have high capability maturity for one task and low capability maturity for an adjacent task.
The capability maturity dimension is what the TRL framework (Section 16.5) specifically addresses. AI capability maturity has improved substantially through 2018–2026, with foundation-model-based systems reaching high capability maturity for many tasks where prior approaches did not.
Deployment maturity. Deployment maturity addresses the broader system around the technology — can the organisation actually run the AI system in production successfully? Deployment maturity encompasses: data infrastructure (does the organisation have the data to train and operate the system?); operational integration (does the AI system fit into existing business processes?); organisational capability (does the organisation have the skills and structures to operate the system?); change management (can the organisation absorb the changes the AI system implies?); regulatory and compliance readiness (does the deployment context meet applicable regulatory requirements?); incident response (does the organisation have the capability to handle problems when they occur?).
The deployment maturity dimension is what the §12.8 framework specifically addresses with its five factors. Section 16.4 develops the framework in detail.
Why both matter. The independence of the dimensions has substantial operational implications.
High capability maturity with low deployment maturity produces deployments that fail despite the technology working. The Klarna case (Section 8.4) is structurally an example: foundation-model customer-service AI had high capability maturity by early 2024, but the deployment maturity at Klarna was inadequate (alpha-skipping; wrong evaluation metrics; insufficient hybrid-handoff design). The deployment failed despite the technology being capable.
Low capability maturity with high deployment maturity produces deployments that fail because the technology cannot deliver. The Watson Health case (Section 7.3) is structurally an example: IBM had substantial deployment-infrastructure capability, but the underlying medical-AI capability in 2014–2018 was inadequate to the broad clinical applications the deployment promised. The deployment failed because the technology was not yet capable enough.
Low capability maturity with low deployment maturity produces compounded failure. The Cambridge Analytica case (Section 10.4) had elements of both — the underlying psychographic-profiling capability was overstated, and the deployment context (broad data access; political-campaign use; cross-jurisdictional regulatory environment) was structurally inadequate.
High capability maturity with high deployment maturity is the success pattern. The contemporary GitHub Copilot deployment (Section 13.5), the Stitch Fix data-flywheel pattern (Section 8.2), the various successful operational-AI deployments across sectors all combine adequate capability with adequate deployment context. The success pattern is not an accident; it reflects the alignment of both dimensions.
The framework’s structure. The combined framework addresses both dimensions: capability maturity (Section 16.5) and deployment maturity (Section 16.4) are assessed independently; the combined assessment supports deployment decisions, regulatory decisions, and evaluation decisions. The integrated AI Deployment Maturity Levels (Section 16.11) combine the dimensional assessments into composite levels.
16.4 The five-factor deployment-maturity framework
The deployment-maturity framework introduced in §12.8 identifies five factors that drive the speed and quality of deployment maturation. The framework supports both diagnostic assessment (where does a specific deployment stand?) and prospective analysis (what would it take for this deployment to mature?).
Factor 1 — task definition. Deployment maturity requires clear operational definition of the task the AI system performs. Tasks with sharp definition (predict whether a transaction is fraudulent; identify defects on a circuit board; route a delivery vehicle through a known network) support high deployment maturity. Tasks with ambiguous definition (recommend the best cancer treatment; advise on the best legal strategy; design an effective marketing campaign) face structural challenges in reaching high deployment maturity.
The factor matters because operational deployment requires the system to make decisions consistently against well-defined criteria. Ambiguous task definitions produce ambiguous evaluation (success and failure cannot be cleanly distinguished), ambiguous improvement directions (it is unclear what to optimise), and ambiguous user expectations (different users have different views of what success looks like). The Watson Health case is structurally relevant: the broad framing (“AI for medicine”) prevented the operational task definition that mature deployment requires.
The factor is not entirely binary. Tasks have more or less sharp definition; the maturity assessment evaluates where the specific task sits on the spectrum. Tasks can be progressively sharpened — the same broad domain can be decomposed into specific tasks with sharp definition that support deployment, even when the broad domain itself does not.
Factor 2 — feedback signals. Deployment maturity requires reliable feedback signals about whether the AI system is performing as intended. Signals can be direct (the system makes a decision; the outcome is observed and provides direct evidence on the decision quality), delayed (the outcome is observable but only after a substantial lag), or indirect (the outcome is not directly observable; proxies must be used).
Strong feedback signals support deployment maturity by enabling continuous improvement: the system’s performance can be monitored; errors can be identified and corrected; the data flywheel can operate. Weak feedback signals constrain deployment maturity because the system’s performance cannot be reliably evaluated and improvement is structurally difficult.
The factor is critical for the data-flywheel dynamics that Iansiti and Lakhani (2020) identified (Section 16.9). Sectors with strong feedback (advertising click-through; e-commerce purchases; financial-services transaction outcomes) develop deployment maturity faster than sectors with weak feedback (specific clinical applications; long-horizon planning; certain creative work).
Factor 3 — data availability. Deployment maturity requires substantial training and operational data. Some deployments operate against substantial existing data assets (financial-services transaction histories; e-commerce platforms’ user-behaviour data; manufacturing sensor networks); others must collect data from scratch (specific clinical applications; novel scientific domains; nascent product categories).
The data-availability factor affects both initial deployment (sufficient data to train the system) and ongoing operations (continuous data flow to maintain and improve the system). The factor produces structural advantage for incumbents with substantial existing data assets relative to entrants without comparable assets. The factor also produces specific challenges for deployment in data-scarce contexts; some deployments simply cannot reach high maturity until data accumulation has occurred.
Factor 4 — regulatory environment. Deployment maturity requires alignment with the applicable regulatory environment. Some deployment contexts have established regulatory frameworks that accommodate AI deployment (financial services with mature AI regulation; manufacturing with established product-safety frameworks; advertising with established consumer-protection frameworks). Others have new or contested regulatory frameworks (autonomous vehicles with developing AV regulation; public-sector decision-making with developing algorithmic-accountability frameworks; generative content with developing IP and content-authenticity frameworks).
The regulatory factor matters because deployment in incompatible regulatory environments produces operational risk (penalties; injunctions; reputational damage) that cannot be eliminated by technical capability alone. The factor has been progressively binding through 2024–2026 as the regulatory landscape (Chapter 14) has matured; firms operating in jurisdictions with mature AI regulation must align deployment with the regulatory framework or face substantial costs.
Factor 5 — deployment-environment friction. Deployment maturity requires manageable friction in integrating the AI system with the broader operational environment. Low-friction deployments include web-based or app-based deployments at consumer scale (where the integration is with the firm’s existing digital infrastructure); high-friction deployments include hospital deployments (with the substantial workflow-integration complexity that Chapter 7 covered), safety-critical infrastructure (with certification and operational-integration requirements), and cross-firm platforms (with the multi-party-coordination complexity that Tradelens demonstrated).
The friction factor matters because high friction produces high deployment cost and slow deployment speed. The factor explains why the same AI capability can produce mature deployment in one sector and immature deployment in another; the friction differential rather than capability differential is often the binding constraint.
Applying the framework. The five factors are assessed independently for specific deployments. Each factor is rated (low, mid, high) for the specific deployment context. The combined assessment characterises the deployment maturity prospects: deployments scoring high on all five factors will mature rapidly; deployments scoring low on multiple factors will face structural delays.
The framework supports specific operational decisions. A firm considering a deployment can assess the factors before committing; if multiple factors are low, the deployment may need substantial additional preparation before commitment. A regulator considering a regulatory framework can assess where deployments stand on the factors; if regulatory environment is the binding constraint, regulatory clarification may unblock substantial value. An investor considering an AI investment can assess where the firm and sector stand on the factors; investment thesis depends on factor assessment.
16.5 Capability maturity — TRL adapted for AI
The Technology Readiness Level (TRL) framework provides a structured approach to capability maturity. The original NASA framework defines nine levels:
- Basic principles observed
- Technology concept formulated
- Experimental proof of concept
- Technology validated in lab
- Technology validated in relevant environment
- Technology demonstrated in relevant environment
- System prototype demonstration in operational environment
- System complete and qualified
- Actual system proven in operational environment
The framework has been adapted for AI by various organisations (NIST; the European Commission; the UK Defence Science and Technology Laboratory; specific firms). The AI adaptations preserve the nine-level structure while specifying AI-specific criteria for each level.
The AI-adapted TRL framework.
TRL 1 — Basic principles observed. The relevant AI capability is in active research; specific algorithmic or architectural ideas are being developed. The deployment scope is research literature.
TRL 2 — Technology concept formulated. Specific approaches to applying the basic principles to particular AI tasks are formulated. Conference papers, technical reports, and early prototypes characterise this level.
TRL 3 — Experimental proof of concept. Specific demonstrations on small-scale tasks. The capability has been shown to work in principle but not at scale or in realistic conditions.
TRL 4 — Technology validated in lab. The capability has been demonstrated on benchmark tasks with reasonable performance. Reproducibility has been established. The technology works in controlled conditions.
TRL 5 — Technology validated in relevant environment. The capability has been demonstrated on realistic tasks in conditions approximating production deployment. Performance is meaningful but not yet production-ready.
TRL 6 — Technology demonstrated in relevant environment. The capability operates in production-like environments with documented performance. Specific issues have been identified and mitigations implemented.
TRL 7 — System prototype demonstration in operational environment. The capability operates in actual production with limited scope. Real users interact with the system; real outcomes are observed; real issues emerge and are addressed.
TRL 8 — System complete and qualified. The capability operates in full production. Operational issues have been substantially addressed. The system is qualified for the specific deployment context.
TRL 9 — Actual system proven in operational environment. The system is in mature production use with sustained operational performance. The technology has been proven across the full operational distribution.
Capability vs robustness vs reliability. A specific complication for AI capability maturity is the distinction between capability, robustness, and reliability.
Capability addresses what the system can do at its best — the peak performance on specific tasks under favourable conditions.
Robustness addresses how the system performs across varying conditions — does performance degrade with input variation; does the system handle adversarial inputs; does it fail gracefully when conditions exceed its design envelope?
Reliability addresses consistency over time — does performance remain stable across extended deployment; does the system degrade gradually or catastrophically; does the maintenance burden remain manageable?
The TRL framework can be applied separately for each dimension. A system may have TRL 7 capability (works in production for the specific task) but TRL 4 robustness (fails under adversarial conditions) and TRL 5 reliability (performance degrades over time without intervention). The dimensional assessment supports more-rigorous evaluation than capability assessment alone provides.
The 2024–2026 AI capability landscape. Foundation-model capability for many tasks has progressed substantially through 2023–2026. Specific applications are at high TRL (typically 7–9): customer-service handling for routine inquiries; coding assistance for routine implementation; document summarisation; basic content generation. Other applications remain at lower TRL: complex multi-step reasoning at production reliability; long-horizon planning; specific creative-judgment tasks; certain safety-critical contexts. The capability landscape is heterogeneous; assessment of specific deployments must address the specific application rather than applying generic AI-capability assumptions.
16.6 Firm-level AI maturity — five levels
A specific application of maturity frameworks is firm-level assessment — where does a particular organisation stand in its AI capability and deployment? The five-level framework adapts CMMI structure for AI-specific dimensions.
Level 1 — Exploration. The firm is exploring AI capabilities through pilots, proofs of concept, and limited experimentation. Specific characteristics: ad hoc AI activity; no centralised AI strategy; individual teams exploring use cases; minimal AI-specific infrastructure; no AI-specific governance. Most firms in 2018–2020 were at this level; many firms in 2024–2026 are still at this level for specific functions even if other functions have progressed.
Level 2 — Experimentation. The firm has multiple active AI projects with some level of strategic coordination. Specific characteristics: AI strategy exists but is uneven across functions; some shared AI infrastructure (typically cloud-based foundation-model APIs and basic ML platforms); emerging AI-specific governance and policy; AI talent recruitment underway; operational AI deployments at limited scale. Many firms in 2024–2026 are at this level.
Level 3 — Operationalisation. The firm has operational AI deployments at scale across multiple functions. Specific characteristics: clear AI strategy with executive sponsorship; substantial shared AI infrastructure (data infrastructure; ML platforms; foundation-model integration; operations-and-monitoring); AI-specific governance with clear policy and review processes; substantial AI talent across the organisation; operational AI deployments contributing measurable business value. Major banks, retailers, manufacturers, and adjacent firms increasingly reach this level through 2024–2026.
Level 4 — Integration. AI is deeply integrated into the firm’s core operations. Specific characteristics: AI strategy is integrated with overall business strategy; substantial AI-and-digital infrastructure as core operational capability; comprehensive AI governance integrated with broader risk management; AI-fluent leadership across the organisation; AI deployment contributing substantially to business outcomes; AI-driven business model evolution. Specific exemplar firms (Google, Microsoft, Meta, Amazon in technology; Stripe in financial services; Stitch Fix in retail; specific advanced manufacturers) approach this level. Iansiti and Lakhani’s “AI-first firms” framework (Section 16.9) captures this level.
Level 5 — Transformation. AI fundamentally transforms the firm’s business model and operating structure. Specific characteristics: the firm’s competitive advantage substantially derives from AI capability; the operating structure is built around AI as core infrastructure; the firm’s economics differ structurally from traditional firms in the same industry; the firm’s value chain has been substantially reshaped by AI integration. Few firms have reached this level in 2026; the most-prominent examples are specific AI-native firms (OpenAI, Anthropic, the major foundation-model providers) and specific AI-transformed firms (some specific consumer-internet firms; particular financial-services firms).
The progression dynamics. The progression through maturity levels is not automatic. Several patterns recur:
Stuck at Level 2. Many firms reach Level 2 (multiple AI experiments) but struggle to progress to Level 3 (operational deployment at scale). The barriers include: insufficient operational integration; unclear value-capture mechanisms; talent constraints; governance complications; competing priorities. The progression from 2 to 3 is the most-difficult transition for many firms.
Uneven across functions. Many firms have different maturity levels across different functions. A firm may be at Level 3 in marketing AI, Level 2 in operations AI, and Level 1 in HR AI. The unevenness produces specific operational challenges; cross-functional AI initiatives face the constraint of the lowest-maturity function involved.
Progression requires investment. Each level transition requires substantial investment in infrastructure, talent, governance, and strategic alignment. Firms that progress without adequate investment produce surface-level claims of maturity that operational reality does not bear out.
Level 4 and 5 are different in kind. Reaching Level 4 (integration) and Level 5 (transformation) requires more than incremental progression; it requires fundamental changes in how the firm operates and what its competitive position is. Most firms will not reach these levels without major strategic transformation.
Assessment in practice. Firm-level assessment typically uses a combination of methods: self-assessment surveys filled out by senior executives; structured interviews with AI leaders; review of AI-related documentation (strategy documents; governance frameworks; deployment inventories); benchmark comparison against peer firms. The assessment is necessarily approximate; the value comes from the systematic identification of strengths, gaps, and improvement priorities.
16.7 Application-level maturity — proof to production
Beyond firm-level assessment, individual AI applications progress through a maturity sequence. The application-level framework supports specific decisions about when to advance an application from one stage to the next.
Stage 1 — Proof of concept. A small-scale demonstration that the AI capability works for the specific task. The deployment scope is typically a small dataset, a specific test environment, and limited user exposure. Specific characteristics: minimal infrastructure; limited evaluation; no production integration; success criteria for advancement focus on whether the technical approach can produce useful results.
Stage 2 — Pilot. A controlled deployment with real users in real conditions but limited scope. The deployment scope is typically a small subset of the production environment with explicit boundaries. Specific characteristics: production-like infrastructure; structured evaluation against specific metrics; limited integration with production systems; success criteria for advancement include both technical performance and operational integration.
Stage 3 — Limited deployment. Production use within bounded contexts. The deployment scope is typically a subset of the full production environment with deliberate constraints. Specific characteristics: production infrastructure with operational support; comprehensive evaluation against business metrics; integration with relevant production systems; success criteria for advancement include sustained performance, manageable operational burden, and demonstrated business value.
Stage 4 — Production deployment. Full production use across the relevant scope. Specific characteristics: complete production infrastructure with full operational support; comprehensive monitoring and maintenance; integration with all relevant production systems; success criteria for ongoing operation include sustained performance, ongoing improvement, and contribution to business outcomes.
Stage 5 — Mature operations. Sustained production operations with continuous improvement. Specific characteristics: optimised production operations; substantial accumulated learning from operational data; continuous improvement through the data flywheel; integration with adjacent applications and broader business processes; the deployment is part of the firm’s competitive infrastructure rather than a discrete project.
The transition dynamics. The transition between stages requires specific decisions and investments. Premature transition (advancing before the current stage’s success criteria are met) produces deployment failures; the cautionary cases of Part II include several premature-transition examples. Excessively cautious transition (delaying advancement after success criteria have been met) produces opportunity costs; firms that get stuck at limited deployment when they should advance to production deployment forgo the value that production-scale deployment would produce.
The Part V playbook chapters (Chapters 19–28) specifically address the transition decisions for the unit’s worked example. Chapter 24 (Alpha) addresses the Stage 2 → Stage 3 transition with explicit discipline. Chapter 25 (Beta) addresses the Stage 3 → Stage 4 transition. Chapter 28 (Commercialisation) addresses the broader transition from initial deployment to sustained operations. The playbook discipline is the operational application of the maturity framework that this chapter develops.
Common patterns in stage progression.
The lab-to-pilot gap. Many AI applications work well in lab conditions but fail to transition to pilot. The gap reflects the difference between controlled-environment performance and real-world performance; specific issues (data drift; user behaviour; edge cases; integration friction) emerge that lab evaluation did not surface.
The pilot-to-limited-deployment gap. Many AI applications work well in pilot but struggle to scale to limited deployment. The gap reflects the difference between bounded-pilot conditions and broader production conditions; specific issues (cost economics; operational burden; integration complexity) emerge that pilot evaluation did not surface.
The limited-deployment-to-production gap. Many AI applications work well in limited deployment but face challenges in scaling to full production. The gap reflects the difference between deliberate-constraint conditions and full-production conditions; specific issues (performance at full scale; broader business-process integration; competitive dynamics) emerge.
The production-to-mature-operations gap. Many AI applications reach production but stall before mature operations. The gap reflects the difference between achieving deployment and producing sustained value; specific issues (ongoing maintenance burden; competitive responses; technology drift) constrain the application’s value contribution.
The framework supports systematic identification of where specific applications stand and what advancement requires. Rigorous application-level maturity assessment is what separates well-managed AI deployment from the cautionary-case patterns of Part II.
16.8 Sector-level maturity — extending §12.8
Section 12.8 introduced the sector-level maturity framework with three categories (mature; mid-stage; early-stage) and five factors driving maturation speed. This section extends the framework with more-detailed analysis.
The mature deployment sectors revisited. The sectors and applications identified in §12.8 as mature share specific characteristics that the five factors capture systematically:
Programmatic advertising (Section 10.1): Sharp task definition (predict click-through rate; optimise bid); clear feedback signals (clicks and conversions); abundant data (impression histories at major platforms); established regulatory environment (with progressive privacy-driven shifts); low integration friction (web-based deployment).
Consumer recommendation systems (Sections 8.1, 10.5): Sharp task definition (recommend items to users); clear feedback signals (engagement and consumption metrics); substantial data (user-behaviour history at major platforms); manageable regulatory environment (with progressive content-and-platform regulation); manageable integration friction (web-based deployment with API integration).
Imaging-AI in radiology (Section 7.2): Sharp task definition (detect specific findings in specific imaging modalities); clear feedback signals (radiologist confirmation and clinical outcomes); growing data (large imaging datasets at major academic medical centres); established regulatory environment (FDA AI/ML SaMD; EU MDR); high integration friction but manageable (PACS integration is engineered).
Predictive maintenance in manufacturing (Section 9.2): Sharp task definition (predict equipment failure); clear feedback signals (equipment outcomes); abundant data (sensor networks at major manufacturers); manageable regulatory environment; high but addressed integration friction (manufacturing systems are integrated).
The mature sectors share high scores on the five factors; the maturity is not coincidental but follows directly from factor alignment.
The mid-stage sectors revisited. Sectors at mid-stage maturity face specific factor constraints:
Generative AI in ad creative (Section 10.2): Task definition is moderate (creative generation has more ambiguity than CTR prediction); feedback signals are clear in some contexts but ambiguous in others (long-run brand effects are harder to measure); data availability is adequate; regulatory environment is established for advertising broadly but new for AI-generated creative; deployment friction is low. Mid-stage maturity reflects task-definition and feedback-signal limitations.
Drug discovery with AI (Section 7.4): Task definition is moderate (drug discovery involves substantial multi-objective optimisation); feedback signals are very delayed (drug efficacy emerges over years); data availability is moderate (patent and trial datasets are substantial but uneven); regulatory environment is established but layered with complexity; integration friction is high (drug discovery requires extensive lab-and-clinical infrastructure). Mid-stage maturity reflects feedback-signal delay and integration friction.
Manufacturing computer-vision QA (Section 9.3): Task definition is sharp; feedback signals are clear; data availability is moderate (each manufacturer’s defect distributions differ); regulatory environment is established; integration friction is high (manufacturing-line integration is substantial). Mid-stage to mature; specific deployments at advanced manufacturers (ViTrox; specific others) approach mature, while broader-industry adoption is mid-stage.
Agricultural precision applications (Sections 11.5–11.6): Task definition is moderate; feedback signals are seasonal-delayed; data availability is uneven across farm sizes; regulatory environment is light; integration friction is high (equipment and infrastructure investment). Mid-stage maturity reflects data and integration-friction constraints.
Legal and accounting AI (Sections 11.10–11.11): Task definition varies sharply across applications (sharp for document review; moderate for legal research; ambiguous for legal strategy); feedback signals are uneven; data availability is moderate (firms have substantial document repositories but legal-and-confidentiality issues constrain training); regulatory environment is established with new AI-specific overlays; integration friction is moderate. Mid-stage maturity with specific applications approaching mature.
The early-stage and contested sectors revisited. Sectors at early stage face binding constraints on multiple factors:
Autonomous vehicles (Section 12.4): Task definition is sharp at the high level but ambiguous at the operational level (handle all driving conditions safely); feedback signals are clear but rare-event problems (catastrophic failures are rare but important); data availability is substantial but coverage of long-tail conditions is uneven; regulatory environment is developing; integration friction is high (vehicle-and-infrastructure integration). Early-stage maturity reflects multi-factor constraints.
Generative video at production scale (Section 10.6): Task definition is moderate; feedback signals are uneven; data availability is constrained by IP-and-rights considerations; regulatory environment is developing; integration friction is moderate. Early-stage maturity reflects rights and capability constraints.
Government and public-sector AI (Section 12.1): Task definition varies; feedback signals are slow and complicated by political dynamics; data availability is constrained by privacy and consent; regulatory environment is developing post-Robodebt; integration friction is high. Early-stage maturity with substantial caution post-Robodebt.
The factor-based analysis explains the maturity differential; the framework supports prediction of which mid-stage sectors will progress to mature and what conditions would need to develop for early-stage sectors to advance.
The factor evolution. A specific dimension of sector-level maturity is how the factors evolve over time. Capability advancement can sharpen task definition (more capable models can address more-ambiguous tasks). Deployment scaling can produce feedback signals where none existed (early-deployment data accumulates into evaluation infrastructure). Data availability can grow over time (operational deployments produce data; data partnerships emerge; synthetic data generation can supplement scarce real data). Regulatory environments evolve toward stability (the EU AI Act’s progressive implementation through 2025–2027 will substantially reduce regulatory uncertainty). Integration friction can decrease (standardised protocols like MCP, Section 13.13, reduce integration burden).
The factor evolution supports prediction: sectors where multiple factors are improving will likely advance in maturity through 2026–2030; sectors where binding factors remain constrained will remain at current maturity stages until the binding factors evolve.
16.9 The Iansiti-Lakhani factor integration
The Iansiti and Lakhani (2020) “Competing in the Age of AI” framework, introduced in Chapter 3, provides specific dimensions that integrate with the maturity framework.
The data-flywheel as a maturity dimension. The data-flywheel concept (Section 12.9) is a specific maturity dimension. A deployment with a functioning data flywheel — where operational use generates data that improves the system, which produces better outcomes that generate more usage — has higher maturity than a deployment without one. The data flywheel is the operational mechanism through which the feedback-signal factor (Section 16.4 Factor 2) and the data-availability factor (Section 16.4 Factor 3) combine into sustained competitive advantage.
The successful data flywheels identified in Section 12.9 (Stitch Fix; Amazon recommendation; Netflix recommendation; Stripe Radar; GE Aviation engine twins; programmatic advertising) all reflect mature deployment-and-firm combinations where the flywheel operates. The failed data flywheels (Watson Health; Tradelens; various government-AI deployments) reflect immature deployments where the flywheel structure was not in place.
The operational architecture as a maturity dimension. Iansiti and Lakhani’s framework emphasises the AI-first firm’s operational architecture — the explicit design of operations around data, software, and AI as core infrastructure rather than as adjuncts to traditional operations. The architecture dimension overlaps substantially with firm-level Levels 4 (Integration) and 5 (Transformation) from Section 16.6.
The architecture dimension is binary in some respects (a firm either is or isn’t designed around AI as core) and continuous in others (firms can progressively re-architect operations toward AI-first). The contemporary pattern is that most firms are progressively re-architecting; few have completed the transition. The transition is structurally costly and politically contested within firms; the maturity progression on this dimension is genuinely difficult.
Connection to Part I §3. Chapter 3 covered the Iansiti-Lakhani framework in detail. The integration with the maturity framework: the AI factory framework provides the organisational architecture; the maturity framework provides the assessment of how mature the architecture is. Both perspectives are necessary for comprehensive analysis.
16.10 Maturity assessment methodologies
Maturity frameworks support assessment only if the assessment is methodologically sound. Several methodologies are used in practice.
Self-assessment surveys. The most-common assessment methodology is self-assessment by the firm or stakeholders involved. Surveys typically ask respondents to rate their organisation on specific dimensions; aggregated responses produce maturity assessments. The methodology is low-cost and easy to administer. The methodology has substantial limitations: respondents may have incomplete information; respondents may have incentives to overstate maturity (to support strategic narratives) or understate (to justify resource requests); the rating scale’s interpretation varies across respondents.
Despite limitations, self-assessment surveys are useful for specific purposes: tracking changes within a firm over time (where the consistent respondent set reduces variation); identifying perception gaps across functions (where divergent responses signal organisational issues); supporting initial conversations about improvement priorities.
Independent audits. A more-rigorous methodology is independent audit by external assessors. The auditors review documentation, conduct structured interviews, observe operations, and produce assessment based on standardised criteria. The methodology is more-costly but produces more-reliable assessment. The audit ecosystem is developing through 2024–2026; specific firms (Big Four; specialised AI-audit firms; academic centres) provide audit services with varying quality and scope.
Independent audits are increasingly required by regulatory frameworks (the EU AI Act’s conformity assessment for high-risk systems; NYC Local Law 144’s bias-audit requirements; specific sector-specific audit requirements). The assessment methodology is therefore maturing both because of internal demand from firms seeking rigorous assessment and external demand from regulators requiring independent verification.
Benchmark studies. Cross-firm benchmark studies provide context for individual-firm assessment. Industry-research organisations (Gartner; Forrester; IDC; specific consulting firms) publish benchmark studies that compare maturity across firms within industries. The benchmarks support relative assessment (where does this firm stand relative to peers?) but are subject to methodological limitations (the methodology is opaque; the sample selection is unclear; the participating firms may not be representative).
Specific benchmark studies through 2024–2026 have included: McKinsey’s annual State of AI survey; BCG’s AI Maturity studies; Deloitte’s State of AI reports; specific industry-vertical benchmarks. The cumulative benchmark literature is substantial but uneven in quality.
Hybrid methodologies. Rigorous practice typically combines methodologies: self-assessment for ongoing tracking; independent audit for periodic comprehensive review; benchmark studies for context. The combination produces more-reliable assessment than any single methodology provides.
The methodology choices. The choice of methodology depends on the assessment purpose. Strategic decisions support longer-cycle audit-based methodology. Investment decisions benefit from due-diligence audits. Regulatory compliance often requires independent audit. Tracking and improvement support self-assessment. The assessment design should match the decision support requirement.
16.11 AI Deployment Maturity Levels — explicit framework
Synthesising the prior sections, the AI Deployment Maturity Levels (ADML) framework provides explicit composite levels that combine capability and deployment dimensions. The framework supports systematic comparison across deployments.
ADML 0 — Pre-deployment. No operational deployment exists. The capability may be at any TRL; the deployment context is not yet established. Most “AI products” announced but not yet deployed sit at this level despite marketing claims.
ADML 1 — Initial deployment. First operational deployment with limited scope, limited users, and explicit experimental framing. Capability typically at TRL 6–7; deployment context typically Level 1–2 firm maturity. Common characteristics: pilot or limited-deployment status; explicit failure-tolerance framing; limited business-process integration; substantial human oversight.
ADML 2 — Operational deployment. Production deployment with documented operational performance. Capability typically at TRL 7–8; deployment context typically Level 2–3 firm maturity. Common characteristics: production status with bounded scope; structured operational metrics; defined business-process integration; structured human-oversight or escalation.
ADML 3 — Mature operational deployment. Sustained production deployment with ongoing operational and business-value delivery. Capability typically at TRL 8–9; deployment context typically Level 3–4 firm maturity. Common characteristics: scale production status; comprehensive operational and business metrics; integrated with broader business processes; continuous improvement through data flywheel.
ADML 4 — Strategic capability. The deployment provides substantial competitive advantage and is integrated with the firm’s broader strategic positioning. Capability at TRL 9; deployment context at Level 4 firm maturity. Common characteristics: substantial business-value contribution; competitive differentiation derived from the deployment; ongoing investment supports continuous capability advancement; the deployment is part of the firm’s competitive infrastructure.
ADML 5 — Foundation capability. The deployment is part of the firm’s core foundation, with the firm structurally built around the AI capability. Capability at TRL 9 with continuous advancement; deployment context at Level 5 firm maturity. Common characteristics: the firm’s economics depend substantially on the AI capability; the firm cannot operate without the AI capability; the firm’s competitive position is structurally derived from the AI capability.
Examples by ADML level.
ADML 5 — Foundation: Google’s search business depends fundamentally on AI capability; the firm cannot operate without it. Stripe’s fraud-prevention business depends on Stripe Radar; the AI capability is foundational. Specific other contexts where AI is the core business.
ADML 4 — Strategic: Netflix’s recommendation infrastructure provides substantial competitive advantage. Amazon’s recommendation and logistics-optimisation infrastructure are strategically central. Stitch Fix’s data-flywheel-driven personalisation supports the firm’s distinctive market position.
ADML 3 — Mature operational: Many specific operational AI deployments at major banks (fraud detection; underwriting); manufacturers (predictive maintenance; quality control); healthcare systems (specific imaging applications; ambient scribes at deployment scale); retailers (specific recommendation and inventory applications).
ADML 2 — Operational: Specific AI applications at firms across sectors that have moved past initial deployment to bounded production use. The bulk of contemporary commercial AI deployment sits at ADML 2.
ADML 1 — Initial: The substantial layer of pilot and limited-deployment activity across firms in 2024–2026.
ADML 0 — Pre-deployment: Many “AI initiatives” and “AI products” that have been announced but not yet operationally deployed.
The framework’s use. The ADML framework supports specific decisions: whether to commit to a specific deployment (the level should match the firm’s risk tolerance and strategic ambitions); whether to invest in a firm’s AI products (the level should match the investment thesis); whether to regulate specific deployments (the level should match the regulatory framework); how to evaluate AI claims (the claimed level should be supported by the deployment evidence).
The framework also supports more-rigorous public discourse about AI deployment. Many “AI deployment” claims that receive media attention are at ADML 0 or 1 — pre-deployment or initial deployment — but framed as if at ADML 4 or 5. The framework supports the distinction; rigorous evaluation can identify where specific claims sit on the actual deployment spectrum.
16.12 Pathologies of premature claims of maturity
The cautionary cases of Part II provide systematic evidence on pathologies of premature maturity claims. The patterns recur across cases and constitute one of the framework’s most-valuable diagnostic tools.
Pathology 1 — Broad framing without operational definition. The pattern: a deployment is announced with broad framing (“AI for medicine”; “AI for trade”; “AI for customer service”) that does not specify operational tasks. The implicit maturity claim is high (the broad framing implies the deployment can address the full domain), but the actual capability is much narrower. The Watson Health case (Section 7.3) is the canonical example.
The pathology is diagnostic: when a deployment is announced with broad framing but cannot specify which operational tasks it actually addresses with what reliability, the maturity claim is overstated. The diagnosis should produce skepticism about the claim and demand for more-specific operational characterisation.
Pathology 2 — Brand or political momentum substituting for evaluation. The pattern: a deployment is committed to based on strategic, brand, or political momentum rather than on rigorous evaluation. The momentum produces commitment that subsequent evidence cannot easily reverse. The Klarna case (Section 8.4) and the Robodebt case (Section 12.1) both exhibit this pattern.
The pathology is diagnostic: when deployment commitment is announced without supporting evaluation evidence, the maturity claim should be questioned. The evaluation evidence — what specific operational metrics support the claim; what specific deployment evidence has been produced — is what supports rigorous maturity assessment.
Pathology 3 — Alpha-skipping or staged-rollout failure. The pattern: a deployment is rolled out at scale without intermediate staging. The scale produces failure modes that intermediate staging would have surfaced. The Klarna full-scale customer-service deployment without staged rollout; the Robodebt nationwide rollout without proportional appeal-pathway scaling; the Boeing 737 MAX deployment without pilot training are structurally similar examples.
The pathology is diagnostic: when a deployment claims production maturity without evidence of intermediate-stage success (Section 16.7’s transition discipline), the claim should be questioned. The staged-rollout evidence — what specific intermediate stages were completed; what evidence supported each transition — is what supports rigorous maturity assessment.
Pathology 4 — Single-points-of-failure that cannot be assessed. The pattern: a deployment depends on specific components (data sources; specific algorithms; specific operational dependencies) that, if they fail, produce systemic failure. The reliance on single AoA sensors in Boeing 737 MAX; the reliance on income-averaging in Robodebt; the reliance on the Graph API in Cambridge Analytica are structurally similar. The pathology is partly architectural; rigorous design should have identified and mitigated the SPOFs.
The pathology is diagnostic: when a deployment cannot be fully assessed because specific components are opaque or untested, the maturity claim should be questioned. The transparency evidence — what specific components are involved; what specific failure modes have been considered; what specific mitigations exist — is what supports rigorous maturity assessment.
Pathology 5 — Defensive post-incident management. The pattern: when problems emerge during deployment, the response is defensive (defending the deployment; attributing problems to users; resisting substantive review) rather than constructive (acknowledging problems; investigating root causes; implementing corrections). The pattern recurs across the cautionary cases — Watson Health’s defensive response to early evidence of clinical problems; Klarna’s resistance to acknowledging the customer-service degradation; Robodebt’s defensive posture toward early criticism; Cambridge Analytica’s initial defensive response to revelations.
The pathology is diagnostic: when a deployment’s incident-response framework is unclear or defensive, the maturity claim should be questioned. The incident-response evidence — what specific frameworks exist for handling problems; what specific cases have been handled and how — is what supports rigorous maturity assessment.
The diagnostic application. The five pathologies provide diagnostic tools for evaluating maturity claims. When a specific deployment claim exhibits one or more pathologies, the claim is overstated; when the deployment’s actual evidence does not support the maturity level claimed, the gap is the source of risk that subsequent failure may surface.
The pathologies recur across the cautionary cases of Part II not coincidentally. The patterns are structural to AI deployment; the discipline of avoiding them is what distinguishes successful from failed deployment. Part V’s playbook chapters operationalise the discipline; Chapter 23’s evaluation framework, Chapter 24’s alpha discipline, and Chapter 25’s beta discipline directly address the pathologies.
16.13 Operational implications for management
The maturity framework has specific operational implications for different stakeholder groups.
For deployers. Firms deploying AI applications should assess maturity rigorously before commitment, during deployment, and during operations.
Pre-commitment assessment should evaluate both capability and deployment maturity for the specific application. The five-factor framework (Section 16.4) supports systematic evaluation. Where specific factors are weak, deployment commitment should be conditional on factor improvement or explicit decision to accept the risk.
During-deployment monitoring should track maturity progression against the application-level framework (Section 16.7). Stage advancement should be conditional on success criteria for the prior stage being met; premature advancement is the source of many cautionary-case failures.
Operational assessment should track ongoing maturity. The data-flywheel dynamics (Section 16.9) require ongoing maintenance; deployment maturity can regress if operational conditions deteriorate.
For investors. Investors evaluating AI investments should assess both firm-level maturity (Section 16.6) and application-level maturity (Section 16.7) for the specific investment thesis.
The investment-due-diligence process should include explicit maturity assessment. The methodology (Section 16.10) — combining self-assessment with independent verification and benchmark comparison — supports rigorous due diligence. The pathologies (Section 16.12) provide diagnostic warnings; investments in firms exhibiting pathologies should be priced accordingly.
The investment thesis should be explicit about the maturity assumptions. An investment thesis that assumes ADML 4 (strategic) should be supported by evidence of ADML 4 deployment; an investment in an ADML 1 (initial) deployment should be priced as the early-stage investment it is.
For regulators. Regulators developing AI-related rules should consider where deployments stand on maturity. Regulatory requirements that align with deployment maturity support successful deployment without imposing infeasible requirements; requirements that exceed current maturity may delay beneficial deployment without producing proportional safety gains.
The risk-based framework of the EU AI Act (Chapter 14) implicitly incorporates maturity reasoning; explicit maturity-framework integration could support more-rigorous regulatory development. The 2026–2030 regulatory trajectory will likely produce more-explicit maturity-framework integration as the field’s understanding deepens.
For evaluators. Researchers, journalists, and analysts evaluating AI claims should apply the maturity framework systematically. The pathologies (Section 16.12) provide red-flag indicators. The ADML framework (Section 16.11) provides composite assessment. The dimensional decomposition (capability vs deployment; the five factors; the application-level stages) supports rigorous evaluation that distinguishes substantive from inflated claims.
Connection to Part V playbook discipline. The Part V playbook chapters explicitly apply the maturity framework. Chapter 19 (idea selection) addresses maturity at the strategic level. Chapter 21 (MVP) addresses operational definition. Chapter 23 (evaluation) addresses the metrics that support maturity assessment. Chapter 24 (alpha) addresses Stage 2 → Stage 3 transition. Chapter 25 (beta) addresses Stage 3 → Stage 4 transition. Chapter 28 (commercialisation) addresses the broader maturity progression. The playbook is the operational application of the analytical framework that this chapter develops.
16.14 The forward trajectory of maturity assessment
Five trajectories define the maturity-assessment forward look.
Trajectory 1 — methodology maturation. Maturity assessment methodologies will continue to develop through 2026–2030. The audit ecosystem will mature; standardised assessment criteria will emerge; benchmark methodology will improve. The field will become substantially more rigorous; the framework’s diagnostic power will increase.
Trajectory 2 — regulatory integration. Regulatory frameworks will increasingly integrate maturity assessment. The EU AI Act’s conformity assessment, the FDA AI/ML SaMD framework, the various sector-specific frameworks — each will progressively elaborate the implicit maturity reasoning that is currently embedded in their structure. The 2030 regulatory landscape will likely include explicit maturity-framework references.
Trajectory 3 — capability-deployment co-evolution. The relationship between capability maturity and deployment maturity will continue to evolve. Capability advancement will support broader deployment in some contexts; deployment-environment evolution (regulatory; infrastructure; talent) will support broader applications in others. The co-evolution is bidirectional; both dimensions will be areas of active development.
Trajectory 4 — sector trajectory clarification. The mid-stage sectors of Section 16.8 will progressively clarify. Some will advance to mature; others will face structural constraints that prevent advancement. The 2030 picture will be substantially clearer than the 2026 picture; specific sectors will have demonstrated trajectories.
Trajectory 5 — pathology recognition. The diagnostic pathologies of Section 16.12 will become more-broadly recognised through the cumulative case material. Future cautionary cases will exhibit similar patterns; the patterns will become more-established as diagnostic tools. The 2030 field will have a more-developed pathology framework; rigorous practitioners will use it to identify problems early.
The bridge to subsequent Part III chapters: Chapter 17 integrates the analytical frameworks of Chapters 13–16 into the broader synthesis. Chapter 18 returns to specific cases at greater synthesised depth, applying the integrated frameworks to additional case material.
The maturity framework is not a checklist; it is a diagnostic tool that supports rigorous analysis of AI deployment. The framework’s value comes from systematic application — assessing capability and deployment dimensions independently; evaluating specific factors against evidence; recognising pathologies that signal overstated claims; supporting decisions with the assessment that precedes them. The discipline of maturity assessment is among the most-valuable analytical capabilities that AI deployment work requires; the framework’s value is realised in the discipline of its application.
References for this chapter
Maturity-framework foundations
- Software Engineering Institute, Carnegie Mellon University (1991, current). Capability Maturity Model Integration (CMMI).
- NASA (1989, current). Technology Readiness Levels Definitions.
- European Commission (2014). Horizon 2020 TRL definitions.
- US Department of Defense (2011). Technology Readiness Assessment Guidance.
AI maturity frameworks
- Iansiti, M. and Lakhani, K. R. (2020). Competing in the Age of AI. Harvard Business Review Press.
- McKinsey & Company (2023, 2024). The State of AI surveys.
- Boston Consulting Group (2024). AI Maturity Index.
- Deloitte (2024). State of AI in the Enterprise.
- Gartner (2024). Hype Cycle for Artificial Intelligence.
Cloud and digital-transformation maturity
- Amazon Web Services (2024). AWS Cloud Adoption Framework.
- Microsoft (2024). Cloud Adoption Framework.
- Google Cloud (2024). Cloud Adoption Framework.
- MIT Center for Digital Business (2024). Digital Transformation Maturity Model.
Specific case studies
- IBM (2018, 2022). Watson Health communications and divestiture announcements.
- Klarna AB (2024, 2025). AI customer service deployment and reversal communications.
- A.P. Møller-Maersk (2018, 2022). Tradelens launch and closure announcements.
- Royal Commission into the Robodebt Scheme (2023). Final Report.
- US Federal Trade Commission (2019). Settlement with Facebook Inc.
Audit and assessment methodology
- US Federal Reserve (2024). Supervisory Letter on AI use in financial services.
- New York City Council (2021). Local Law 144 (effective 2023).
- US National Institute of Standards and Technology (2023). AI Risk Management Framework.
- European Commission AI Office (2025). High-risk AI conformity assessment guidance.
Foundational labour and innovation literature
- Acemoglu, D. and Restrepo, P. (2020). Robots and jobs: Evidence from US labor markets. Journal of Political Economy 128(6): 2188–2244.
- Brynjolfsson, E. and Hitt, L. M. (2003). Computing productivity: Firm-level evidence. Review of Economics and Statistics 85(4): 793–808.
Cross-sector synthesis literature
- Iansiti, M. and Lakhani, K. R. (2020). Competing in the Age of AI. Harvard Business Review Press.
- Brynjolfsson, E. and McAfee, A. (2017). Machine, Platform, Crowd. W. W. Norton.
- Davenport, T. H. and Mittal, N. (2023). All-in on AI. Harvard Business Review Press.