Chapter 20 — Week 2: Customer discovery for AI products

Welcome to Week 2. Last week you committed to an idea on the basis of 8 short pre-validation calls and your own pattern recognition. This week you turn that hypothesis into a falsifiable, evidence-backed problem statement through 20 structured 30–45-minute customer interviews. By Friday you will have an affinity-diagrammed corpus of customer voices, a mapped as-is workflow, a named primary segment, and a one-page problem statement. The single most-common Week-2 mistake is to confuse interviews-as-validation (asking customers if they like your idea) with interviews-as-research (eliciting how customers actually behave). This chapter exists primarily to keep you on the research side of that line.

Chapter overview

This chapter follows the same six-part structure as Chapter 19. §20.1 (Concept) sets out customer discovery as a research methodology — Blank’s customer development framework, jobs-to-be-done theory, the Value Proposition Canvas, and the five ways AI customer discovery differs from classical SaaS or hardware discovery. §20.2 (Method) is the day-by-day Week 2 sprint with recruiting templates, the 30–45-minute interview structure, workflow mapping, and synthesis discipline. §20.3 (Lessons from the cases) pulls eight specific lessons on discovery from Parts I–III, including the most-cited cases (Watson Health, JPMorgan COiN, Ant Group, DBS) and several that we develop in upcoming Part II chapters. §20.4 (Tools and templates) gives you the recruiting scripts, interview structures, affinity-diagramming method, and Value Proposition Canvas template. §20.5 (Worked example) continues Team Aroma’s Pulse project from Week 1 through 22 customer interviews, an unexpected segmentation pivot, and a problem statement v1. §20.6 (Course exercises and deliverables) specifies the Week 2 submission with grading rubric.

How to read this chapter. Read §20.1 in full at the team’s Monday meeting. Read §20.2 individually before you make your first interview booking — recruiting badly is the highest-velocity path to wasting a week. Treat §20.3 as your Wednesday-evening reading once you have a few interviews under your belt; the lessons land harder when you have your own first-hand data to triangulate against. Use §20.4 throughout the week. Read §20.5 on Thursday for synthesis structure. Submit against §20.6 by Friday 23:59.

20.1 Concept

20.1.1 Customer development as a research methodology

Steve Blank’s customer development framework (Blank, 2005; expanded in Blank and Dorf, 2012) is the canonical alternative to product-development-driven startup methodology. Where product development organises a startup around shipping features to a hypothesised market, customer development organises a startup around iterative learning about customers, with feature decisions deferred until customer hypotheses have been validated.

The framework has four sequential stages, each with a falsifiable exit criterion:

  1. Customer Discovery — validate that you understand the customer, their problem, and the conditions under which they would adopt a solution. Exit criterion: a written, evidence-backed problem-customer-segment hypothesis.
  2. Customer Validation — validate that customers will actually pay for a solution at scale. Exit criterion: repeatable revenue from at least the early-adopter segment.
  3. Customer Creation — validate that demand can be created at scale through marketing and channels. Exit criterion: a sales pipeline with predictable conversion rates.
  4. Company Building — transition from search to execution. The startup becomes an operating company.

This week occupies the heart of Stage 1. By Friday you should be able to state in writing: who the customer is (with named segments and demographic specificity), what their problem is (with frequency, severity, and workflow context), and under what conditions they would adopt a solution (the “early-adopter test” in §20.1.6).

The framework’s most important methodological contribution is the get out of the building discipline (a Blank slogan): customer hypotheses cannot be validated from inside the startup’s office. Every assumption about the customer must be tested against evidence collected from people who are not the founders. Twenty interviews is a low number for this purpose — most professional researchers would consider 30–50 the saturation threshold for early-stage qualitative discovery — but twenty is the realistic Week-2 target for a five-person team, and the discipline of interviewing is more important at this stage than the absolute count.

20.1.2 Jobs-to-be-done

Clayton Christensen, Karen Dillon, Taddy Hall and David Duncan’s Competing Against Luck (2016) develops the jobs-to-be-done (JTBD) framework as a complement to demographic and psychographic segmentation. The central claim: customers do not “buy products”; they “hire products to do a job.” A useful product is one that performs the customer’s job better than the alternatives the customer is currently hiring.

The famous worked example is the McDonald’s milkshake. Christensen’s research team observed that 40% of milkshake sales happened before 8:30am, in the drive-through, and that each customer bought a milkshake with no other items. Conventional segmentation (kids, teens, adults; demographic factors) had not predicted this pattern. The JTBD analysis revealed the customers were commuters hiring the milkshake to do three things at once: occupy their hand during the drive, provide stomach-filling fuel that lasted until lunch, and entertain them during the commute. Once the job was named, the implications for the milkshake’s design (thicker, easier to hold, with chunks for variety) followed directly.

Three things make JTBD particularly useful for AI products:

  • Functional, emotional, and social dimensions of the job. Most products have more than one dimension. A Form-5 student preparing for SPM has the functional job of mastering the maths rubric, the emotional job of feeling prepared and confident, and the social job of meeting parental expectations. A successful product addresses more than one dimension. A product that addresses only the functional job (a maths textbook with answer keys) competes against many alternatives; a product that addresses functional + emotional + social well will face fewer competitors.
  • The “hiring and firing” frame. When a customer adopts your product, what are they firing? When they leave, what would they be hiring instead? The frame forces you to think about your product in the context of competing alternatives, including non-obvious ones. (Form-5 maths tutoring competes against not just other tutoring services but YouTube, peer study groups, and “doing nothing.” The non-consumption alternative is often the most-important competitor.)
  • The “progress in life” frame. Customers hire products to make progress on something they care about. The progress is the unit of analysis, not the product features. This shifts the customer-discovery questions from “what features would you want?” (a useless question, per the Mom Test) to “what are you trying to make progress on, and what is currently in your way?” (a useful question).

In Week 2, JTBD provides the organising frame for your interviews. Every interview should produce, at minimum, a job-articulation: what is this customer trying to make progress on, and what is currently in their way? That articulation is what you affinity-diagram across the 20 interviews to identify the recurring jobs.

20.1.3 Why customer discovery is different for AI products

Five distinctive properties of AI products affect customer-discovery methodology. A team that discovers as if for a generic SaaS product will produce findings that are systematically misleading for an AI product.

Technology anchoring. When you describe an AI product, customers anchor on either the technology (ChatGPT, Copilot) or the most-publicised competitor in your space. Both anchors are usually wrong. A parent who has tried ChatGPT for help with their child’s homework will frame your SPM-tutor product through the lens of that experience — including its weaknesses (inconsistent quality, no progress tracking, no localisation). A clinician who has heard about Watson Health will frame your clinical-decision-support tool through the lens of Watson Health’s failure. The anchors are sticky and shape what the customer perceives even before you describe your specific product.

The methodological response: in the discovery interview, describe the workflow change you are proposing rather than the technology you are using. Do not say “an AI tutor that uses GPT-4 to…”; say “imagine a tool that gives your child a personalised practice question after each exercise, with explanations in BM, English, or Mandarin, that adapts the difficulty as they improve.” The customer can react to the workflow change without being anchored to a specific technology.

Workflow specificity. AI products are most often deployed as substitutes or complements for specific steps in existing workflows, not as wholesale replacements. The unit of analysis is therefore the workflow step, not the workflow as a whole. A discovery interview that captures the customer’s journey at the level of “I do my work” is too coarse; an interview that maps the journey at the level of “first I open the spreadsheet, then I scan column F for outliers, then I cross-reference against the prior month’s data, then I draft an email” is the right granularity for an AI product to find its insertion point.

Trust and verification cost. AI products in regulated or B2B contexts must clear trust thresholds that classical SaaS does not. A clinician adopting an AI diagnostic tool faces malpractice exposure if the tool is wrong; a banker adopting an AI risk score faces regulatory exposure; a teacher adopting an AI tutor faces parent and principal scrutiny. The trust cost has two components: (a) the cost of evaluating whether the tool is trustworthy in the first place, and (b) the cost of verifying outputs in production once adopted. Both costs must be elicited in customer discovery; they are often the binding constraint on adoption even when the technical performance is adequate.

Demo effects and the wow gap. Generative AI demos elicit enthusiasm in interviews that does not translate to actual usage. A user who says “this is amazing” in an interview will use the product three times, conclude it is “less reliable than I thought,” and abandon it. The demo-versus-usage gap is much larger for AI products than for classical SaaS, because the demos are exceptionally good and the production reality is exceptionally noisy. The methodological response: never present a working demo in a discovery interview. Demos belong in customer-validation (Stage 2), once the problem hypothesis has been validated. In Week 2, you are eliciting behaviour and pain, not selling.

Data sensitivity and consent. AI products often require customers to share data with the system. Customers’ attitudes toward data sharing are a function of (a) the sensitivity of the data, (b) the customer’s trust in the firm, and (c) the customer’s understanding of what happens to the data. All three must be elicited. A customer who says “I’d love a personalised maths tutor for my child” may, on follow-up, say “but I don’t want anyone to know what questions she gets wrong” — the second answer is the binding constraint, not the first. Data-sensitivity questions belong in every Week 2 interview.

20.1.4 The customer interview as a research instrument

A customer interview is a piece of qualitative research. Like all qualitative research, its quality depends on the discipline of the researcher more than on the responsiveness of the subject. Three failure modes recur in student interviews and account for most of the wasted time at this stage.

Leading questions. “Wouldn’t it be useful if…” “Don’t you find it frustrating that…” Leading questions produce leading answers; the answers tell you about the interviewer’s hypotheses, not about the customer’s experience. The Mom Test rule (Fitzpatrick, 2013) applies: ask about past behaviour, not about future preferences. “What did you do the last time…” beats “What would you do if…” every time.

The interviewer’s own talk-time. Novice interviewers talk too much. They explain the product, they justify their interest, they fill silences. The customer’s talk-time should be at least 80% of the conversation; the interviewer’s questions should be short (typically under 15 words) and followed by silence. Long pauses after a customer’s answer often produce the most-informative second-pass answers, because the customer fills the silence with what they really meant.

The “wow factor” trap. Customer enthusiasm in an interview is a weak signal. The customer is being polite, is perhaps flattered to be asked, and is talking about an idealised future rather than a constrained present. Treat enthusiasm with mild suspicion. Treat specific criticisms as gold. A customer who says “I love it” has given you a generic compliment; a customer who says “the BM translations are inconsistent and my daughter complained that the explanations sound robotic” has given you a research finding.

Confirmation bias on the synthesis side. After the interview, the team’s confirmation bias takes over. Founders remember the answers that supported the original idea and forget the ones that contradicted it. The methodological response: write up each interview immediately after it ends, including the contradicting evidence, and synthesise across interviews using a structured affinity diagram (§20.4.4) rather than a narrative summary.

A final discipline worth naming: the “negative interview.” Of your 20 interviews, at least three should be with people you would expect not to be customers — Form-5 students whose parents do not pay for tutoring; clinicians who do not work in your target specialty; bankers in functions adjacent to but not your target one. Negative interviews bound your customer hypothesis from below; they tell you who is not your customer, which is often more useful than another confirmation that your imagined customer is your real customer.

20.1.5 Segmentation in Week 2

A common Week-2 mistake is to assume your customer segment is known and to interview only within it. This is begging the question. The 20 interviews should be designed to discover segmentation, not assume it.

Four lenses for segmentation, applied to AI products:

  • Vertical (industry). Are you targeting healthcare, finance, retail, education, manufacturing? Each industry has its own regulatory environment, sales cycle, and decision-making structure. AI products almost never succeed in two verticals simultaneously without distinct go-to-market strategies; pick one for the MVP.
  • Horizontal (function). Within a vertical, are you targeting the customer-service function, the analytics function, the operations function? Different functions have different budgets, different decision-makers, and different tolerance for risk.
  • Behavioural. Within a function, are you targeting heavy users vs light users, early adopters vs laggards, individual contributors vs managers? Behavioural segmentation often discriminates among customers more sharply than demographic segmentation.
  • Psychographic. Risk-tolerance, AI-comfort, technical sophistication. In B2C products particularly, psychographic segmentation often dominates demographic segmentation. The early adopters of consumer AI products in 2024–2026 are roughly the same psychographic cohort as the early adopters of smartphone apps in 2008–2010 — across a huge demographic spread.

In Week 2 your 20 interviews should be designed to span these lenses. Specifically:

  • 12+ interviews in your hypothesised primary segment (so you can characterise it well)
  • 5+ interviews in adjacent segments (to discover whether the segmentation is correct)
  • 3+ “negative” interviews (to bound the segment from below)

The ratio is more important than the absolute count. A team that conducts all 20 interviews in the primary segment learns less than a team that distributes across the four segment categories.

20.1.6 The early-adopter test

Geoffrey Moore’s Crossing the Chasm (1991) introduced the technology-adoption-lifecycle framework that organises modern startup go-to-market strategy. The framework identifies five customer cohorts: innovators (~2.5%), early adopters (~13.5%), early majority (~34%), late majority (~34%), and laggards (~16%). Most startups die trying to cross the “chasm” between early adopters and early majority. The Week-2 implication: you must find your early adopters before anything else, because they are the customers who will adopt your MVP despite its inevitable flaws.

The early-adopter test, adapted for AI startups, has five components:

  1. Pain awareness. Does the customer recognise the problem you are solving as a problem? (A customer who does not recognise the problem cannot adopt a solution to it.)
  2. Active workaround. Has the customer cobbled together their own workaround — a spreadsheet, a manual process, a freelance contractor, a half-functional script? An active workaround is the strongest possible signal of pain. (Customers without workarounds are usually not in enough pain to adopt.)
  3. Budget or budget intent. Is there money already being spent on the problem (existing tutoring fees, existing software licences, existing labour cost) that could be redirected? Or is there a credible budget allocation for new spending?
  4. Buying authority. Can the customer themselves authorise the purchase, or do they need to escalate to a manager / parent / spouse / committee? Early adopters in B2B are usually individual contributors with discretionary purchase authority for tools under a threshold (~USD 100/month is typical for individual contributors in 2026).
  5. Tolerance for imperfection. Is the customer willing to use a v1 product that is rough? Or do they need a polished, production-grade tool from day one? Early adopters tolerate roughness in exchange for being early.

Customers who score well on all five tests are your early adopters. They become the focus of your Week 3 MVP design and your Week 6–7 alpha and beta testing. Most teams find that 3–5 of their 20 Week-2 interviewees fit this profile.

20.1.7 The Value Proposition Canvas

Alexander Osterwalder, Yves Pigneur, Greg Bernarda, and Alan Smith’s Value Proposition Design (2014) develops the Value Proposition Canvas as a tool for designing products that fit a target customer profile. The canvas has two halves:

  • Customer Profile (right side). Three boxes: jobs (what is the customer trying to get done?), pains (what frustrates the customer about the current state?), gains (what would make the customer’s situation better?). The customer profile is the output of customer discovery.
  • Value Proposition (left side). Three boxes: products and services (what do you offer?), pain relievers (how do your offerings reduce pains?), gain creators (how do your offerings create gains?). The value proposition is the output of MVP design.

Fit is achieved when each pain has a pain reliever and each gain has a gain creator. Misfit is the most common cause of MVP failure: founders who designed pain relievers for pains the customers do not in fact have.

In Week 2, your output is the customer profile (right side). Your Week 3 work (Chapter 21) produces the value proposition (left side) and tests the fit. This explicit sequencing matters: a team that designs the value proposition before completing the customer profile is engineering pain relievers without knowing the pains.

20.2 Method — the Week 2 sprint

20.2.1 Days 1–2: recruiting and interview booking

By Tuesday evening, your team should have 22+ interview slots booked across the rest of Week 2 (the 20-interview target plus a 10% buffer for cancellations). Distribute the 22+ slots across four channels.

Channel Yield (typical) KL specifics Melbourne specifics
Warm intros via founders’ networks 8–12 Family WhatsApp, Monash KL alumni, school network LinkedIn 1st-degree, Monash Melbourne alumni, parent/family
Cold outreach (LinkedIn, email) 3–5 Target sector communities; use Monash Malaysia affiliation Same; Monash Australia affiliation often opens doors
Communities (Reddit, FB groups, Discord, sector Slack) 3–5 r/malaysia, malaysian-specific FB groups, local sector Discord servers r/AusFinance, sector-specific subreddits, regional Slack communities
In-person ambush (where ethical) 2–4 University cafes, public events, sector meetups Same; respect cultural norms

The recruiting templates in §20.4.1 provide tested message structures for each channel. The 22+ target with 10% buffer matters because typical no-show rates for Week 2 interviews are 10–20%; without the buffer, you finish the week with 16 interviews and miss the deliverable.

20.2.2 Days 2–5: the 30–45 minute interview

The interview structure has six segments. Total target time: 30–45 minutes; guard against drift in either direction. Under 25 minutes is too short to elicit workflow detail; over 50 minutes wastes the interviewee’s time and your synthesis budget.

Segment 1: Anchor and rapport (3–5 min).

Introduce yourself, the unit, and the research nature of the interview. Confirm consent for recording (audio only is acceptable; video is more accurate but creates more distraction). Establish that you are not selling, are not building a product yet, and are interested in the customer’s experience. The Mom Test framing: “I’m trying to understand how people actually [do X] today, before I decide whether there’s anything worth building.”

Segment 2: Behavioural deep-dive (8–12 min).

The single most important segment. Anchor the conversation on a specific past instance, not on a generalisation. “Tell me about the most recent time you [did the activity].” Resist the customer’s natural tendency to abstract (“I usually…”) by repeatedly bringing them back to the specific instance (“OK, but on that specific Tuesday, what did you do?”). Aim to capture: who was involved, what tools they used, where they got stuck, what they did when they got stuck, how long it took, what the outcome was, and how they felt about the outcome.

Segment 3: Workflow mapping (8–12 min).

In real time, draw the customer’s workflow on a shared screen or in a notebook visible to the customer. Walk through the steps with them: “So step 1 is X. Then what?” The drawing forces specificity. Many customers will only realise the granularity of their own workflow when they see it being mapped. The most-important moments are the points where the customer says “and then…”, which usually indicate a step they had not previously articulated. Capture each step with a verb, an artefact (what document/screen/tool is involved), and a duration estimate.

Segment 4: Pain articulation (5–10 min).

Now go back through the workflow you just mapped and ask, at each step, “what’s annoying about that?” or “where does that go wrong?” The structure forces the customer to surface micro-frictions they would not have volunteered in a generic complaint. Distinguish high-frequency low-severity pains (small frictions encountered daily) from low-frequency high-severity pains (catastrophic failures encountered occasionally). Both are research findings, but they imply different product responses.

Segment 5: Solution exploration and data sensitivity (5–8 min).

Ask about workarounds: “what have you tried to address [the named pain]?” Ask about data sharing: “if a tool wanted access to [your students’ progress / your patients’ records / your transaction data] in exchange for [a benefit], what would your reaction be?” The data question elicits the trust-and-verification dimension that classical customer discovery often misses for AI products.

Resist the temptation to describe your product in this segment. If the customer asks what you are building, deflect: “we’re still in research; what we’re learning today is what shapes that decision.” Customers respect this answer; they do not respect a half-formed product pitch.

Segment 6: Close and referral request (2–3 min).

Thank the customer. Ask whether they would be willing to be re-contacted in 4–6 weeks for a follow-up. Ask for referrals: “is there anyone else who experiences [the named pain] that you’d be willing to introduce me to?” Referrals from validated subjects are typically 3–5× more likely to result in interviews than cold outreach, and they carry an implicit endorsement that improves rapport.

20.2.3 Same-day write-up

Within 60 minutes of the interview ending, complete the per-interview write-up template (§20.4.3). The discipline of immediate write-up matters because (a) memory degrades quickly — by the next morning the verbatim quotes are gone and the workflow drawings are partially fictionalised; and (b) the next interview’s quality is improved by reflection on the prior one. Allocate 15–20 minutes per interview for write-up; budget for it in your calendar.

The write-up captures, at minimum: the four-element synthesis from Week 1 (one-sentence surprise, one verbatim quote, updated assessment, next-question), plus the workflow map (transcribed and tidied from the live drawing), the named pains in priority order, the data-sensitivity finding, and the early-adopter test scoring (5 dimensions, scored 1–5 each).

20.2.4 Day 5: synthesis through affinity diagramming

By Friday morning your team should have ~20 interview write-ups in a shared workspace. Synthesis transforms the corpus from a collection of individual notes into an evidence-backed problem statement. The technique is affinity diagramming, due to Kawakita Jiro and standard in qualitative research methodology.

The procedure (full method in §20.4.4):

  1. Extract. From each write-up, extract the named pains, jobs, and workarounds onto separate digital sticky notes (Miro, FigJam, Notion). One pain per sticky. Aim for 8–15 stickies per interview, so 160–300 stickies total.
  2. Cluster. As a team, cluster the stickies into thematic groups. Do not name the clusters yet; let them emerge. Stickies that do not fit anywhere stay in an “outliers” pile.
  3. Name. Once the clusters are stable (typically 30–60 minutes of clustering), name each cluster with a short descriptive phrase. Resist abstract labels (“frustration”); favour specific labels (“BM/English code-switching makes existing tools feel foreign”).
  4. Count. Count the stickies in each cluster. The most-populous clusters are your most-recurring findings. Single-mention clusters are worth noting but should not drive your problem statement.
  5. Quote. For each major cluster, identify the most-vivid verbatim quote from your interviews. Quotes are the evidence you will cite in the problem statement and (later) in the pitch deck.

The output of synthesis is a labelled affinity diagram you can screenshot, plus a list of ranked themes with quote citations. This is the primary input to the problem statement.

20.2.5 Day 5: problem statement v1

The problem statement v1 is a one-page document with the following structure:

PROBLEM STATEMENT v1 — [PROJECT NAME]
Date: [Friday of Week 2]

WHO: [primary customer segment, with demographic specificity]
WHAT: [the named primary pain, in customers' own language]
WHEN: [the frequency of pain — daily, weekly, monthly]
WHY: [the structural reason the pain persists despite existing alternatives]
HOW MUCH: [the cost/value at stake — money, time, or both]

EVIDENCE: [3–5 specific verbatim quotes from interviews,
           with interviewee identifiers]

BOUNDARY: [who is NOT in this segment, with rationale]

EARLY ADOPTERS: [the 3–5 specific interviewees who fit the
                 §20.1.6 early-adopter test, with their fit
                 evidence]

WHAT WE STILL DON'T KNOW: [3 specific open questions that
                            Week 3 work should resolve]

OPEN-ENDED ALTERNATIVES: [if the team is considering
                          pivoting away from the Week 1 idea,
                          state the conditions for pivot here]

The form forces you to be specific. A team that cannot fill in the WHO with demographic specificity has not yet narrowed; a team that cannot fill in the EVIDENCE with verbatim quotes has not interviewed deeply enough; a team that cannot fill in the BOUNDARY has not done the negative interviews.

20.2.6 The customer profile (Value Prop Canvas right side)

In parallel with the problem statement, produce the customer profile (right side of the Value Proposition Canvas), with three boxes:

  • Jobs. What is the customer trying to get done? Functional, emotional, social. Aim for 5–8 jobs.
  • Pains. What frustrates the customer about the current state? Aim for 6–10 pains in priority order.
  • Gains. What would make the customer’s situation better? Aim for 5–8 gains, distinguishing required gains (without which the customer cannot adopt) from desired gains (which would delight them).

The customer profile is the artefact you carry into Week 3, where you design the value proposition (left side) to fit. Most teams find that the customer profile takes 3–4 hours to produce well; allocate Friday afternoon for it.

20.2.7 The Friday submission

Submit the Week 2 deliverable bundle by 23:59 Friday. The deliverable specification is in §20.6. As in Week 1, do not submit components no team member other than the author has read; the team-comprehension penalty applies.

20.3 Lessons from the cases

Eight specific lessons from Parts I–III shape Week 2 customer-discovery decisions.

20.3.1 Watson Health — the failure to discover clinician workflows (Chapters 2, 7)

IBM positioned Watson Health as a clinical decision-support system for cancer treatment. In retrospect, the failure was not the technology; it was that IBM had not done the customer-discovery work to understand how oncologists actually make treatment decisions. Oncologists do not read structured patient summaries and select treatment options from a ranked list; they integrate clinical, social, family, and trial-eligibility considerations through a deliberative process that varies substantially by institution and by physician. Watson Health’s recommendations were structured for a workflow that did not exist.

Operational implication. In Week 2, your workflow mapping (§20.2.2 Segment 3) is the most-important segment for AI products, particularly in B2B contexts. A team that captures only the customer’s stated preferences without mapping the actual decision-making workflow will produce a product that does not fit.

20.3.2 JPMorgan COiN — interviewing the operators, not the executives (Chapter 6)

The COiN deployment succeeded partly because the bank’s discovery work (in 2014–2016, before the system was built) was done with the actual lawyers and analysts who would use the system, not just with the executives who would approve it. The lawyers told the discovery team where in the contract review process they spent the most time, what kinds of clauses caused them the most trouble, and what their existing workarounds were. The system was designed against this evidence.

Operational implication. Discovery must reach the user of the eventual product, not just the purchaser. In B2B products, the purchaser (a CIO, a procurement director, a department head) often has different priorities from the user (a lawyer, an analyst, a customer-service representative). Your 20 interviews should include both, weighted toward users.

20.3.3 Ant Group — revealed preference over stated preference (Chapter 3, §3.11)

Ant Group’s 3-1-0 lending model was designed against revealed-preference data — actual transaction patterns from Alipay — not against stated-preference surveys about lending. This produced a system whose underwriting reflected what borrowers did rather than what they said. The accuracy gap between revealed-preference and stated-preference designs is large in financial services and large in any domain where social desirability or self-image distorts stated preferences.

Operational implication. In Week 2, your behavioural deep-dive segment (§20.2.2 Segment 2) is the closest your interviews can come to revealed preference. Push for specific past behaviour, not general claims. “How many times did you do X last week?” beats “How often do you do X?” every time, because the first answer is bounded by memory of specific events while the second is bounded only by self-image.

20.3.4 DBS — observing behaviour at scale (Chapter 4, §4.11; Chapter 6, §6.10)

DBS’s 50,000 daily personalised nudges programme was built on observing customers’ actual behaviours — when they had idle balances, when their typical low-balance points occurred, when they had subscriptions that changed price — rather than on surveying customers about what kinds of nudges they would want. The behavioural-economics framing produced nudges customers responded to; a survey-based design would have produced nudges that customers thought they should want but ignored in practice.

Operational implication. For B2C products, where you can observe (or simulate) customer behaviour at scale, supplement interviews with behavioural data when available. A team with access to even a small sample of telemetry data, transaction logs, or app analytics can triangulate interview findings against behaviour. For most student teams, this is impractical in Week 2 — but the principle that revealed preference > stated preference applies to interview question design as well.

20.3.5 Klarna — the gap between “would you use AI for this?” and “would you keep paying us if AI handled this?” (Chapter 8, forthcoming)

Klarna’s pre-deployment discovery (2022–2023) reportedly indicated that customers were comfortable with AI handling their service queries. Post-deployment data showed substantially declining customer satisfaction once AI was the primary handler. The gap was not in the technology — the AI was reasonably capable — but in the discovery framing. Customers asked “would you be comfortable using AI for service queries?” answered yes; customers asked “would you keep paying us if AI was the only available service channel?” might have answered differently.

Operational implication. Distinguish opt-in willingness from substitution willingness in your interviews. “Would you use [AI tool] alongside your current process?” is a different question from “would you replace your current process with [AI tool]?”, and the second is the question that matters for your eventual product economics. Push for substitution questions, not augmentation questions, during the interview.

20.3.6 Cursor — when the founder is the customer, “discovery” is autoethnography (Chapter 5, AI-native disruption)

Anysphere’s founders did not need 20 customer interviews because they were the customers. They had spent thousands of hours using GitHub Copilot and could articulate every micro-friction in the IDE-around-the-model workflow with first-person specificity. Their early product decisions were autoethnographic — they built what they themselves needed.

Operational implication. If your team includes a founder who is genuinely the customer (a student building a tool for students; a clinician building a tool for clinicians; an SME owner building a tool for SME owners), you have a discovery advantage that the framework does not fully capture. Use it. Document the founder’s own first-person workflow and pains as if they were one of the 20 interviews — but not all of them. Founder autoethnography validates only the founder’s own experience; the other 19 interviews validate that the founder is representative.

20.3.7 Stitch Fix — data collection IS discovery (Chapter 8, forthcoming)

Stitch Fix’s customer discovery was structurally embedded in its product: the initial style quiz collects 60+ data points about a customer’s preferences, body shape, lifestyle, and budget. Each subsequent shipment generates additional revealed-preference data through the keep/return decisions. The company never had to do separate “customer discovery” because every interaction was discovery.

Operational implication. For consumer AI products, design your eventual MVP to collect discovery-grade data as a byproduct of customer use. The discovery does not stop at the end of Week 2; it continues throughout the product’s life. Week 2 produces the initial hypothesis; the product itself produces the ongoing validation. The implication for Week-3 MVP design: the MVP should include analytics, structured customer feedback channels, and (where appropriate) explicit preference-elicitation moments — not because the customer asked for them, but because they will be your future discovery infrastructure.

20.3.8 The Mom Test failure mode — leading questions distort findings (Fitzpatrick, 2013)

Most novice interviewers ask leading questions because they are anxious to validate their idea. “Wouldn’t it be useful if…” is a pure leading question; it produces compliant answers that do not test the hypothesis. The cumulative effect across 20 interviews is a corpus that systematically over-states customer pain and over-states willingness to adopt the proposed solution. Teams that fall into this pattern in Week 2 typically discover the bias in Week 6 when their alpha launch produces dramatically lower engagement than the interview data predicted.

Operational implication. Audit your interview transcripts for leading questions before synthesis. The simplest test: read your script in advance and circle every question that contains a hypothesised positive answer or product feature. Replace each with a behavioural question. Run a peer-review of two interviews per founder, with another founder reading the transcript and flagging leading questions. The discipline catches the failure mode before it contaminates the corpus.

20.4 Tools and templates

20.4.1 Recruiting templates

Warm intro via WhatsApp / messaging app (KL primary):

Hi [name], I'm a Monash student doing research on
[brief, neutral description of the domain — not the product].
I'm hoping to interview about 20 people who [characterisation
of the target experience], for about 30–45 minutes. No pitch,
no selling — just trying to understand how people actually
handle this. Would you have time this week, or do you
know someone who might?

Cold outreach via LinkedIn (Melbourne and KL, professional contexts):

Subject: Monash student research — 30 min on [domain]

Hi [name],

I'm [your name], a [program] student at Monash University
[Melbourne / Malaysia]. As part of a graduate unit, I'm
researching how [target audience] handles [specific
activity]. Your experience at [specific point of relevance
in their LinkedIn] would be particularly useful.

I'm looking for 30–45 minutes of conversation about your
actual experience — not a pitch, not a survey. The
research is for academic and pre-startup work, with no
commercial agenda yet. I can fit your schedule and would
be happy to share findings afterwards if useful.

Would any time in the next two weeks work?

[your name]
[your contact]
[your Monash student page or LinkedIn]

Community outreach (Reddit / FB group / sector Discord):

[Title or post header]: Quick research request — 30 min
on [domain] for Monash student project

Hi all — Monash student here, doing research for a
graduate unit on how [target audience] handles
[specific activity]. Looking for 5–10 people willing to
chat for 30–45 min about your real experience. No pitch
involved; happy to share findings.

If you've ever [specific concrete experience], I'd love
to hear from you. DM me.

In-person ambush (where ethical and culturally appropriate):

A specific script depends heavily on context. The general structure: introduce yourself, name your research purpose, ask for 5 minutes (not 30 — the in-person ask is for a brief conversation that may lead to a longer scheduled interview). For KL: cafes near Monash Sunway, the BMS school complex, university hostels. For Melbourne: cafes near Caulfield/Clayton, the Carlton/Fitzroy startup-adjacent spaces, sector-specific events advertised on Eventbrite. Always carry a Monash student ID, an explicit research framing, and a way to follow up (a QR code linking to a Calendly is more efficient than business cards).

20.4.2 Structured 30–45 min interview script

The script below is a template; adapt the specifics to your domain. The structure is six segments per §20.2.2, with timings.

INTERVIEW SCRIPT — [PROJECT]
Total time: 30–45 min

[1. ANCHOR AND RAPPORT — 3 to 5 min]
- Introduce self, unit, research purpose
- Confirm consent for recording (audio only OK)
- Establish "no pitch, no selling" frame
- Restate Mom Test framing: "I'm trying to understand
  how people actually do X today"

[2. BEHAVIOURAL DEEP-DIVE — 8 to 12 min]
- "Tell me about the most recent time you [activity]."
- "On that specific day, what did you do first?"
- "Then what?" (repeat as needed)
- "Who else was involved?"
- "What tools did you use?"
- "How long did it take?"
- "What was the outcome?"
- "How did you feel about the outcome?"

[3. WORKFLOW MAPPING — 8 to 12 min]
- Open shared screen / notebook visible to interviewee
- Draw step 1: [verb + artefact + duration]
- "OK, so step 1 is X. Then what?"
- Draw each subsequent step
- For each step: "What document or screen or tool is
  involved? About how long does this take?"
- Identify branch points: "Does this always happen this
  way, or are there alternative paths?"

[4. PAIN ARTICULATION — 5 to 10 min]
- Walk back through the workflow you just mapped
- At each step: "What's annoying about that?"
- "Where does this go wrong?"
- "When was the last time this step caused a real problem?"
- Distinguish high-frequency-low-severity from
  low-frequency-high-severity pains

[5. SOLUTION EXPLORATION AND DATA SENSITIVITY — 5 to 8 min]
- "What have you tried to address [the named pain]?"
- "Did the workaround work? What did you give up?"
- "If a tool wanted access to [specific data], in
  exchange for [hypothetical benefit], what would your
  reaction be?"
- "Who else would need to approve adopting a new tool?"
- DO NOT describe your product

[6. CLOSE AND REFERRALS — 2 to 3 min]
- Thank interviewee
- "Is it OK to follow up in 4–6 weeks?"
- "Is there anyone else who experiences [the named pain]
  that you'd be willing to introduce me to?"
- Confirm contact details for follow-up

20.4.3 Per-interview write-up template

INTERVIEW WRITE-UP
Date / time: [DATE TIME]
Interviewer: [NAME]
Interviewee ID: [PSEUDONYM_OR_ID]
Channel: [warm intro / cold outreach / community / ambush]
Demographic: [role, sector, region, relevant context]

ONE-SENTENCE SURPRISE
[The thing you didn't expect.]

VERBATIM QUOTE
"[The single most-vivid sentence the interviewee said,
in their own words.]"

WORKFLOW MAP (transcribed)
Step 1: [verb + artefact + duration]
Step 2: [verb + artefact + duration]
...
Branch points: [where the workflow forks]

NAMED PAINS (priority order)
1. [pain 1, with specific frequency and severity]
2. [pain 2 ...]

WORKAROUNDS
[What the interviewee currently does to address the pains.]

DATA SENSITIVITY FINDING
[How the interviewee responded to the data-access question.]

EARLY-ADOPTER TEST SCORING (1–5)
- Pain awareness: [score]
- Active workaround: [score]
- Budget or budget intent: [score]
- Buying authority: [score]
- Tolerance for imperfection: [score]
Composite: [sum]

UPDATED ASSESSMENT
[Has my problem-customer hypothesis become more or less
plausible? What did I learn?]

NEXT QUESTIONS
[What I should ask the next interviewee about.]

REFERRALS RECEIVED
[Names and contact details for follow-up.]

20.4.4 Affinity diagramming method

The procedure for synthesis (Friday morning, 2–3 hours team time):

  1. Set up. Open a Miro / FigJam / Notion canvas. Create a column per interview and a sticky-extraction zone.
  2. Extract. Each founder takes 4 interviews and extracts named pains, jobs, workarounds, and data-sensitivity findings onto separate digital stickies. Aim for 8–15 stickies per interview. Use a colour code: pains in red, jobs in blue, workarounds in green, data-sensitivity in yellow.
  3. Cluster. As a team (synchronous, both campuses), drag stickies into clusters by similarity. Resist naming. Move stickies freely; the early arrangement will be wrong. After 30–60 minutes the clusters stabilise.
  4. Name. Once clusters are stable, name each. Avoid abstract labels (“frustration”); favour specific labels (“BM/English code-switching makes existing tools feel foreign”).
  5. Count. Annotate each cluster with its sticky count. The most-populous clusters are your most-recurring findings.
  6. Identify representative quotes. For each major cluster, link 1–2 verbatim quotes (with interviewee ID) for citation in the problem statement.
  7. Identify the outliers. Stickies that did not cluster are research findings too — they are evidence of segment heterogeneity, possible secondary segments, or signals that you have under-sampled certain customer types.

The affinity diagram is the artefact you screenshot for the deliverable bundle; the named-cluster ranking with quote citations is the structured input to the problem statement.

20.4.5 Workflow mapping template

For each customer’s as-is workflow, capture:

WORKFLOW MAP — [INTERVIEWEE ID]

Step 1
  Verb: [what the customer does]
  Artefact: [document/screen/tool involved]
  Duration: [time spent]
  Pain: [from §4 of interview]
  Workaround: [if any]

Step 2
  ...

[Branch point — describe alternative paths]

Step N (terminal)
  Outcome: [success/failure conditions]

Workflow maps from individual interviews are then composited into a segment-level workflow map that captures the modal pattern across the segment, with annotations for the major variations. The composite is the input to MVP scoping in Week 3.

20.4.6 Customer Profile (Value Prop Canvas right side)

The customer profile is one canvas per primary segment. Format:

CUSTOMER PROFILE — [SEGMENT NAME]

JOBS (functional, emotional, social)
  Functional:
    1. [job 1]
    2. [job 2]
    3. ...
  Emotional:
    1. [job 1]
    ...
  Social:
    1. [job 1]
    ...

PAINS (priority order)
  1. [pain 1 — with frequency and severity]
  2. [pain 2 — ...]
  ...

GAINS (required vs desired)
  Required:
    1. [gain 1 — without which adoption is blocked]
    ...
  Desired:
    1. [gain 1 — would delight]
    ...

Use the actual Strategyzer Value Proposition Canvas template (free at strategyzer.com) for the team-facing version; the structured text above is for the deliverable submission.

20.4.7 Segmentation grid

A 2x2 or 3x2 grid that maps your interview corpus across the segment dimensions you have discovered. A typical Week-2 output:

Segment n interviews Modal job Modal pain Early-adopter density Notes
[Primary] 12 4 of 12 Selected for MVP
[Adjacent A] 4 1 of 4 Worth tracking; revisit Week 4
[Adjacent B] 3 0 of 3 Out of scope for MVP
[Negative — chosen as control] 3 0 of 3 Confirms boundary

The segmentation grid is the artefact that justifies your MVP-target choice. A team that cannot fill in this grid has not narrowed enough.

20.5 Worked example — Team Aroma’s Week 2

Team Aroma continues with the Pulse SPM-tutor concept from Week 1.

Days 1–2: recruiting

By Tuesday evening they have 24 interviews booked across the rest of the week. Recruiting distribution:

  • Aliyah’s network (8 interviews): 5 parents from her tutoring side-business contact list, 3 SPM students she previously tutored.
  • Sara’s design network (3 interviews): connections to Form-5 students through a school-design programme she volunteered with.
  • Wei Hao’s network (2 interviews): cousins in Form 5; one tutoring-centre owner he encountered through his fintech internship.
  • LinkedIn cold outreach (4 interviews): Aliyah identified 12 tutoring-centre owners in Klang Valley via LinkedIn search; 4 responded.
  • Daniel’s edtech connections (4 interviews): 2 Australian-based parents of recent-Australian-equivalent-to-SPM students for cross-comparison; 2 Australian edtech product managers for category context.
  • Priya’s curriculum network (2 interviews): 2 retired Malaysian secondary-school teachers her family knows.
  • Buffer (1): a friend-of-a-friend introduction kept in reserve; converted to an interview when one parent cancelled mid-week.

Most of the interviews are scheduled 30 minutes; the tutoring-centre owners and retired teachers each get 45 minutes, since they are denser sources of workflow detail. The team distributes the interviewing across all five members so each member runs 4–5 interviews across the week.

Days 2–5: the interviews

The team conducts 22 interviews across Tuesday through Friday morning. (Two scheduled interviews fall through; the buffer covers one.) Selected findings, as captured in the write-ups:

Interviewee P3 (parent of Form-5 student in SMK Sri Permata): “I pay RM 350 a week for two-on-one tutoring at the [redacted] centre. The teacher is good but my daughter only goes Saturday morning. The rest of the week she’s stuck. If she gets a question wrong on Tuesday she has to wait until Saturday for help. I would pay RM 80 to RM 100 a month for something she can use any day.”

Interviewee S5 (Form-5 student, SMJK Yu Hua): “I tried Snapask last year. The questions were OK but the explanations didn’t match my school’s marking scheme. SPM has its own format and the answers needed to follow that format. The Snapask tutors didn’t always know that.”

Interviewee T2 (owner of a Klang Valley tutoring centre): “Teachers spend most of their time grading homework and giving the same explanations five times a week. If they had a tool that handled the routine practice, they could focus on the harder stuff. I would consider paying RM 30 per student per month if it actually worked for SPM.”

Interviewee P7 (parent, MARA-scholarship household): “We can’t afford the centres. My son uses YouTube and his school textbook. He gets stuck a lot. RM 50 a month is more than we spend on his school books for a year. RM 20 maybe.”

The pattern emerging from the synthesis: the parent-paying segment splits into at least two sub-segments — the centre-paying segment with WTP RM 50–100/month, and the textbook-only segment with WTP RM 10–20/month. The centre-paying segment is much smaller (perhaps 15–20% of Form-5 households nationally, concentrated in Klang Valley, Penang, and Johor Bahru) but has the budget. The textbook-only segment is the larger market but the unit economics do not work at WTP RM 20/month for the team’s projected cost structure.

A surprise emerges from interviews T1 through T4 (the four tutoring-centre owners): they universally describe Pulse not as a competitor but as a teacher productivity tool. Their pain is not parents demanding better products; it is teacher time being absorbed by routine grading and re-explanation. If Pulse could function as a teacher’s homework-handling assistant, with the centre as the customer and the student as the user, the unit economics shift dramatically: a centre with 200 students at RM 30/student/month is RM 6,000/month per centre, with each centre representing a single sales transaction.

Day 5: synthesis

The team runs the affinity diagram on Friday morning. 220 stickies from 22 interviews. The major clusters that emerge:

Cluster Sticky count Verbatim anchor quote
SPM-rubric alignment is the binding constraint 38 “Snapask explanations didn’t match my school’s marking scheme” (S5)
24/7 availability matters more than quality 27 “If she gets a question wrong on Tuesday she has to wait until Saturday” (P3)
BM/English code-switching is constant in tutoring 24 “Sometimes the teacher explains in Malay, sometimes in English. My son uses both.” (P11)
Parents distrust AI for children’s education 19 “I want to see what questions she’s getting and how she’s doing” (P3)
Tutoring-centre owners want teacher productivity, not student replacement 17 “Teachers spend most of their time grading homework” (T2)
Form-5 students themselves are reluctant about anything that “feels like school” 15 “I don’t want another homework app” (S2)
WTP varies sharply across household income 14 “RM 20 maybe” (P7) vs “RM 80–100” (P3)
Existing alternatives (Snapask, Pandai) are seen as too generic 12 “Pandai is OK but my son says it’s boring” (P9)
Outliers (singletons) 54 (heterogeneous; mostly minor frictions)

Day 5 evening: the segmentation pivot

The team makes a critical decision Friday evening. The original Week-1 framing was direct-to-parent — Pulse as a B2C subscription. Two interviews-derived findings challenge this:

  1. WTP heterogeneity is sharper than expected. The centre-paying segment has WTP RM 50–100/month; the textbook-only segment has WTP RM 10–20/month. The former is too small; the latter does not pay back the build cost.
  2. Tutoring centres have higher WTP, simpler sales cycles, and pre-existing teacher-time pain. A B2B sale to centres at RM 30/student/month routes around both the small-centre-paying-segment problem and the WTP-too-low-for-textbook-segment problem.

The team decides to narrow the primary segment to tutoring-centre owners, with the eventual product positioned as a teacher productivity tool that students use under teacher direction. The student-facing surface remains technically similar; the customer, the buyer, and the sales channel all change.

This is a significant pivot from Week 1, but it is exactly the kind of pivot Week 2 is supposed to enable. The signals are evidence-backed (17 sticky-mentions across 4 of 4 tutoring-centre interviews); the alternative is well-defined; the founder-market fit (Aliyah’s tutoring background) carries directly into the new framing. Aliyah resists the pivot at first because her original idea was direct-to-consumer; the team’s discipline of writing pivot conditions explicitly in the Week-1 memo (§19.5) makes the conversation tractable.

Day 5 evening: problem statement v1

The team writes the problem statement v1:

PROBLEM STATEMENT v1 — Pulse Date: Friday, Week 2.

WHO: Klang-Valley-based small-to-medium tutoring centres (10–200 students), specialising in SPM preparation, with 2–8 teachers each. Pilot focus on the ~120 such centres in PJ, Subang, and Klang.

WHAT: Teachers spend 60–70% of their hours on routine grading, re-explanation of SPM-format questions, and creating practice sets. Centre owners cannot easily scale teacher hours because qualified SPM-trained tutors are expensive (RM 60–120/hour) and difficult to retain.

WHEN: Daily; the pain compounds with student count. A centre with 50 students has roughly 200 hours/month of teacher routine work that could be redirected to higher-value teaching.

WHY: Existing solutions (Snapask, Pandai, free YouTube) are designed for student self-study, not for teacher productivity. None align specifically with the SPM rubric. None support BM/English code-switching that matches Malaysian classroom practice. None integrate with how centres actually run.

HOW MUCH: A centre with 100 students currently spends approximately RM 25,000/month on teacher hours, with an estimated 65% (RM 16,250) spent on routine work. At our pilot pricing of RM 30/student/month (RM 3,000/month), the centre saves an estimated RM 10,000+/month in teacher-hour redirection.

EVIDENCE: - “Teachers spend most of their time grading homework and giving the same explanations five times a week.” (T2, centre owner, ~80 students) - “I would consider paying RM 30 per student per month if it actually worked for SPM.” (T2) - “We need something where the answer format matches what SPM wants. None of the apps do that.” (T1, centre owner, ~50 students) - “If teachers had time for the harder stuff, my retention rate would go up. Now I lose teachers to bigger centres because they’re tired.” (T3, centre owner, ~150 students)

BOUNDARY: Not in segment: large national chains (BMS, Kumon, Wong Wei Yong). They have internal proprietary systems and do not buy from external vendors at student-level pricing. Not in segment: home-tutor freelancers. Their economics are too variable for a per-student SaaS model. Not in segment (initially): non-SPM tutoring (Cambridge IGCSE, IB, Australian VCE). Different rubric, different sales cycle.

EARLY ADOPTERS: T1, T2, T3, T4 (all four interviewed centre owners). All four passed the §20.1.6 early-adopter test on at least 4 of 5 dimensions. T2 explicitly offered to pilot at his centre at RM 30/student/month for 3 months in exchange for input into the product.

WHAT WE STILL DON’T KNOW: 1. Whether the SPM-rubric alignment is achievable at the quality bar centres expect, given the limitations of foundation models on Malay-language and Malaysian-specific content. 2. Whether teacher adoption (teachers using the tool with students) will follow centre-owner adoption (centre owner buying the tool). Adoption may be slower than purchase. 3. Whether the scale we need (10+ centres in pilot, 1,000+ students) is achievable in the 8-week build window.

OPEN-ENDED ALTERNATIVES: If pilot conversion in Week 6–7 falls below 30% (i.e., fewer than 3 of 10 centres convert from free-trial to paid), reconsider the direct-to-consumer framing for high-WTP urban-Klang-Valley parents.

Day 5 late: customer profile and segmentation grid

The team produces the customer profile (Value Prop Canvas right side) for the centre-owner segment, with separate rough profiles for the teacher and student users so Week 3’s value-proposition design can address the multi-actor structure.

The segmentation grid:

Segment n interviews Modal job Modal pain Early-adopter density Notes
Klang-Valley tutoring centres (10–200 students) 4 Maximise teacher productivity Teachers spend 60–70% on routine work 4 of 4 Selected as primary
Klang-Valley parents, centre-paying (HH income RM 8K+) 8 Help child master SPM Centre availability constrained to Saturday 2 of 8 Worth tracking; possible Phase-2 segment
Klang-Valley parents, textbook-only (HH income < RM 5K) 5 Help child master SPM at low cost No quality alternative at price point 0 of 5 Out of scope for MVP (WTP too low)
Australian VCE parents (cross-comparison control) 2 Same Different rubric, different system 0 of 2 Confirms boundary
Form-5 students (across both above parental segments) 3 Pass SPM with minimum effort School/tutoring is “boring” 0 of 3 Users not buyers; insight only
Total 22

The grid is the artefact that justifies the centre-as-primary-customer choice; the team submits it Friday at 22:30 alongside the rest of the deliverable bundle.

What Team Aroma got right and what they almost got wrong

Three things they did well: (1) they distributed interviews across segments and ran four “tutoring-centre owner” interviews even though the original idea had centred on parents; (2) they were willing to pivot when the evidence pointed away from the Week-1 hypothesis; (3) they wrote pivot conditions into the Week-1 memo, which made the Week-2 pivot procedural rather than emotional.

Three things they almost got wrong: they almost ran 18 of 22 interviews with parents (because that was the Week-1 segment); they almost did not include the negative-control Australian VCE parents (because the trip to interview them seemed off-strategy); they almost wrote a problem statement that hedged across both the centre and direct-parent segments (because Aliyah resisted abandoning the original framing). Each of these would have produced a less-defensible Week 2 deliverable. The team’s discipline in following the §20.1.5 segmentation rules and the §20.2.7 narrowing requirement saved them.

20.6 Course exercises and Week 2 deliverable

Submit the Week 2 deliverable bundle as a shared folder by Friday 23:59. Required artefacts:

20.6.1 Required artefacts

  1. Twenty interview write-ups (§20.4.3). One per interview, with the full template populated. Pseudonymise interviewees per ethics requirements; keep the master de-identification key in a separate, restricted-access document.
  2. Affinity diagram (§20.4.4). Screenshot of the labelled affinity diagram with major clusters annotated by sticky count.
  3. Workflow map (§20.4.5). Both per-interviewee maps for at least the primary-segment interviews (12+ maps) and a composite segment-level workflow with major variations annotated.
  4. Problem statement v1 (§20.2.5). One page using the structured format. The most-important deliverable. Read aloud at the team’s Friday meeting before submission.
  5. Customer profile (§20.4.6). The right side of the Value Proposition Canvas for the primary segment; rough profiles for any secondary user actors.
  6. Segmentation grid (§20.4.7). The grid with named segments, interview counts, modal jobs and pains, and early-adopter density.

20.6.2 Grading rubric (50 points)

Component Points Distinction-level criteria
Interview quantity and distribution 5 ≥20 interviews, distributed across primary / adjacent / negative segments per §20.1.5
Interview quality 15 Verbatim quotes captured per interview; behavioural framing rather than leading questions; data-sensitivity question included; same-day write-ups
Synthesis rigour 10 Affinity diagram with named clusters and counts; outlier analysis; quote citations
Workflow mapping depth 5 Per-interview maps with verb/artefact/duration; segment-level composite
Problem statement quality 10 All 9 fields populated with specific evidence; falsifiable claims; pivot conditions explicit
Segmentation discipline 5 Named primary segment with rationale; bounded by negative segment; secondary-segment notes

Pass: 30. Credit: 36. Distinction: 42. High Distinction: 47.

The team-comprehension penalty from §19.6.2 applies: the team grade is reduced by 5 points if any team member, when asked, cannot articulate the reasoning behind any component their team submitted.

20.6.3 Things to do before Monday of Week 3

By Sunday evening of Week 2, in addition to the deliverable submission:

  • Schedule 5–8 of your interviewees for Week 5–6 alpha testing. Booking the alpha cohort early gives you 3 weeks of lead time and avoids the Week-6 panic.
  • Read Chapter 2 (Five eras of business AI) and §21.1–§21.3 of Chapter 21 (MVP design) before Monday of Week 3. The Chapter 2 reading establishes the technological substrate against which your Week-3 MVP scoping decisions will be made.
  • Discuss with the team any segmentation pivots that emerged in Week 2. If the team is on a different segment than at end of Week 1 (as Team Aroma in §20.5 was), schedule a 30-minute Sunday call to align on the implications for Week 3.

References for this chapter

Customer development and lean methodology

  • Blank, S. (2005). The Four Steps to the Epiphany: Successful Strategies for Products That Win. K&S Ranch.
  • Blank, S. and Dorf, B. (2012). The Startup Owner’s Manual: The Step-by-Step Guide for Building a Great Company. K&S Ranch.
  • Ries, E. (2011). The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business.
  • Maurya, A. (2012). Running Lean: Iterate from Plan A to a Plan That Works. O’Reilly.

Jobs-to-be-done and value proposition design

  • Christensen, C. M., Hall, T., Dillon, K., and Duncan, D. S. (2016). Competing Against Luck: The Story of Innovation and Customer Choice. Harper Business.
  • Ulwick, A. W. (2016). Jobs To Be Done: Theory to Practice. Idea Bite Press.
  • Osterwalder, A., Pigneur, Y., Bernarda, G., and Smith, A. (2014). Value Proposition Design: How to Create Products and Services Customers Want. Wiley.
  • Strategyzer AG (2024). The Value Proposition Canvas — official template and guide. strategyzer.com.

Customer interview methodology

  • Fitzpatrick, R. (2013). The Mom Test: How to Talk to Customers and Learn If Your Business Is a Good Idea When Everyone Is Lying to You. Founder Centric.
  • Portigal, S. (2013). Interviewing Users: How to Uncover Compelling Insights. Rosenfeld Media.
  • Goodman, E., Kuniavsky, M., and Moed, A. (2012). Observing the User Experience: A Practitioner’s Guide to User Research. (2nd ed.) Morgan Kaufmann.

Diffusion, adoption, and segmentation

  • Moore, G. A. (1991, revised 2014). Crossing the Chasm: Marketing and Selling Disruptive Products to Mainstream Customers. HarperBusiness.
  • Rogers, E. M. (2003). Diffusion of Innovations. (5th ed.) Free Press.

Cases referenced in §20.3

  • Iansiti, M. and Lakhani, K. R. (2020). Competing in the Age of AI: Strategy and Leadership When Algorithms and Networks Run the World. Harvard Business Review Press. (Watson Health, Ant Group, DBS, Stitch Fix.)
  • Lamarre, E., Smaje, K., and Zemmel, R. (2023). Rewired: The McKinsey Guide to Outcompeting in the Age of Digital and AI. Wiley.
  • Klarna AB (2024). Klarna AI assistant handles two-thirds of customer service chats in its first month. Press release, 28 February 2024.
  • Klarna AB (2025). CEO interview, May 2025; reversal of full-AI customer service strategy.

Affinity diagramming and qualitative research methodology

  • Beyer, H. and Holtzblatt, K. (1998). Contextual Design: Defining Customer-Centered Systems. Morgan Kaufmann.
  • Saldaña, J. (2021). The Coding Manual for Qualitative Researchers. (4th ed.) Sage.
  • Miles, M. B., Huberman, A. M., and Saldaña, J. (2020). Qualitative Data Analysis: A Methods Sourcebook. (4th ed.) Sage.

Further reading

For the deepest treatment of customer-discovery interviewing, Portigal’s Interviewing Users is the practitioner reference; Goodman, Kuniavsky, and Moed’s Observing the User Experience is the textbook. For jobs-to-be-done specifically, Christensen et al. Competing Against Luck is the readable introduction; Ulwick’s Jobs To Be Done: Theory to Practice is the more rigorous methodology. For the value-proposition canvas, Osterwalder et al.’s Value Proposition Design is the primary reference, and the free Strategyzer templates are the standard production version.

For B2B AI-product-specific customer discovery, the most-current public-record sources are practitioner blogs from a16z (the Andreessen Horowitz AI portfolio newsletters) and First Round Capital (the First Round Review customer-discovery series). Academic literature on customer discovery for AI products specifically is thin; the methodological transfer from generic SaaS customer discovery is mostly direct, with the five distinctive properties in §20.1.3 the substantive additions.

For Malaysian and Australian customer-discovery contexts, the Cradle and LaunchVic case-study libraries provide locally-grounded examples; reach out to your mentor network for warm intros into their alumni founders, who are typically generous with time when asked specifically and briefly.

Read Chapter 2 (Five eras of business AI) and §21.1–§21.3 of Chapter 21 (MVP design) before Monday of Week 3.