May 2026 · Design

Methodology in action — a prototype

Stage 2 of a series on the design choices behind CoNoggin. Stage 1 mapped the field. This one runs a worked example.

The question

Can the nine parameters from Stage 1 be applied to a real case, used to pick a complementary set of methodologies, and turned into a CoNoggin change intervention that holds together?

Inputs

Scenario

A public-broadcasting newsroom adopts AI research tools — fast triangulation of facts, lead surfacing, long-document structuring — without weakening verification discipline. A documented, current pressure across the industry.

The room

What's already true, before any goal is authored:

Public-service broadcaster, charter-bound to inform, educate, entertain. Trust is the operating capital.
News division, ~6,000 staff, sub-units (Investigations · World · Verify · Programmes).
Published guidelines on AI use in journalism (2024–2025) emphasising human oversight, transparency, accuracy.
AI-saturated information environment; regulator and audience scrutiny on impartiality.
Editorial-standards-coded culture; long memory for high-profile errors.

The goal (three-field narrative input)

Authored by the Head of News Operations, Newsroom Investigations:

New goal · Simple

For my teamTeam

The change (and how we'll see it)

The team starts using AI research tools regularly for early scoping — without weakening verification discipline. We'll see it in: AI-assisted research entering early drafts with the same standards we apply to anything else. No AI-derived material reaching script without a human-verified trail.

For whom

The full Newsroom Investigations team — 30 people across senior reporters, producers, and editors. Mixed exposure to AI tools so far; mixed appetites; same standards required of all.

The obstacle

Time pressure. When an AI tool gives a fast plausible answer, the temptation is to take it as a starting point and skip the verification step. Two near-misses already this quarter.

Or co-author with CoNoggin

The nine-parameter pass

Question	Read	Confidence
What kind of problem is this? (Cynefin)	Complex. AI tooling is too new for stable best practice; behaviour patterns must emerge through small experiments. Probe-sense-respond, not analyse-then-prescribe.	High
What's the change?	Verification rigour preserved while AI use expands.	High
What's the behaviour shift?	Journalists run an explicit verification step on AI-assisted material before it reaches draft scripts.	High
Where's the audience starting from?	Mixed across Stages of Change.	Medium
Why now?	Two near-misses already; ongoing regulator and audience scrutiny.	High
How will we know?	Lead: verification-step completion when AI is used. Lagging: error rate, near-miss count.	High
By when?	~12 weeks to embed habits; ongoing reinforcement after.	Medium
What's the lead measure?	Frequency of explicit verification steps in AI-assisted drafts.	High
What's the obstacle?	Named: time pressure → lower verification's Ability cost (Fogg's B = M × A × P).	High

Outputs

Methodology pick

A complementary set, not a single framework. The four load-bearing cards:

Sense-making

Cynefin

Dave Snowden · 2007

Five domains — Simple · Complicated · Complex · Chaotic · Confused — each demanding a different kind of action. Classifies this problem as Complex.

Primary stance · probe-sense-respond

Behaviour

Tiny Habits / B=MAP

B.J. Fogg · 2019

Behaviour = Motivation × Ability × Prompt. Time pressure attacks the Abilityaxis — lower verification's ability cost, don't raise motivation.

Behavioural backbone · addresses the named obstacle

Performance support

Five Moments of Need

Mosher & Gottfredson · 2009

Training addresses New and More; performance support addresses Apply, Solve, Change.Most things shouldn't be a course.

Discipline · keeps work in real journalism

Evaluation

Reverse Kirkpatrick

Brinkerhoff and others · 2000s

Start at the desired Level 4 result and design backwards. Produces a credible lead measure, distinct from the lagging result.

Measurement architecture · designed backwards

The full set, with reasons:

Sense-making — Cynefin (Snowden, 2007). Classifies the problem as Complex. Rules out a course-led intervention; commits to probe-sense-respond.
Behaviour — Tiny Habits / B=MAP (Fogg, 2019). Behaviour = Motivation × Ability × Prompt. Time pressure attacks Ability; lower verification's ability cost rather than raise motivation.
Performance support — Five Moments of Need (Mosher & Gottfredson, 2009) + 70-20-10. This goal lives in Apply / Solve / Change, not New / More. Most learning happens in real journalism, not in a course.
Discipline — Action Mapping (Moore, 2008). Every activity collapses to observable on-the-job behaviour. No “understand AI” activities.
Framing — Self-Determination Theory (Deci & Ryan, 1985). Autonomy, competence, relatedness. Craft augmentation, not policing.
Evaluation — Reverse Kirkpatrick. Start from Level 4 (no AI-induced editorial errors) and design backwards to leading-measure architecture.

Not used: ADKAR, Kotter's 8 Steps.Both assume a stable end state. AI tooling isn't stabilising; ADKAR's Reinforcementphase can't lock in a behaviour the field will keep updating.

Activity composition (within the org's allowed palette)

The organisation's vocabulary of permitted activity types:

1. Asynchronous interactive activities · 2. Collaborative trio cards · 3. Submitted assignments to manager · 4. Briefing docs or videos · 5. Small sub-team meetings of 5 · 6. Recorded webinars.

The composed outline as it appears in CoNoggin — eleven activities in the goal's bucket:

Goal · Newsroom Investigations · 12 weeks

AI-assisted research with verification standards intact

11 activities·~12 weeks·30 people

Briefing — Verification under AI: what countsW05 min · solo
Verification reflection (private)W05 min · vault
Webinar — kickoff with senior editor + Q&AW145 min · recorded
Async — the plausible-but-wrong walkW115 min · revisitable
Async — the plausible-but-wrong simulatorW2–115 min · weekly
Weekly verification capture (lead measure)W2–113 lines · weekly
Trio AI-trace auditW3·5·7·9fortnightly · trio
Sub-team pre-mortemW560 min · team-of-5
Submitted assignment to managerW8review · gate
Sub-team wrap — what stuck, what didn'tW1230 min · team-of-5
Async — what changedW12solo

The full table, with what each activity does:

#	Activity	What it does
1	Briefing doc/video (week 0)	Verification under AI: what counts. Five-min video + one-page doc from leadership.
2	Async — verification reflection (week 0, private)	Where my habit is strongest, where it might slip. Stays in the journalist's vault.
3	Webinar recorded (week 1)	45-min kickoff. Senior editor + a journalist using AI tools well + Q&A.
4	Async — the plausible-but-wrong walk (week 1)	15-min self-paced module. Three composite cases on the team's beat. Reader makes choices, sees consequences. Revisitable.
5	Async — the plausible-but-wrong simulator (weekly)	5-min weekly. The system gives confident, partly-fabricated answers; the journalist catches and traces.
6	Async — weekly verification capture (weeks 2–11)	Three lines per week. One AI-assisted piece, what was verified, what almost slipped. The capture is the lead measure.
7	Trio audit (weeks 3, 5, 7, 9)	Trios audit one of each member's AI-assisted pieces against a rubric. Roles rotate.
8	Sub-team pre-mortem (week 5)	60-min per sub-team. What does it look like when AI-induced error reaches broadcast? Risk register produced.
9	Submitted assignment to manager (week 8)	One real piece with the verification trace shown explicitly. Manager reviews.
10	Sub-team wrap (week 12)	30-min per sub-team. What stuck, what didn't, carry-forward proposals.
11	Async — what changed (week 12)	Self-rated habit shift; named verification patterns; opt-in knowledge-card.

Measurement

#	Measure	Success
1	Opened; ack-line submitted	≥80% / ≥75%
2	Completion	≥80% week 1
3	Attended-or-watched in 7d; capture submitted	≥90% / ≥75%
4	Completion; failure modes flagged	≥85% / ≥3 of 5 first pass
5	Weekly completion; per-failure-mode catch rate	≥70% sustained; catch rate ≥80% by w6, ≥90% by w11
6	Weekly submission rate per person (lead measure)	≥75% sustained
7	Audits submitted; rubric scores	≥90% complete; scores trending upward
8	Session held; risk register	6/6 sub-teams; ≥5 named failure modes
9	Submission (gate); rubric scores	100%; ≥80% meets-standard
10	Session held; carry-forward proposals	6/6 sub-teams; ≥3 each
11	Completion; self-rated habit shift (1–5)	≥85% / ≥70% rate ≥3

Intervention judgment (Reverse Kirkpatrick)

Four headline numbers:

L4 — Results. Zero AI-induced editorial errors reach broadcast in the 12-week window or the 12 weeks following. (Lagging.)
L3 — Behaviour. Weekly capture submission rate ≥75% sustained AND week-8 exemplar submissions ≥80% meets-standard.
L2 — Learning. Simulator catch rate ≥90% by week 11.
L1 — Reaction. Week-12 self-rating ≥3 from ≥70% of cohort + qualitative wrap-session sentiment positive on craft-augmentation framing.

Successful if L1–L3 hit and L4 holds across the trailing window.

What worked

The nine-parameter pass produced a clear methodology pick from a 200-word goal + standing context. No further intake form.
Productive friction between methodologies earned its keep. Each cluster compensated for another's weakness — Tiny Habits without sense-making becomes habit-formation that misses the domain; sense-making without Tiny Habits is “let's experiment” with no behavioural backbone.
The org's allowed activity palette held the composition realistic without changing the methodology stance. Same intervention, different organisation, different activity composition. The stance is universal; the composition is local.
The lead measure (weekly verification capture) carried the story. Designed to survive the same constraint the goal does — three lines, not a form.

What didn't, or wobbled

Two soft-confidence reads (audience starting state; time horizon) needed one targeted clarifying question rather than gating the goal author. Worth designing in as a fallback, not as a default.
First-pass tendency to over-instrument the measurement — quality-scoring every text artefact across the cohort. Real cost; rarely acted on. Simpler architecture: capture the simple thing; flag exceptions for the manager; let human judgement do quality work.
Initial activity composition over-reached on AI-native activities (live group choreography, weekly synchronous roleplays). The org-palette constraint forced honesty; the async-heavy backbone is more credible. Adding the role-play and scenario-walk back as async-interactive flavours kept the training value at lower cost.

Conclusions (for now)

Methodology stance is universal; activity composition is local. Two distinct problems, both required.
One question shapes everything downstream — what kind of problem is this? Get this wrong and every methodology choice that follows is misapplied.
Surface stays human; substrate stays disciplined. Goal is articulated as narrative; activity outline is deterministic; measurement is structured. None competes with the others when they're properly separated.
The simplicity imperative applies to measurement, not just the intervention. Measurement that survives the constraint the intervention survives.

CoNoggin is built by Alt Shift Lab. We're in pilot with one client and opening to a small group later this year. Join the waitlist →