Your governance committee is applying the same framework to a chatbot and an autonomous trading system.
That uniform treatment is the structural failure most boards cannot see until a material loss makes it visible. Episode 4 introduced the AI Risk Classification Matrix — the four-quadrant instrument that maps any AI investment by Deployment Autonomy and Reversibility of Consequence.
This issue delivers the operational version: an eight-question scorecard you can run against any specific AI investment in your organization’s portfolio. Two questions per dimension. The pattern of your answers tells you which quadrant the investment belongs in — and whether the governance structure currently applied to it actually fits.
HOW TO USE THE SCORECARD
Pick one specific AI investment in your organization’s active portfolio. Not the AI portfolio in aggregate — one investment. The scorecard is designed for individual investment classification.
Answer all eight questions. Each pair maps to one dimension of the matrix. Score honestly: the value is in surfacing where governance is misaligned, not in confirming what you already believe.
The scorecard returns a quadrant assignment and a governance prescription. Compare that prescription to the governance structure currently applied. If they don’t match, you have a classification gap — the upstream cause of the failure patterns documented in Episodes 1, 2, and 3.
THE AI INVESTMENT CLASSIFICATION SCORECARD
DIMENSION 1 Deployment Autonomy
How independently does the system act on individual decisions?
Q1. When this system produces an output, does a human review and approve each individual output before any consequence follows?
Scoring: Yes, human reviews each output = LOW autonomy. No, the system acts at scale and humans review aggregate performance = HIGH autonomy. Mixed (some outputs reviewed, some not) = HIGH autonomy with oversight gaps.
What it tells you: If outputs reach consequential state without individual human review, the system is acting — not suggesting. That is high autonomy regardless of how it was originally categorized.
Q2. Could an incorrect output from this system trigger a binding action (financial transaction, regulatory filing, customer-facing decision, automated escalation) before any human reviewer would catch it?
Scoring: No, human review is required before any binding action = LOW autonomy. Yes, the system can trigger binding action autonomously = HIGH autonomy.
What it tells you: This is the action-test. If the system can produce binding consequences without a review checkpoint, it is operating in the high-autonomy half of the matrix.
DIMENSION 2 Reversibility of Consequence
If the system produces an incorrect output, how reversible is the resulting harm?
Q3. If this system produced an incorrect output today, would the resulting harm be detectable, correctable, and reversible without material consequence to the organization, customer, or third party?
Scoring: Yes, errors are catchable and correctable without consequence = HIGH reversibility. No, errors produce durable financial, legal, reputational, or human-impact consequences before correction = LOW reversibility.
What it tells you: This is the consequence-test. The question is not ‘is the system accurate’ but ‘what happens if it isn’t.’ If the answer involves litigation, regulatory action, or customer harm that cannot be unwound, reversibility is low.
Q4. If an incorrect output from this system became public — reported by a regulator, surfaced in litigation, or disclosed in an audit — what would the cost of remediation, restitution, or reputational repair look like?
Scoring: Trivial or contained = HIGH reversibility. Material to the operating budget, the board agenda, or the brand = LOW reversibility.
What it tells you: The reversibility test is not technical — it is institutional. If a single incorrect output could put the system on the board agenda or in a regulator’s file, the consequence profile is low-reversibility regardless of error rate.
DIMENSION 3 Current Governance Architecture
What governance is the investment receiving today?
Q5. Where in the organization’s structure is this investment currently being reviewed — standard IT procurement, an AI-specific governance committee, business unit leadership, board-level oversight, or a combination?
Scoring: Standard IT procurement only = lightest tier. AI-specific committee or business leadership = elevated tier. Documented board-level oversight with defined escalation = highest tier.
What it tells you: This question surfaces the current governance assignment. The next question tests whether that assignment is appropriate.
Q6. What evidence threshold, if crossed, would trigger a formal review or capital stop on this investment? Is that threshold documented?
Scoring: Documented threshold with defined escalation = governance is operational. Implied or undocumented = governance is reactive, not architectural. No threshold has been defined = the investment is operating without an escalation trigger.
What it tells you: Per the Evidence Discipline Audit (Newsletter Issue #3), undocumented thresholds are the failure mode that produces the Watson Health pattern. If this question returns ‘no threshold,’ the governance gap exists regardless of the matrix quadrant.
DIMENSION 4 External Constraint Layer
Are external rules in scope, and is that scope acknowledged in the governance structure?
Q7. Does this investment’s use case fall under any external regulatory framework that imposes specific obligations — EU AI Act high-risk categories, sector-specific regulation (financial services, healthcare, employment), or jurisdictional requirements your organization is exposed to?
Scoring: No external framework applies = LOW external constraint. Sector regulation applies but obligations are administered by legal/compliance only = MEDIUM, with potential gap. External framework applies AND obligation cost is priced into the investment business case = HIGH, with proper integration.
What it tells you: If external regulatory obligations apply but are not yet priced into the business case, the investment will absorb the obligation cost as overrun after capital is committed. That is the failure mode this dimension surfaces.
Q8. If this system is supplied by a vendor: does the vendor agreement explicitly address responsibility allocation for incorrect outputs, regulatory non-compliance, or substantial modification by your organization — and does it survive contract renewal?
Scoring: Yes, fully addressed and durable = LOW vendor risk. Partially addressed or pre-AI Act language = MEDIUM, with pass-through exposure. Not addressed or vendor disclaims responsibility = HIGH, with unassigned risk.
What it tells you: Many AI vendor contracts were negotiated before substantial AI regulation took effect. If the contract does not assign accountability for the issues that matter, the risk has not disappeared — it has been left unallocated.
INTERPRETING THE RESULT
Pattern recognition across the eight answers maps to a quadrant assignment:
Quadrant | Pattern of Answers | Required Governance Architecture |
Q1: Advisory AI Low Autonomy / High Reversibility | Q1–2 = LOW autonomy. Q3–4 = HIGH reversibility. | Lightweight. Standard IT procurement review. Periodic performance monitoring. No board escalation required. |
Q2: Monitored AI Low Autonomy / Low Reversibility | Q1–2 = LOW autonomy. Q3–4 = LOW reversibility. | Elevated. Domain-expert sign-off. Defined escalation thresholds. Regular bias and accuracy audits. Legal or compliance review. |
Q3: Scaled AI High Autonomy / High Reversibility | Q1–2 = HIGH autonomy. Q3–4 = HIGH reversibility. | Process governance. Monitoring architecture. Drift detection. Population-level review, not individual output review. |
Q4: Autonomous AI High Autonomy / Low Reversibility | Q1–2 = HIGH autonomy. Q3–4 = LOW reversibility. | Board-level oversight. Mandatory human review thresholds. Pre-deployment bias testing. Documented escalation triggers. Regulatory compliance review may be required depending on use case and jurisdiction. |
The Cross-Check Now compare the Required Governance Architecture for your investment’s quadrant to the answer you gave on Question 5 and Question 6. If the current architecture is lighter than the quadrant requires, you have a classification gap. If the threshold from Question 6 is undocumented, the gap is structural — not an oversight that can be closed in the next committee meeting. |
APPLIED EXAMPLE
Consider a Director of Operations at a regional health insurer whose organization has deployed an AI-assisted claims-routing system. The system uses ML to flag claims for additional review or fast-track them to payment. Approximately 60 percent of claims pass through autonomously.
Running the scorecard:
Q1: Sixty percent of claims are routed without individual human review. → HIGH autonomy.
Q2: Auto-routed claims trigger payment workflows or denial letters before human review. → HIGH autonomy.
Q3: An incorrect denial could cause patient financial harm and trigger regulatory complaint. Reversal is technically possible but the harm — delayed care, financial stress, complaint record — is largely durable. → LOW reversibility.
Q4: A pattern of incorrect denials would be a state insurance commissioner matter. → LOW reversibility.
Q5: System reviewed by IT procurement at deployment, with quarterly performance reports to the operations VP. No board-level escalation defined.
Q6: No documented threshold for triggering a capital stop or review. The team has discussed it but never formalized.
Q7: Sector regulation (state insurance code, federal claims processing rules) applies. Compliance team owns it; obligation cost is not in the investment business case.
Q8: Vendor contract pre-dates current AI regulation. Liability for system errors is largely vendor-disclaimed.
Quadrant Assignment & Diagnosis Q4: Autonomous AI. The required governance architecture is board-level oversight with mandatory human review thresholds, pre-deployment bias testing, documented escalation triggers, and a regulatory compliance review. The current architecture (Q5) is IT procurement plus quarterly operations reports — the structure appropriate for Q1 Advisory AI, not Q4. The classification gap is two quadrants wide. The Q6 threshold is undocumented, which means escalation is reactive. The Q7 obligation cost is in compliance only, not in the business case. The Q8 vendor allocation has not been updated for current regulation. Four distinct gaps. None of them indicate weak intent. All of them indicate that the governance architecture was designed for a different risk profile than the investment actually carries. |
THREE QUESTIONS TO ASK MONDAY
1 | Has every active AI investment over a defined materiality threshold been classified into a quadrant of the matrix — with that classification documented in the governance record, not just the business case? If classification exists only in informal discussion, the governance architecture cannot be tested against it. |
2 | For the largest active Q4 investment (High Autonomy / Low Reversibility): does the governance currently applied to it match the architecture the matrix prescribes — or is it operating with Q1 or Q2 oversight? Q4 misclassification is the upstream cause of the failure pattern documented in Episodes 1–3. If the answer is ‘Q1 or Q2 oversight,’ that is the highest-leverage governance correction available. |
3 | Where in the capital approval workflow does the classification question get asked — before the ROI is calculated, or after legal review of the proposed contract? If classification happens after legal review, governance is operating downstream of capital commitment. The cost of misclassification has already been locked in by the time the question is asked. |
WHAT’S NEXT
Episode 4 closed by signaling the external constraint layer: the EU AI Act’s Annex III high-risk system requirements begin applying on August 2, 2026, and those categories overlap most clearly with Q4 of the matrix — the highest-risk quadrant the scorecard surfaces.
Episode 5 — publishing next week — opens that constraint layer fully. The argument: the EU AI Act is not a compliance event. It is a re-pricing event. Organizations treating it as compliance will price AI investments at pre-Act economics and absorb the obligation stack as overrun after capital is committed. The episode introduces a four-field classification gate that belongs upstream of every Annex III investment decision — the regulatory companion to the matrix this issue makes operational.
Watch Episode 4 How to Classify AI Risk Before You Approve It is available now. Link in the description and on the channel. If this scorecard surfaced classification gaps inside your organization, forwarding this issue to a colleague who governs AI investments is the most valuable action you can take. |
Strategic Risk Lab | The Governance Brief | Issue #4 | May 4, 2026
AI Strategy, Risk & Governance for Decision Makers | [email protected]
