Physical AI Governance: How to Design and Run It at a Production Site

1 Apr

~7min read

Physical AI is moving from lab to factory floor faster than anyone expected. NVIDIA's GTC made that clear. The World Economic Forum published a white paper in February calling physical AI "the next wave of smart manufacturing." Deloitte followed with their own. The hardware is shipping, the venture money is flowing, and GTC demos are making their way into board presentations.

We've spent the last decade designing and diagnosing digital twins for production sites across mining, manufacturing, recycling, and energy. We've seen enough technology waves to know what comes next. Somebody buys the technology. Somebody deploys it. And then somebody asks: who's governing this thing?

That question should come first.

Why Physical AI Governance Is Different

Physical AI is different from the AI your IT team has been managing. A language model that hallucinates a wrong answer wastes time. A physical AI system that misreads a production state can halt a conveyor, misroute a haul truck, or fail to flag a proximity breach. The stakes are measured in safety incidents, lost production hours, and equipment damage. The regulators are paying attention. The insurers will be next.

The governance conversation is mostly absent from the physical AI vendor pitch. I've reviewed dozens of demo decks this year. The pattern is: advanced capabilities, integrations, ROI projections, reference customers. Zero mention of how the system is governed once it's running.

The vendors aren't hiding anything. The gap is structural. The industry has no shared framework for how you govern physical AI at a production site. The WEF's three-layer governance pyramid (executive accountability, runtime governance, frontline authority) is a useful starting point, but it was written for policy audiences and stops at the conceptual level. It doesn't answer the question a site manager needs answered: what do I govern, how do I govern it, and what system do I use to do the governing?

The Prior Question

The most common failure mode is starting with technology instead of the problem. Physical AI governance has the same pattern. Everyone is asking "how do we deploy this?" Nobody is asking the prior question: does our operation have the capacity to govern these systems once they're deployed?

For a governance system to work, it needs to distinguish at least as many operational states as the physical AI systems it governs. If your crusher has twenty meaningfully different operating conditions and your AI can distinguish eight of them, you don't have a governance problem. You have a blindness problem. You're governing eight states and hoping the other twelve take care of themselves.

Count the states your physical AI can be in. Count the states your governance system can see and respond to. The gap between those two numbers is your exposure.

Dashboards Are Not Governance

The default response to "how do we govern this?" is to build another dashboard. A new AI system goes live. Someone asks for visibility. A dashboard appears. It shows data points, status indicators, alert counts. Management reviews it weekly. The team feels like they're governing.

They're not. A dashboard that shows ten thousand data points has transferred the burden of governance to whoever is staring at the screen. That person now has to synthesise, prioritise, decide, and act, all from raw data. When they leave, your governance walks out the door with them.

This distinction matters: monitoring, governance, and orchestration are three different things. Monitoring tells you what is happening. Governance determines who makes which decisions, with what authority, under what constraints. Orchestration distributes a decision across every system that needs to change. Most organisations have monitoring (dashboards, alerts, SCADA screens). Some have governance (procedures and meetings). Almost none have orchestration that works across systems at the speed physical AI demands.

You need all three.

Designing Physical AI Governance with the DTDP

Before you pick a governance tool or build a governance dashboard, you need to map the territory. The methodology I developed for this is the Digital Twin Design Process (DTDP), built over six years to solve a simpler version of the same problem: how do you design a connected system that gets used to produce changes in the real world?

The DTDP applied to physical AI governance works in five steps.

Map the problem, not the technology. Pick a physical AI system you've deployed or plan to deploy. An autonomous haul truck. A machine vision safety system. An AI-powered maintenance predictor. Forget the technology and ask: what operational problem was this supposed to solve, and for whom? Each answer is a governance domain. Trying to "govern AI" as a single initiative is like trying to "maintain equipment" as a single initiative. You govern specific systems for specific reasons.

Count the states. For each physical AI system, estimate how many meaningfully different operational states it can be in, and how many of those your current capability can see and respond to. A machine vision system watches six zones. Under normal daytime conditions with standard crew, it works well. What about night shift? Maintenance shutdowns with scaffolding in Zone 3? Contractor crews without the right PPE tags? Rain fogging the cameras? Each of those is an operational state that needs governance. Two hours, a whiteboard, and honest answers from the people who operate the system.

Design the governance loops. For each state where you need governance, design a Sense-Decide-Act-Learn (SDAL) circuit. What signal tells you this state has occurred, and how fast does it arrive? Who interprets it and decides on a response? Where does the action land (a system of record, not someone's head)? And did the action produce the intended outcome? Each SDAL loop is a governance circuit.

Score readiness. Not all governance circuits can be built immediately. Score each one against five dimensions: data (do you have the signals?), infrastructure (can you connect to the systems of record?), process (will existing workflows accommodate the new loops?), people (will the affected teams adopt?), and problem (is this gap worth closing?). Any dimension below 3 out of 5 is a blocker.

Sequence for speed. Pick the governance circuit with the highest risk exposure, the highest readiness score, and the shortest time to a working loop. One physical AI system. One set of ungoverned operational states. One SDAL loop. Prove the pattern works, then expand.

Running Governed Operations with the Site Twin

Designing governance circuits is half the problem. The other half is running them at the speed physical AI demands.

Consider what a single governance circuit requires. A predictive maintenance model flags bearing degradation on Crusher 2. The governance circuit says: check the prediction against the current maintenance schedule, the production plan for the next 72 hours, spare parts inventory, and crew availability. Decide whether to schedule a replacement now or defer. If you schedule it, the action needs to land in the CMMS as a work order, in the production schedule as a throughput adjustment, in the shift plan as a crew notification, and in the site forecast as an output revision.

That is four systems, three departments, and two time horizons, triggered by one AI signal. No single person has the access or the bandwidth. No committee meets fast enough.

The Site Twin sits above and between your existing systems and runs three functions:

Monitoring: compress upward. The Site Twin compresses raw operational data into five to ten actionable signals per shift. Not a dashboard with ten thousand indicators. The supervisor doesn't get "bearing degradation alert." They get "bearing on Crusher 2 needs replacement by Thursday; here's what that does to your week and here are two options." Local signals projected to system-wide impacts.

Governance: encode the decision structure. The governance circuits from the DTDP live inside the running system. Each circuit defines who decides, with what authority, and what type of response is required: communicate (inform the right people), collaborate (assemble context and options for a shared decision), escalate (raise authority when a response stalls or exceeds scope), or intervene (change the system state to reduce immediate risk). Time-based escalation is built in: the notification that sits unactioned for four hours grows into the urgent fire dealt with via two-ways and WhatsApp. We break that reactive cycle by building escalation into the governance structure.

Orchestration: distribute downward. When the planner approves the Thursday replacement, one action triggers coordinated updates across the CMMS, the production schedule, the downstream notification, and the forecast. The planner makes one decision. The Site Twin does the distribution.

What Stays Human

The Site Twin does not set policy. It does not decide risk appetite. The governance circuits encode decision structure. The monitoring compresses operational reality into context. The orchestration distributes decisions into action. But the policy decisions, the risk calls, the trade-offs between production and safety stay with the people who carry the consequences.

Operators know things the system doesn't. Every time an operator overrides a governance recommendation and logs the reason, the system learns where the circuit's coverage runs out. The correct response is not compliance training. It's investigating what they know that the circuit doesn't, and absorbing that knowledge so the circuit improves.

The system starts conservative: every signal is a recommendation that a human reviews before action. As governance circuits prove reliable, you move the autonomy boundary. First, recommend and wait. Then, act and notify. Eventually, for high-severity time-critical situations only, act and inform after the fact. The progression is earned, not assumed, and configurable per circuit.

Start with a Whiteboard

Physical AI governance is a design problem before it is a technology problem. The DTDP gives you the methodology to design governance circuits grounded in your real operational states. The Site Twin gives you the system to run those circuits across monitoring, governance, and orchestration at the speed physical AI demands.

Week one is a whiteboard and a conversation. Week twelve is a governed operation. Month six is a system that's smarter than when you started.

If you're evaluating physical AI for your operation, or you've already deployed it and you're starting to wonder who's governing it, start with a Site Twin Assessment or a DTDP Workshop. Two weeks or two days. Independent. We'll map your governance requirements against your current capability and tell you where the gaps are.

---

Geminum is an NVIDIA Inception member. We design and build operational digital twins for production-intensive industries: mining, steel, recycling, and heavy manufacturing. Our Site Twin product unifies production, safety, and maintenance into a single operating loop, enabling teams to make faster decisions with better context, and progressively reduce the human effort required to run safe, high-performing operations.

If you're exploring how physical AI fits into your site operations, we'd like to talk.

Robert Foster