agentsPublished: June 23, 2026

World Models in Pieces: Structural Certification for General Agents

By Yikai Lu, Yifei Wu, Xinyu Lu, Tongxin Li

Research TL;DR

"Introduces structural certification to verify when agents' world models reliably support long-horizon planning, proving a bound on error for compositional goals."

Abstract

In the big-world regime, agents cannot be universally capable and their ability is inevitably specialized across a world model in pieces. Consequently, standard uniform guarantees fail to distinguish between the understanding of critical bottlenecks and irrelevant failures. We first formalize this limitation by proving that general agents are not universal, rendering standard worst-case analysis uninformative. To overcome this, we introduce structural certification, a transition-local framework that maps bounded goal-conditioned performance to entry-wise guarantees on the agent's internal world model. Our main contribution is constructive. We provide algorithms that filter specific transitions using deep compositional goals and prove that a general agent on these goals has a structural world model with a $\mathcal{O}(1/n) + \mathcal{O}(δ)$ error bound. Conversely, this bound is tight in the small-$δ$ regime, whose existence is explicitly guaranteed by our certification. These results enable the certifiable deployment of general agents by localizing the specific transitions where long-horizon planning is reliable.

Read full paper on arXiv →

Related Research

Jun 2026

Accelerate your workflow with Feedalyze

AI churn detection for SaaS. Know which customers will leave before they do.

Free plan available · Connects to HubSpot, Intercom, Zendesk

Detect churn before it happens →

World Models in Pieces: Structural Certification for General Agents

Abstract

Related Research

InSight: Self-Guided Skill Acquisition via Steerable VLAs

OpenThoughts-Agent: Data Recipes for Agentic Models

Grading the Grader: Lessons from Evaluating an Agentic Data Analysis System

Accelerate your workflow with Feedalyze