claim
active
claim:one-word-is-enough-for-both-agentic-and-latent-visual-reasoningOne word is enough for both agentic and latent visual reasoning.
The paper's thesis from the title.
Source paper
extracted_fromZiyu Guo · Rain Liu · Xinyan Chen · Pheng-Ann Heng
Neighborhood — ranked by edge-count
Papers (1)
paper
Communities (3)
community
- Active inference & agent ecologymembers_ofFree energy minimization, Markov blankets, trust gradients, and multi-agent rhythm/deferral frameworks
- Methods bridging external tool-use agents and internal hidden reasoning, examining trade-offs like context-switching latency and unified token representations.
- Functional token unified reasoningmembers_ofSingle discrete word bridging agentic and latent visual reasoning in ATLAS framework
Frameworks (1)
framework
- ATLAS FrameworkcitesA framework where a single discrete word (functional token) serves both agentic operation and latent visual reasoning, requiring no visual supervision.
Concepts (1)
concept
- Functional TokencitesA discrete token in the vocabulary that represents a visual operation (e.g., <|Line|>, <|Shape|>, <|Text|>), generated via next-token prediction within autoregressive sequences.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Core idea of ATLAS: a single discrete token serves dual purpose of operation specification and latent reasoning.
- Core framework proposing discrete functional tokens as a unified solution for visual reasoning in VLMs, bridging agentic and latent approaches.
- Paradigm where VLM acts as controller generating code or tool calls to external modules for visual operations, incurring context-switching latency.
- Core claim of the paper: a unified token bridges the gaps.
- Statement of existing alternatives to direct generation.
- Reasoning approach using learnable hidden embeddings.
- Author's interpretive assertion on the direction of the field.
- Core claim: Turing test and brain homology fail for synthetic, AI, and radically non-human agents; new frameworks required.