One word is enough for both agentic and latent visual reasoning.

The paper's thesis from the title.

Source paper

extracted_from

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

Ziyu Guo · Rain Liu · Xinyan Chen · Pheng-Ann Heng

Neighborhood — ranked by edge-count

Papers (1)

paper

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
mentions

Communities (3)

community

Active inference & agent ecology
members_of
Free energy minimization, Markov blankets, trust gradients, and multi-agent rhythm/deferral frameworks
Hybrid agentic-latent reasoning architectures
members_of
Methods bridging external tool-use agents and internal hidden reasoning, examining trade-offs like context-switching latency and unified token representations.
Functional token unified reasoning
members_of
Single discrete word bridging agentic and latent visual reasoning in ATLAS framework

Frameworks (1)

framework

ATLAS Framework
cites
A framework where a single discrete word (functional token) serves both agentic operation and latent visual reasoning, requiring no visual supervision.

Concepts (1)

concept

Functional Token
cites
A discrete token in the vocabulary that represents a visual operation (e.g., <|Line|>, <|Shape|>, <|Text|>), generated via next-token prediction within autoregressive sequences.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

One word is enough for both agentic operation and latent reasoning unitconcept0.898
Core idea of ATLAS: a single discrete token serves dual purpose of operation specification and latent reasoning.
ATLAS: Agentic or Latent Visual Reasoningframework0.807
Core framework proposing discrete functional tokens as a unified solution for visual reasoning in VLMs, bridging agentic and latent approaches.
Agentic Visual Reasoningconcept0.799
Paradigm where VLM acts as controller generating code or tool calls to external modules for visual operations, incurring context-switching latency.
ATLAS combines the strengths of agentic and latent reasoning by using a single discrete word (functional token) that serves both roles.claim0.798
Core claim of the paper: a unified token bridges the gaps.
Recent alternatives include agentic reasoning through code or tool calls, and latent reasoning with learnable hidden embeddings.claim0.795
Statement of existing alternatives to direct generation.
latent reasoningconcept0.778
Reasoning approach using learnable hidden embeddings.
Visual reasoning, often interleaved with intermediate visual states, has emerged as a promising direction in the field.claim0.772
Author's interpretive assertion on the direction of the field.
Verbal reports, homology, and phylogenetic provenance are insufficient to determine sentience in unconventional agents.claim0.767
Core claim: Turing test and brain homology fail for synthetic, AI, and radically non-human agents; new frameworks required.