claim
active
claim:larger-models-can-support-higher-dimensional-truth-cones-than-smaller-modelsLarger models can support higher-dimensional truth cones than smaller models
Interpretation of ASR degradation patterns by model size across cone dimensions
Source paper
extracted_from(2025) · Kevin Shengyang Yu · Vaidehi Bulusu · Oscar Yasunaga · Lau, Clayton +4
Neighborhood — ranked by edge-count
Findings (4)
finding
- Experiment 2 result showing large models can support high-dimensional truth cones
- Experiment 2 result showing large Gemma model supports high-dimensional truth cones
- Small Gemma model shows severe ASR degradation at higher cone dimensions
- Smaller models show non-monotonic and diminished ASR with increasing cone dimensionality
Claims (1)
claim
- Interpretation of weaker PCA separation and lower ASR in smaller models
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Interpretive synthesis of DIM and cone intervention successes
- Bigger models are more likely to converge to a shared representation than smaller modelshypothesis0.793Selective pressure toward convergence via model capacity
- Concept cone truth interventions would generalize to larger frontier models and multimodal settingshypothesis0.772Key robustness question raised as future work
- The model appears to encode truth differently under passive versus active truth evaluation prompts.claim0.765Key finding from Section 5 based on low cosine similarity between no-prompt and ask-correct probes.
- Implication of PRH for AI fairness and bias
- Features may not be strictly one-dimensional objects; higher-dimensional feature manifolds may exist in model representationshypothesis0.761Extension of superposition hypothesis to account for continuous families of features
- Overarching conclusion summarizing the paper's contribution relative to prior universality claims.
- Establishes generalizability of the core difficulty-boundary finding across model families.