Bits-Per-Byte Language Modeling Score

Language model performance metric used in cross-modal alignment experiments to rank LLM competence

Neighborhood — ranked by edge-count

paper

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Bits-Per-Byte Metricconcept0.815
Normalized cross-entropy metric used as language model performance measure on OpenWebText
Bias in language modelsconcept0.726
Features related to gender, racial, ethnic biases, slurs, and hate speech.
Judge Model Scoringmethod0.722
Claude 4.5 Haiku used to segment responses into attempts and score each attempt 0-100 for relevance
Language Modelsconcept0.717
Primary substrate for manifold steering experiments; demonstrates method on reasoning and in-context tasks.
Language Modelconcept0.716
Primary test domain for manifold steering, including reasoning and ICL tasks
Autoregressive Language Modelingconcept0.711
Training objective interpretable as optimizing a diverse set of tasks; thus subject to multitask scaling convergence pressures
Tests of performance on specific tasks, including language modeling, are insufficient for determining consciousness statusclaim0.706
Systems directly optimized for output can produce it without the prerequisite processes for conscious experience; simplest explanation for LLM consciousness reports is pattern matching
Discovering Language Model Behaviors with Model-Written Evaluations (Perez et al. 2022)concept0.704
Prior work studying sycophancy and desire not to be shut down in RLHF-trained models