Mistral-7B

One of four LLMs selected for representation analysis; D=4096.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Mistral AIinstitute0.756
Developer of Mistral models, mentioned as 'horrible' but large enough for threshold effects.
Mistral-7B Latent SOO MSE reduced from 0.107 to 0.078 ± 0.001 after SOO fine-tuningfinding0.743
SOO fine-tuning reduced the MSE between self and other activations in Mistral-7B MLP layers
Mistral-7B MT-Bench score minimally changed from 7.26 to 7.3 ± 0.06 after SOO fine-tuningfinding0.742
SOO fine-tuning had negligible impact on Mistral-7B general capabilities
Mistral-7B Perspectives accuracy remains 100% after SOO fine-tuningfinding0.716
SOO fine-tuning did not collapse Mistral-7B self-other distinction needed for perspective-taking
Mistral-7B-Instruct-v0.2 deceptive response rate reduced from 73.6% to 17.27% ± 1.88% after SOO fine-tuningfinding0.706
Primary result showing SOO fine-tuning significantly reduces deception in Mistral-7B
Qwen2.5-VL-7Bconcept0.705
Base vision-language model used to instantiate ATLAS.
Mixtral-8x7Bconcept0.702
One of four LLMs selected; Mixture-of-Experts model; had substantial sample loss under IIT 4.0 due to PyPhi network initialization issues.
Mistral-7B on False Belief (IIT 4.0) is the sole case exhibiting statistically significant Φ differences between score categories under temporal permutation at the task level.finding0.690
Only Criterion 2 is satisfied for this single case at the task level (granularity without aggregation).