framework
pending-review
framework:natural-language-autoencoders-nlaNatural Language Autoencoders (NLA)
natural.mdFrontmatter (10 fields)
{
"doc": "natural.md",
"context": "An unsupervised method for generating natural language explanations of LLM activations through a verbalizer-reconstructor pair trained jointly with RL.",
"category": "ai",
"norm_label": "Natural Language Autoencoders (NLA)",
"graphify_id": "framework_nla",
"source_file": "natural.md",
"imported_from": "/Users/antonborzov/Documents/Research.nosync/papers/extract_typed_out/natural/graph.json",
"extracted_type": "framework",
"source_location": "§Introduction",
"graphify_file_type": "framework"
}Outgoing (7)
about (1)
- Unverbalized Evaluation Awareness(concept)
Associated with (2)
- Confabulation(concept)
- Sparse Autoencoders (SAE)(method)
Implements (4)
Incoming (1)
introduces (1)
Mentions (1)
- papers-typed
natural.md