method
pending-review
method:grpo-group-relative-policy-optimizationGRPO (Group Relative Policy Optimization)
natural.mdFrontmatter (10 fields)
{
"doc": "natural.md",
"context": "RL algorithm used to train the activation verbalizer on open models; samples group of candidate descriptions and applies policy optimization.",
"category": "ai",
"norm_label": "GRPO (Group Relative Policy Optimization)",
"graphify_id": "grpo",
"source_file": "natural.md",
"imported_from": "/Users/antonborzov/Documents/Research.nosync/papers/extract_typed_out/natural/graph.json",
"extracted_type": "method",
"source_location": "§Method: NLA training",
"graphify_file_type": "method"
}Outgoing (0)
None.
Incoming (2)
Implemented by (2)
- Activation Verbalizer (AV)(method)
- Natural Language Autoencoders (NLA)(framework)
Mentions (2)
- papers-typed
natural.md - papers-typed
natural.md