framework
pending-review
framework:reinforceREINFORCE
nguyen-2025-sfr-deepresearch.mdFrontmatter (10 fields)
{
"doc": "nguyen-2025-sfr-deepresearch.md",
"context": "Classical RL algorithm adapted by the paper with modifications including clipped-surrogate losses and length-normalized advantages for agentic training.",
"category": "ai",
"norm_label": "REINFORCE",
"graphify_id": "reinforce_algorithm",
"source_file": "nguyen-2025-sfr-deepresearch.md",
"imported_from": "/Users/antonborzov/Documents/Research.nosync/papers/extract_typed_out/nguyen-2025-sfr-deepresearch/graph.json",
"extracted_type": "framework",
"source_location": "§3.3",
"graphify_file_type": "framework"
}Outgoing (0)
None.
Incoming (3)
Extended by (2)
- Length-Normalized Advantage Formulation(method)
- Trajectory Filtering(method)
Implemented by (1)
- SFR-DeepResearch(framework)
Mentions (1)
- papers-typed
nguyen-2025-sfr-deepresearch.md