dataset
active
dataset:the-pileThe Pile
Training corpus used for the 67M-parameter model tested with VPD.
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Empirical demonstration of VPD on a mid-scale transformer, establishing feasibility.