artifact
active
artifact:safety-research-assistant-axis-github-repositorysafety-research/assistant-axis GitHub repository
Code and full transcripts of case studies released alongside the paper
Neighborhood — ranked by edge-count
Papers (1)
paper
Frameworks (1)
framework
- Assistant AxisaboutContrast vector between mean default Assistant activation and mean of all fully role-playing role vectors; main contribution of the paper