paper-buf

LLM safety
- OpenAI Model Spec by John Schulman
- Inference-time-compute: More faithful? A research note
- Iterative label refinement matters more than preference optimization under weak supervision, code
- NOTES
LLM
- Enhancing automated interpretability with output-centric feature description
Others
- Emergent symbolic-like number variables in artificial neural networks
- NOTES