paper-buf

LLM safety
- Towards scalable oversight: Meta-evaluation of llms as evaluators via agent debate, the arxiv version, and code
- Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space
- NOTES
LLM