
Teaching Claude why
In this post, Anthropic discusses a few of the updates they've made to alignment training.
Reads I found worth sharing.

In this post, Anthropic discusses a few of the updates they've made to alignment training.

A new paper from the Center for AI Safety, an AI safety nonprofit, suggests that more is going on under the surface.