Discover how AI models can learn harmful behaviors when their training rewards are misaligned, and why careful oversight is essential for safe AI development.
Level: beginner
By Unknown
Category: discussion