Agents that learn from their own outputs or live environments can drift from expected behavior. Reinforcement loops form, subtle hallucinations become the next generation of training data and ...