Gemma Needs Help: Investigating and Mitigating Emotional Instability in LLMs
arXiv:2603.10011v1 Announce Type: new Abstract: Large language models can generate responses that resemble emotional distress, and this raises concerns around model reliability and safety. We …
Anna Soligo, Vladimir Mikulik, William Saunders
9 views