LLMs Get Lost In Multi-Turn Conversation

Best AI papers explained - Un pódcast de Enoch H. Kang

Categorías:

This paper exemines the performance of Large Language Models (LLMs) in multi-turn conversations compared to single-turn interactions. The authors developed a method to create "sharded" instructions from fully-specified tasks, allowing for controlled simulation of underspecified, multi-turn exchanges. They discovered that LLMs exhibit significantly lower performance and drastically increased unreliability in multi-turn settings, attributing this "lost in conversation" phenomenon primarily to issues with context management and premature, incorrect assumptions. The study concludes by urging LLM builders to focus on improving multi-turn reliability alongside single-turn aptitude, as current techniques like lowering temperature or using agent-like frameworks offer only limited improvements.

Visit the podcast's native language site