Large Language Models (LLMs) have demonstrated significant capabilities in answering questions using techniques such
as Chain of Thought (CoT) and Retrieval-Augmented Generation (RAG). CoT enables step-by-step reasoning to improve
accuracy, while RAG supplements LLMs with relevant external information. Retrieval-Augmented Thoughts (RAT) combines
CoT and RAG to provide a more robust factual foundation and coherence in reasoning chains. However, RAT is limited
in its ability to handle uncertainty and lacks replanning, often resulting in unnecessary retrievals,
inefficiencies, and globally inconsistent reasoning. To address these limitations, we introduce iRAT, a novel
reasoning framework that enhances RAT through retrieval control and replanning. iRAT dynamically evaluates
uncertainty in initial responses, employs controlled and filtered retrievals to obtain only the most relevant
context, revises thoughts to align with new content, and uses replanning to correct previous thoughts. Evaluations
demonstrated that iRAT outperforms RAT in HumanEval, MBPP, and GSM8K datasets, while reducing retrievals by a
considerable amount. The source code is available at
github.com/prane-eth/iRAT.
Keywords: large language models, artificial intelligence, chain-of-thought reasoning, uncertainty-aware language models, reasoning in LLMs, context-aware reasoning, LLM reasoning frameworks
PDF
Preprint PDF of "iRAT: Replanning and Controlled Retrieval for Robust LLM Reasoning"Download the PDF file