Think$^{2}$: Grounded Metacognitive Reasoning in Large Language Models

Abraham Paul Elenjical; Vivek Hruday Kavuri; Vasudeva Varma

arXiv 🤖 Machine Learning

Think$^{2}$: Grounded Metacognitive Reasoning in Large Language Models

Abraham Paul Elenjical,Vivek Hruday Kavuri,Vasudeva Varma

February 21, 2026

arXiv preprint

8 min read

Abstract

Large Language Models (LLMs) demonstrate strong reasoning performance, yet their ability to reliably monitor, diagnose, and correct their own errors remains limited. We introduce a psychologically grounded metacognitive framework that operationalizes Ann Brown's regulatory cycle (Planning, Monitoring, and Evaluation) as a structured prompting architecture, and study its integration within a lightweight dual-process MetaController for adaptive effort allocation. Across diverse reasoning and diagnostic benchmarks (GSM8K, CRUXEval, MBPP, AIME, CorrectBench, and TruthfulQA) using Llama-3 and Qwen-3 (8B), explicit regulatory structuring substantially improves error diagnosis and yields a threefold increase in successful self-correction. Blinded human evaluations over 580 query pairs show an 84% aggregate preference for trustworthiness and metacognitive self-awareness over standard and Chain-of-Thought baselines. Grounding LLM reasoning in established cognitive theory offers a principled path toward more transparent and diagnostically robust AI systems.

Keywords

#self-awareness#LLM#monitoring#evaluation#reasoning

View on arXiv

Abstract

Keywords

Related Research

Reinforcement Learning: A Survey

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Deep Reinforcement Learning with Double Q-Learning