From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

Yifan Li; Shengbin Yue; Boyu Feng; Jinhu Qi; Bo Ke

arXiv 📊 Research Studies

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

Yifan Li,Shengbin Yue,Boyu Feng,Jinhu Qi,Bo Ke

June 9, 2026

arXiv preprint

8 min read

Abstract

The integration of external tools has transitioned LLM agents from passive responders to autonomous systems. However, current benchmarks prioritize execution success, neglecting self-awareness capability, the ability to discern whether a problem requires necessary external resources or can be solved via internal parametric knowledge. To address this, we introduce KAPRO (Knowing-Acting Quadrant PRObe), a framework that evaluates cognitive-behavioral alignment by decoupling an agent's metacognitive judgment (Knowing) from its spontaneous execution (Acting). We further construct KAware, a dataset rigorously partitioning tasks into external, internal, and hybrid subspaces to systematically probe these epistemic boundaries. Extensive experiments across diverse agent architectures show that self-awareness capability is strongly correlated with task success but degrades sharply in internal-capability settings. Moreover, open-source and instruction-following models exhibit stronger tool overuse due to shallow pattern matching, while proprietary and reasoning-oriented models demonstrate more reliable cognitive gating. Benchmark and codes are available at https://github.com/AI-Santiago/KAware.

Keywords

#self-awareness#LLM#reasoning

View on arXiv

Abstract

Keywords

Related Research

SMOTE: Synthetic Minority Over-sampling Technique

FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological Data

Adaptation in Natural and Artificial Systems