Academic

Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

arXiv:2603.11495v1 Announce Type: new Abstract: Tool-calling empowers Large Language Models (LLMs) to interact with external environments. However, current methods often struggle to handle massive and noisy candidate tools in long-context tool-calling tasks, limiting their real-world application. To this end, we propose Tool-DC, a Divide-and-Conquer framework for boosting tool-calling performance of LLMs. The core of Tool-DC is to reduce the reasoning difficulty and make full use of self-reflection ability of LLMs via a "Try-Check-Retry" paradigm. Specifically, Tool-DC involves two variants: 1) the training-free Tool-DC (TF), which is plug-and-play and flexible; 2) the training-based Tool-DC (TB), which is more inference-efficient. Extensive experiments show that both Tool-DC methods outperform their counterparts by a clear margin. Tool-DC (TF) brings up to +25.10% average gains against the baseline on BFCL and ACEBench benchmarks, while Tool-DC (TB) enables Qwen2.5-7B to achieve comp

Kunfeng Chen, Qihuang Zhong, Juhua Liu, Bo Du, Dacheng Tao · March 13, 2026 · 1 min read · 33 views

#cs.CL

Executive Summary

The article 'Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs' proposes a novel framework, Tool-DC, to enhance the performance of Large Language Models (LLMs) in long-context tool-calling tasks. The framework employs a 'Try-Check-Retry' paradigm, reducing reasoning difficulty and leveraging self-reflection abilities of LLMs. Two variants of Tool-DC are presented: training-free (TF) and training-based (TB). Experimental results demonstrate significant improvements over baseline models, with TF achieving up to +25.10% average gains and TB rivaling proprietary LLMs. This research has the potential to improve the practical applications of LLMs in real-world scenarios.

Key Points

▸ Tool-DC is a novel framework for boosting LLM performance in long-context tool-calling tasks
▸ The framework employs a 'Try-Check-Retry' paradigm to reduce reasoning difficulty and leverage self-reflection
▸ Two variants of Tool-DC are presented: training-free (TF) and training-based (TB)
▸ Experimental results demonstrate significant improvements over baseline models

Merits

Strength

The framework's ability to reduce reasoning difficulty and leverage self-reflection abilities of LLMs is a significant merit.

Demerits

Limitation

The framework's reliance on the quality of pre-trained LLMs may limit its generalizability to other tasks and domains.

Expert Commentary

The article presents a significant contribution to the field of LLM research, particularly in the context of long-context tool-calling tasks. The proposed framework, Tool-DC, is well-designed and effectively leverages the self-reflection abilities of LLMs to improve performance. The experimental results are convincing, and the framework's potential to improve practical applications of LLMs is substantial. However, the framework's reliance on pre-trained LLMs may limit its generalizability to other tasks and domains. Future research should focus on addressing this limitation and exploring the framework's applicability to other areas of LLM research.

Recommendations

✓ Further research should focus on addressing the framework's reliance on pre-trained LLMs and exploring its applicability to other tasks and domains.
✓ The proposed framework should be evaluated on a broader range of benchmarks and tasks to demonstrate its generalizability and robustness.

Sources

arXiv - cs.CL

Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

AI Commentary

Executive Summary

Key Points

Merits

Strength

Demerits

Limitation

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs