Academic

Two-Stage Optimizer-Aware Online Data Selection for Large Language Models

arXiv:2604.00001v1 Announce Type: cross Abstract: Gradient-based data selection offers a principled framework for estimating sample utility in large language model (LLM) fine-tuning, but existing methods are mostly designed for offline settings. They are therefore less suited to online fine-tuning, where data arrives sequentially, sample utility is step-dependent, and the effective update geometry is shaped by adaptive optimizers. We propose an optimizer-aware framework for gradient-based online data selection and reweighting in LLM fine-tuning. Our key idea is to view online selection not as static sample ranking, but as shaping the next target-oriented update under the optimizer state. We formulate this as an optimizer-aware update-matching problem, establish its connection to second-order target utility, and show why subset-level construction must account for interactions and redundancy among selected samples. Based on this view, we develop a two-stage Filter-then-Weight algorithm

Fangxin Wang, Peyman Baghershahi, Langzhou He, Henry Peng Zou, Sourav Medya, Philip S. Yu · April 3, 2026 · 1 min read · 1 views

#cs.LG #cs.AI #cs.CL

Executive Summary

The article proposes an optimizer-aware framework for gradient-based online data selection and reweighting in large language model (LLM) fine-tuning. The authors introduce a two-stage Filter-then-Weight algorithm to select and optimize data samples under adaptive optimizers. Experiments demonstrate improved convergence and downstream performance over existing online data selection baselines. The framework's key feature is its ability to account for interactions and redundancy among selected samples, making it well-suited for online fine-tuning. However, the algorithm assumes access to the optimizer's state, which may not be feasible in all settings.

Key Points

▸ The authors propose an optimizer-aware framework for online data selection in LLM fine-tuning.
▸ The framework uses a two-stage Filter-then-Weight algorithm to select and optimize data samples.
▸ The algorithm accounts for interactions and redundancy among selected samples.

Merits

Strength in Mathematical Formulation

The authors provide a rigorous mathematical formulation of the optimizer-aware update-matching problem, which is a key contribution of the article.

Improved Convergence and Downstream Performance

The experiments demonstrate significant improvements in convergence and downstream performance over existing online data selection baselines, making the framework a promising approach for LLM fine-tuning.

Demerits

Assumption of Access to Optimizer's State

The algorithm assumes access to the optimizer's state, which may not be feasible in all settings, limiting its practical applicability.

Computational Complexity

The factorized outer-product gradient representation and optimized matrix computations may increase computational complexity, which could be a challenge for large-scale LLM fine-tuning.

Expert Commentary

The article makes a significant contribution to the field of LLM fine-tuning by proposing an optimizer-aware framework for online data selection and reweighting. The two-stage Filter-then-Weight algorithm is a key innovation that addresses the challenges of online fine-tuning, and the experimental results demonstrate its effectiveness. However, as with any new approach, there are potential limitations and challenges that need to be addressed, such as the assumption of access to the optimizer's state and the computational complexity of the algorithm. Further research is needed to fully explore the potential of this framework and to address these limitations.

Recommendations

✓ Future research should focus on developing more efficient and scalable algorithms for computing the factorized outer-product gradient representation and optimized matrix computations.
✓ The proposed framework should be evaluated on a broader range of LLM architectures and tasks to demonstrate its generalizability and effectiveness.

Sources

Original: arXiv - cs.AI

arXiv - cs.AI

Two-Stage Optimizer-Aware Online Data Selection for Large Language Models

AI Commentary

Executive Summary

Key Points

Merits

Strength in Mathematical Formulation

Improved Convergence and Downstream Performance

Demerits

Assumption of Access to Optimizer's State

Computational Complexity

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.