Academic

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

arXiv:2604.00666v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer a promising path toward low-latency generation through parallel decoding, but their practical efficiency depends heavily on the decoding trajectory. In practice, this advantage often fails to fully materialize because standard training does not provide explicit supervision over token reveal order, creating a train-inference mismatch that leads to suboptimal decoding behavior. We propose Trajectory-Ranked Instruction Masked Supervision (TRIMS), a simple trajectory-guided supervised fine-tuning framework that injects trajectory supervision into standard Masked Diffusion Language Model (MDLM) training with minimal overhead. Instead of relying on costly DLM-based distillation, TRIMS uses lightweight signals from an autoregressive teacher to guide a trajectory-aware masking strategy, encouraging the model to learn more effective decoding orders. Experiments on LLaDA and Dream across math and coding bench

Lingjie Chen, Ruizhong Qiu, Yuyu Fan, Yanjun Zhao, Hanghang Tong · April 3, 2026 · 1 min read · 3 views

#cs.CL

Executive Summary

This article proposes Trajectory-Ranked Instruction Masked Supervision (TRIMS), a novel framework for fine-tuning diffusion language models (DLMs) to improve their decoding efficiency. TRIMS injects trajectory supervision into standard masked diffusion language model (MDLM) training, leveraging lightweight signals from an autoregressive teacher to guide a trajectory-aware masking strategy. Experiments on LLaDA and Dream demonstrate TRIMS' effectiveness in improving the accuracy-parallelism trade-off, achieving competitive performance with prior distillation-based approaches at lower training costs. Further analysis highlights TRIMS' potential in optimizing DLM decoding trajectories. While TRIMS shows promise, its applicability and limitations require further exploration.

Key Points

▸ TRIMS injects trajectory supervision into standard MDLM training
▸ TRIMS leverages lightweight signals from an autoregressive teacher
▸ TRIMS improves accuracy-parallelism trade-off and achieves competitive performance

Merits

Strength

TRIMS offers a novel and efficient approach to fine-tuning DLMs, leveraging trajectory supervision to improve decoding efficiency.

Enhanced Performance

TRIMS demonstrates improved accuracy-parallelism trade-off and competitive performance with prior distillation-based approaches at lower training costs.

Improved Decoding Trajectories

TRIMS leads to better decoding trajectories, validating the effectiveness of trajectory-guided supervision for DLMs.

Demerits

Limitation

The applicability and limitations of TRIMS require further exploration, particularly in terms of its scalability and generalizability.

Expert Commentary

The proposed TRIMS framework offers a promising direction for improving the efficiency of DLMs. By leveraging lightweight signals from an autoregressive teacher, TRIMS provides a more efficient and effective approach to fine-tuning DLMs. The experimental results demonstrate TRIMS' potential in optimizing DLM decoding trajectories and improving the accuracy-parallelism trade-off. However, further investigation is necessary to fully explore TRIMS' applicability and limitations, particularly in terms of its scalability and generalizability. Nevertheless, TRIMS' innovative approach and demonstrated effectiveness make it an exciting area of research for the development of more efficient and effective AI language models.

Recommendations

✓ Future research should focus on scaling up TRIMS to larger datasets and exploring its generalizability to other types of language models.
✓ Investigation into the potential applications of TRIMS in real-world scenarios, such as real-time language translation or text summarization, is necessary to fully understand its practical implications.

Sources

Original: arXiv - cs.CL

arXiv - cs.CL

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

AI Commentary

Executive Summary

Key Points

Merits

Strength

Enhanced Performance

Improved Decoding Trajectories

Demerits

Limitation

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.