Academic

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

arXiv:2603.09978v1 Announce Type: cross Abstract: Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives within a single model, but fully fine-tuning LLMs across tasks is computationally prohibitive. Parameter-efficient fine-tuning mitigates this cost by updating only a small fraction of weights. Although PEFT has proven effective in single-task settings, its potential for multi-task learning has not yet been systematically explored. We present the first comprehensive evaluation of multi-task PEFT for code analysis, comparing several methods across diverse tasks and model architectures. Our experiments show that a single PEFT module shared across tasks can match, and in some cases surpass, full multi-task fine-tuning, confirming that the benefits of PEFT extend beyond isolated tasks. When comparing single

A
Amal Akli, Maxime Cordy, Mike Papadakis, Yves Le Traon
· · 1 min read · 11 views

arXiv:2603.09978v1 Announce Type: cross Abstract: Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives within a single model, but fully fine-tuning LLMs across tasks is computationally prohibitive. Parameter-efficient fine-tuning mitigates this cost by updating only a small fraction of weights. Although PEFT has proven effective in single-task settings, its potential for multi-task learning has not yet been systematically explored. We present the first comprehensive evaluation of multi-task PEFT for code analysis, comparing several methods across diverse tasks and model architectures. Our experiments show that a single PEFT module shared across tasks can match, and in some cases surpass, full multi-task fine-tuning, confirming that the benefits of PEFT extend beyond isolated tasks. When comparing single-task and multi-task setups, we find that multi-task PEFT achieves a favorable performance-efficiency trade-off: it delivers accuracy close to single-task fine-tuning while reducing storage requirements, cutting the number of trainable parameters by a factor of the task count, and lowering computation costs by as much as 85%. At the same time, multi-task gains remain sensitive to task grouping. Through task-pairing experiments, we identify key factors shaping outcomes: task stability, model architecture, task complementarity, asymmetry, and dataset quality determine the success of co-fine-tuning. Finally, we benchmark efficient multi-task PEFT against direct prompting of open-source general-purpose LLMs, including DeepSeek, Qwen, Mistral, CodeLlama, and StarCoder. Despite their strong performance in code generation, these models underperform on analysis tasks, where even a 1B-parameter model with multi-task PEFT achieves significantly better results.

Executive Summary

This article explores the effectiveness of parameter-efficient fine-tuning (PEFT) for multitask code analysis. The authors evaluate several PEFT methods across diverse tasks and model architectures, demonstrating that a single PEFT module can match or surpass full multi-task fine-tuning. The results show that multi-task PEFT achieves a favorable performance-efficiency trade-off, reducing storage requirements and computation costs while delivering accuracy close to single-task fine-tuning. The study highlights the importance of task grouping, task stability, and model architecture in determining the success of co-fine-tuning.

Key Points

  • Parameter-efficient fine-tuning (PEFT) is effective for multitask code analysis
  • A single PEFT module can match or surpass full multi-task fine-tuning
  • Multi-task PEFT achieves a favorable performance-efficiency trade-off

Merits

Improved Efficiency

PEFT reduces storage requirements and computation costs while maintaining accuracy

Unified Model

A single PEFT module can be shared across tasks, simplifying model management

Demerits

Task Grouping Sensitivity

The success of co-fine-tuning is sensitive to task grouping, requiring careful selection of tasks

Limited Generalizability

The study's findings may not generalize to all code analysis tasks or model architectures

Expert Commentary

The study's findings have significant implications for the development of large language models and their applications in code analysis. The use of PEFT can facilitate the creation of more efficient and effective models, enabling faster and more accurate software development. However, the sensitivity of co-fine-tuning to task grouping highlights the need for careful task selection and model design. Further research is necessary to fully explore the potential of PEFT and its applications in code analysis.

Recommendations

  • Further research is needed to explore the applications of PEFT in code analysis and its potential for improving model efficiency and effectiveness
  • Developers should consider using PEFT in their model development pipelines to accelerate the creation of multitask code analysis models

Sources