Academic

Structure-Aware Set Transformers: Temporal and Variable-Type Attention Biases for Asynchronous Clinical Time Series

arXiv:2603.06605v1 Announce Type: new Abstract: Electronic health records (EHR) are irregular, asynchronous multivariate time series. As time-series foundation models increasingly tokenize events rather than discretizing time, the input layout becomes a key design choice. Grids expose time$\times$variable structure but require imputation or missingness masks, risking error or sampling-policy shortcuts. Point-set tokenization avoids discretization but loses within-variable trajectories and time-local cross-variable context (Fig.1). We restore these priors in STructure-AwaRe (STAR) Set Transformer by adding parameter-efficient soft attention biases: a temporal locality penalty $-|\Delta t|/\tau$ with learnable timescales and a variable-type affinity $B_{s_i,s_j}$ from a learned feature-compatibility matrix. We benchmark 10 depth-wise fusion schedules (Fig.2). On three ICU prediction tasks, STAR-Set achieves AUC/APR of 0.7158/0.0026 (CPR), 0.9164/0.2033 (mortality), and 0.8373/0.1258 (va

J
Joohyung Lee, Kwanhyung Lee, Changhun Kim, Eunho Yang
· · 1 min read · 9 views

arXiv:2603.06605v1 Announce Type: new Abstract: Electronic health records (EHR) are irregular, asynchronous multivariate time series. As time-series foundation models increasingly tokenize events rather than discretizing time, the input layout becomes a key design choice. Grids expose time$\times$variable structure but require imputation or missingness masks, risking error or sampling-policy shortcuts. Point-set tokenization avoids discretization but loses within-variable trajectories and time-local cross-variable context (Fig.1). We restore these priors in STructure-AwaRe (STAR) Set Transformer by adding parameter-efficient soft attention biases: a temporal locality penalty $-|\Delta t|/\tau$ with learnable timescales and a variable-type affinity $B_{s_i,s_j}$ from a learned feature-compatibility matrix. We benchmark 10 depth-wise fusion schedules (Fig.2). On three ICU prediction tasks, STAR-Set achieves AUC/APR of 0.7158/0.0026 (CPR), 0.9164/0.2033 (mortality), and 0.8373/0.1258 (vasopressor use), outperforming regular-grid, event-time grid, and prior set baselines. Learned $\tau$ and $B$ provide interpretable summaries of temporal context and variable interactions, offering a practical plug-in for context-informed time-series models.

Executive Summary

This article introduces the Structure-Aware Set Transformer (STAR) model, designed to handle asynchronous clinical time series data. The model incorporates temporal and variable-type attention biases, allowing for more accurate predictions and interpretable summaries of temporal context and variable interactions. The STAR model outperforms existing baselines on three ICU prediction tasks, demonstrating its potential for practical applications in healthcare. The model's ability to learn temporal locality and variable-type affinity provides valuable insights into the underlying data structure, making it a promising tool for context-informed time-series analysis.

Key Points

  • Introduction of the STAR model for asynchronous clinical time series data
  • Incorporation of temporal and variable-type attention biases for improved predictions
  • Outperformance of existing baselines on ICU prediction tasks

Merits

Improved Prediction Accuracy

The STAR model achieves higher AUC/APR scores than existing baselines on ICU prediction tasks

Interpretable Summaries

The model provides interpretable summaries of temporal context and variable interactions, offering valuable insights into the underlying data structure

Demerits

Computational Complexity

The introduction of additional attention biases may increase the computational complexity of the model, potentially limiting its scalability

Dependence on Learnable Parameters

The model's performance relies on the learnability of timescales and feature-compatibility matrices, which may require careful tuning and regularization

Expert Commentary

The STAR model represents a significant advancement in the field of time-series analysis, particularly in the context of asynchronous clinical data. The incorporation of temporal and variable-type attention biases enables the model to capture complex patterns and relationships in the data, leading to improved prediction accuracy and interpretable summaries. However, further research is needed to address the potential limitations of the model, such as computational complexity and dependence on learnable parameters. Nevertheless, the STAR model has the potential to make a substantial impact in healthcare applications, and its development contributes to the growing field of explainable AI.

Recommendations

  • Further evaluation of the STAR model on diverse healthcare datasets to assess its generalizability and robustness
  • Investigation of techniques to reduce computational complexity and improve the scalability of the model, such as knowledge distillation or pruning

Sources