Academic

Multilingual Financial Fraud Detection Using Machine Learning and Transformer Models: A Bangla-English Study

Mohammad Shihab Uddin, Md Hasibul Amin, Nusrat Jahan Ema, Bushra Uddin, Tanvir Ahmed, Arif Hassan Zidan · March 13, 2026 · 1 min read · 38 views

#cs.LG

arXiv:2603.11358v1 Announce Type: new Abstract: Financial fraud detection has emerged as a critical research challenge amid the rapid expansion of digital financial platforms. Although machine learning approaches have demonstrated strong performance in identifying fraudulent activities, most existing research focuses exclusively on English-language data, limiting applicability to multilingual contexts. Bangla (Bengali), despite being spoken by over 250 million people, remains largely unexplored in this domain. In this work, we investigate financial fraud detection in a multilingual Bangla-English setting using a dataset comprising legitimate and fraudulent financial messages. We evaluate classical machine learning models (Logistic Regression, Linear SVM, and Ensemble classifiers) using TF-IDF features alongside transformer-based architectures. Experimental results using 5-fold stratified cross-validation demonstrate that Linear SVM achieves the best performance with 91.59 percent accuracy and 91.30 percent F1 score, outperforming the transformer model (89.49 percent accuracy, 88.88 percent F1) by approximately 2 percentage points. The transformer exhibits higher fraud recall (94.19 percent) but suffers from elevated false positive rates. Exploratory analysis reveals distinctive patterns: scam messages are longer, contain urgency-inducing terms, and frequently include URLs (32 percent) and phone numbers (97 percent), while legitimate messages feature transactional confirmations and specific currency references. Our findings highlight that classical machine learning with well-crafted features remains competitive for multilingual fraud detection, while also underscoring the challenges posed by linguistic diversity, code-mixing, and low-resource language constraints.

Executive Summary

This study investigates financial fraud detection in a multilingual Bangla-English setting using machine learning and transformer models. Experimental results using 5-fold stratified cross-validation demonstrate that classical machine learning models, specifically Linear SVM, achieve the best performance with 91.59 percent accuracy and 91.30 percent F1 score, outperforming transformer-based architectures. The findings highlight that classical machine learning with well-crafted features remains competitive for multilingual fraud detection, while also underscoring the challenges posed by linguistic diversity and low-resource language constraints.

Key Points

▸ The study focuses on financial fraud detection in a multilingual Bangla-English setting.
▸ Classical machine learning models, specifically Linear SVM, achieve the best performance with 91.59 percent accuracy and 91.30 percent F1 score.
▸ Transformer-based architectures are outperformed by classical machine learning models in this study.

Merits

Strength of the Study

The study addresses a critical research challenge by focusing on financial fraud detection in a multilingual Bangla-English setting, a previously underexplored domain.

Methodological Rigor

The study employs a robust experimental design, using 5-fold stratified cross-validation and evaluating classical machine learning models and transformer-based architectures.

Insights into Linguistic Diversity

The study provides valuable insights into the challenges posed by linguistic diversity, code-mixing, and low-resource language constraints in multilingual fraud detection.

Demerits

Limitation of the Study

The study focuses on a specific language pair (Bangla-English) and may not be generalizable to other language pairs or domains.

Overemphasis on Accuracy Metrics

The study primarily focuses on accuracy metrics, neglecting other important evaluation metrics, such as precision, recall, and F1 score.

Expert Commentary

The study's findings have significant implications for the development of machine learning models for financial fraud detection in multilingual contexts. While the study's focus on classical machine learning models and their performance in a Bangla-English setting is valuable, it is essential to recognize that the challenges posed by linguistic diversity and low-resource language constraints are not unique to this study. Future research should aim to develop machine learning models that can effectively handle these challenges and provide more comprehensive evaluation of their performance using a range of metrics. Additionally, the study's findings highlight the need for more research on the role of linguistic diversity in financial crime prevention and the development of effective strategies for addressing these challenges.

Recommendations

✓ Future research should prioritize the development of machine learning models that can effectively handle linguistic diversity and low-resource language constraints in multilingual contexts.
✓ Researchers should strive to develop more comprehensive evaluation metrics that consider a range of factors, including accuracy, precision, recall, and F1 score.

Sources

arXiv - cs.LG

Multilingual Financial Fraud Detection Using Machine Learning and Transformer Models: A Bangla-English Study

AI Commentary

Executive Summary

Key Points

Merits

Strength of the Study

Methodological Rigor

Insights into Linguistic Diversity

Demerits

Limitation of the Study

Overemphasis on Accuracy Metrics

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs