Academic

Survey of Text Mining Techniques Applied to Judicial Decisions Prediction

This paper reviews the most recent literature on experiments with different Machine Learning, Deep Learning and Natural Language Processing techniques applied to predict judicial and administrative decisions. Among the most outstanding findings, we have that the most used data mining techniques are Support Vector Machine (SVM), K Nearest Neighbours (K-NN) and Random Forest (RF), and in terms of the most used deep learning techniques, we found Long-Term Memory (LSTM) and transformers such as BERT. An important finding in the papers reviewed was that the use of machine learning techniques has prevailed over those of deep learning. Regarding the place of origin of the research carried out, we found that 64% of the works belong to studies carried out in English-speaking countries, 8% in Portuguese and 28% in other languages (such as German, Chinese, Turkish, Spanish, etc.). Very few works of this type have been carried out in Spanish-speaking countries. The classification criteria of the w

O
Olga Alejandra Alcántara Francia
· · 1 min read · 17 views

This paper reviews the most recent literature on experiments with different Machine Learning, Deep Learning and Natural Language Processing techniques applied to predict judicial and administrative decisions. Among the most outstanding findings, we have that the most used data mining techniques are Support Vector Machine (SVM), K Nearest Neighbours (K-NN) and Random Forest (RF), and in terms of the most used deep learning techniques, we found Long-Term Memory (LSTM) and transformers such as BERT. An important finding in the papers reviewed was that the use of machine learning techniques has prevailed over those of deep learning. Regarding the place of origin of the research carried out, we found that 64% of the works belong to studies carried out in English-speaking countries, 8% in Portuguese and 28% in other languages (such as German, Chinese, Turkish, Spanish, etc.). Very few works of this type have been carried out in Spanish-speaking countries. The classification criteria of the works have been based, on the one hand, on the identification of the classifiers used to predict situations (or events with legal interference) or judicial decisions and, on the other hand, on the application of classifiers to the phenomena regulated by the different branches of law: criminal, constitutional, human rights, administrative, intellectual property, family law, tax law and others. The corpus size analyzed in the reviewed works reached 100,000 documents in 2020. Finally, another important finding lies in the accuracy of these predictive techniques, reaching predictions of over 60% in different branches of law.

Executive Summary

The article provides a comprehensive survey of text mining techniques applied to judicial decisions prediction, highlighting the prevalence of machine learning techniques over deep learning. The study identifies Support Vector Machine (SVM), K Nearest Neighbours (K-NN), and Random Forest (RF) as the most commonly used data mining techniques, while Long-Term Memory (LSTM) and transformers like BERT are noted in deep learning applications. The research also reveals a significant geographical disparity, with 64% of studies originating from English-speaking countries and a notable lack of studies in Spanish-speaking countries. The accuracy of these predictive techniques is reported to exceed 60% across various branches of law, with the corpus size analyzed reaching 100,000 documents in 2020.

Key Points

  • Machine learning techniques prevail over deep learning in predicting judicial decisions.
  • SVM, K-NN, and RF are the most used data mining techniques.
  • LSTM and BERT are prominent deep learning techniques in this field.
  • 64% of studies originate from English-speaking countries, with a lack of research in Spanish-speaking countries.
  • Predictive techniques achieve over 60% accuracy in various legal domains.

Merits

Comprehensive Review

The article provides a thorough review of various text mining techniques applied to judicial decision prediction, offering a clear overview of the current state of research.

Geographical Insight

The study highlights the geographical distribution of research, providing valuable insights into the global landscape of legal text mining.

Accuracy Metrics

The article reports on the accuracy of predictive techniques, which is crucial for understanding their practical applicability.

Demerits

Limited Scope

The article does not delve deeply into the specific methodologies or datasets used in the studies, which could provide more nuanced insights.

Geographical Bias

The predominance of studies from English-speaking countries may introduce a bias, limiting the generalizability of the findings.

Lack of Detailed Analysis

The article could benefit from a more detailed analysis of the reasons behind the prevalence of certain techniques and the challenges faced in their application.

Expert Commentary

The article offers a valuable overview of the current landscape of text mining techniques applied to judicial decision prediction. The prevalence of machine learning techniques over deep learning is noteworthy and suggests that simpler models may be more effective in this context, possibly due to the structured nature of legal texts. The geographical distribution of research highlights a significant gap, particularly in Spanish-speaking countries, which may be due to linguistic barriers or resource limitations. The reported accuracy of over 60% is promising but should be interpreted with caution, as accuracy can vary widely depending on the specific legal domain and the quality of the training data. Future research should aim to address these geographical disparities and provide more detailed analyses of the methodologies and datasets used, to offer a more comprehensive understanding of the field.

Recommendations

  • Encourage more research in underrepresented regions to ensure a more balanced global perspective.
  • Conduct detailed case studies on specific legal domains to understand the nuances of applying these techniques in different contexts.

Sources