Conference

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology

· March 7, 2026 · 10 min read · 7 views

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Wanxiang Che , Ekaterina Shutova (Editors) Anthology ID: 2022.emnlp-demos Month: December Year: 2022 Address: Abu Dhabi, UAE Venue: EMNLP SIG: Publisher: Association for Computational Linguistics URL: https://aclanthology.org/2022.emnlp-demos/ DOI: 10.18653/v1/2022.emnlp-demos Bib Export formats: BibTeX MODS XML EndNote PDF: https://aclanthology.org/2022.emnlp-demos.pdf PDF (full) Bib TeX Search Show all abstracts Hide all abstracts pdf bib Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Wanxiang Che | Ekaterina Shutova pdf bib abs C og KTR : A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding Zhuoran Jin | Tianyi Men | Hongbang Yuan | Yuyang Zhou | Pengfei Cao | Yubo Chen | Zhipeng Xue | Kang Liu | Jun Zhao As the first step of modern natural language processing, text representation encodes discrete texts as continuous embeddings. Pre-trained language models (PLMs) have demonstrated strong ability in text representation and significantly promoted the development of natural language understanding (NLU). However, existing PLMs represent a text solely by its context, which is not enough to support knowledge-intensive NLU tasks. Knowledge is power, and fusing external knowledge explicitly into PLMs can provide knowledgeable text representations. Since previous knowledge-enhanced methods differ in many aspects, making it difficult for us to reproduce previous methods, implement new methods, and transfer between different methods. It is highly desirable to have a unified paradigm to encompass all kinds of methods in one framework. In this paper, we propose CogKTR, a knowledge-enhanced text representation toolkit for natural language understanding. According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks. Our unified, knowledgeable and modular toolkit is publicly available at GitHub, with an online system and a short instruction video. pdf bib abs LM -Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models Mor Geva | Avi Caciularu | Guy Dar | Paul Roit | Shoval Sadde | Micah Shlain | Bar Tamir | Yoav Goldberg The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions. However, current interpretation methods mostly focus on probing models from outside, executing behavioral tests, and analyzing salience input features, while the internal prediction construction process is largely not understood. In this work, we introduce LM-Debugger, an interactive debugger tool for transformer-based LMs, which provides a fine-grained interpretation of the model’s internal prediction process, as well as a powerful framework for intervening in LM behavior. For its backbone, LM-Debugger relies on a recent method that interprets the inner token representations and their updates by the feed-forward layers in the vocabulary space. We demonstrate the utility of LM-Debugger for single-prediction debugging, by inspecting the internal disambiguation process done by GPT2. Moreover, we show how easily LM-Debugger allows to shift model behavior in a direction of the user’s choice, by identifying a few vectors in the network and inducing effective interventions to the prediction process. We release LM-Debugger as an open-source tool and a demo over GPT2 models. pdf bib abs E asy NLP : A Comprehensive and Easy-to-use Toolkit for Natural Language Processing Chengyu Wang | Minghui Qiu | Taolin Zhang | Tingting Liu | Lei Li | Jianing Wang | Ming Wang | Jun Huang | Wei Lin Pre-Trained Models (PTMs) have reshaped the development of Natural Language Processing (NLP) and achieved significant improvement in various benchmarks. Yet, it is not easy for industrial practitioners to obtain high-performing PTM-based models without a large amount of labeled training data and deploy them online with fast inference speed. To bridge this gap, EasyNLP is designed to make it easy to build NLP applications, which supports a comprehensive suite of NLP algorithms. It further features knowledge-enhanced pre-training, knowledge distillation and few-shot learning functionalities, and provides a unified framework of model training, inference and deployment for real-world applications. EasyNLP has powered over ten business units within Alibaba Group and is seamlessly integrated to the Platform of AI (PAI) products on Alibaba Cloud. The source code of EasyNLP is released at GitHub ( https://github.com/alibaba/EasyNLP ). pdf bib abs An Explainable Toolbox for Evaluating Pre-trained Vision-Language Models Tiancheng Zhao | Tianqi Zhang | Mingwei Zhu | Haozhan Shen | Kyusong Lee | Xiaopeng Lu | Jianwei Yin We introduce VL-CheckList, a toolbox for evaluating Vision-Language Pretraining (VLP) models, including the preliminary datasets that deepen the image-texting ability of a VLP model. Most existing VLP works evaluated their systems by comparing the fine-tuned downstream task performance. However, only average downstream task accuracy provides little information about the pros and cons of each VLP method. In this paper, we demonstrate how minor input changes in language and vision will affect the prediction outputs. Then, we describe the detailed user guidelines to utilize and contribute to the community. We show new findings on one of the representative VLP models to provide an example analysis. The data/code is available at https://github.com/om-ai-lab/VL-CheckList pdf bib abs T weet NLP : Cutting-Edge Natural Language Processing for Social Media Jose Camacho-collados | Kiamehr Rezaee | Talayeh Riahi | Asahi Ushio | Daniel Loureiro | Dimosthenis Antypas | Joanne Boisson | Luis Espinosa Anke | Fangyu Liu | Eugenio Martínez Cámara In this paper we present TweetNLP, an integrated platform for Natural Language Processing (NLP) in social media. TweetNLP supports a diverse set of NLP tasks, including generic focus areas such as sentiment analysis and named entity recognition, as well as social media-specific tasks such as emoji prediction and offensive language identification. Task-specific systems are powered by reasonably-sized Transformer-based language models specialized on social media text (in particular, Twitter) which can be run without the need for dedicated hardware or cloud services. The main contributions of TweetNLP are: (1) an integrated Python library for a modern toolkit supporting social media analysis using our various task-specific models adapted to the social domain; (2) an interactive online demo for codeless experimentation using our models; and (3) a tutorial covering a wide variety of typical social media applications. pdf bib abs J oey S 2 T : Minimalistic Speech-to-Text Modeling with J oey NMT Mayumi Ohta | Julia Kreutzer | Stefan Riezler JoeyS2T is a JoeyNMT extension for speech-to-text tasks such as automatic speech recognition and end-to-end speech translation. It inherits the core philosophy of JoeyNMT, a minimalist NMT toolkit built on PyTorch, seeking simplicity and accessibility. JoeyS2T’s workflow is self-contained, starting from data pre-processing, over model training and prediction to evaluation, and is seamlessly integrated into JoeyNMT’s compact and simple code base. On top of JoeyNMT’s state-of-the-art Transformer-based Encoder-Decoder architecture, JoeyS2T provides speech-oriented components such as convolutional layers, SpecAugment, CTC-loss, and WER evaluation. Despite its simplicity compared to prior implementations, JoeyS2T performs competitively on English speech recognition and English-to-German speech translation benchmarks. The implementation is accompanied by a walk-through tutorial and available on https://github.com/may-/joeys2t . pdf bib abs F air L ib: A Unified Framework for Assessing and Improving Fairness Xudong Han | Aili Shen | Yitong Li | Lea Frermann | Timothy Baldwin | Trevor Cohn This paper presents FairLib, an open-source python library for assessing and improving model fairness. It provides a systematic framework for quickly accessing benchmark datasets, reproducing existing debiasing baseline models, developing new methods, evaluating models with different metrics, and visualizing their results. Its modularity and extensibility enable the framework to be used for diverse types of inputs, including natural language, images, and audio. We implement 14 debiasing methods, including pre-processing,at-training-time, and post-processing approaches. The built-in metrics cover the most commonly acknowledged fairness criteria and can be further generalized and customized for fairness evaluation. pdf bib abs ELEVANT : A Fully Automatic Fine-Grained Entity Linking Evaluation and Analysis Tool Hannah Bast | Matthias Hertel | Natalie Prange We present Elevant, a tool for the fully automatic fine-grained evaluation of a set of entity linkers on a set of benchmarks. Elevant provides an automatic breakdown of the performance by various error categories and by entity type. Elevant also provides a rich and compact, yet very intuitive and self-explanatory visualization of the results of a linker on a benchmark in comparison to the ground truth. A live demo, the link to the complete code base on GitHub and a link to a demo video are provided under https://elevant.cs.uni-freiburg.de . pdf bib abs A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering Matt Maufe | James Ravenscroft | Rob Procter | Maria Liakata Question Answering (QA) is a growing area of research, often used to facilitate the extraction of information from within documents. State-of-the-art QA models are usually pre-trained on domain-general corpora like Wikipedia and thus tend to struggle on out-of-domain documents without fine-tuning. We demonstrate that synthetic domain-specific datasets can be generated easily using domain-general models, while still providing significant improvements to QA performance. We present two new tools for this task: A flexible pipeline for validating the synthetic QA data and training down stream models on it, and an online interface to facilitate human annotation of this generated data. Using this interface, crowdworkers labelled 1117 synthetic QA pairs, which we then used to fine-tune downstream models and improve domain-specific QA performance by 8.75 F1. pdf bib abs D eep KE : A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population Ningyu Zhang | Xin Xu | Liankuan Tao | Haiyang Yu | Hongbin Ye | Shuofei Qiao | Xin Xie | Xiang Chen | Zhoubo Li | Lei Li We present an open-source and extensible knowledge extraction toolkit DeepKE, supporting complicated low-resource, document-level and multimodal scenarios in the knowledge base population. DeepKE implements various information extraction tasks, including named entity recognition, relation extraction and attribute extraction. With a unified framework, DeepKE allows developers and researchers to customize datasets and models to extract information from unstructured data according to their requirements. Specifically, DeepKE not only provides various functional modules and model implementation for different tasks and scenarios but also organizes all components by consistent frameworks to maintain sufficient modularity and extensibility. We release the source code at GitHub in https://github.com/zjunlp/DeepKE with Google Colab tutorials and comprehensive documents for beginners. Besides, we present an online system in http://deepke.openkg.cn/EN/re_doc_show.html for real-time extraction of various tasks, and a demo video. pdf bib abs A n EMIC : A Framework for Benchmarking ICD Coding Models Juyong Kim | Abheesht Sharma | Suhas Shanbhogue | Jeremy Weiss | Pradeep Ravikumar Diagnostic coding, or ICD coding, is the task of assigning diagnosis codes defined by the ICD (International Classification of Diseases) standard to patient visits based on clinical notes. The current process of manual ICD coding is time-consuming and often error-prone, which suggests the need for automatic ICD coding. However, despite the long history of automatic ICD coding, there have been no standardized frameworks for benchmarking ICD coding models. We open-source an easy-to-use tool named AnEMIC , which provides a streamlined pipeline for preprocessing, training, and evaluating for automatic ICD coding. We correct errors in preprocessing by existing works, and provide key models and weights trained on the correctly preprocessed datasets. We also provide an interactive demo performing real-time inference from custom inputs, and visualizations drawn from explainable AI to analyze the models. We hope the framework helps move the research of ICD coding forward and helps professionals explore the potential of ICD coding. The framework and the associated code are available here. pdf bib abs SPEAR : Semi-supervised Data Programming in Python Guttu Abhishek | Harshad Ingole | Parth Laturia | Vineeth Dorna | Ayush Maheshwari | Ganesh Ramakrishnan | Rishabh Iyer We present SPEAR, an open-source python library for data programming with semi supervision. The package implements several recent data programming approaches including facility to programmatically label and build training data. SPEAR facilitates weak supervision in the form of heuristics (or rules) and association of noisy labels to the training dataset. These noisy labels are aggregated to assign labels to the unlabeled data for downstream tasks. We have implemented several label aggregation approaches that aggregate the noisy labels and then train using the noisily labeled set in a cascaded manner. Our implementation also includes other approaches that jointly aggregate and train the model for text classification tasks. Thus, in our python package, we integrate several cascade and joint data-programming approaches while also providing the facility of data programming by letting the user define labeling functions or rules. The code and tutorial notebooks are available at https://github.com/decile-team/spear . Further, extensive documentation can be found at https://spear-decile.readthedocs.io/ . Video tutorials demonstrating the usage of our package are available https://youtube.com/playlist?list=PLW8agt_HvkVnOJoJAqBpaerFb-z-ZlqlP . We also present some real-world use cases of SPEAR. pdf bib abs Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements Leandro Von Werra | Lewis Tunstall | Abhishek Thakur | Sasha

Executive Summary

The Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) System Demonstrations present two significant contributions to the field of natural language processing (NLP). The first, CogKTR, introduces a knowledge-enhanced text representation toolkit designed to integrate external knowledge into pre-trained language models (PLMs) to improve natural language understanding (NLU) tasks. The second, LM-Debugger, offers an interactive tool for inspecting and intervening in transformer-based language models, addressing the opacity and unexplained behavior of these models. Both tools aim to enhance the transparency, reproducibility, and effectiveness of NLP systems.

Key Points

▸ CogKTR provides a unified framework for knowledge-enhanced text representation in NLU tasks.
▸ LM-Debugger offers an interactive tool for inspecting and intervening in transformer-based language models.
▸ Both tools aim to address critical gaps in current NLP methodologies.

Merits

Innovative Framework

CogKTR's Unified Knowledge-Enhanced Paradigm (UniKEP) offers a comprehensive and modular approach to integrating external knowledge into PLMs, making it easier to reproduce, implement, and transfer between different methods.

Interactive Debugging

LM-Debugger provides an interactive tool that allows for deeper inspection and intervention in transformer-based language models, enhancing transparency and interpretability.

Demerits

Complexity

The integration of external knowledge into PLMs, as proposed by CogKTR, may introduce additional complexity and computational overhead, which could be a barrier for some users.

Limited Scope

LM-Debugger focuses primarily on transformer-based language models, which may limit its applicability to other types of NLP models.

Expert Commentary

The Proceedings of the 2022 EMNLP System Demonstrations highlight two innovative tools that address critical gaps in the field of NLP. CogKTR's Unified Knowledge-Enhanced Paradigm (UniKEP) represents a significant advancement in the integration of external knowledge into PLMs, offering a modular and unified framework that simplifies the process of reproducing, implementing, and transferring between different methods. This toolkit is particularly valuable for knowledge-intensive NLU tasks, where the ability to incorporate external knowledge can significantly enhance model performance. On the other hand, LM-Debugger provides an interactive tool for inspecting and intervening in transformer-based language models, addressing the opacity and unexplained behavior that has been a persistent challenge in the field. By offering deeper insights into the internal prediction construction process, LM-Debugger enhances transparency and interpretability, which are crucial for the ethical and responsible use of NLP models. Together, these tools contribute to the ongoing efforts to make NLP systems more effective, transparent, and accountable.

Recommendations

✓ Researchers and practitioners should explore the use of CogKTR for enhancing the performance of NLU tasks by integrating external knowledge into PLMs.
✓ Developers of NLP models should consider incorporating LM-Debugger into their workflow to improve model interpretability and transparency.

Sources

EMNLP

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology

AI Commentary

Executive Summary

Key Points

Merits

Innovative Framework

Interactive Debugging

Demerits

Complexity

Limited Scope

Expert Commentary

Recommendations

Sources

Related Articles

Google Maps

Find Your Next Job

A Retrospective on the ICLR 2026 Review Process

Retrospective on PAT x ICML 2026 AI Paper Assistant Program

JCG, PC

HSOLLC Co., Ltd.