MSQ-BioBERT: Ambiguity Resolution to Enhance BioBERT Medical Question-Answering

Document Type

Conference Proceeding

Publication Title

ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023

Publication Date



Question answering (QA) is a task in the field of natural language processing (NLP) and information retrieval, which has pivotal applications in areas such as online reading comprehension and web search engines. Currently, Bidirectional Encoder Representations from Transformers (BERT) and its biomedical variation (BioBERT) achieve impressive results on the reading comprehension QA datasets and medical-related QA datasets, and so they are widely used for a variety of passage-based QA tasks. However, their performances rapidly deteriorate when encountering passage and context ambiguities. This issue is prevalent and unavoidable in many fields, notably the web-based medical field. In this paper, we introduced a novel approach called the Multiple Synonymous Questions BioBERT (MSQ-BioBERT), which integrates question augmentation, rather than the typical single question used by traditional BioBERT, to elevate BioBERT's performance on medical QA tasks. In addition, we constructed an ambiguous medical dataset based on the information from Wikipedia web. Experiments with both this web-based constructed medical dataset and open biomedical datasets demonstrate the significant performance gains of the MSQ-BioBERT approach, showcasing a new method for addressing ambiguity in medical QA tasks.

First Page


Last Page