TY - GEN
T1 - A Knowledge-Graph-Based Intrinsic Test for Benchmarking Medical Concept Embeddings and Pretrained Language Models
AU - Aracena, Claudio
AU - Villena, Fabián
AU - Rojas, Matias
AU - Dunstan, Jocelyn
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Using language models created from large data sources has improved the performance of several deep learning-based architectures, obtaining state-of-the-art results in several NLP extrinsic tasks. However, little research is related to creating intrinsic tests that allow us to compare the quality of different language models when obtaining contextualized embeddings. This gap increases even more when working on specific domains in languages other than English. This paper proposes a novel graph-based intrinsic test that allows us to measure the quality of different language models in clinical and biomedical domains in Spanish. Our results show that our intrinsic test performs better for clinical and biomedical language models than a general one. Also, it correlates with better outcomes for a NER task using a probing model over contextualized embeddings. We hope our work will help the clinical NLP research community to evaluate and compare new language models in other languages and find the most suitable models for solving downstream tasks.
AB - Using language models created from large data sources has improved the performance of several deep learning-based architectures, obtaining state-of-the-art results in several NLP extrinsic tasks. However, little research is related to creating intrinsic tests that allow us to compare the quality of different language models when obtaining contextualized embeddings. This gap increases even more when working on specific domains in languages other than English. This paper proposes a novel graph-based intrinsic test that allows us to measure the quality of different language models in clinical and biomedical domains in Spanish. Our results show that our intrinsic test performs better for clinical and biomedical language models than a general one. Also, it correlates with better outcomes for a NER task using a probing model over contextualized embeddings. We hope our work will help the clinical NLP research community to evaluate and compare new language models in other languages and find the most suitable models for solving downstream tasks.
UR - https://www.scopus.com/pages/publications/85154552744
M3 - Conference contribution
AN - SCOPUS:85154552744
T3 - LOUHI 2022 - 13th International Workshop on Health Text Mining and Information Analysis, Proceedings of the Workshop
SP - 197
EP - 206
BT - LOUHI 2022 - 13th International Workshop on Health Text Mining and Information Analysis, Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 13th International Workshop on Health Text Mining and Information Analysis, LOUHI 2022, co-located with EMNLP 2022
Y2 - 7 December 2022
ER -