EMNLP | SustaiNLP 2020
Authors: Harshil Shah, Julien Fauqueur
Abstract
Extracting biomedical relations from large corpora of scientific documents is a challenging natural language processing task. Existing approaches usually focus on identifying a relation either in a single sentence (mention-level)or across an entire corpus (pair-level). In both cases, recent methods have achieved strong results by learning a point estimate to represent the relation; this is then used as the input to a relation classifier. However, the relation expressed in the text between a pair of biomedical entities is often more complex than can be captured by a point estimate. To address this issue, we propose a latent variable model with an arbitrarily flexible distribution to represent the relation between an entity pair. Additionally, our model provides a unified architecture for both mention-level and pair-level relation extraction. We demonstrate that our model achieves results competitive with strong base-lines for both tasks while having fewer parameters and being significantly faster to train. We make our code publicly available.
Github
The code can be accessed on GitHub here.
![](https://bai-13902-s3.s3.eu-west-2.amazonaws.com/media/5416/7113/9908/5fbd30ac7534b058c4688f0a_Screenshot_2020-11-24_at_16.11.05.png)
Back to publications