BERTKG-DDI: Towards Incorporating Entity-specific Knowledge Graph Information in Predicting Drug-Drug Interactions (2024)

Ishani Mondal

Abstract

Off-the-shelf biomedical embeddings obtained from the recently released various pre-trained language models (such as BERT, XLNET) have demonstrated state-of-the-art results (in terms of accuracy) for the various natural language understanding tasks (NLU) in the biomedical domain. Relation Classification (RC) falls into one of the most critical tasks. In this paper, we explore how to incorporate domain knowledge of the biomedical entities (such as drug, disease, genes), obtained from Knowledge Graph (KG) Embeddings, for predicting Drug-Drug Interaction from textual corpus. We propose a new method, BERTKG-DDI, to combine drug embeddings obtained from its interaction with other biomedical entities along with domain-specific BioBERT embedding-based RC architecture. Experiments conducted on the DDIExtraction 2013 corpus clearly indicate that this strategy improves other baselines architectures by 4.1% macro F1-score.

Introduction

During the concurrent administration of multiple drugs to a patient, there seems to be a possibility in which an ailment might get cured or it can lead to serious side-effects. These type of interactions are known as Drug-Drug Interactions (DDIs). Predicting drug-drug interactions (DDI) is a difficult task as it requires to understand the underlying action principle of the interacting drugs. Numerous efforts by the researchers have been observed recently in terms of automatic extraction of DDIs from the textual corpus (Sahu and Anand 2018), (Liu etal. 2016), (Sun etal. 2019), (Li and Ji 2019), (Mondal 2020) and predicting unknown DDI from KG (Purkayastha etal. 2019). Automatic extraction of DDI from texts helps to maintain large-scale databases and thereby facilitate the medical experts in their diagnosis.

In parallel to the progress of DDI extraction from the textual corpus, some efforts have been observed recently where the researchers came up with various strategies of augmenting chemical structure information of the drugs and textual description of the drugs (Zhu etal. 2020) to improve Drug-Drug Interaction prediction performance from corpus and Knowledge Graphs. The DDI Prediction from the textual corpus has been framed by the earlier researchers as relation classification problem (Sahu and Anand 2018), (Liu etal. 2016), (Sun etal. 2019), (Li and Ji 2019) using CNN or RNN-based neural networks.

Recently, with the massive success of the pre-trained language models (Devlin etal. 2019), (Yang etal. 2019) in many NLP classifications, we formulate the problem of DDI classification as a relation classification task by leveraging both entities and contextual information. We propose a model that leverages both domain-specific contextual embeddings (Bio-BERT) (Lee etal. 2019) from the target entities (drugs) and also its external information. In the recent years, representation learning has played a pivotal role in solving various machine learning tasks.

In this work, we explore the direction of augmenting graph embeddings to predict relation between two drugs from the textual corpus. We have made use of an in-house Knowledge Graph (Bio-KG) after curating the interactions among drugs, diseases, genes from multiple ontologies.In order to understand the complex underlying mechanism of interactions among the biomedical entities, we employ translation-based and semantics preserving heterogeneous graph embeddings on Bio-KG and augment the entities representation jointly to train the relation classification model. Experiments conducted on the DDIExtraction 2013 corpus (Herrero-Zazo etal. 2013) reveals that this method outperforms the existing baseline models and is in line with the new direction of research of fusing various information to DDI prediction. In a nutshell, the major contributions of this work are summarized as follows:

  1. 1.

    We propose a novel method that jointly leverages textual and external Knowledge information to classify relation type between the drug pairs mentioned in the text showing the efficacy of external entity specific information.

  2. 2.

    Our method achieves new state-of-the-art performance on DDI Extraction 2013 corpus.

Problem Statement

Given an input instance or sentence s𝑠s with two target drug entities d1subscript𝑑1d_{1} and d2subscript𝑑2d_{2}, the task is to classify the type of relation (y𝑦y) the drugs hold between them, y𝑦y ∈\in (y1subscript𝑦1y_{1} , …., yNsubscript𝑦𝑁y_{N}). Here N𝑁N denotes the number of relation types.

Methodology

Text-based Relation Classification

Our model for extracting DDIs from texts is based on the pre-trained BERT-based relation classification model by (Wu and He 2019). Given a sentence s𝑠s with drugs d1subscript𝑑1d_{1} and d2subscript𝑑2d_{2}, let the final hidden state output from BERT module is H𝐻H. Let the vectors Hisubscript𝐻𝑖H_{i} to Hjsubscript𝐻𝑗H_{j} are the final hidden state vectors from BERT for entity d1subscript𝑑1d_{1}, and Hksubscriptπ»π‘˜H_{k} to Hmsubscriptπ»π‘šH_{m} are the final hidden state vectors from BERT for entity d2subscript𝑑2d_{2}.An average operation is applied to obtain the vector representation for each of the drug entities. An activation operation tanh is applied followed by a fully connected layer to each of the two vectors, and the output for d1subscript𝑑1d_{1} and d2subscript𝑑2d_{2} are H1β€²superscriptsubscript𝐻1β€²H_{1}^{{}^{\prime}} and H2β€²superscriptsubscript𝐻2β€²H_{2}^{{}^{\prime}} respectively.

H1β€²=W[tanh(1(jβˆ’i+1)βˆ‘t=ijHt]+bH_{1}^{{}^{\prime}}=W[tanh(\frac{1}{(j-i+1)}\sum_{t=i}^{j}H_{t}]+b(1)
H2β€²=W[tanh(1(mβˆ’k+1)βˆ‘t=kmHt]+bH_{2}^{{}^{\prime}}=W[tanh(\frac{1}{(m-k+1)}\sum_{t=k}^{m}H_{t}]+b(2)

The weights (Wπ‘ŠW) and bias (b𝑏b) parameters are shared. For the final hidden state vector of the first token (β€˜[CLS]’), we also add an activation operation and a fully connected layer, which is formally expressed as:

H0β€²=W0​(t​a​n​h​(H0))+b0superscriptsubscript𝐻0β€²subscriptπ‘Š0π‘‘π‘Žπ‘›β„Žsubscript𝐻0subscript𝑏0H_{0}^{{}^{\prime}}=W_{0}(tanh(H_{0}))+b_{0}(3)

Matrices W0subscriptπ‘Š0W_{0}, W1subscriptπ‘Š1W_{1}, W2subscriptπ‘Š2W_{2} have the same dimensions, i.e. W0subscriptπ‘Š0W_{0} ∈\in Rdβˆ—dsuperscript𝑅𝑑𝑑R^{d*d} ,W1subscriptπ‘Š1W_{1} ∈\in Rdβˆ—dsuperscript𝑅𝑑𝑑R^{d*d}, W2subscriptπ‘Š2W_{2} ∈\in Rdβˆ—dsuperscript𝑅𝑑𝑑R^{d*d}, where d𝑑d is the hidden state size from BERT.We concatenate H0β€²superscriptsubscript𝐻0β€²H_{0}^{{}^{\prime}}, H1β€²superscriptsubscript𝐻1β€²H_{1}^{{}^{\prime}} and H2β€²superscriptsubscript𝐻2β€²H_{2}^{{}^{\prime}} and then add a fully connected layer and a softmax layer, which is expressed as :

hβ€²β€²=W3​[c​o​n​c​a​t​(H0β€²,H1β€²,H2β€²)]+b3superscriptβ„Žβ€²β€²subscriptπ‘Š3delimited-[]π‘π‘œπ‘›π‘π‘Žπ‘‘superscriptsubscript𝐻0β€²superscriptsubscript𝐻1β€²superscriptsubscript𝐻2β€²subscript𝑏3h^{{}^{\prime\prime}}=W_{3}[concat(H_{0}^{{}^{\prime}},H_{1}^{{}^{\prime}},H_{2}^{{}^{\prime}})]+b_{3}(4)
ytβ€²=s​o​f​t​m​a​x​(hβ€²β€²)superscriptsubscriptπ‘¦π‘‘β€²π‘ π‘œπ‘“π‘‘π‘šπ‘Žπ‘₯superscriptβ„Žβ€²β€²y_{t}^{{}^{\prime}}=softmax(h^{{}^{\prime\prime}})(5)

W3subscriptπ‘Š3W_{3} ∈\in RNβˆ—3​dsuperscript𝑅𝑁3𝑑R^{N*3d}, and ytβ€²superscriptsubscript𝑦𝑑′y_{t}^{{}^{\prime}} is the softmax probability output over N𝑁N. In Equations (1), (2), (3), (4) the bias vectors are b0subscript𝑏0b_{0}, b1subscript𝑏1b_{1}, b2subscript𝑏2b_{2}, b3subscript𝑏3b_{3}. We use cross entropy as the loss function. We denote this text-based architecture as BERT-Text-DDI.

Entity Representation from KG

To infuse external information of the entities in relation classification task, we obtain the representation of two Drug entities mentioned in each input instance of the relation classification task. We use an in-house heterogeneous biomedical Knowledge Graph (Bio-KG) consisting of the interactions of target-target, drug-drug, drug-disease, drug-target, disease-disease, disease-target interactions from a large number of ontologies such as : DrugBank111https://go.drugbank.com/, BioSNAP222http://snap.stanford.edu/biodata/, UniProt333https://www.uniprot.org/ (TheUniProtConsortium 2016). The overall statistics of Bio-KG has been enumerated in table 1. The real-world information/facts observed in the Bio-KG are stored as a collection of triples in the form (hβ„Žh, rπ‘Ÿr, t). Each triple is composed of a head entity hβ„Žh ∈\in E𝐸E, a tail entity t𝑑t ∈\in E𝐸E, and a relation rπ‘Ÿr ∈\in R𝑅R between them, e.g., (paracetamol, treats, fever). The fact that paracetamol is effective in curing fever is being stored in Bio-KG. In this case, E𝐸E denotes set of entities, and R𝑅R denotes the set of relations. There are three different types of E𝐸E in Bio-KG such as drugs, diseases, targets and five different types of R𝑅R such as target-target, drug-disease, drug-target, disease-disease, disease-target interactions.

Node TypesCountEdge TypesCount
Drug6512Drug-Target15245
Target30098Target-Target77108
Disease23458Drug-Disease84745
Disease-Disease35382
Disease-Target31161
Total Nodes60068Total Edges243641

The aim of a Knowledge Graph embedding is to embed the entities and relations into a low-dimensional continuous vector space, so as to simplify the computations on the KG. They mostly use facts in the KG to perform the embedding task, enforcing embedding to be compatible with the facts. They provide a generalizable context about the overall Knowledge Graph (KG) that can be used to infer the relations. In this work, we employ some off-the-shelf KG embeddings to encode the representation of each of the drugs (in terms of their relationship with other entities). The knowledge graph embeddings are computed so that they satisfy certain properties; i.e., they follow a given KGE model. These KGE models define different score functions that measure the distance of two entities relative to its relation type in the low-dimensional embedding space. These score functions are used to train the KGE models so that the entities connected by relations are close to each other while the entities that are not connected are far away.Some of the KGEs used in our experiments as explained below:

  • β€’

    TransE (Bordes etal. 2013): Given a fact (hβ„Žh, rπ‘Ÿr, t𝑑t), the relation in TransE is interpreted as a translation vector rπ‘Ÿr so that the embedded entities hβ„Žh and t𝑑t can be connected by rπ‘Ÿr, i.e., hβ„Žh + rπ‘Ÿr β‰ˆ\approx t𝑑t when (hβ„Žh, rπ‘Ÿr, t𝑑t) holds. The scoring function is defined as (negative) distance between h+rβ„Žπ‘Ÿh+r and t𝑑t, i.e.,

    fr​(h,t)=β€–h+rβˆ’tβ€–subscriptπ‘“π‘Ÿβ„Žπ‘‘normβ„Žπ‘Ÿπ‘‘f_{r}(h,t)=\parallel h+r-t\parallel(6)
  • β€’

    TransR (Lin etal. 2015): Givena fact (hβ„Žh, rπ‘Ÿr, t𝑑t), TransR first projects the entity representations hβ„Žh and t𝑑t into the space specific to relation rπ‘Ÿr, Here Mrsubscriptπ‘€π‘ŸM_{r} is a projection matrix from the entity spaceto the relation space of rπ‘Ÿr, the scoring function is:

    ht=Mr​h,tt=Mr​tformulae-sequencesubscriptβ„Žπ‘‘subscriptπ‘€π‘Ÿβ„Žsubscript𝑑𝑑subscriptπ‘€π‘Ÿπ‘‘h_{t}=M_{r}h,t_{t}=M_{r}t(7)
  • β€’

    RESCAL (Nickel, Tresp, and Kriegel 2011): Each relation in RESCAL is represented as a matrix which models pairwise interactions between latent factors. The score of a fact (hβ„Žh, rπ‘Ÿr, t𝑑t) is defined by a bi-linear function where hβ„Žh, t𝑑t are vector representations of the entities, and Mrsubscriptπ‘€π‘ŸM_{r} is a matrix associated with the relation.This score captures pairwise interactions between allcomponents of hβ„Žh and t𝑑t:

    fr​(h,t)=hT​Mr​t=βˆ‘i=0dβˆ’1βˆ‘j=0dβˆ’1[Mr]i​jβˆ—[h]iβˆ—[t]jsubscriptπ‘“π‘Ÿβ„Žπ‘‘superscriptβ„Žπ‘‡subscriptπ‘€π‘Ÿπ‘‘superscriptsubscript𝑖0𝑑1superscriptsubscript𝑗0𝑑1subscriptdelimited-[]subscriptπ‘€π‘Ÿπ‘–π‘—subscriptdelimited-[]β„Žπ‘–subscriptdelimited-[]𝑑𝑗f_{r}(h,t)=h^{T}M_{r}t=\sum_{i=0}^{d-1}\sum_{j=0}^{d-1}[M_{r}]_{ij}*[h]_{i}*[t]_{j}(8)
  • β€’

    DistMult (Yang etal. 2015): DistMult simplifies RESCAL by restricting Mrsubscriptπ‘€π‘ŸM_{r} to diagonal matrices. For each relation rπ‘Ÿr, it introduces avector embedding rπ‘Ÿr and requires Mrsubscriptπ‘€π‘ŸM_{r} = d​i​a​g​(r)π‘‘π‘–π‘Žπ‘”π‘Ÿdiag(r). Thescoring function is defined as:

    fr​(h,t)=hT​d​i​a​g​(r)​t=βˆ‘i=0dβˆ’1riβˆ—[h]iβˆ—[t]jsubscriptπ‘“π‘Ÿβ„Žπ‘‘superscriptβ„Žπ‘‡π‘‘π‘–π‘Žπ‘”π‘Ÿπ‘‘superscriptsubscript𝑖0𝑑1subscriptπ‘Ÿπ‘–subscriptdelimited-[]β„Žπ‘–subscriptdelimited-[]𝑑𝑗f_{r}(h,t)=h^{T}diag(r)t=\sum_{i=0}^{d-1}r_{i}*[h]_{i}*[t]_{j}(9)

    This score captures pairwise interactions between only thecomponents of hβ„Žh and t𝑑t along the same dimension, and reduces the number of parameters to O​(d)𝑂𝑑O(d) per relation.

From Bio-KG, we train these KG Embeddings and obtain the representation of all the nodes. In our case, we are only interested in obtaining the representation of drug nodes. We denote the KG representation of drug d𝑑d as k​g​eπ‘˜π‘”π‘’{kge}.

BERTKG-DDI

From the input instance s𝑠s with two tagged target drug entities d1subscript𝑑1d_{1} and d2subscript𝑑2d_{2}, we obtain the KG embedding representation of two drugs k​g​e1π‘˜π‘”subscript𝑒1{kge}_{1} and k​g​e2π‘˜π‘”subscript𝑒2{kge}_{2} respectively using Bio-KG. We concatenate these two embeddings k​g​e1π‘˜π‘”subscript𝑒1{kge}_{1} and k​g​e2π‘˜π‘”subscript𝑒2{kge}_{2} and pass those through a fully connected layer as represented below:

k​g​e=W​[c​o​n​c​a​t​(k​g​e1,k​g​e2)]+bπ‘˜π‘”π‘’π‘Šdelimited-[]π‘π‘œπ‘›π‘π‘Žπ‘‘π‘˜π‘”subscript𝑒1π‘˜π‘”subscript𝑒2𝑏kge=W[concat({kge}_{1},{kge}_{2})]+b(10)

Wπ‘ŠW and b𝑏b are the parameters of the fully-connected layer of the KG representation of k​g​e1π‘˜π‘”subscript𝑒1{kge}_{1} and k​g​e2π‘˜π‘”subscript𝑒2{kge}_{2}. The final layer of BERTKG-DDI model contains concatenation of all the previous text-based outputs and drug representation from KG as expressed below:

oβ€²=W3​[c​o​n​c​a​t​(H0β€²,H1β€²,H2β€²,k​g​e)]+b3superscriptπ‘œβ€²subscriptπ‘Š3delimited-[]π‘π‘œπ‘›π‘π‘Žπ‘‘superscriptsubscript𝐻0β€²superscriptsubscript𝐻1β€²superscriptsubscript𝐻2β€²π‘˜π‘”π‘’subscript𝑏3o^{{}^{\prime}}=W_{3}[concat(H_{0}^{{}^{\prime}},H_{1}^{{}^{\prime}},H_{2}^{{}^{\prime}},kge)]+b_{3}(11)
ytβ€²=s​o​f​t​m​a​x​(oβ€²)superscriptsubscriptπ‘¦π‘‘β€²π‘ π‘œπ‘“π‘‘π‘šπ‘Žπ‘₯superscriptπ‘œβ€²y_{t}^{{}^{\prime}}=softmax(o^{{}^{\prime}})(12)

Finally the training optimization is achieved using the cross-entropy loss.

Experimental Setup

Dataset and Pre-processing

We have followed the task setting of Task 9.2 in theDDIExtraction 2013 shared task (Herrero-Zazo etal. 2013) for evaluation. It consists of MEDLINE documents annotated with the drug mentions and five types of interactions: Mechanism, Effect, Advice, Interaction and Other. The task is a multi-class classification to classify each of the drug pairs in the sentences into one of the types and we evaluate using three standard evaluation metrics such as: Precision (P), Recall (R) and F1-score (F1).

During pre-processing, we obtain the DRUG mentions in the corpus and map those into unique DrugBank 444https://go.drugbank.com/ identifiers. This is a step for converting the drug mentions into their respective DrugBank ID, a step of entity linking (Mondal etal. 2019), (Leaman, Dogan, and lu 2013). This mention normalization has been performed based on the longest overlap of drug mentions in DrugBank and map the drugs to different Knowledge sources used to construct Bio-KG.

Training Details

For the purpose of experiments, we use the initialization of various pre-trained contextual embeddings. For instance, we use the embeddings such as bert-base-cased 555https://huggingface.co/bert-base-cased, scibert-scivocab-uncased (Beltagy, Lo, and Cohan 2019) 666https://github.com/allenai/scibert and domain-specific biobert v1.0 pubmed pmc and biobert v1.0 pubmed777https://github.com/dmis-lab/biobert as the initialization of the transformer encoder in BERTKG-DDI. We uniformly keep the maximum sequence length as 300 for all the embedding ablations and trained for 5 epochs. For the KG embeddings, we use word embeddings dimensions to be 200. Stochastic Gradient Descent (SGD) was used for optimization with an initial learning rate of 0.0001 and the model is trained for 300 epochs. After training the embeddings, we obtain the final representation of each drug. For the drugs mentioned in the input instance, we make use of the obtained embeddings as shown in the equation 11. We initialize the non-normalized drugs using pre-trained word2vec (of dimension 200 same as the KG embedding) trained on PubMED 888http://evexdb.org/pmresources/ngrams/PubMed/.

Embeddings on BERT-Text-DDITest set Macro F1
bert-base-cased0.806
scibert-scivocab-uncased0.812
biobert v1.0 pubmed pmc0.818
biobert v1.1 pubmed0.822
KG Embeddings on BERTKG-DDITest set Macro F1
BERTKG-DDI w/ TransE0.826
BERTKG-DDI w/ TransR0.829
BERTKG-DDI w/ RESCAL0.834
BERTKG-DDI w/ DistMult0.840
ModelsContextual EmbeddingsMacro F1
BERT-Text-DDIbiobert v1.0 pubmed pmc0.818
BERTKG-DDIbiobert v1.0 pubmed pmc0.831
BERT-Text-DDIbiobert v1.1 pubmed0.822
BERTKG-DDIbiobert v1.1 pubmed0.840
MethodsAdviceEffectMechanismInteractionTotal
F1 ScoreF1 ScoreF1 ScoreF1 ScoreF1 Score
(Zhang etal. 2017)0.800.710.740.540.72
(Vivian etal. 2017)0.850.760.770.570.77
(Asada, Miwa, and Sasaki 2018)0.810.710.730.450.72
(Sun etal. 2019)0.800.730.780.580.75
(Zhu etal. 2020)0.860.800.840.560.80
Our method (BERTKG-DDI)0.880.810.870.590.84

Results and Discussion

In this section, we provide a detailed analysis of the various results and findings that we have observed during experiments. We show empirical results based on BERTKG-DDI for both text and KG information.

Ablation of Embeddings on BERT-Text-DDI:During ablation analysis, we observe that the incorporation of domain-specific information in biobert v.1 pubmed boosts up the predictive performance in terms of macro-F1 score (across all relation types) by 2.3% compared to bert-base-cased. Moreover, the scibert-vocab-cased embedddings due to the scientific details obtained during fine-tuning achieves reasonable boost in performance. biobert v.1 pubmed based BERT-Text-DDI is the best-performing text-based relation classification model. The results are enumerated in Table 2.

Ablation analysis of KG Embeddings on BERTKG-DDI:We compare the different KG embeddings for drugs obtained from Bio-KG after augmenting with the BERT-Text-DDI model in Table 3. The semantic-matching models such as RESCAL and DistMult measure plausibility of facts by matching the latent semantics of both relations and entities in their vector space. In our experiments, they seem to outperform the translation-based KGE such as TransE and TransR by an average of 1% macro F1-score.

Advantage of KG information on BERTKG-DDI:During empirical analysis of the BERTKG-DDI model, we observe how much performance gain can be achieved by augmenting KG embeddings. From the results enumerated in terms of macro F1-score on all the relation types in Table 4, we observe that the best-performing BERT-Text-DDI model achieves a performance boost of 1.8% after augmenting KG information in BERTKG-DDI.

Comparison with the existing baselines: We compare our best-performing model with some of the best-performing existing baselines. (Asada, Miwa, and Sasaki 2018) proposed a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information. They encode textual drug pairs with convolutional neural networks and their molecular pairs with graph convolutional networks (GCNs), and then concatenate the outputs of these two networks. (Vivian etal. 2017) proposed an effective model that classifies DDIs from the literature by combining an attention mechanism and a recurrent neural network with long short-term memory (LSTM) units. (Zhang etal. 2017) has presented a hierarchical recurrent neural networks (RNNs)-based method to integrate the SDP and sentence sequence for DDI extraction task.(Sun etal. 2019) has proposed a novel recurrent hybrid convolutional neural network (RHCNN) for DDI extraction from biomedical literature. In the embedding layer, the texts mentioning two entities are represented as a sequence of semantic embeddings and position embeddings. In particular, the complete semantic embedding is obtained by the information fusion between a word embedding and its contextual information which is learnt by recurrent structure. Recently, (Zhu etal. 2020) proposed multiple entity-aware attentions with various entity information to strengthen the representations of drug entities in sentences. They integrate drug descriptions from Wikipedia and DrugBank to our model to enhance the semantic information of drug entities. Also, they modified the output of the BioBERT model and the results show that it is better than using the BioBERT model directly. On the contrary, our method achieves the state-of-the-art performance based on the results on the DDI Extraction 2013 corpus (in terms of F1-scores of all the relation types) as shown in Table 5.

Conclusion

In this paper, we propose an approach, BERTKG-DDI, for DDI relation classification based on pre-trained language models and Knowledge Graph Embedding of the drug entities. Experiments conducted on a benchmark DDI dataset proves the effectiveness of our proposed method. Possible directions of further research might be to explore other external drug representation such as chemical structure, textual description in predicting DDI from textual corpus.

References

  • Asada, Miwa, and Sasaki (2018)Asada, M.; Miwa, M.; and Sasaki, Y. 2018.Enhancing Drug-Drug Interaction Extraction from Texts by MolecularStructure Information.In Proceedings of the 56th Annual Meeting of the Associationfor Computational Linguistics (Volume 2: Short Papers), 680–685. Melbourne,Australia: Association for Computational Linguistics.doi:10.18653/v1/P18-2108.URL https://www.aclweb.org/anthology/P18-2108.
  • Beltagy, Lo, and Cohan (2019)Beltagy, I.; Lo, K.; and Cohan, A. 2019.SciBERT: A Pretrained Language Model for Scientific Text.In EMNLP/IJCNLP.
  • Bordes etal. (2013)Bordes, A.; Usunier, N.; GarcΓ­a-DurΓ‘n, A.; Weston, J.; and Yakhnenko,O. 2013.Translating Embeddings for Modeling Multi-relational Data.In NIPS.
  • Devlin etal. (2019)Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019.BERT: Pre-training of Deep Bidirectional Transformers for LanguageUnderstanding.In NAACL-HLT.
  • Herrero-Zazo etal. (2013)Herrero-Zazo, M.; Segura-Bedmar, I.; MartΓ­nez, P.; and Declerck, T. 2013.The DDI corpus: An annotated corpus with pharmacological substancesand drug–drug interactions.Journal of Biomedical Informatics 46(5): 914 – 920.ISSN 1532-0464.doi:https://doi.org/10.1016/j.jbi.2013.07.011.URL http://www.sciencedirect.com/science/article/pii/S1532046413001123.
  • Leaman, Dogan, and lu (2013)Leaman, R.; Dogan, R.; and lu, Z. 2013.DNorm: Disease Name Normalization with Pairwise Learning to Rank.Bioinformatics (Oxford, England) 29.doi:10.1093/bioinformatics/btt474.
  • Lee etal. (2019)Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; and Kang, J. 2019.BioBERT: a pre-trained biomedical language representation model forbiomedical text mining.Bioinformatics 36(4): 1234–1240.ISSN 1367-4803.doi:10.1093/bioinformatics/btz682.URL https://doi.org/10.1093/bioinformatics/btz682.
  • Li and Ji (2019)Li, D.; and Ji, H. 2019.Syntax-aware Multi-task Graph Convolutional Networks for BiomedicalRelation Extraction.In Proceedings of the Tenth International Workshop on HealthText Mining and Information Analysis (LOUHI 2019), 28–33. Hong Kong:Association for Computational Linguistics.doi:10.18653/v1/D19-6204.URL https://www.aclweb.org/anthology/D19-6204.
  • Lin etal. (2015)Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu, X. 2015.Learning Entity and Relation Embeddings for Knowledge GraphCompletion.In Proceedings of the Twenty-Ninth AAAI Conference onArtificial Intelligence, AAAI’15, 2181–2187. AAAI Press.ISBN 0262511290.
  • Liu etal. (2016)Liu, S.; Tang, B.; Chen, Q.; and Wang, X. 2016.Drug-Drug Interaction Extraction via Convolutional Neural Networks.Computational and Mathematical Methods in Medicine 2016: 1–8.doi:10.1155/2016/6918381.
  • Mondal (2020)Mondal, I. 2020.BERTChem-DDI : Improved Drug-Drug Interaction Prediction fromtext using Chemical Structure Information.In Proceedings of Knowledgeable NLP: the First Workshop onIntegrating Structured Knowledge and Neural Networks for NLP, 27–32.Suzhou, China: Association for Computational Linguistics.URL https://www.aclweb.org/anthology/2020.knlp-1.4.
  • Mondal etal. (2019)Mondal, I.; Purkayastha, S.; Sarkar, S.; Goyal, P.; Pillai, J.; Bhattacharyya,A.; and Gattu, M. 2019.Medical Entity Linking using Triplet Network.In Proceedings of the 2nd Clinical Natural Language ProcessingWorkshop, 95–100. Minneapolis, Minnesota, USA: Association forComputational Linguistics.doi:10.18653/v1/W19-1912.URL https://www.aclweb.org/anthology/W19-1912.
  • Nickel, Tresp, and Kriegel (2011)Nickel, M.; Tresp, V.; and Kriegel, H.-P. 2011.A Three-Way Model for Collective Learning on Multi-Relational Data.In Proceedings of the 28th International Conference onInternational Conference on Machine Learning, ICML’11, 809–816. Madison,WI, USA: Omnipress.ISBN 9781450306195.
  • Purkayastha etal. (2019)Purkayastha, S.; Mondal, I.; Sarkar, S.; Goyal, P.; and Pillai, J.K.2019.Drug-Drug Interactions Prediction Based on Drug Embedding and GraphAuto-Encoder.In 2019 IEEE 19th International Conference on Bioinformaticsand Bioengineering (BIBE), 547–552.
  • Sahu and Anand (2018)Sahu, S.K.; and Anand, A. 2018.Drug-drug interaction extraction from biomedical texts using longshort-term memory network.Journal of Biomedical Informatics 86: 15 – 24.ISSN 1532-0464.doi:https://doi.org/10.1016/j.jbi.2018.08.005.URL http://www.sciencedirect.com/science/article/pii/S1532046418301606.
  • Sun etal. (2019)Sun, X.; Dong, K.; Ma, L.; Sutcliffe, R.; He, F.; Chen, S.; and Feng, J. 2019.Drug-Drug Interaction Extraction via Recurrent Hybrid ConvolutionalNeural Networks with an Improved Focal Loss.Entropy 21(1): 37.ISSN 1099-4300.doi:10.3390/e21010037.URL http://dx.doi.org/10.3390/e21010037.
  • TheUniProtConsortium (2016)TheUniProtConsortium. 2016.UniProt: the universal protein knowledgebase.Nucleic Acids Research 45(D1): D158–D169.ISSN 0305-1048.doi:10.1093/nar/gkw1099.URL https://doi.org/10.1093/nar/gkw1099.
  • Vivian etal. (2017)Vivian, V.; Lin, H.; Luo, L.; Zhao, Z.; Zhengguang, l.; Yijia, Z.; Yang, Z.;and Wang, J. 2017.An attention-based effective neural model for drug-drug interactionsextraction.BMC Bioinformatics 18.doi:10.1186/s12859-017-1855-x.
  • Wu and He (2019)Wu, S.; and He, Y. 2019.Enriching Pre-trained Language Model with Entity Information forRelation Classification.CoRR abs/1905.08284.URL http://arxiv.org/abs/1905.08284.
  • Yang etal. (2015)Yang, B.; tau Yih, W.; He, X.; Gao, J.; and Deng, L. 2015.Embedding Entities and Relations for Learning and Inference inKnowledge Bases.CoRR abs/1412.6575.
  • Yang etal. (2019)Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.; and Le, Q.V.2019.XLNet: Generalized Autoregressive Pretraining for LanguageUnderstanding.In NeurIPS.
  • Zhang etal. (2017)Zhang, Y.; Zheng, W.; Lin, H.; Wang, J.; Yang, Z.; and Dumontier, M. 2017.Drug–drug interaction extraction via hierarchical RNNs on sequenceand shortest dependency paths.Bioinformatics 34(5): 828–835.ISSN 1367-4803.doi:10.1093/bioinformatics/btx659.URL https://doi.org/10.1093/bioinformatics/btx659.
  • Zhu etal. (2020)Zhu, Y.; Li, L.; Lu, H.; Zhou, A.; and Qin, X. 2020.Extracting drug-drug interactions from texts with BioBERT andmultiple entity-aware attentions.Journal of Biomedical Informatics 106: 103451.ISSN 1532-0464.doi:https://doi.org/10.1016/j.jbi.2020.103451.URL http://www.sciencedirect.com/science/article/pii/S1532046420300794.
BERTKG-DDI: Towards Incorporating Entity-specific Knowledge Graph Information in Predicting Drug-Drug Interactions (2024)
Top Articles
Latest Posts
Article information

Author: Msgr. Benton Quitzon

Last Updated:

Views: 6336

Rating: 4.2 / 5 (43 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Msgr. Benton Quitzon

Birthday: 2001-08-13

Address: 96487 Kris Cliff, Teresiafurt, WI 95201

Phone: +9418513585781

Job: Senior Designer

Hobby: Calligraphy, Rowing, Vacation, Geocaching, Web surfing, Electronics, Electronics

Introduction: My name is Msgr. Benton Quitzon, I am a comfortable, charming, thankful, happy, adventurous, handsome, precious person who loves writing and wants to share my knowledge and understanding with you.