R.S. Olayan, H. Ashoor, V.B Bajic
Bioinformatics, btx731, (2017)
Efficient computational, Predict drugtarget, Interactions, Graph, Mining, Machine learning approaches
Finding computationally drug-target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate.
We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using five repeats of 10-fold cross-validation, three testing setups, and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 34% when the drugs are new, by 23% when targets are new, and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs.