论文部分内容阅读
与有效成分相作用靶点的识别是从分子水平阐明中药作用机理的关键步骤。本文提出1种新的思路来解决中药化学成分的靶点识别问题。该方法基于小分子化合物的结构,计算表征分子组成、电荷分布、拓扑、几何结构及物理化学性质的分子描述符,经BestFirst搜索策略和CfsSubsetEval评估策略相结合的方法筛选出与靶点作用活性相关的分子描述符。采用径向基神经网络、朴素贝叶斯和随机森林3种机器学习方法构建一系列靶点的识别模型,后期将所建模型整合成靶点识别系统,进而预测中药有效成分的作用靶点。采用10折交叉验证,3种方法得到总的预测正确率分别为83.33%~95.71%、84.62%~96.43%、82.14%~95.59%,识别过程在(0.02~0.19)秒完成。实验结果证明该方法不但简单有效,更主要的是满足面向中药化学成分的靶点识别任务对辨识效率的要求。
The identification of the target of action with the active ingredient is a key step to elucidate the mechanism of action of traditional Chinese medicine at the molecular level. This paper presents a new way to solve the chemical composition of traditional Chinese medicine to identify the target. Based on the structure of small molecule compounds, molecular descriptors that characterize the molecular composition, charge distribution, topology, geometry and physico-chemical properties of the molecule descriptors were calculated and screened out by the combination of BestFirst search strategy and CfsSubsetEval assessment strategy The molecular descriptors. A series of target recognition models are constructed using three kinds of machine learning methods: RBF neural network, naive Bayesian and random forest. Later, the model is integrated into a target recognition system to predict the active ingredients of TCM. The accuracy of the three methods was 83.33% ~ 95.71%, 84.62% ~ 96.43% and 82.14% ~ 95.59%, respectively. The recognition process was completed within (0.02 ~ 0.19) seconds with 10 fold cross validation. The experimental results show that this method is not only simple and effective, but also more important to meet the recognition efficiency requirements of target recognition tasks for Chinese chemical composition.