论文部分内容阅读
目的:根据诱导的特异性抗体种型,B细胞表位被分成不同的亚类。探索表位多类亚类之间的区别非常重要,能促进揭示免疫系统为什么会针对不同的表位产生特异性抗体应答。基于多类支持向量机,发展一个能区分多类表位亚类并且能预测B细胞表位的亚类类别的模型。方法:训练模型的数据集来源于免疫表位数据库,数据集包含4类数据,对应4种B细胞表位亚类:Ig A表位,Ig E表位,Ig G表位以及Ig M表位。通过5折交叉验证,分别探索氨基酸组成特征,quasi-序列顺序特征以及二肽组成特征区分表位多类亚类的能力。结果:实验结果表明二肽组成特征的区分性能最好,整体准确率为61.58%,应用此多类分类模型,开发了一个名为BCESCP的免费使用的B细胞表位的亚类类别预测服务器,BCESCP可以通过如下地址访问:http://www.bioinfo.tsinghua.edu.cn/epitope/BCESCP/。
OBJECTIVE: B cell epitopes are divided into different subclasses based on the specific antibody species induced. It is important to explore the differences between the various classes of epitopes that can help reveal why the immune system produces specific antibody responses to different epitopes. Based on multi-class support vector machines, a model that distinguishes subtypes of multiple classes of epitopes and predicts subclasses of B-cell epitopes is developed. METHODS: The data set of the training model was derived from an immunological epitope database containing 4 types of data corresponding to 4 classes of B cell epitopes: Ig A epitopes, Ig E epitopes, Ig G epitopes, and Ig M epitopes . Through the five-fold cross-validation, we explored the amino acid composition, quasi- sequence sequence and the ability of dipeptide to classify the epitopes. Results: The experimental results showed that the dipeptide composition characteristics were the best, with the overall accuracy of 61.58%. Using this multi-class classification model, a sub-category prediction server named BCESCP for free use of B-cell epitopes was developed. BCESCP can be accessed through the following address: http://www.bioinfo.tsinghua.edu.cn/epitope/BCESCP/.