论文部分内容阅读
目的利用基因芯片数据挖掘识别与乳腺癌组织学分级相关的特征基因,对乳腺癌的临床诊断和生物医学研究起到借鉴和参考作用。方法从公共基因芯片数据库GEO(gene expression omnibus)获得乳腺癌芯片表达数据,利用支持向量机提取获得不同组织学分级的肿瘤样本的特征基因,并对这些基因进行生物学功能分析。结果获得了64个特征基因,分类正确率达到100%,这些基因与癌症有较大的相关性,主要集中在转录调控、离子运输、器官发生发育等多个生物学途径中。结论通过对基因芯片数据的挖掘,可以从全局上了解肿瘤的表达情况,加深对乳腺癌细胞分化分子机制的认识。
OBJECTIVE: To use gene chip data mining to identify characteristic genes related to histological grading of breast cancer, and to provide reference and reference for the clinical diagnosis and biomedical research of breast cancer. Methods Breast cancer microarray expression data were obtained from gene expression omnibus (GEO), and the characteristic genes of tumor samples with different histological grade were extracted by using support vector machines. The biological functions of these genes were analyzed. Results Sixty-four characteristic genes were obtained and their classification accuracy was 100%. These genes have a high correlation with cancer, mainly in many biological pathways such as transcriptional regulation, ion transport and organogenesis. Conclusion Through the mining of gene chip data, we can understand the overall situation of tumor expression and deepen the understanding of the molecular mechanism of breast cancer cell differentiation.