论文部分内容阅读
借助MATLAB软件和最优完全子图算法,提取并展示NCBI数据库中烟草脉清病毒完整基因组(NC-003378.1)上微卫星分布特性。结果表明,计算出了各种1-碱基组~6-碱基组在完整基因组序列上重复出现次数和出现位置,并展示它们的分布规律(指数函数)。烟草脉清病毒完整基因组上各种N-碱基组最大的重复出现次数,随N按指数函数数减少;各种N-碱基组重复出现次数由少到多排序的结果,重复出现次数随序号增加。该研究方法可以系统地运用到其他病毒完整基因组序列微卫星分布特性的提取和展示,从而为有效利用微卫星分布特性研究完整基因组的结构和功能、遗传和变异规律提供依据。
With the aid of MATLAB software and the optimal complete subgraph algorithm, the microsatellite distribution characteristics of intact genome of tobacco pulse virus (NC-003378.1) in NCBI database was extracted and displayed. The results showed that the frequency and position of repeated occurrences of all kinds of 1-base groups to 6-base groups were calculated and their distribution rules (exponential function) were shown. The maximum number of repeated occurrences of various N-base groups in the intact genome of tobacco pulse virus decreased with the number of exponential functions; the number of repeated occurrences of the various N-base groups varied from less to more, with the number of repetitions Serial number increased. This research method can systematically apply the extraction and display of the microsatellite distribution characteristics of the complete genome sequence of other viruses, and provide the basis for studying the structure, function, inheritance and variation of the complete genome effectively utilizing the microsatellite distribution characteristics.