,Confidence intervals for Markov chain transition probabilities based on next generation sequencing

来源 :定量生物学(英文版) | 被引量 : 0次 | 上传用户:yiyiyaya13575
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background: Markov chains (MC) have been widely used to model molecular sequences.The estimations of MC transition matrix and confidence intervals of the transition probabilities from long sequence data have been intensively studied in the past decades.In next generation sequencing (NGS),a large amount of short reads are generated.These short reads can overlap and some regions of the genome may not be sequenced resulting in a new type of data.Based on NGS data,the transition probabilities of MC can be estimated by moment estimators.However,the classical asymptotic distribution theory for MC transition probability estimators based on long sequences is no longer valid.Methods: In this study,we present the asymptotic distributions of several statistics related to MC based on NGS data.We show that,after scaling by the effective coverage d defined in a previous study by the authors,these statistics based on NGS data approximate to the same distributions as the corresponding statistics for long sequences.Results: We apply the asymptotic properties of these statistics for finding the theoretical confidence regions for MC transition probabilities based on NGS short reads data.We validate our theoretical confidence intervals using both simulated data and real data sets,and compare the results with those by the parametric bootstrap method.Conclusions: We find that the asymptotic distributions of these statistics and the theoretical confidence intervals of transition probabilities based on NGS data given in this study are highly accurate,providing a powerful tool for NGS data analysis.
其他文献
小学美术教学对于提高小学生的综合素质具有重要作用。如何设计小学美术的课堂教学,进而形成良好的教学氛围已经成为人们关注的焦点之一。小学美术教学应该以探索自由式小学
在当今的高职教育中,舞蹈是幼师专业学生的一门必修课,也是一门专业基础课,但幼师专业对于学生舞蹈方面的培养并不同于专门的舞蹈艺术学院对于舞蹈人才的培养,它是针对幼师教
抛栽水稻产量形成的初步研究徐永林,熊喜萍,王本新,雷新美,柳在洪(湖北国营后湖农场,潜江433115)YieldFormationofThorwing-PlantRiceXuYonglinXiongXipinWanBenxinLeiXinmeiLiu... Preliminary study on yield formation of cast transplanting rice Xu Yonglin, Xio
高光谱遥感是精准农业研究的重要工具之一,是实时、快速、精准、无损监测作物生长及营养状况的重要途径。叶绿素含量和叶面积指数等农学参数是衡量小麦光合能力、生长发育阶段、营养生理状况和受环境胁迫程度的有效指标。因此,小麦冠层反射光谱数据对农学参数的响应特征,可以用来预测小麦长势、监测其光合能力与氮素营养状况,以及评估其产量变化。本试验以普冰151小麦为研究材料,在不同播种量和施氮量下的田间试验的基础上,
本研究以不同初始水分的粉质型和蛋白型种子为试材,研究了不同贮藏条件(温度和湿度)下种子的吸湿解吸规律,并建立了水分平衡时间与种子初始水分、贮藏温度和贮藏湿度的回归方程模型,并进行了验证,同时研究了不同贮藏条件下种子活力下降过程中的生理生化特性变化,旨在为种子安全贮藏提供理论依据。主要研究结论如下:1.种子的吸湿解吸规律因种子类型和贮藏条件的不同而异。棉花和大豆种子,4%和8%初始水分的大豆种子及8
Background:The induction of neural regeneration is vital to the repair of spinal cord injury (SCI).While compared with peripheral nervous system (PNS),the regen
长期以来,美术教育“启迪心智,陶冶情操”的作用一直被人们忽视,中小学美术教育教学没有得到应有的重视。早在1995年,普通高中就有了美术课,但很多学校却不能按要求开课。在
Developmental patteing is highly reproducible and accurate at the single-cell level during fly embryogenesis despite the gene expression noise and exteal pertur
以2011年3月南黄海与东海部分海域为研究对象,对其中48个站位海水样品的总溶解氨基酸(THAA)、溶解结合氨基酸(DCAA)、溶解游离氨基酸(DFAA)的浓度分布和组成进行了研究。结果