Indexing Techniques of Distributed Ordered Tables: A Survey and Analysis

来源 :计算机科学技术学报(英文版) | 被引量 : 0次 | 上传用户:guyueer83
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Many NoSQL (Not Only SQL) databases were proposed to store and query on a huge amount of data. Some of them like BigTable, PNUTS, and HBase, can be modeled as distributed ordered tables (DOTs). Many additional indexing techniques have been presented to support queries on non-key columns for DOTs. However, there was no comprehensive analysis or comparison of these techniques, which brings troubles to users in selecting or proposing a proper indexing technique for a certain workload. This paper proposes a taxonomy based on six indexing issues to classify indexing techniques on DOTs and provides a comprehensive review of the state-of-the-art techniques. Based on the taxonomy, we propose a performance model named QSModel to estimate the query time and storage cost of these techniques and run experiments on a practical workload from Tencent to evaluate this model. The results show that the maximum error rates of the query time and storage cost are 24.2%and 9.8%, respectively. Furthermore, we propose IndexComparator, an open source project that implements representative indexing techniques. Therefore, users can select the best-fit indexing technique based on both theoretical analysis and practical experiments.
其他文献
我国多原发癌中消化道多原发癌所占比例最大,相对于单发病灶,多原发癌一旦漏诊或误诊,将直接影响患者手术术式、治疗方法的选择及预后情况.近年来随着影像学检查技术的发展以
乳头内陷是女性乳房常见的畸形,发病率约3%。临床表现为乳头埋没于乳晕之下,常发生于双侧,也可单侧发生。其不仅影响乳房外观和哺乳功能,而且内陷的乳头易藏污纳垢,造成感染、糜烂
4月 9 日,康 明 斯 在 德 国bauma国际工程机械展上举行了隆重热烈的100周年庆祝活动.rn同时展出的还有康明斯满足非道路欧五(Stage V)排放标准的B6.7发动机,预示着康明斯百
期刊
颞下颌关节强直较少见,现将我院遇到的2例报告如下.rn例1 女,14岁,10年前左耳流脓,治疗后好转,尔后逐渐张口困难.查体:下颌偏左,面部不对称,张口度0.5 cm,双侧髁突动度(±),
单株立木材积的测定对林业生产和科学研究工作有着重要的作用,主要的测定方法是通过胸径、树高、上部直径或上部直径的高度来计算干形指数,然后计算单株立木材积。而不同的方
对建立一元立木材积模型的两种常规方法进行了深入分析,提出了以树高—胸径模型为基础的二阶回归估计方法。利用该方法既可以提高模型切合性能,又能有效控制模型的拟合误差,
近年来,随着宫颈癌筛查水平的不断提高,其发病率逐年下降,但宫颈癌前病变的检出率逐年上升并呈现出年轻化的趋势.如何有效地诊治宫颈癌前病变,成为广大妇科医师极为关注的话
Inheriting from a data-driven communication patt other than a location-driven patt, named data net-working (NDN) offers better support to network-layer dataflow
以北京市密云半城子侧柏水源涵养林为研究对象,提出了改进的邻体干扰模型,着重讨论了干扰指数与侧柏的胸径、树高、生境利用率等因子的关系。结果表明,侧柏林内的邻体干扰指
期刊