论文部分内容阅读
集成学习方法已经广泛应用于人工智能的各个研究领域,其显著的性能吸引了大量的研究者.分类器融合是集成学习中的一个核心问题,研究者已提出多种不同的分类器融合方法.本文提出了分类器平均分布的概念,即通过调整基分类器的权重,使它们在不同样本上的表现尽可能的平均.这种策略为那些只被少数分类器正确预测的样本提供了机会.此外,本文提出了分类器等价系数的概念,即如何衡量两个准确率不同的分类器在集成学习中的权重.通过严格的十折叠交叉检验,在12个UCI数据集上的实验表明,平均分布集成算法优于简单多数投票策略、LP-Adaboost和LP1算法.
The integrated learning method has been widely used in various fields of artificial intelligence, and its remarkable performance attracts a large number of researchers.Classifier fusion is a core problem in integrated learning, and researchers have proposed a variety of different classifier fusion methods. In this paper, the concept of average distribution of classifiers is proposed, in which the weights of base classifiers are adjusted so as to average their performance on different samples. This strategy provides opportunities for those samples that are correctly predicted by only a few classifiers. In addition, this paper proposes the concept of classifier equivalence coefficient, that is, how to measure the weight of two classifiers with different accuracy in integrated learning.Experimental results on 12 UCI datasets by strict ten-fold crossover test show that, The average distribution integration algorithm is superior to the simple majority voting strategy, LP-Adaboost and LP1 algorithms.