论文部分内容阅读
Identifying those attribute values that refer to same real-world entity but with different representations can improve the effectiveness many applications(e.g.,duplicate record detection and functional dependency mining).The state-of-the-art approach for attribute value matching is based on string similarity measurement.Its effectiveness depends on the assumption that equivalent attribute values appear similar while in comparison,non-matchers appear less similar.Unfortunately,it may not perform well in the circumstances where the string similarity specified by a metric is not a reliable indicator for attribute value equivalence.