论文部分内容阅读
元数据的应用需要开发适于所应用主题领域的规范词表来满足用户的检索需求,但目前对用户用什么词来进行查找却知之甚少。为了了解数字化教育图书馆用户在检索中使用什么样的词来进行查找,本文作者利用检索记录挖掘的方法来进行研究。在初步分析了40多万条检索记录中所含的100多万个检索词之后,作者重点分析了规范词在检索中被用户使用的情况,并且对比分析了哪些非规范词被用户使用。作者发现用户在查找信息的过程中对规范词的使用频率大大超过了对非规范词的使用频率。对非规范词使用的进一步分析不仅可以提供补充更新规范词的来源,而且也可以为分析规范词和非规范词之间建立对应的语义关系提供重要的信息来源。
The application of metadata requires the development of a canonical vocabulary suitable for the applied subject area to satisfy the user’s retrieval needs, but little is known about what words the user uses for searching. In order to understand what kind of words the users of digital education library use in searching, the author uses the method of searching records to research. After a preliminary analysis of more than one million search terms contained in more than 400,000 search records, the author focuses on the analysis of the use of normative words by users in the search and compares the non-canonical terms used by users. The authors found that the frequency of use of canonical terms by users during their search for information far outweighs the frequency of use of non-canonical terms. Further analysis of the use of non-canonical words not only provides a source of additional updated normative words, but also provides an important source of information for the analysis of the corresponding semantic relationships between normative and non-canonical terms.