论文部分内容阅读
“北京语言大学语料库中心(BLCU Corpus Center,简称BCC)”是以汉语为主、兼有其他语种的在线语料库。BCC总规模达数百亿字,是服务语言本体研究和语言应用研究的在线大数据系统。BCC检索式由字、词和语法标记等单元组成,并且支持通配符和离合查询。本文将概述BCC的总体情况,包括语料库建设情况和检索引擎开发等,重点介绍BCC形式化检索语言和在线系统的使用方法。
“BLCU Corpus Center (BCC) ” is an online corpus mainly in Chinese with other languages. BCC, with a total size of tens of billions of words, is an online big data system for service language ontology research and language application research. BCC search by words, words and grammar markers and other units, and support for wildcards and clutch queries. This article will outline the general situation of BCC, including corpus construction and search engine development, with emphasis on the use of BCC formal search language and online systems.